Skip to content

dsr1-fp4-b200-dynamo-sglang-mtp: 8k1k 6-variant MTP sweep on local split recipes#1688

Open
Ankur-singh wants to merge 4 commits into
mainfrom
dsr1-fp4-b200-8k1k-mtp
Open

dsr1-fp4-b200-dynamo-sglang-mtp: 8k1k 6-variant MTP sweep on local split recipes#1688
Ankur-singh wants to merge 4 commits into
mainfrom
dsr1-fp4-b200-8k1k-mtp

Conversation

@Ankur-singh

@Ankur-singh Ankur-singh commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator

Restructure the DeepSeek-R1 FP4 B200 dynamo-sglang MTP disagg sweep to an 8k1k-only, 6-variant configuration backed by local split recipes (one flat recipe YAML per topology).

  • 3 low-latency (1p5d / 1p3d / 1p1d, TP4 prefill / TP8 decode) + 3 MTP2 high-throughput (2p1d / 3p1d / 5p1d, DEP4 prefill / DEP8 decode) topologies
  • Bump container image to lmsysorg/sglang:v0.5.12.post1
  • Add the dsr1/fp4 recipe-copy path to launch_b200-dgxc.sh

Note

Low Risk
Changes are benchmark YAML, launch scripting, and perf changelog only; risk is misconfigured cluster topologies or wrong recipe paths causing failed or misleading B200 runs, not production app logic.

Overview
Restructures the dsr1-fp4-b200-dynamo-sglang-mtp benchmark sweep to use in-repo flat recipe YAMLs instead of srt-slurm zip_override/override recipe references, and bumps the SGLang image to lmsysorg/sglang:v0.5.12.post1.

For 1k1k, the four MTP disagg points are unchanged in intent; CONFIG_FILE now points at recipes/sglang/dsr1/b200-fp4/1k1k/disagg/mtp/*.yaml. For 8k1k, the search space is replaced with a six-variant sweep: three low-latency topologies (1p5d / 1p3d / 1p1d with TP4 prefill, TP8 decode, ep: 1, dp-attn: false) and three MTP2 throughput Pareto points (2p1d / 3p1d / 5p1d with DEP4 prefill / DEP8 decode and fixed high concurrencies).

Adds the corresponding recipe files under benchmarks/multi_node/srt-slurm-recipes/sglang/dsr1/b200-fp4/, documents the change in perf-changelog.yaml, and extends launch_b200-dgxc.sh to clone srt-slurm at main and copy those recipes into the checkout for dynamo-sglang + dsr1 + fp4.

Reviewed by Cursor Bugbot for commit 79a9f56. Bugbot is set up for automated code reviews on this repo. Configure here.

@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Comment thread .github/configs/nvidia-master.yaml Outdated
…tinode runner

Only the 8k1k scenario is updated (6-variant local split recipes). The
1k1k scenario and the b200-multinode runner are unchanged from main; the
image bump to v0.5.12.post1 is shared (1k1k follows via the dynamo-sglang
container alias).
…cipes

Flatten the srt-slurm b200-fp4 1k1k recipe (base + zip_override_mtp_*[i])
into 4 standalone per-topology recipes under
recipes/sglang/dsr1/b200-fp4/1k1k/disagg/mtp/, matching the 8k1k local
layout, and point the config at them instead of srt-slurm. Behavior is
unchanged (faithful flatten; dynamo-sglang container alias preserved).

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 79a9f56. Configure here.

cd "$SRT_REPO_DIR" || exit 1
git checkout main
mkdir -p recipes/sglang/dsr1/b200-fp4
cp -rT "$GITHUB_WORKSPACE/benchmarks/multi_node/srt-slurm-recipes/sglang/dsr1/b200-fp4" recipes/sglang/dsr1/b200-fp4

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Launch branch breaks non-MTP

High Severity

The new dynamo-sglang + dsr1 + fp4 branch applies to every DeepSeek-R1 FP4 disagg run, not only the MTP sweep. It checks out srt-slurm main and copies only recipes/sglang/dsr1/b200-fp4, while dsr1-fp4-b200-dynamo-sglang still points CONFIG_FILE at recipes/b200-fp4/1k1k.yaml overrides that lived on sa-submission-q2-2026.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 79a9f56. Configure here.

@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

1 participant