dsv4-fp4-gb300-dynamo-trt: STP + MTP disagg trtllm recipes on GB300 by Ankur-singh · Pull Request #1689 · SemiAnalysisAI/InferenceX

Ankur-singh · 2026-06-08T20:28:30Z

Summary

Adds disaggregated TRT-LLM inference benchmarks for DeepSeek-V4-Pro MXFP4 on
GB300 via the Dynamo frontend, covering both STP and MTP configurations.

Changes

.github/configs/nvidia-master.yaml — two new config keys:
- dsv4-fp4-gb300-dynamo-trt (STP): 27 scenarios, ISL 1024/OSL 1024 (14 conc points) + ISL 8192/OSL 1024 (13 conc points)
- dsv4-fp4-gb300-dynamo-trt-mtp (MTP): 27 scenarios, same sequence-length coverage
- Container: nvcr.io#nvidia/ai-dynamo/tensorrtllm-runtime:1.3.0-deepseek-v4-dev.1
- Recipes referenced from NVIDIA/srt-slurm branch sa-submission-q2-2026
runners/launch_gb300-nv.sh — new branch for framework=dynamo-trt + model_prefix=dsv4; overrides SRT_SLURM_MODEL_PREFIX to deepseek-ai/DeepSeek-V4-Pro to match the recipe's model.path (HuggingFace-style), clones NVIDIA/srt-slurm@sa-submission-q2-2026.
perf-changelog.yaml — entry appended at end documenting both keys; PR link backfilled after open.

Note

Low Risk
Benchmark and CI runner configuration only; no application runtime or auth logic changes.

Overview
Adds GB300 disaggregated DeepSeek-V4-Pro MXFP4 benchmark coverage for TensorRT-LLM via Dynamo, in both STP and MTP variants.

nvidia-master.yaml gains dsv4-fp4-gb300-dynamo-trt and dsv4-fp4-gb300-dynamo-trt-mtp (54 scenarios total: 1k/1k and 8k/1k fixed-seq-len sweeps with prefill/decode parallelism and CONFIG_FILE pointers to NVIDIA/srt-slurm sa-submission-q2-2026 recipes). Image: tensorrtllm-runtime:1.3.0-deepseek-v4-dev.1.

launch_gb300-nv.sh adds a dynamo-trt + dsv4 path that sets SRT_SLURM_MODEL_PREFIX to the HuggingFace model id and checks out that srt-slurm branch. perf-changelog.yaml documents the new config keys.

^{Reviewed by Cursor Bugbot for commit e2dd5ca. Bugbot is set up for automated code reviews on this repo. Configure here.}

…on GB300

github-actions · 2026-06-08T20:28:37Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

dsv4-fp4-gb300-dynamo-trt: add disagg trtllm recipes for STP and MTP …

db84c59

…on GB300

Ankur-singh requested a review from a team June 8, 2026 20:28

Ankur-singh requested review from jgangani and kedarpotdar-nv as code owners June 8, 2026 20:28

github-project-automation Bot added this to InferenceMAX Board Jun 8, 2026

Update perf-changelog pr-link for #1689

e2dd5ca

Ankur-singh added the full-sweep-enabled label Jun 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dsv4-fp4-gb300-dynamo-trt: STP + MTP disagg trtllm recipes on GB300#1689

dsv4-fp4-gb300-dynamo-trt: STP + MTP disagg trtllm recipes on GB300#1689
Ankur-singh wants to merge 2 commits into
mainfrom
dsv4-fp4-gb300-dynamo-trt

Ankur-singh commented Jun 8, 2026 •

edited by cursor Bot

Loading

Uh oh!

github-actions Bot commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Ankur-singh commented Jun 8, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Uh oh!

github-actions Bot commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Ankur-singh commented Jun 8, 2026 •

edited by cursor Bot

Loading