-
Notifications
You must be signed in to change notification settings - Fork 189
Pull requests: SemiAnalysisAI/InferenceX
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[AMD] dsv4-fp4-mi355x-atom-disagg, add multi-node ATOM/mooncake disaggregation support
AMD
#1683
opened Jun 8, 2026 by
seungrokj
Collaborator
Loading…
5 tasks
dsv4-fp4-b300-sglang: align env vars to GB300
full-sweep-enabled
#1682
opened Jun 8, 2026 by
yhyang201
Collaborator
Loading…
[AMD][MI35X] Qwen3.5-fp4 SGLang single-node benchmark
AMD
full-sweep-enabled
#1680
opened Jun 8, 2026 by
1am9trash
Collaborator
Loading…
Bump actions/checkout from 6.0.2 to 6.0.3 in the github-actions group
dependencies
Pull requests that update a dependency file
github_actions
Pull requests that update GitHub Actions code
#1679
opened Jun 8, 2026 by
dependabot
Bot
Loading…
Add DSv4-Pro FP4 GB200 SGLang disagg + MTP config
full-sweep-enabled
#1676
opened Jun 5, 2026 by
Ankur-singh
Collaborator
Loading…
Add DSv4-Pro FP4 GB200 SGLang disagg config
full-sweep-enabled
#1675
opened Jun 5, 2026 by
Ankur-singh
Collaborator
Loading…
[AMD][MI355X] Bump qwen3.5-bf16 single-node SGLang image to v0.5.12.post1
#1673
opened Jun 5, 2026 by
ChangLiu0709
Collaborator
Loading…
2 of 3 tasks
chore(sweep): re-run MiniMax-M2.5 vLLM sweeps for motniroing
full-sweep-enabled
#1666
opened Jun 4, 2026 by
arygupt
Collaborator
Loading…
[WIP] Initial work to add llm-d-vllm framework with H200
#1660
opened Jun 4, 2026 by
ezrasilvera
Collaborator
Loading…
Throwaway: conc-64 gsm8k eval for DEP8+MTP3 dispatch token bug
non-canary-full-sweep-enabled
Run the full sweep without the canary gate (full search space, no trim)
#1659
opened Jun 3, 2026 by
Oseltamivir
Collaborator
Loading…
[NV] Add Kimi K2.5 FP4 B200/B300 EP sweep
full-sweep-enabled
#1658
opened Jun 3, 2026 by
jasonlizhengjian
Collaborator
•
Draft
AMD - gpt-oss vllm mxfp4: AITER tuning + n-gram spec decode + server …
AMD
#1657
opened Jun 3, 2026 by
nehaprakriya
Collaborator
•
Draft
fix(power): classify zero-decode-GPU multinode runs as aggregated
#1646
opened Jun 2, 2026 by
arygupt
Collaborator
Loading…
dsr1-fp4-mi355x-sglang: bump ROCm 7.0->7.2 image + add TP4 search-space
AMD
#1645
opened Jun 2, 2026 by
JohnQinAMD
Collaborator
Loading…
Use official TRT-LLM image (1.3.0rc15.post1) for DSv4 B300 TRT (non-MTP + MTP)
full-sweep-enabled
#1636
opened Jun 1, 2026 by
Oseltamivir
Collaborator
Loading…
feat(power): vendor-agnostic GPU power/telemetry aggregation core
#1635
opened Jun 1, 2026 by
arygupt
Collaborator
Loading…
2 of 3 tasks
Enable Rust frontend (VLLM_USE_RUST_FRONTEND=1)
full-sweep-enabled
#1634
opened Jun 1, 2026 by
chunfangamd
Collaborator
Loading…
Update new fixed-AR-MTP CI workflow for kimik2.5_int4, kimik2.5_fp4, …
#1633
opened Jun 1, 2026 by
haic0
Collaborator
Loading…
[Klaud Cold] Update gptoss-fp4-mi300x-vllm vLLM ROCm image to v0.22.0
full-sweep-enabled
#1621
opened May 30, 2026 by
functionstackx
Collaborator
Loading…
1 task
[Klaud Cold] Update minimaxm2.5-fp8-mi300x-vllm vLLM ROCm image to v0.22.0
full-sweep-enabled
#1618
opened May 30, 2026 by
functionstackx
Collaborator
Loading…
1 task
[Klaud Cold] Update kimik2.5-int4-mi300x-vllm vLLM ROCm image to v0.22.0
full-sweep-enabled
#1615
opened May 30, 2026 by
functionstackx
Collaborator
Loading…
1 task
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.