fix: [ModelOpt-Windows][modelopt 0.43.0] [genai_llm][README]: Sho (#5997787)#1077
fix: [ModelOpt-Windows][modelopt 0.43.0] [genai_llm][README]: Sho (#5997787)#1077
Conversation
…nai_llm][RE Signed-off-by: Pensieve Bot <pensieve-bot@nvidia.com>
…nai_llm][RE Signed-off-by: Pensieve Bot <pensieve-bot@nvidia.com>
…nai_llm][RE Signed-off-by: Pensieve Bot <pensieve-bot@nvidia.com>
📝 WalkthroughWalkthroughWindows installation documentation is updated to provide separate instruction paths for CUDA-capable NVIDIA RTX GPUs versus other systems using DirectML, replacing single-path guidance with GPU-specific package recommendations for ONNX Runtime and related dependencies. Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~5 minutes 🚥 Pre-merge checks | ✅ 3 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
examples/windows/README.md (1)
68-69:⚠️ Potential issue | 🟡 MinorVersion mismatch across documentation files.
The DirectML package versions specified here (
onnxruntime-directml==1.20.0andonnxruntime-genai-directml>=0.4.0) differ from those inexamples/windows/accuracy_benchmark/README.md(line 42:onnxruntime-directml==1.21.1andonnxruntime-genai-directml==0.6.0). This inconsistency may confuse users following different examples.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/windows/README.md` around lines 68 - 69, The README contains inconsistent DirectML package versions; update the two pip lines so the onnxruntime-directml and onnxruntime-genai-directml versions match the versions used in examples/windows/accuracy_benchmark/README.md (replace onnxruntime-directml==1.20.0 with onnxruntime-directml==1.21.1 and onnxruntime-genai-directml>=0.4.0 with onnxruntime-genai-directml==0.6.0) to ensure both docs use the same package versions for onnxruntime-directml and onnxruntime-genai-directml.
🧹 Nitpick comments (2)
examples/windows/README.md (1)
60-61: Consider pinning CUDA package versions for reproducibility.The CUDA installation commands install
onnxruntime-genai-cudaandonnxruntimewithout version specifications, while the DirectML path uses specific versions. This inconsistency could lead to compatibility issues when package versions drift.📌 Suggested version pinning
Consider specifying versions for consistency and reproducibility:
-pip install onnxruntime-genai-cuda -pip install onnxruntime +pip install onnxruntime-genai-cuda==<version> +pip install onnxruntime==<version>🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/windows/README.md` around lines 60 - 61, The README installs onnxruntime-genai-cuda and onnxruntime without versions which can cause drift and incompatibility; update the two pip lines so they pin explicit, compatible versions (e.g., match the DirectML path's pinned versions or the tested CUDA-compatible onnxruntime/onnxruntime-genai-cuda pair) to ensure reproducible installs—modify the lines that currently reference onnxruntime-genai-cuda and onnxruntime to include the chosen version specifiers.examples/windows/accuracy_benchmark/README.md (1)
41-41: Missing version specifications for CUDA packages.Similar to the main README, the CUDA installation lacks version pins while DirectML has specific versions. This asymmetry may lead to compatibility issues.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@examples/windows/accuracy_benchmark/README.md` at line 41, The CUDA install row in the README currently lists `pip install onnxruntime` and `pip install onnxruntime-genai-cuda` without version pins; update the table row "Install ONNX Runtime Packages (CUDA)" to include explicit version specifiers matching the main README (e.g., the same pinned versions used there) by replacing the unpinned package entries with the versioned packages for `onnxruntime` and `onnxruntime-genai-cuda` so CUDA and DirectML installs are symmetrical and compatible.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@examples/windows/accuracy_benchmark/README.md`:
- Line 42: Summary: The DirectML package versions differ between the
accuracy_benchmark README and other Windows installation docs, causing
confusion. Fix: choose a single canonical version policy (either standardize to
onnxruntime-directml==1.20.0 and onnxruntime-genai-directml>=0.4.0 or update all
docs to 1.21.1/0.6.0) or explicitly document why the accuracy_benchmark example
requires different versions; then update the three documents that mention these
packages (the accuracy_benchmark README entry containing
"onnxruntime-directml==1.21.1" / "onnxruntime-genai-directml==0.6.0", the
Windows README entry that lists "onnxruntime-directml==1.20.0" /
"onnxruntime-genai-directml>=0.4.0", and the Olive installation doc that lists
those same versions) so they either all use the chosen canonical versions or
include a short note explaining the divergence and required versions for that
example.
---
Outside diff comments:
In `@examples/windows/README.md`:
- Around line 68-69: The README contains inconsistent DirectML package versions;
update the two pip lines so the onnxruntime-directml and
onnxruntime-genai-directml versions match the versions used in
examples/windows/accuracy_benchmark/README.md (replace
onnxruntime-directml==1.20.0 with onnxruntime-directml==1.21.1 and
onnxruntime-genai-directml>=0.4.0 with onnxruntime-genai-directml==0.6.0) to
ensure both docs use the same package versions for onnxruntime-directml and
onnxruntime-genai-directml.
---
Nitpick comments:
In `@examples/windows/accuracy_benchmark/README.md`:
- Line 41: The CUDA install row in the README currently lists `pip install
onnxruntime` and `pip install onnxruntime-genai-cuda` without version pins;
update the table row "Install ONNX Runtime Packages (CUDA)" to include explicit
version specifiers matching the main README (e.g., the same pinned versions used
there) by replacing the unpinned package entries with the versioned packages for
`onnxruntime` and `onnxruntime-genai-cuda` so CUDA and DirectML installs are
symmetrical and compatible.
In `@examples/windows/README.md`:
- Around line 60-61: The README installs onnxruntime-genai-cuda and onnxruntime
without versions which can cause drift and incompatibility; update the two pip
lines so they pin explicit, compatible versions (e.g., match the DirectML path's
pinned versions or the tested CUDA-compatible onnxruntime/onnxruntime-genai-cuda
pair) to ensure reproducible installs—modify the lines that currently reference
onnxruntime-genai-cuda and onnxruntime to include the chosen version specifiers.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: e840ddd5-7f97-48ec-92d2-417e2b87511a
📒 Files selected for processing (3)
docs/source/getting_started/windows/_installation_with_olive.rstexamples/windows/README.mdexamples/windows/accuracy_benchmark/README.md
| | **Install PyTorch and Related Packages** | `pip install torch==2.7.0 torchvision==0.22.0 torchaudio==2.7.0 --index-url https://download.pytorch.org/whl/cu128` | | ||
| | **Install ONNX Runtime Packages** | `pip install onnxruntime-directml==1.21.1` <br> `pip install onnxruntime-genai-directml==0.6.0` | | ||
| | **Install ONNX Runtime Packages (CUDA)** | `pip install onnxruntime` <br> `pip install onnxruntime-genai-cuda` | | ||
| | **Install ONNX Runtime Packages (DirectML)** | `pip install onnxruntime-directml==1.21.1` <br> `pip install onnxruntime-genai-directml==0.6.0` | |
There was a problem hiding this comment.
DirectML versions differ from other documentation.
This file specifies onnxruntime-directml==1.21.1 and onnxruntime-genai-directml==0.6.0, while examples/windows/README.md (line 68-69) and docs/source/getting_started/windows/_installation_with_olive.rst (line 33-34) specify onnxruntime-directml==1.20.0 and onnxruntime-genai-directml>=0.4.0.
This version discrepancy needs clarification—either different examples require different versions (which should be documented), or one set of versions should be standardized across all documentation.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@examples/windows/accuracy_benchmark/README.md` at line 42, Summary: The
DirectML package versions differ between the accuracy_benchmark README and other
Windows installation docs, causing confusion. Fix: choose a single canonical
version policy (either standardize to onnxruntime-directml==1.20.0 and
onnxruntime-genai-directml>=0.4.0 or update all docs to 1.21.1/0.6.0) or
explicitly document why the accuracy_benchmark example requires different
versions; then update the three documents that mention these packages (the
accuracy_benchmark README entry containing "onnxruntime-directml==1.21.1" /
"onnxruntime-genai-directml==0.6.0", the Windows README entry that lists
"onnxruntime-directml==1.20.0" / "onnxruntime-genai-directml>=0.4.0", and the
Olive installation doc that lists those same versions) so they either all use
the chosen canonical versions or include a short note explaining the divergence
and required versions for that example.
Fixes #5997787
Summary
The Windows README and installation documentation recommend DirectML packages (onnxruntime-genai-directml and onnxruntime-directml) for Olive installation, but the issue author suggests these should be replaced with CUDA variants (onnxruntime-genai-cuda and onnxruntime) since the test environment uses RTX GPUs with CUDA 12.9. This appears to be a documentation inconsistency that may mislead users with CUDA-capable GPUs.
Root Cause
The documentation provides a one-size-fits-all installation recommendation for Olive that defaults to DirectML packages, but doesn't account for users who have CUDA-capable RTX GPUs and want to use CUDA acceleration instead. The installation instructions should either: (1) provide both CUDA and DirectML options, or (2) recommend CUDA for RTX GPU users and mention DirectML as an alternative.
Agent Fix Summary
Successfully fixed the Windows installation documentation issue by providing both CUDA and DirectML options. Updated three key files:
All changes verified via Slurm validation test which confirmed both CUDA and DirectML options are properly documented in the installation instructions.
Files Changed
docs/source/getting_started/windows/_installation_with_olive.rstexamples/windows/README.mdexamples/windows/accuracy_benchmark/README.mdReproduction
To validate on a Slurm cluster, save the files below under
tools/launcher/in Model-Optimizer and run:cd tools/launcher uv run launch.py --yaml examples/triage/test_windows_doc_fix.yaml --yestools/launcher/examples/triage/test_windows_doc_fix.yamltools/launcher/examples/triage/test_doc_fix.shAuto-generated by pensieve
/magic-triageagentic fix — please review before merging.Summary by CodeRabbit