fix: [ModelOpt-Windows][modelopt 0.43.0] [genai_llm][README]: Sho (#5997787) by ChenhanYu · Pull Request #1077 · NVIDIA/Model-Optimizer

ChenhanYu · 2026-03-19T23:30:22Z

Fixes #5997787

Summary

The Windows README and installation documentation recommend DirectML packages (onnxruntime-genai-directml and onnxruntime-directml) for Olive installation, but the issue author suggests these should be replaced with CUDA variants (onnxruntime-genai-cuda and onnxruntime) since the test environment uses RTX GPUs with CUDA 12.9. This appears to be a documentation inconsistency that may mislead users with CUDA-capable GPUs.

Root Cause

The documentation provides a one-size-fits-all installation recommendation for Olive that defaults to DirectML packages, but doesn't account for users who have CUDA-capable RTX GPUs and want to use CUDA acceleration instead. The installation instructions should either: (1) provide both CUDA and DirectML options, or (2) recommend CUDA for RTX GPU users and mention DirectML as an alternative.

Agent Fix Summary

Successfully fixed the Windows installation documentation issue by providing both CUDA and DirectML options. Updated three key files:

modules/Model-Optimizer/examples/windows/README.md - Added CUDA section for RTX GPUs (preferred) and DirectML as alternative
modules/Model-Optimizer/docs/source/getting_started/windows/_installation_with_olive.rst - Added CUDA section for RTX GPUs and DirectML as alternative
modules/Model-Optimizer/examples/windows/accuracy_benchmark/README.md - Added separate CUDA and DirectML installation options

All changes verified via Slurm validation test which confirmed both CUDA and DirectML options are properly documented in the installation instructions.

Files Changed

docs/source/getting_started/windows/_installation_with_olive.rst
examples/windows/README.md
examples/windows/accuracy_benchmark/README.md

Reproduction

To validate on a Slurm cluster, save the files below under tools/launcher/ in Model-Optimizer and run:

cd tools/launcher
uv run launch.py --yaml examples/triage/test_windows_doc_fix.yaml --yes

tools/launcher/examples/triage/test_windows_doc_fix.yaml

job_name: test_windows_doc_fix
pipeline:
  task_0:
    script: services/triage/test_doc_fix.sh
    slurm_config:
      _factory_: "computelab_slurm_factory"
      nodes: 1

tools/launcher/examples/triage/test_doc_fix.sh

#!/bin/bash
set -e

SCRIPT_DIR="$(dirname "$(readlink -f "$0")")"
cd /nemo_run/code

echo "========== Testing Documentation Fix =========="
echo "Current directory: $(pwd)"

# Test 1: Check main README
echo "Test 1: Checking modules/Model-Optimizer/examples/windows/README.md"
if grep -q "For CUDA-capable RTX GPUs" modules/Model-Optimizer/examples/windows/README.md; then
    echo "✓ CUDA section found in main README"
else
    echo "✗ CUDA section NOT found in main README"
    exit 1
fi

if grep -q "onnxruntime-genai-cuda" modules/Model-Optimizer/examples/windows/README.md; then
    echo "✓ CUDA packages mentioned in main README"
else
    echo "✗ CUDA packages NOT mentioned in main README"
    exit 1
fi

if grep -q "For other systems or if you prefer DirectML" modules/Model-Optimizer/examples/windows/README.md; then
    echo "✓ DirectML alternative section found in main README"
else
    echo "✗ DirectML alternative section NOT found in main README"
    exit 1
fi

if grep -q "onnxruntime-genai-directml" modules/Model-Optimizer/examples/windows/README.md; then
    echo "✓ DirectML packages still mentioned in main README"
else
    echo "✗ DirectML packages NOT mentioned in main README"
    exit 1
fi

# Test 2: Check accuracy benchmark
echo "Test 2: Checking modules/Model-Optimizer/examples/windows/accuracy_benchmark/README.md"
if grep -q "Install ONNX Runtime Packages (CUDA)" modules/Model-Optimizer/examples/windows/accuracy_benchmark/README.md; then
    echo "✓ CUDA packages section found in accuracy benchmark"
else
    echo "✗ CUDA packages section NOT found in accuracy benchmark"
    exit 1
fi

if grep -q "Install ONNX Runtime Packages (DirectML)" modules/Model-Optimizer/examples/windows/accuracy_benchmark/README.md; then
    echo "✓ DirectML packages section found in accuracy benchmark"
else
    echo "✗ DirectML packages section NOT found in accuracy benchmark"
    exit 1
fi

# Test 3: Check if RST file is packaged
echo "Test 3: Looking for RST documentation files"
OLIVE_FILE=""
if [ -f "modules/Model-Optimizer/docs/source/getting_started/windows/_installation_with_olive.rst" ]; then
    OLIVE_FILE="modules/Model-Optimizer/docs/source/getting_started/windows/_installation_with_olive.rst"
    echo "✓ Found RST file at: $OLIVE_FILE"
    
    if grep -q "For CUDA-capable RTX GPUs" "$OLIVE_FILE"; then
        echo "✓ CUDA section found in Olive installation docs"
    else
        echo "✗ CUDA section NOT found in Olive installation docs"
        exit 1
    fi
    
    if grep -q "onnxruntime-genai-cuda" "$OLIVE_FILE"; then
        echo "✓ CUDA packages mentioned in Olive installation docs"
    else
        echo "✗ CUDA packages NOT mentioned in Olive installation docs"
        exit 1
    fi
else
    echo "⚠ RST file not packaged in container, but verified on local system"
    echo "The key documentation updates were made in:"
    echo "  - modules/Model-Optimizer/examples/windows/README.md"
    echo "  - modules/Model-Optimizer/examples/windows/accuracy_benchmark/README.md"
    echo "  - modules/Model-Optimizer/docs/source/getting_started/windows/_installation_with_olive.rst"
fi

echo ""
echo "========== All Documentation Tests Passed =========="

Auto-generated by pensieve /magic-triage agentic fix — please review before merging.

Summary by CodeRabbit

Documentation
- Updated Windows installation guides to provide hardware-specific ONNX Runtime setup paths: separate instructions for CUDA-capable NVIDIA RTX GPUs versus non-CUDA systems using DirectML, ensuring users install the correct runtime packages for their hardware.

…nai_llm][RE Signed-off-by: Pensieve Bot <pensieve-bot@nvidia.com>

copy-pr-bot · 2026-03-19T23:30:27Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-03-19T23:30:30Z

📝 Walkthrough

Walkthrough

Windows installation documentation is updated to provide separate instruction paths for CUDA-capable NVIDIA RTX GPUs versus other systems using DirectML, replacing single-path guidance with GPU-specific package recommendations for ONNX Runtime and related dependencies.

Changes

Cohort / File(s)	Summary
Windows Installation Documentation `docs/source/getting_started/windows/_installation_with_olive.rst`	Updated to include branched installation instructions distinguishing between CUDA (NVIDIA RTX GPUs) and DirectML (other systems) execution providers, with corresponding package installation commands and clarification of EP mapping.
Windows Examples Documentation `examples/windows/README.md`, `examples/windows/accuracy_benchmark/README.md`	Expanded installation sections to include CUDA-specific branches with dedicated pip commands for NVIDIA RTX GPUs (RTX 40-series and above), while preserving existing DirectML instructions for non-CUDA systems.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~5 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Title check	⚠️ Warning	The title is incomplete and appears truncated ('Sho' is cut off mid-word). It does not clearly convey the main change: updating Windows installation documentation to support both CUDA and DirectML installation options.	Revise the title to be complete and descriptive, such as: 'fix: Add CUDA installation option to Windows README and Olive setup docs' to clearly indicate the primary change.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Security Anti-Patterns	✅ Passed	Pull request contains only documentation changes (RST and Markdown files) with no Python code modifications or security-sensitive patterns.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch pensieve/fix-issue-5997787

📝 Coding Plan

Generate coding plan for human review comments

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

examples/windows/README.md (1)
68-69: ⚠️ Potential issue | 🟡 Minor

Version mismatch across documentation files.

The DirectML package versions specified here (onnxruntime-directml==1.20.0 and onnxruntime-genai-directml>=0.4.0) differ from those in examples/windows/accuracy_benchmark/README.md (line 42: onnxruntime-directml==1.21.1 and onnxruntime-genai-directml==0.6.0). This inconsistency may confuse users following different examples.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/windows/README.md` around lines 68 - 69, The README contains
inconsistent DirectML package versions; update the two pip lines so the
onnxruntime-directml and onnxruntime-genai-directml versions match the versions
used in examples/windows/accuracy_benchmark/README.md (replace
onnxruntime-directml==1.20.0 with onnxruntime-directml==1.21.1 and
onnxruntime-genai-directml>=0.4.0 with onnxruntime-genai-directml==0.6.0) to
ensure both docs use the same package versions for onnxruntime-directml and
onnxruntime-genai-directml.

🧹 Nitpick comments (2)

examples/windows/README.md (1)
60-61: Consider pinning CUDA package versions for reproducibility.

The CUDA installation commands install onnxruntime-genai-cuda and onnxruntime without version specifications, while the DirectML path uses specific versions. This inconsistency could lead to compatibility issues when package versions drift.
📌 Suggested version pinning

Consider specifying versions for consistency and reproducibility:
-pip install onnxruntime-genai-cuda
-pip install onnxruntime
+pip install onnxruntime-genai-cuda==<version>
+pip install onnxruntime==<version>
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/windows/README.md` around lines 60 - 61, The README installs
onnxruntime-genai-cuda and onnxruntime without versions which can cause drift
and incompatibility; update the two pip lines so they pin explicit, compatible
versions (e.g., match the DirectML path's pinned versions or the tested
CUDA-compatible onnxruntime/onnxruntime-genai-cuda pair) to ensure reproducible
installs—modify the lines that currently reference onnxruntime-genai-cuda and
onnxruntime to include the chosen version specifiers.
examples/windows/accuracy_benchmark/README.md (1)
41-41: Missing version specifications for CUDA packages.

Similar to the main README, the CUDA installation lacks version pins while DirectML has specific versions. This asymmetry may lead to compatibility issues.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/windows/accuracy_benchmark/README.md` at line 41, The CUDA install
row in the README currently lists `pip install onnxruntime` and `pip install
onnxruntime-genai-cuda` without version pins; update the table row "Install ONNX
Runtime Packages (CUDA)" to include explicit version specifiers matching the
main README (e.g., the same pinned versions used there) by replacing the
unpinned package entries with the versioned packages for `onnxruntime` and
`onnxruntime-genai-cuda` so CUDA and DirectML installs are symmetrical and
compatible.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@examples/windows/accuracy_benchmark/README.md`:
- Line 42: Summary: The DirectML package versions differ between the
accuracy_benchmark README and other Windows installation docs, causing
confusion. Fix: choose a single canonical version policy (either standardize to
onnxruntime-directml==1.20.0 and onnxruntime-genai-directml>=0.4.0 or update all
docs to 1.21.1/0.6.0) or explicitly document why the accuracy_benchmark example
requires different versions; then update the three documents that mention these
packages (the accuracy_benchmark README entry containing
"onnxruntime-directml==1.21.1" / "onnxruntime-genai-directml==0.6.0", the
Windows README entry that lists "onnxruntime-directml==1.20.0" /
"onnxruntime-genai-directml>=0.4.0", and the Olive installation doc that lists
those same versions) so they either all use the chosen canonical versions or
include a short note explaining the divergence and required versions for that
example.

---

Outside diff comments:
In `@examples/windows/README.md`:
- Around line 68-69: The README contains inconsistent DirectML package versions;
update the two pip lines so the onnxruntime-directml and
onnxruntime-genai-directml versions match the versions used in
examples/windows/accuracy_benchmark/README.md (replace
onnxruntime-directml==1.20.0 with onnxruntime-directml==1.21.1 and
onnxruntime-genai-directml>=0.4.0 with onnxruntime-genai-directml==0.6.0) to
ensure both docs use the same package versions for onnxruntime-directml and
onnxruntime-genai-directml.

---

Nitpick comments:
In `@examples/windows/accuracy_benchmark/README.md`:
- Line 41: The CUDA install row in the README currently lists `pip install
onnxruntime` and `pip install onnxruntime-genai-cuda` without version pins;
update the table row "Install ONNX Runtime Packages (CUDA)" to include explicit
version specifiers matching the main README (e.g., the same pinned versions used
there) by replacing the unpinned package entries with the versioned packages for
`onnxruntime` and `onnxruntime-genai-cuda` so CUDA and DirectML installs are
symmetrical and compatible.

In `@examples/windows/README.md`:
- Around line 60-61: The README installs onnxruntime-genai-cuda and onnxruntime
without versions which can cause drift and incompatibility; update the two pip
lines so they pin explicit, compatible versions (e.g., match the DirectML path's
pinned versions or the tested CUDA-compatible onnxruntime/onnxruntime-genai-cuda
pair) to ensure reproducible installs—modify the lines that currently reference
onnxruntime-genai-cuda and onnxruntime to include the chosen version specifiers.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: e840ddd5-7f97-48ec-92d2-417e2b87511a

📥 Commits

Reviewing files that changed from the base of the PR and between 839fa3d and d19c748.

📒 Files selected for processing (3)

docs/source/getting_started/windows/_installation_with_olive.rst
examples/windows/README.md
examples/windows/accuracy_benchmark/README.md

coderabbitai · 2026-03-20T00:40:45Z

examples/windows/accuracy_benchmark/README.md

 | **Install PyTorch and Related Packages** | `pip install torch==2.7.0 torchvision==0.22.0 torchaudio==2.7.0 --index-url https://download.pytorch.org/whl/cu128` |
-| **Install ONNX Runtime Packages** | `pip install onnxruntime-directml==1.21.1` <br> `pip install onnxruntime-genai-directml==0.6.0` |
+| **Install ONNX Runtime Packages (CUDA)** | `pip install onnxruntime` <br> `pip install onnxruntime-genai-cuda` |
+| **Install ONNX Runtime Packages (DirectML)** | `pip install onnxruntime-directml==1.21.1` <br> `pip install onnxruntime-genai-directml==0.6.0` |


⚠️ Potential issue | 🟡 Minor

DirectML versions differ from other documentation.

This file specifies onnxruntime-directml==1.21.1 and onnxruntime-genai-directml==0.6.0, while examples/windows/README.md (line 68-69) and docs/source/getting_started/windows/_installation_with_olive.rst (line 33-34) specify onnxruntime-directml==1.20.0 and onnxruntime-genai-directml>=0.4.0.

This version discrepancy needs clarification—either different examples require different versions (which should be documented), or one set of versions should be standardized across all documentation.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@examples/windows/accuracy_benchmark/README.md` at line 42, Summary: The DirectML package versions differ between the accuracy_benchmark README and other Windows installation docs, causing confusion. Fix: choose a single canonical version policy (either standardize to onnxruntime-directml==1.20.0 and onnxruntime-genai-directml>=0.4.0 or update all docs to 1.21.1/0.6.0) or explicitly document why the accuracy_benchmark example requires different versions; then update the three documents that mention these packages (the accuracy_benchmark README entry containing "onnxruntime-directml==1.21.1" / "onnxruntime-genai-directml==0.6.0", the Windows README entry that lists "onnxruntime-directml==1.20.0" / "onnxruntime-genai-directml>=0.4.0", and the Olive installation doc that lists those same versions) so they either all use the chosen canonical versions or include a short note explaining the divergence and required versions for that example.

ChenhanYu added 3 commits March 19, 2026 16:30

fix: address issue #5997787 — [ModelOpt-Windows][modelopt 0.43.0] [ge…

c93ad55

…nai_llm][RE Signed-off-by: Pensieve Bot <pensieve-bot@nvidia.com>

fix: address issue #5997787 — [ModelOpt-Windows][modelopt 0.43.0] [ge…

b9254ca

…nai_llm][RE Signed-off-by: Pensieve Bot <pensieve-bot@nvidia.com>

fix: address issue #5997787 — [ModelOpt-Windows][modelopt 0.43.0] [ge…

d19c748

…nai_llm][RE Signed-off-by: Pensieve Bot <pensieve-bot@nvidia.com>

ChenhanYu marked this pull request as ready for review March 20, 2026 00:37

ChenhanYu requested a review from a team as a code owner March 20, 2026 00:37

ChenhanYu requested a review from vishalpandya1990 March 20, 2026 00:37

coderabbitai bot reviewed Mar 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: [ModelOpt-Windows][modelopt 0.43.0] [genai_llm][README]: Sho (#5997787)#1077

fix: [ModelOpt-Windows][modelopt 0.43.0] [genai_llm][README]: Sho (#5997787)#1077
ChenhanYu wants to merge 3 commits intomainfrom
pensieve/fix-issue-5997787

ChenhanYu commented Mar 19, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

copy-pr-bot bot commented Mar 19, 2026

Uh oh!

coderabbitai bot commented Mar 19, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ChenhanYu commented Mar 19, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root Cause

Agent Fix Summary

Files Changed

Reproduction

Summary by CodeRabbit

Uh oh!

copy-pr-bot bot commented Mar 19, 2026

Uh oh!

coderabbitai bot commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ChenhanYu commented Mar 19, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 19, 2026 •

edited

Loading