Fridah/kinjal/vllm modelopt reload by Fridah-nv · Pull Request #1068 · NVIDIA/Model-Optimizer

Fridah-nv · 2026-03-18T18:43:24Z

What does this PR do?

Type of change: ?

Enable vllm fakequant export in examples/llm_ptq: example command:

python ../llm_ptq/hf_ptq.py \
  --pyt_ckpt_path <MODEL_PATH> \
  --qformat nvfp4 \
  --calib_size 512 \
  --export_path <EXPORT_DIR> \
  --vllm_fakequant_export \
  --trust_remote_code

Usage

# Add a code snippet demonstrating how to use this

Testing

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed (git commit -s -S).

Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded trust_remote_code=True, torch.load(..., weights_only=False), pickle, etc.).

Is this change backward compatible?: ✅ / ❌ / N/A
If you copied code from any other sources or added a new PIP dependency, did you follow guidance in CONTRIBUTING.md: ✅ / ❌ / N/A
Did you write any new necessary tests?: ✅ / ❌ / N/A
Did you update Changelog?: ✅ / ❌ / N/A

Additional Information

Signed-off-by: Fridah-nv <201670829+Fridah-nv@users.noreply.github.com>

copy-pr-bot · 2026-03-18T18:43:29Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

coderabbitai · 2026-03-18T18:43:33Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 758471f1-4c68-4561-ae86-6e93d047977d

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fridah/kinjal/vllm_modelopt_reload

📝 Coding Plan

Generate coding plan for human review comments

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Tip

Flake8 can be used to improve the quality of Python code reviews.

Flake8 is a Python linter that wraps PyFlakes, pycodestyle and Ned Batchelder's McCabe script.

To configure Flake8, add a '.flake8' or 'setup.cfg' file to your project root.

See Flake8 Documentation for more details.

realAsma · 2026-03-19T13:35:53Z

examples/vllm_serve/README.md

+
+  Alternatively, the dedicated `hf_ptq_export.py` script (**deprecated** — use `hf_ptq.py` with `--vllm_fakequant_export` instead) can be used for a simpler interface:
+
+```bash
+python hf_ptq_export.py \
+  --pyt_ckpt_path <MODEL_PATH> \
+  --quant_cfg NVFP4_DEFAULT_CFG \
+  --export_path <EXPORT_DIR> \
+  --trust_remote_code
+```
+


We should remove hf_ptq_export.py -> in my understanding examples/llm_ptq/hf_ptq.py should be sufficient. Is that correct @kinjalpatel27 ?

yes its sufficient. we added hf_ptq_export.py to not overcrowd the hf_ptq.py example.

realAsma · 2026-03-19T17:08:58Z

examples/llm_ptq/hf_ptq.py

+    parser.add_argument(
+        "--vllm_fakequant_export",
+        default=False,
+        action="store_true",
+        help="Export as vLLM fake-quant checkpoint (produces vllm_fq_modelopt_state.pth "
+        "for use with vllm_serve_fakequant.py).",
+    )


move this to the end of arguments

also lets add a line in hf_ptq.py readme about this argument, pointing to vllm_serve readme export section

realAsma · 2026-03-19T17:09:18Z

examples/vllm_serve/README.md

+
+  Alternatively, the dedicated `hf_ptq_export.py` script (**deprecated** — use `hf_ptq.py` with `--vllm_fakequant_export` instead) can be used for a simpler interface:
+
+```bash
+python hf_ptq_export.py \
+  --pyt_ckpt_path <MODEL_PATH> \
+  --quant_cfg NVFP4_DEFAULT_CFG \
+  --export_path <EXPORT_DIR> \
+  --trust_remote_code
+```
+


Suggested change

Alternatively, the dedicated `hf_ptq_export.py` script (**deprecated** — use `hf_ptq.py` with `--vllm_fakequant_export` instead) can be used for a simpler interface:

```bash

python hf_ptq_export.py \

--pyt_ckpt_path <MODEL_PATH> \

--quant_cfg NVFP4_DEFAULT_CFG \

--export_path <EXPORT_DIR> \

--trust_remote_code

```

Add --vllm_fakequant_export flag to hf_ptq.py for vLLM fake-quant export

d32bc5e

Signed-off-by: Fridah-nv <201670829+Fridah-nv@users.noreply.github.com>

Fridah-nv changed the base branch from main to kinjal/vllm_modelopt_reload March 18, 2026 21:39

realAsma requested a review from kinjalpatel27 March 19, 2026 13:34

realAsma reviewed Mar 19, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fridah/kinjal/vllm modelopt reload#1068

Fridah/kinjal/vllm modelopt reload#1068
Fridah-nv wants to merge 1 commit intokinjal/vllm_modelopt_reloadfrom
fridah/kinjal/vllm_modelopt_reload

Fridah-nv commented Mar 18, 2026 •

edited

Loading

Uh oh!

copy-pr-bot bot commented Mar 18, 2026

Uh oh!

coderabbitai bot commented Mar 18, 2026

Review skipped

Uh oh!

realAsma Mar 19, 2026

Uh oh!

kinjalpatel27 Mar 19, 2026

Uh oh!

realAsma Mar 19, 2026

Uh oh!

kinjalpatel27 Mar 19, 2026

Uh oh!

realAsma Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Fridah-nv commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Additional Information

Uh oh!

copy-pr-bot bot commented Mar 18, 2026

Uh oh!

coderabbitai bot commented Mar 18, 2026

Review skipped

Uh oh!

realAsma Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

kinjalpatel27 Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

realAsma Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

kinjalpatel27 Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

realAsma Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fridah-nv commented Mar 18, 2026 •

edited

Loading