Add agent skills for modelopt by kaix-nv · Pull Request #1011 · NVIDIA/Model-Optimizer

kaix-nv · 2026-03-09T23:29:46Z

What does this PR do?

Type of change: ?

Adds a Claude Code skill suite for interactive model optimization with ModelOpt. The skill guides users through an end-to-end workflow: optimize model with modelopt APIs, deploy on vLLM and benchmark speed, evaluate accuracy with NeMo Evaluator (nel).

Usage

Invoke the skill in Claude Code:

/ptq

Say which model you want to quantize and in what quantization spec, e.g. nvfp4 mlp only

Slack Bot Setup

1. Create the Slack App

Go to api.slack.com/apps:

Create New App → "From scratch" → name it, e.g., modelopt-bot
Settings → Socket Mode → Enable → Generate token with connections:write → save as SLACK_APP_TOKEN
OAuth & Permissions → Bot Token Scopes → Add:
- app_mentions:read
- chat:write
- im:history, im:read
Events → Event Subscriptions → Enable → Subscribe to bot events:
- app_mention
- message.im
Install App → Request to Workspace Install → Copy Bot User OAuth Token as SLACK_BOT_TOKEN
App Home → Messages Tab → Check "Allow users to send Slash commands and messages from the messages tab"

2. Set up environment

cd slack-bot
pip install -r requirements.txt

export SLACK_BOT_TOKEN="xoxb-..."
export SLACK_APP_TOKEN="xapp-..."
export SKILLS_CWD="/path/to/modelopt_agent"

3. Run the bot

python bot.py

Expected output:
Starting ModelOpt Slack Bot...
Skills directory: /path/to/modelopt_agent
Found skills: ptq, deployment, evaluation, modelopt

4. Test in Slack

DM the bot: hello
@modelopt quantize Qwen3-0.6B with fp8
@modelopt deploy ./qwen3-0.6b-fp8
@modelopt evaluate my model on mmlu

Testing

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed (git commit -s -S).

Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded trust_remote_code=True, torch.load(..., weights_only=False), pickle, etc.).

Is this change backward compatible?: ✅ / ❌ / N/A
If you copied code from any other sources or added a new PIP dependency, did you follow guidance in CONTRIBUTING.md: ✅ / ❌ / N/A
Did you write any new necessary tests?: ✅ / ❌ / N/A
Did you update Changelog?: ✅ / ❌ / N/A

Additional Information

Signed-off-by: Kai Xu <kaix@nvidia.com>

copy-pr-bot · 2026-03-09T23:29:50Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

coderabbitai · 2026-03-09T23:30:05Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 5f5a10d4-22ed-4db1-8a68-e6350f5d5278

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch kaix/modelopt_agent

📝 Coding Plan

Generate coding plan for human review comments

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-03-10T00:33:10Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 70.27%. Comparing base (a4fde49) to head (6770524).
⚠️ Report is 54 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1011      +/-   ##
==========================================
- Coverage   72.12%   70.27%   -1.86%     
==========================================
  Files         209      227      +18     
  Lines       23628    25857    +2229     
==========================================
+ Hits        17042    18170    +1128     
- Misses       6586     7687    +1101

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: Meng Xin <mxin@nvidia.com>

mxinO · 2026-03-10T06:40:38Z

Added a separate ptq skill, needs further tuning. Claude opus can follow the skill, but sonnet needs more guide.

Signed-off-by: Kai Xu <kaix@nvidia.com>

Signed-off-by: Meng Xin <mxin@nvidia.com>

Edwardf0t1 · 2026-03-12T21:16:38Z

@kaix-nv @mxinO This is a great starting point to use agent skills for modelopt workflows 👍 We should test it with various models and optimization recipes to polish the skills.

Copy nel-assistant skill as local evaluation skill so we can extend it to support optimized model evaluation requirements. Update modelopt orchestrator to reference the evaluation skill. Signed-off-by: Kai Xu <kaix@nvidia.com>

Add deployment skill (vLLM, SGLang, TRT-LLM serving) and update modelopt orchestrator to support three pipelines: - PTQ only - PTQ + Deploy (serve as API endpoint) - PTQ + Evaluate (accuracy benchmark) Signed-off-by: Kai Xu <kaix@nvidia.com>

Signed-off-by: Meng Xin <mxin@nvidia.com>

kaix-nv · 2026-03-13T22:43:34Z

@kaix-nv @mxinO This is a great starting point to use agent skills for modelopt workflows 👍 We should test it with various models and optimization recipes to polish the skills.

Thanks. The skills are still at an early stage, so it’d be great to get more people using them and giving feedback. Testing across a broader set of models and optimization recipes will help us iterate quickly and make the workflows more robust.

copy-pr-bot · 2026-03-18T07:02:04Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Signed-off-by: Meng Xin <mxin@nvidia.com>

Signed-off-by: Kai Xu <kaix@nvidia.com>

Signed-off-by: Meng Xin <mxin@nvidia.com>

Add agent skills for modelopt

f83861f

Signed-off-by: Kai Xu <kaix@nvidia.com>

kaix-nv requested a review from mxinO March 9, 2026 23:30

mxinO added 2 commits March 9, 2026 19:38

restructure

a53bb59

Signed-off-by: Meng Xin <mxin@nvidia.com>

adding ptq skill

a3a2240

Signed-off-by: Meng Xin <mxin@nvidia.com>

kaix-nv added 2 commits March 10, 2026 15:59

Update paths

bd794bf

Signed-off-by: Kai Xu <kaix@nvidia.com>

Update paths

6968ad6

Signed-off-by: Kai Xu <kaix@nvidia.com>

kaix-nv force-pushed the kaix/modelopt_agent branch from 18eb9c2 to 6968ad6 Compare March 11, 2026 00:47

update ptq

22560c9

Signed-off-by: Meng Xin <mxin@nvidia.com>

kaix-nv force-pushed the kaix/modelopt_agent branch from bd2d3da to 4f61bad Compare March 12, 2026 23:13

Add evaluation skill and orchestrator

28928a1

Copy nel-assistant skill as local evaluation skill so we can extend it to support optimized model evaluation requirements. Update modelopt orchestrator to reference the evaluation skill. Signed-off-by: Kai Xu <kaix@nvidia.com>

kaix-nv force-pushed the kaix/modelopt_agent branch from 4f61bad to 28928a1 Compare March 12, 2026 23:17

kaix-nv force-pushed the kaix/modelopt_agent branch from 3a320f6 to 5c46798 Compare March 13, 2026 02:03

mxinO added 2 commits March 12, 2026 21:05

better slurm process

2eca533

Signed-off-by: Meng Xin <mxin@nvidia.com>

direct things inside reference

802d2cb

Signed-off-by: Meng Xin <mxin@nvidia.com>

mxinO force-pushed the kaix/modelopt_agent branch from 4d6d126 to c658f4b Compare March 19, 2026 09:06

add remote workflow

0a53b13

Signed-off-by: Meng Xin <mxin@nvidia.com>

mxinO force-pushed the kaix/modelopt_agent branch from c658f4b to 1f58896 Compare March 19, 2026 09:09

kaix-nv added 2 commits March 19, 2026 10:22

Add slack bot for the modelopt agent

7a66157

Signed-off-by: Kai Xu <kaix@nvidia.com>

Add slack bot for the modelopt agent

ad8f2b8

Signed-off-by: Kai Xu <kaix@nvidia.com>

kaix-nv force-pushed the kaix/modelopt_agent branch from 1f58896 to 1eb9c85 Compare March 19, 2026 17:24

per user directory and api support

6770524

Signed-off-by: Meng Xin <mxin@nvidia.com>

mxinO force-pushed the kaix/modelopt_agent branch from 1eb9c85 to 6770524 Compare March 20, 2026 00:06

mxinO added 2 commits March 20, 2026 04:33

better bot

7a106c3

format

d8cc6c8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add agent skills for modelopt#1011

Add agent skills for modelopt#1011
kaix-nv wants to merge 16 commits intomainfrom
kaix/modelopt_agent

kaix-nv commented Mar 9, 2026 •

edited

Loading

Uh oh!

copy-pr-bot bot commented Mar 9, 2026

Uh oh!

coderabbitai bot commented Mar 9, 2026 •

edited

Loading

Review skipped

Uh oh!

codecov bot commented Mar 10, 2026 •

edited

Loading

Uh oh!

mxinO commented Mar 10, 2026 •

edited

Loading

Uh oh!

Edwardf0t1 commented Mar 12, 2026 •

edited

Loading

Uh oh!

kaix-nv commented Mar 13, 2026 •

edited

Loading

Uh oh!

copy-pr-bot bot commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

kaix-nv commented Mar 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Usage

Slack Bot Setup

1. Create the Slack App

2. Set up environment

3. Run the bot

4. Test in Slack

Testing

Before your PR is "Ready for review"

Additional Information

Uh oh!

copy-pr-bot bot commented Mar 9, 2026

Uh oh!

coderabbitai bot commented Mar 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

codecov bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

mxinO commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Edwardf0t1 commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kaix-nv commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

copy-pr-bot bot commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kaix-nv commented Mar 9, 2026 •

edited

Loading

coderabbitai bot commented Mar 9, 2026 •

edited

Loading

codecov bot commented Mar 10, 2026 •

edited

Loading

mxinO commented Mar 10, 2026 •

edited

Loading

Edwardf0t1 commented Mar 12, 2026 •

edited

Loading

kaix-nv commented Mar 13, 2026 •

edited

Loading