Skip to content

Add test coverage for Muon muon_lr/adam_lr overrides#8047

Open
sowndappan5 wants to merge 3 commits into
deepspeedai:masterfrom
sowndappan5:master
Open

Add test coverage for Muon muon_lr/adam_lr overrides#8047
sowndappan5 wants to merge 3 commits into
deepspeedai:masterfrom
sowndappan5:master

Conversation

@sowndappan5
Copy link
Copy Markdown

Summary

Add coverage for separate learning rate overrides in the Muon optimizer path and fix the related Muon blog documentation.

Background

Muon parameters and non-Muon parameters are automatically split into separate optimizer groups. The intended behavior is:

  • muon_lr applies to Muon parameter groups
  • adam_lr applies to Adam parameter groups
  • lr remains the fallback for both groups when overrides are not provided

Changes

  • add a parameterized test covering:
    • legacy lr fallback behavior
    • separate muon_lr / adam_lr override behavior
  • fix the Muon blog table header to label muon_lr and adam_lr correctly

Validation

Ran:
python -m pytest DeepSpeed/tests/unit/ops/muon/test_muon_partial_training.py -k learning_rate_overrides -q -rs

Result:

  • test collected successfully
  • skipped locally because this distributed test requires 2 GPUs, while the local environment has 1 GPU

Comment thread blogs/muon-optimizer/README.md Outdated
### Evaluation Results

| Optimizer | Learning Rate | adam_lr (for Muon) | MBPP | MBPP+ | MMLU | GSM8K |
| Optimizer | muon_lr | adam_lr | MBPP | MBPP+ | MMLU | GSM8K |
Copy link
Copy Markdown
Collaborator

@delock delock Jun 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like change the title this way does not consistent with table content down below, i.e. AdamW learning rate is not muon learning rate. Does this file have to be modified?

@sowndappan5
Copy link
Copy Markdown
Author

sowndappan5 commented Jun 5, 2026

I removed the README table change and updated the PR to keep it focused on the test coverage for the existing muon_lr / adam_lr behavior.

@delock
Copy link
Copy Markdown
Collaborator

delock commented Jun 5, 2026

Hi @sowndappan5 thanks for your new test cases. Can you address the comments in README.md and also fix the DCO tests by sign-off your commits? Thanks!

@sowndappan5
Copy link
Copy Markdown
Author

Thanks, I’ve addressed the README feedback and pushed the update. I’m fixing the DCO issue now by signing off the commits.

Signed-off-by: Sowndappan S <147894621+sowndappan5@users.noreply.github.com>
Signed-off-by: Sowndappan S <147894621+sowndappan5@users.noreply.github.com>
@sowndappan5
Copy link
Copy Markdown
Author

I’ve addressed the README feedback and force-pushed signed-off commits to fix the DCO issue. The PR is now updated and pending review/CI.

@delock delock enabled auto-merge (squash) June 5, 2026 12:46
auto-merge was automatically disabled June 5, 2026 13:08

Head branch was pushed to by a user without write access

…date README contributors section

Signed-off-by: Sowndappan S <147894621+sowndappan5@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants