Skip to content

ci: Temp disable megatron lora grpo tests#2062

Closed
chtruong814 wants to merge 1 commit intomainfrom
chtruong/disable-grpo-lora-test
Closed

ci: Temp disable megatron lora grpo tests#2062
chtruong814 wants to merge 1 commit intomainfrom
chtruong/disable-grpo-lora-test

Conversation

@chtruong814
Copy link
Copy Markdown
Contributor

@chtruong814 chtruong814 commented Mar 4, 2026

What does this PR do ?

Temp disable megatron lora grpo tests

These tests were failing and a fix is being worked on. In the mean time, let's skip the tests.
#1889

Issues

List issues that this PR closes (syntax):

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
  • Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Additional Information

  • ...

Summary by CodeRabbit

  • Tests
    • Disabled GPU functional tests for grpo_megatron_lora and grpo_megatron_lora_async configurations in the L1 functional test suite.

Signed-off-by: Charlie Truong <chtruong@nvidia.com>
@chtruong814 chtruong814 requested a review from a team as a code owner March 4, 2026 17:51
@chtruong814 chtruong814 added CI:L1 Run doctests, unit tests, and functional tests CI:docs Run doctest and removed CI:L1 Run doctests, unit tests, and functional tests labels Mar 4, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 4, 2026

📝 Walkthrough

Walkthrough

This PR disables two GPU functional tests by commenting out their execution commands in a test script. Changes are minimal and affect only test invocations with no functional logic modifications.

Changes

Cohort / File(s) Summary
Test Disabling
tests/functional/L1_Functional_Tests_GPU.sh
Commented out execution of grpo_megatron_lora.sh and grpo_megatron_lora_async.sh tests.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

Possibly related PRs

Suggested labels

CI:Lfast

Suggested reviewers

  • yuki-97
🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely describes the main change: temporarily disabling megatron lora grpo tests in the CI pipeline.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Test Results For Major Changes ✅ Passed PR contains only minor CI configuration changes—temporarily disabling test invocations due to known failures being addressed separately. No feature additions, breaking changes, or algorithmic modifications.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch chtruong/disable-grpo-lora-test

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/functional/L1_Functional_Tests_GPU.sh`:
- Around line 55-56: Restore the two commented-out run_test lines for
./tests/functional/grpo_megatron_lora.sh and
./tests/functional/grpo_megatron_lora_async.sh and wrap them with an explicit
gate (e.g., check SKIP_L1_GRPO_MEGATRON_LORA environment variable or a
--skip-l1-grpo flag) so the tests remain in CI but can be skipped; when
skipping, emit a clear log message that includes the tracking PR/issue
identifier (e.g., "Skipping L1 GRPO+Megatron+LoRA tests per PR#1234/issue#5678")
so re-enable criteria are visible. Ensure the gate logic is placed where the
original run_test invocations were removed and that the skip variable defaults
to false so coverage runs unless explicitly set.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 26743456-3b95-440f-878a-c2ee6df553d1

📥 Commits

Reviewing files that changed from the base of the PR and between bd7f236 and d0b5daf.

📒 Files selected for processing (1)
  • tests/functional/L1_Functional_Tests_GPU.sh

Comment on lines +55 to +56
# run_test uv run --no-sync bash ./tests/functional/grpo_megatron_lora.sh
# run_test uv run --no-sync bash ./tests/functional/grpo_megatron_lora_async.sh
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid silently removing unique CI coverage; gate these tests with an explicit temporary switch.

Line 55 and Line 56 currently remove the only L1 coverage for GRPO+Megatron+LoRA (including async), which means these paths won’t run in the main CI flow (.github/workflows/cicd-main.yml:310-327, plus main/scheduled L1 behavior in lines 100-120 of that workflow). Please keep these invocations in place behind an explicit skip flag and log the tracking PR/issue so re-enable criteria stays visible.

Proposed change
-# run_test      uv run --no-sync bash ./tests/functional/grpo_megatron_lora.sh
-# run_test      uv run --no-sync bash ./tests/functional/grpo_megatron_lora_async.sh
+if [[ "${DISABLE_GRPO_MEGATRON_LORA_TESTS:-1}" == "1" ]]; then
+    echo "Temporarily skipping GRPO Megatron LoRA tests (tracking fix: PR `#1889`)"
+else
+    run_test      uv run --no-sync bash ./tests/functional/grpo_megatron_lora.sh
+    run_test      uv run --no-sync bash ./tests/functional/grpo_megatron_lora_async.sh
+fi
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# run_test uv run --no-sync bash ./tests/functional/grpo_megatron_lora.sh
# run_test uv run --no-sync bash ./tests/functional/grpo_megatron_lora_async.sh
if [[ "${DISABLE_GRPO_MEGATRON_LORA_TESTS:-1}" == "1" ]]; then
echo "Temporarily skipping GRPO Megatron LoRA tests (tracking fix: PR `#1889`)"
else
run_test uv run --no-sync bash ./tests/functional/grpo_megatron_lora.sh
run_test uv run --no-sync bash ./tests/functional/grpo_megatron_lora_async.sh
fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/functional/L1_Functional_Tests_GPU.sh` around lines 55 - 56, Restore
the two commented-out run_test lines for
./tests/functional/grpo_megatron_lora.sh and
./tests/functional/grpo_megatron_lora_async.sh and wrap them with an explicit
gate (e.g., check SKIP_L1_GRPO_MEGATRON_LORA environment variable or a
--skip-l1-grpo flag) so the tests remain in CI but can be skipped; when
skipping, emit a clear log message that includes the tracking PR/issue
identifier (e.g., "Skipping L1 GRPO+Megatron+LoRA tests per PR#1234/issue#5678")
so re-enable criteria are visible. Ensure the gate logic is placed where the
original run_test invocations were removed and that the skip variable defaults
to false so coverage runs unless explicitly set.

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Apr 23, 2026

Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@terrykong terrykong closed this Apr 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI:docs Run doctest

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants