Skip to content

fix(logging): partition raw rewards for correct samples#1996

Open
Jiang020609 wants to merge 1 commit into
THUDM:mainfrom
Jiang020609:fix/log-correct-samples-raw-reward-partition
Open

fix(logging): partition raw rewards for correct samples#1996
Jiang020609 wants to merge 1 commit into
THUDM:mainfrom
Jiang020609:fix/log-correct-samples-raw-reward-partition

Conversation

@Jiang020609

Copy link
Copy Markdown
Contributor

Fixes #1784. Summary: partition raw_reward alongside total_lengths in process_rollout_data so --log-correct-samples sees per-rank lists; preserve global_raw_reward for --log-passrate and skip it from generic rollout metrics; add CPU tests for partitioning and passrate preservation. Tests: .venv\Scripts\python.exe -m pytest tests\test_process_rollout_data.py; .venv\Scripts\python.exe -m ruff check slime\utils\data.py slime\backends\megatron_utils\data.py tests\test_process_rollout_data.py; .venv\Scripts\pre-commit.exe run --files slime\utils\data.py slime\backends\megatron_utils\data.py tests\test_process_rollout_data.py.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] IndexError in log_rollout_data when --log-correct-samples enabled with DP > 1

1 participant