[paddle-adapt] comm/test_dcp_alltoall: 29 PASS with assert_close Paddle compat patch (#25)#26
Merged
Merged
Conversation
…le compat patch - §44/§45: torch.testing.assert_close bfloat16/float16 isclose kernel not registered in Paddle compat - §52: Paddle compat wraps ALL assert_close internal errors with "resulted in the unexpected exception above" (not just bfloat16/float16); fix: check this outer message first before dtype-specific conditions - §46: torch.equal returns Tensor not bool in Paddle compat - §47: tensor.multiply(scalar) does not accept Python scalar - §48: tensor.clamp_min/clamp_max aliases missing Skipped tests (multiprocessing/MPI/MNNVL/NVSHMEM — too complex): test_all_gather_matmul.py, test_allreduce*.py, test_mixed_comm.py, test_trtllm_allreduce*.py, test_mnnvl_*.py, test_nvshmem*.py, test_vllm_custom_allreduce.py Regression: all previous PASS cases still pass Refs: MISMATCH_EXPERIMENT §52
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Adapt for Paddle compat mode. Only (single-GPU DCP LL128 FIFO All-to-All test) is adaptable. All multiprocessing/MPI/MNNVL/NVSHMEM tests are skipped as too complex.
Changes
Adaptation Points
para44/para45 - bfloat16/float16 isclose kernel not registered. Fix: catch and fall back to numpy-based allclose.
para52 (NEW) - Paddle compat wraps ALL internal errors with (not just bfloat16/float16). Affects float32 comparisons too. Fix: check this outer message first.
para46 - returns Tensor not bool. para47 - does not accept Python scalar. para48 - aliases missing.
Skipped Tests (too complex)
Test Results
Checklist
Refs: MISMATCH_EXPERIMENT para52