[paddle-adapt] gemm: adapt tests/gemm/test_group_gemm.py, test_mm_bf16.py, test_bmm_bf16.py by BingooYang · Pull Request #17 · PFCCLab/flashinfer

BingooYang · 2026-05-15T04:38:20Z

📌 Description

Adapt three GEMM tests to run under paddle.enable_compat() mode.

Changes

Cherry-picked base adaptations from adapt/gemm_bf16 (§35 torch.device keyword arg fix, conftest.py patches)
tests/gemm/test_mm_bf16.py: torch.device(device="cuda") → torch.device("cuda") (§35)
tests/gemm/test_bmm_bf16.py: same fix (§35)
tests/gemm/test_group_gemm.py: no code changes needed - passes with sm80 backend as-is
Added 3 representative CI cases to scripts/paddle_all_test_cases.sh

Test Results

File	PASS	SKIP	FAIL	Notes
test_group_gemm.py (sm80)	288	36	0	sm90 SKIP: SM100 device does not support sm90 GEMM
test_mm_bf16.py (non-cudnn)	1081	3870	450	FAIL: §47 env issue (Multiple libcudart.so.12+.so.13 in cudnn/auto backend)
test_bmm_bf16.py (cutlass)	32	0	112	FAIL: same §47 env issue in auto+float32 cases

Known Non-Paddle Issues

§47: RuntimeError: Multiple libcudart libraries found — environment-level CUDA version conflict in cudnn and auto backends. Not fixable in Python.

🔍 Related Issues

Part of the Paddle compatibility adaptation series.

🚀 Pull Request Checklist

✅ Pre-commit Checks

pre-commit run --all-files: all checks passed

🧪 Tests

Tests added to scripts/paddle_all_test_cases.sh
Regression: norm PASS (102+35), comm PASS

Reviewer Notes

test_group_gemm.py required zero code changes — sm80 backend passes with base conftest.py patches alone. cudnn/auto failures are §47 env issue, unrelated to Paddle adaptation.

- Add paddle.enable_compat() and monkey-patches to tests/conftest.py: - Stream.cuda_stream property (paddle uses __cuda_stream__() returning tuple) - torch.cuda.current_blas_handle (paddle.cuda lacks this API) - Fix torch.device(device=...) -> torch.device(...) across test files - Add __is_paddle_compatible_library__ = True to flashinfer/__init__.py - Add use_paddle_compatible_api() helper to flashinfer/utils.py - Make flashinfer/triton imports optional (triton may not be available) - Add _CudaOutOfMemoryError sentinel in flashinfer/autotuner.py - Fix _get_cuda_stream() in cutlass/torch.py for paddle compat - Rename package to flashinfer-python-paddle in pyproject.toml Test results: - test_group_gemm.py: 288 passed, 360 skipped - test_mm_bf16.py: 1081 passed (cudnn/auto failures due to libcudart env conflict) - test_bmm_bf16.py: 32 passed (cudnn/auto failures due to libcudart env conflict) Known limitations (not adaptation issues): - cudnn/auto backend: libcudart.so.12 vs .13 conflict (environment issue) - res_dtype != bfloat16: paddle tensor copy between different dtypes not supported

…m_bf16 under paddle compat - test_group_gemm.py: sm80 backend 288 PASS, 36 SKIP (batch_size*rows>8192); sm90 SKIP (SM100 device, no sm90 GEMM support); zero code changes needed - test_mm_bf16.py: adapted via §35 fix (torch.device kwarg -> positional); cutlass/tgv/cublaslt/tinygemm backends pass; cudnn/auto-float32 FAIL due to §47 env issue (Multiple libcudart.so.12 vs .so.13) - test_bmm_bf16.py: adapted via §35; cutlass backend pass; auto+float32 FAIL due to §47 - Regression: norm PASS (102+35 cases), comm PASS, cherry-picked base fixes from c11b6f55 Refs: adaptation-paddle/adaptation_exp.md §35 §47

- Replace try/import paddle with importlib.util.find_spec() in utils.py - Apply ruff-format to 5 modified files

Your Name added 3 commits May 15, 2026 10:51

style: fix ruff F401 and formatting (pre-commit auto-fix)

3af530f

- Replace try/import paddle with importlib.util.find_spec() in utils.py - Apply ruff-format to 5 modified files

BingooYang force-pushed the adapt/gemm_all branch from a7265ce to 3af530f Compare May 15, 2026 06:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[paddle-adapt] gemm: adapt tests/gemm/test_group_gemm.py, test_mm_bf16.py, test_bmm_bf16.py#17

[paddle-adapt] gemm: adapt tests/gemm/test_group_gemm.py, test_mm_bf16.py, test_bmm_bf16.py#17
BingooYang wants to merge 3 commits into
PFCCLab:0.6from
BingooYang:adapt/gemm_all

BingooYang commented May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

BingooYang commented May 15, 2026

📌 Description

Changes

Test Results

Known Non-Paddle Issues

🔍 Related Issues

🚀 Pull Request Checklist

✅ Pre-commit Checks

🧪 Tests

Reviewer Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant