[Others] update flash mask version#7819
Conversation
|
Thanks for your contribution! |
CI报告基于以下代码生成(30分钟更新一次): 1 任务总览有 2 个 required 任务失败,需优先处理后方可合并。
2 任务状态汇总2.1 Required任务 : 8/10 通过
2.2 可选任务 — 30/32 通过
3 失败详情(仅 required)Run FastDeploy Unit Tests and Coverage / run_tests_with_coverage — 覆盖率不达标(置信度: 高)Run FastDeploy Unit Tests and Coverage / run_tests_with_coverage
失败用例: 无(单元测试全部通过,覆盖率检查失败) 根因详情: 关键日志: 修复建议:
修复建议摘要: 为 flash_attn_backend.py L88-99 新增单测或申请豁免 关联变更: Approval — 代码规范(置信度: 高)Approval
根因详情: 修复建议:
修复建议摘要: 请 xyxinyang 或 zyyzghb Approve 此 PR 关联变更: PR 新增了 |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #7819 +/- ##
==========================================
Coverage ? 63.33%
==========================================
Files ? 462
Lines ? 64371
Branches ? 9872
==========================================
Hits ? 40769
Misses ? 20835
Partials ? 2767
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
CI报告基于以下代码生成(30分钟更新一次): 1 任务总览⏳ Required 任务进行中:5 个运行中,1 个等待中,暂无 Required 失败。
2 任务状态汇总2.1 Required任务 : 3/9 通过
2.2 可选任务 — 24/28 通过
3 失败详情(仅 required)无 required 失败任务。 |
PaddlePaddle-bot
left a comment
There was a problem hiding this comment.
🤖 Paddle-CI-Agent | pr_review |
2026-05-19 22:59:58
📋 Review 摘要
PR 概述:将 flash mask 从独立包迁移到 paddlefleet.ops 下,并移除 requirements.txt 中的旧版本锁定依赖。
变更范围:fastdeploy/model_executor/layers/attention/flash_attn_backend.py、requirements.txt
影响面 Tag:[OP]
问题
| 级别 | 文件 | 概述 |
|---|---|---|
| 🔴 Bug | flash_attn_backend.py:102 |
fa4 变量在内层异常分支未赋值,外层继续使用导致 NameError 崩溃 |
📝 PR 规范检查
标题格式合规,PR 描述所有必填 section 均已填写,Checklist 已勾选,规范符合要求。✓
总体评价
PR 意图清晰,依赖迁移合理。但新增的嵌套异常处理逻辑存在明确的变量作用域 Bug——当 flash_mask 不可用时 fa4 未赋值,外层代码继续访问会抛出 NameError 导致进程崩溃,需在合入前修复。
| logger.info(f"The current platform[sm{get_sm_version()}] can't import Flash Attention V4.") | ||
|
|
||
| global flashmask_attention_v4 | ||
| flashmask_attention_v4 = fa4 |
There was a problem hiding this comment.
🔴 Bug fa4 变量在异常分支未被赋值,但外层代码仍访问它
当 is_flash_mask_available() 返回 False 或 ImportError/ModuleNotFoundError 被内层 except 捕获时,fa4 变量并未赋值。此时外层 try 块第 102 行 flashmask_attention_v4 = fa4 会抛出 NameError: name 'fa4' is not defined,而外层 except ImportError 无法捕获 NameError,导致程序崩溃,等同于 flash_mask 完全不可用时的行为比原来更糟。
建议修复方式:
try:
paddle.enable_compat(scope={"cutlass"})
fa4 = None
try:
from paddlefleet.ops import is_flash_mask_available
if is_flash_mask_available():
from paddlefleet.ops.flash_mask.cute.interface import (
flashmask_attention as fa4,
)
else:
raise ModuleNotFoundError("flash_mask not available.")
except (ImportError, ModuleNotFoundError):
logger.info(f"The current platform[sm{get_sm_version()}] can't import Flash Attention V4.")
if fa4 is not None:
global flashmask_attention_v4
flashmask_attention_v4 = fa4
FLASH_ATTN_VERSION = 4
logger.info("The current platform supports Flash Attention V4.")
except ImportError:
logger.info(f"The current platform[sm{get_sm_version()}] can't import Flash Attention V4.")|
root seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
Motivation
flashinfer升级到0.6.11版本要求nvidia-cutlass-dsl>=4.4.2(https://github.com/PaddlePaddle/FastDeploy/pull/7799),flash mask旧版本锁死了nvidia-cutlass-dsl==4.4.2,产生冲突,因此升级一下flash mask版本
Modifications
升级flash mask版本
版本信息记录在:https://ku.baidu-int.com/knowledge/HFVrC7hq1Q/pKzJfZczuc/YeqWcBGW4m/EUBpKxHfTurV5G
Usage or Command
NA
Accuracy Tests
NA
Checklist
[FDConfig],[APIServer],[Engine],[Scheduler],[PD Disaggregation],[Executor],[Graph Optimization],[Speculative Decoding],[RL],[Models],[Quantization],[Loader],[OP],[KVCache],[DataProcessor],[BugFix],[Docs],[CI],[Optimization],[Feature],[Benchmark],[Others],[XPU],[HPU],[GCU],[DCU],[Iluvatar],[Metax]]pre-commitbefore commit.releasebranch, make sure the PR has been submitted to thedevelopbranch, then cherry-pick it to thereleasebranch with the[Cherry-Pick]PR tag.