core: per-rule CWE field + CWE-aware cross-rule dedup#56
Merged
Conversation
Adds a `cwe` field on each rule. When two rules report findings at the same (file, line) and share the same CWE (e.g. DESER_TORCH001 + AI202 both flagging one torch.load line under CWE-502), the engine collapses them: the finding whose rule declares the higher severity wins, with rule_id lex order as stable tiebreaker on equal severity. CWE itself does not set severity — each rule's severity comes from its own TOML field. Distinct CWEs at the same line stay distinct, so `os.system(eval(user_input))` correctly reports both CWE-78 and CWE-94.
Rust core
- rules.rs / issues.rs: new optional `cwe: Option<String>`, carried from Rule → Issue and exposed to Python via pyo3
- analysis/{config,ast,taint}_analysis.rs: pass it through Issue::new
- analysis/mod.rs: 2-stage dedup
stage 1 = existing fingerprint dedup (same rule, exact match)
stage 2 = CWE-aware merge by (file, line, cwe), highest severity wins. Rules without a CWE skip stage 2.
cli.py
- file_path passed to Rust is now `py_file.resolve()` (absolute, canonical) so AST-rule and pattern-rule findings agree on the same path string and stage-2 dedup actually triggers.
reporting.py
- JSON output gains a top-level `cwe` field on each issue
- SARIF output emits `external/cwe/cwe-N` in each rule's `properties.tags` — standard SARIF taxon, parses cleanly in GitHub Code Scanning and DefectDojo
setup.py
- RustExtension declares `debug=False` so `pip install -e .` produces release-mode binaries; previously editable installs ran ~3× slower.
Rules — all 179 [[rule]] blocks now declare a CWE (built-in-rules.toml + built-in-rules-ai.toml). Mapping summary:
CWE-78 command injection PROC819, SHELL602/689, PY102/103/106, AI503, ...
CWE-22 path traversal PATH813, OPEN1149, AI502, ZIPSLIP001, FILE526, ...
CWE-94 code/template injection PY001/305/500, SEC501, SSTI001, SANDBOX307/308, AI101/102/103/105/106/107, ...
CWE-502 insecure deserialization DESER*, PY002/107/204/301/302/306, YAML001, AI201/202/203/204/205, RUAMEL_UNSAFE001, ...
CWE-89 SQL injection PY101, SQL586/693, ORM001/002, AI104/504, ...
CWE-918 SSRF SSRF_001, NET705, AI501, ENV_URL001, ...
CWE-295 TLS / cert verification TLS001, SSL531, SSH001, G405, NET705
CWE-327 weak crypto PY201/202/203/205, HASH807
CWE-338 weak PRNG CRYPTO708, RAND810
CWE-798 hardcoded credentials G101/101B/102/104/110..133, AI002/404, AUTH711, ADMIN795, CFG001, ...
CWE-352 CSRF G404, CSRF747, OAUTH774
CWE-489 active debug code G401/403, FLASK001, FLASK_DEBUG001, DJANGO_DEBUG001, DEBUG798
CWE-79 XSS PY105
CWE-611 XXE PY303, XXE001
CWE-942 CORS CORS780
CWE-601 open redirect OPEN_REDIRECT001
CWE-1004 sensitive cookie attr COOKIE792, COOKIE_FILE001
CWE-319 cleartext transmission HTTPS789, AI403
CWE-200 info disclosure INFO738, BACKUP801, FILE528, AI402, AI405
CWE-117 log injection LOG741
CWE-208 timing attack TIMING759
CWE-1333 ReDoS REGEX870
(full list in the rule TOMLs themselves)
New AST rules
- YAML001 yaml.load() without SafeLoader (CWE-502, Critical)
- FLASK_DEBUG001 .run(debug=True) on Flask/FastAPI (CWE-489, High)
AI202 hardened
- pattern tightened to `torch\.load\s*\(`
- exclude_pattern now matches DESER_TORCH001's: skip lines with `weights_only=True`
- now redundant with DESER_TORCH001 (both CWE-502) → stage-2 dedup collapses them to one Critical finding per torch.load line
Test on Ghy0501/MCITlib (4,743 .py / 27,568 functions):
this branch main (post-ParzivalHack#55)
wall clock 593s 606s
total findings 1,740 3,103
unique (file, line, CWE) groups 1,740 1,918
duplicate groups (≥2 rules) 0 1,185
excess duplicate findings 0 1,185
heuristic-TP 1,684 3,047
heuristic-FP 56 56
Dedup is reflected directly: branch produces 0 duplicate groups where main produces 1,185 (i.e. 1,185 places where 2+ rules describe the same vulnerability at the same line). FP count is identical (56) since FPs are pattern-shape artifacts that don't depend on dedup. The remaining 178-finding gap (1,918 unique vs 1,740) is AI202 no longer flagging torch.load(..., weights_only=True). Wall clock −13s is within noise.
ParzivalHack
approved these changes
Jun 1, 2026
Owner
ParzivalHack
left a comment
There was a problem hiding this comment.
Dedup works fine (and the new per-rule CWE field is a great adding). Merging :)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
core: per-rule CWE field + CWE-aware cross-rule dedup
Adds a
cwefield on each rule. When two rules report findings at the same (file, line) and share the same CWE (e.g. DESER_TORCH001 + AI202 both flagging one torch.load line under CWE-502), the engine collapses them: the finding whose rule declares the higher severity wins, with rule_id lex order as stable tiebreaker on equal severity. CWE itself does not set severity — each rule's severity comes from its own TOML field. Distinct CWEs at the same line stay distinct, soos.system(eval(user_input))correctly reports both CWE-78 and CWE-94.Rust core
cwe: Option<String>, carried from Rule → Issue and exposed to Python via pyo3stage 1 = existing fingerprint dedup (same rule, exact match)
stage 2 = CWE-aware merge by (file, line, cwe), highest severity wins. Rules without a CWE skip stage 2.
cli.py
py_file.resolve()(absolute, canonical) so AST-rule and pattern-rule findings agree on the same path string and stage-2 dedup actually triggers.reporting.py
cwefield on each issueexternal/cwe/cwe-Nin each rule'sproperties.tags— standard SARIF taxon, parses cleanly in GitHub Code Scanning and DefectDojosetup.py
debug=Falsesopip install -e .produces release-mode binaries; previously editable installs ran ~3× slower.Rules — all 179 [[rule]] blocks now declare a CWE (built-in-rules.toml + built-in-rules-ai.toml). Mapping summary:
CWE-78 command injection PROC819, SHELL602/689, PY102/103/106, AI503, ...
CWE-22 path traversal PATH813, OPEN1149, AI502, ZIPSLIP001, FILE526, ...
CWE-94 code/template injection PY001/305/500, SEC501, SSTI001, SANDBOX307/308, AI101/102/103/105/106/107, ...
CWE-502 insecure deserialization DESER*, PY002/107/204/301/302/306, YAML001, AI201/202/203/204/205, RUAMEL_UNSAFE001, ...
CWE-89 SQL injection PY101, SQL586/693, ORM001/002, AI104/504, ...
CWE-918 SSRF SSRF_001, NET705, AI501, ENV_URL001, ...
CWE-295 TLS / cert verification TLS001, SSL531, SSH001, G405, NET705
CWE-327 weak crypto PY201/202/203/205, HASH807
CWE-338 weak PRNG CRYPTO708, RAND810
CWE-798 hardcoded credentials G101/101B/102/104/110..133, AI002/404, AUTH711, ADMIN795, CFG001, ...
CWE-352 CSRF G404, CSRF747, OAUTH774
CWE-489 active debug code G401/403, FLASK001, FLASK_DEBUG001, DJANGO_DEBUG001, DEBUG798
CWE-79 XSS PY105
CWE-611 XXE PY303, XXE001
CWE-942 CORS CORS780
CWE-601 open redirect OPEN_REDIRECT001
CWE-1004 sensitive cookie attr COOKIE792, COOKIE_FILE001
CWE-319 cleartext transmission HTTPS789, AI403
CWE-200 info disclosure INFO738, BACKUP801, FILE528, AI402, AI405
CWE-117 log injection LOG741
CWE-208 timing attack TIMING759
CWE-1333 ReDoS REGEX870
(full list in the rule TOMLs themselves)
New AST rules
AI202 hardened
torch\.load\s*\(weights_only=TrueTest on Ghy0501/MCITlib (4,743 .py / 27,568 functions):
wall clock 593s 606s
total findings 1,740 3,103
unique (file, line, CWE) groups 1,740 1,918
duplicate groups (≥2 rules) 0 1,185
excess duplicate findings 0 1,185
heuristic-TP 1,684 3,047
heuristic-FP 56 56
Dedup is reflected directly: branch produces 0 duplicate groups where main produces 1,185 (i.e. 1,185 places where 2+ rules describe the same vulnerability at the same line). FP count is identical (56) since FPs are pattern-shape artifacts that don't depend on dedup. The remaining 178-finding gap (1,918 unique vs 1,740) is AI202 no longer flagging torch.load(..., weights_only=True). Wall clock −13s is within noise.