Skip to content

core: per-rule CWE field + CWE-aware cross-rule dedup#56

Merged
ParzivalHack merged 1 commit into
ParzivalHack:mainfrom
satoridev01:feat/cwe-dedup
Jun 1, 2026
Merged

core: per-rule CWE field + CWE-aware cross-rule dedup#56
ParzivalHack merged 1 commit into
ParzivalHack:mainfrom
satoridev01:feat/cwe-dedup

Conversation

@satoridev01
Copy link
Copy Markdown
Contributor

core: per-rule CWE field + CWE-aware cross-rule dedup

Adds a cwe field on each rule. When two rules report findings at the same (file, line) and share the same CWE (e.g. DESER_TORCH001 + AI202 both flagging one torch.load line under CWE-502), the engine collapses them: the finding whose rule declares the higher severity wins, with rule_id lex order as stable tiebreaker on equal severity. CWE itself does not set severity — each rule's severity comes from its own TOML field. Distinct CWEs at the same line stay distinct, so os.system(eval(user_input)) correctly reports both CWE-78 and CWE-94.

Rust core

  • rules.rs / issues.rs: new optional cwe: Option<String>, carried from Rule → Issue and exposed to Python via pyo3
  • analysis/{config,ast,taint}_analysis.rs: pass it through Issue::new
  • analysis/mod.rs: 2-stage dedup
    stage 1 = existing fingerprint dedup (same rule, exact match)
    stage 2 = CWE-aware merge by (file, line, cwe), highest severity wins. Rules without a CWE skip stage 2.

cli.py

  • file_path passed to Rust is now py_file.resolve() (absolute, canonical) so AST-rule and pattern-rule findings agree on the same path string and stage-2 dedup actually triggers.

reporting.py

  • JSON output gains a top-level cwe field on each issue
  • SARIF output emits external/cwe/cwe-N in each rule's properties.tags — standard SARIF taxon, parses cleanly in GitHub Code Scanning and DefectDojo

setup.py

  • RustExtension declares debug=False so pip install -e . produces release-mode binaries; previously editable installs ran ~3× slower.

Rules — all 179 [[rule]] blocks now declare a CWE (built-in-rules.toml + built-in-rules-ai.toml). Mapping summary:

CWE-78 command injection PROC819, SHELL602/689, PY102/103/106, AI503, ...
CWE-22 path traversal PATH813, OPEN1149, AI502, ZIPSLIP001, FILE526, ...
CWE-94 code/template injection PY001/305/500, SEC501, SSTI001, SANDBOX307/308, AI101/102/103/105/106/107, ...
CWE-502 insecure deserialization DESER*, PY002/107/204/301/302/306, YAML001, AI201/202/203/204/205, RUAMEL_UNSAFE001, ...
CWE-89 SQL injection PY101, SQL586/693, ORM001/002, AI104/504, ...
CWE-918 SSRF SSRF_001, NET705, AI501, ENV_URL001, ...
CWE-295 TLS / cert verification TLS001, SSL531, SSH001, G405, NET705
CWE-327 weak crypto PY201/202/203/205, HASH807
CWE-338 weak PRNG CRYPTO708, RAND810
CWE-798 hardcoded credentials G101/101B/102/104/110..133, AI002/404, AUTH711, ADMIN795, CFG001, ...
CWE-352 CSRF G404, CSRF747, OAUTH774
CWE-489 active debug code G401/403, FLASK001, FLASK_DEBUG001, DJANGO_DEBUG001, DEBUG798
CWE-79 XSS PY105
CWE-611 XXE PY303, XXE001
CWE-942 CORS CORS780
CWE-601 open redirect OPEN_REDIRECT001
CWE-1004 sensitive cookie attr COOKIE792, COOKIE_FILE001
CWE-319 cleartext transmission HTTPS789, AI403
CWE-200 info disclosure INFO738, BACKUP801, FILE528, AI402, AI405
CWE-117 log injection LOG741
CWE-208 timing attack TIMING759
CWE-1333 ReDoS REGEX870
(full list in the rule TOMLs themselves)

New AST rules

  • YAML001 yaml.load() without SafeLoader (CWE-502, Critical)
  • FLASK_DEBUG001 .run(debug=True) on Flask/FastAPI (CWE-489, High)

AI202 hardened

  • pattern tightened to torch\.load\s*\(
  • exclude_pattern now matches DESER_TORCH001's: skip lines with weights_only=True
  • now redundant with DESER_TORCH001 (both CWE-502) → stage-2 dedup collapses them to one Critical finding per torch.load line

Test on Ghy0501/MCITlib (4,743 .py / 27,568 functions):

                              this branch     main (post-#55)

wall clock 593s 606s
total findings 1,740 3,103
unique (file, line, CWE) groups 1,740 1,918
duplicate groups (≥2 rules) 0 1,185
excess duplicate findings 0 1,185
heuristic-TP 1,684 3,047
heuristic-FP 56 56

Dedup is reflected directly: branch produces 0 duplicate groups where main produces 1,185 (i.e. 1,185 places where 2+ rules describe the same vulnerability at the same line). FP count is identical (56) since FPs are pattern-shape artifacts that don't depend on dedup. The remaining 178-finding gap (1,918 unique vs 1,740) is AI202 no longer flagging torch.load(..., weights_only=True). Wall clock −13s is within noise.

Adds a `cwe` field on each rule. When two rules report findings at the same (file, line) and share the same CWE (e.g. DESER_TORCH001 + AI202 both flagging one torch.load line under CWE-502), the engine collapses them: the finding whose rule declares the higher severity wins, with rule_id lex order as stable tiebreaker on equal severity. CWE itself does not set severity — each rule's severity comes from its own TOML field. Distinct CWEs at the same line stay distinct, so `os.system(eval(user_input))` correctly reports both CWE-78 and CWE-94.

Rust core
 - rules.rs / issues.rs: new optional `cwe: Option<String>`, carried from Rule → Issue and exposed to Python via pyo3
 - analysis/{config,ast,taint}_analysis.rs: pass it through Issue::new
 - analysis/mod.rs: 2-stage dedup
     stage 1 = existing fingerprint dedup (same rule, exact match)
     stage 2 = CWE-aware merge by (file, line, cwe), highest severity wins. Rules without a CWE skip stage 2.

cli.py
 - file_path passed to Rust is now `py_file.resolve()` (absolute, canonical) so AST-rule and pattern-rule findings agree on the same path string and stage-2 dedup actually triggers.

reporting.py
 - JSON output gains a top-level `cwe` field on each issue
 - SARIF output emits `external/cwe/cwe-N` in each rule's `properties.tags` — standard SARIF taxon, parses cleanly in GitHub Code Scanning and DefectDojo

setup.py
 - RustExtension declares `debug=False` so `pip install -e .` produces release-mode binaries; previously editable installs ran ~3× slower.

Rules — all 179 [[rule]] blocks now declare a CWE (built-in-rules.toml + built-in-rules-ai.toml). Mapping summary:

  CWE-78  command injection           PROC819, SHELL602/689, PY102/103/106, AI503, ...
  CWE-22  path traversal              PATH813, OPEN1149, AI502, ZIPSLIP001, FILE526, ...
  CWE-94  code/template injection     PY001/305/500, SEC501, SSTI001, SANDBOX307/308, AI101/102/103/105/106/107, ...
  CWE-502 insecure deserialization    DESER*, PY002/107/204/301/302/306, YAML001, AI201/202/203/204/205, RUAMEL_UNSAFE001, ...
  CWE-89  SQL injection               PY101, SQL586/693, ORM001/002, AI104/504, ...
  CWE-918 SSRF                        SSRF_001, NET705, AI501, ENV_URL001, ...
  CWE-295 TLS / cert verification     TLS001, SSL531, SSH001, G405, NET705
  CWE-327 weak crypto                 PY201/202/203/205, HASH807
  CWE-338 weak PRNG                   CRYPTO708, RAND810
  CWE-798 hardcoded credentials       G101/101B/102/104/110..133, AI002/404, AUTH711, ADMIN795, CFG001, ...
  CWE-352 CSRF                        G404, CSRF747, OAUTH774
  CWE-489 active debug code           G401/403, FLASK001, FLASK_DEBUG001, DJANGO_DEBUG001, DEBUG798
  CWE-79  XSS                         PY105
  CWE-611 XXE                         PY303, XXE001
  CWE-942 CORS                        CORS780
  CWE-601 open redirect               OPEN_REDIRECT001
  CWE-1004 sensitive cookie attr      COOKIE792, COOKIE_FILE001
  CWE-319 cleartext transmission      HTTPS789, AI403
  CWE-200 info disclosure             INFO738, BACKUP801, FILE528, AI402, AI405
  CWE-117 log injection               LOG741
  CWE-208 timing attack               TIMING759
  CWE-1333 ReDoS                      REGEX870
  (full list in the rule TOMLs themselves)

New AST rules
 - YAML001        yaml.load() without SafeLoader   (CWE-502, Critical)
 - FLASK_DEBUG001 .run(debug=True) on Flask/FastAPI (CWE-489, High)

AI202 hardened
 - pattern tightened to `torch\.load\s*\(`
 - exclude_pattern now matches DESER_TORCH001's: skip lines with `weights_only=True`
 - now redundant with DESER_TORCH001 (both CWE-502) → stage-2 dedup collapses them to one Critical finding per torch.load line

Test on Ghy0501/MCITlib (4,743 .py / 27,568 functions):

                                  this branch     main (post-ParzivalHack#55)
  wall clock                          593s              606s
  total findings                     1,740             3,103
  unique (file, line, CWE) groups    1,740             1,918
  duplicate groups (≥2 rules)            0             1,185
  excess duplicate findings              0             1,185
  heuristic-TP                       1,684             3,047
  heuristic-FP                          56                56

Dedup is reflected directly: branch produces 0 duplicate groups where main produces 1,185 (i.e. 1,185 places where 2+ rules describe the same vulnerability at the same line). FP count is identical (56) since FPs are pattern-shape artifacts that don't depend on dedup. The remaining 178-finding gap (1,918 unique vs 1,740) is AI202 no longer flagging torch.load(..., weights_only=True). Wall clock −13s is within noise.
@ParzivalHack ParzivalHack added the enhancement New feature or request label Jun 1, 2026
Copy link
Copy Markdown
Owner

@ParzivalHack ParzivalHack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dedup works fine (and the new per-rule CWE field is a great adding). Merging :)

@ParzivalHack ParzivalHack merged commit 8dc016b into ParzivalHack:main Jun 1, 2026
1 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants