Skip to content

Add confusable / homoglyph detection for Unicode spoofing#317

Merged
JE-Chen merged 1 commit into
devfrom
feat/confusables-batch
Jun 22, 2026
Merged

Add confusable / homoglyph detection for Unicode spoofing#317
JE-Chen merged 1 commit into
devfrom
feat/confusables-batch

Conversation

@JE-Chen

@JE-Chen JE-Chen commented Jun 22, 2026

Copy link
Copy Markdown
Member

Summary

secrets_scan finds secret-shaped tokens and guardrail screens for prompt injection, but nothing catches visual spoofing: a Cyrillic "а" (U+0430) is pixel-for-pixel a Latin "a", so "pаypal" reads as "paypal" yet compares unequal — the basis of IDN-homograph phishing and lookalike UI labels. Following Unicode TR39, this folds confusables to a prototype skeleton and flags mixed-script tokens.

Layers

  • Headless core: je_auto_control/utils/confusables/confusables.pyskeleton, is_confusable, detect_homoglyphs, scripts_of, is_mixed_script. Pure stdlib (unicodedata), no PySide6, pure functions.
  • Facade: re-exported (with skeleton aliased to confusable_skeleton) + __all__.
  • Executor: AC_confusable_scan{skeleton, homoglyphs, mixed_script, scripts}, AC_confusable_compare{confusable}.
  • MCP: ac_confusable_scan / ac_confusable_compare.
  • Script Builder: two CommandSpecs under Data.

Tests / docs

  • test/unit_test/headless/test_confusables_batch.py — 8 headless tests (skeleton folding, confusable pairs, homoglyph positions, mixed-script, scripts + wiring + facade).
  • EN/Zh feature docs v109_features_doc.rst + toctrees; 3 README What's-new sections.

@codacy-production

Copy link
Copy Markdown

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics 33 complexity · 0 duplication

Metric Results
Complexity 33
Duplication 0

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

@JE-Chen JE-Chen merged commit 74475fe into dev Jun 22, 2026
16 checks passed
@JE-Chen JE-Chen deleted the feat/confusables-batch branch June 22, 2026 03:17
@sonarqubecloud

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant