Skip to content

Add fuzzy string matching and dedupe#248

Merged
JE-Chen merged 2 commits into
devfrom
feat/fuzzy-match-batch
Jun 19, 2026
Merged

Add fuzzy string matching and dedupe#248
JE-Chen merged 2 commits into
devfrom
feat/fuzzy-match-batch

Conversation

@JE-Chen

@JE-Chen JE-Chen commented Jun 19, 2026

Copy link
Copy Markdown
Member

Dependency-light data batch (approved under 批准全部). Full layers + tests + EN/Zh v40 docs + README.

Feature (utils/fuzzy, stdlib default + optional rapidfuzz)

  • fuzzy_ratio (0..1 similarity), fuzzy_best_match (closest of a list, score+index, or None below cutoff), fuzzy_matches (top-N sorted), fuzzy_dedupe (collapse near-duplicates, keep first). Robust to OCR/UI-copy noise — act on "the button that looks like Submit".
  • Default backend is stdlib difflib — zero extra deps, so the feature (and its tests) run everywhere including the headless quality.yml job. The optional [fuzzy] extra adds rapidfuzz for speed; scores are normalised to 0..1 either way so callers never depend on the backend (BACKEND names the active one). ignore_case (default true) + score_cutoff.
  • Executor AC_fuzzy_ratio / AC_fuzzy_best_match / AC_fuzzy_dedupe (choices/items as list or builder JSON string); MCP ac_fuzzy_*; Builder under Data.

Verification

  • 11 tests pass (ratio bounds/identity/case, best-match closest + cutoff-None, matches sorted+limited, dedupe collapse + keep-distinct, executor round-trip, wiring, facade). Assertions check ordering/thresholds, not backend-specific floats. ruff clean; radon no CC≥C; bandit clean; PySide6-free.

@codacy-production

codacy-production Bot commented Jun 19, 2026

Copy link
Copy Markdown

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics 39 complexity · 0 duplication

Metric Results
Complexity 39
Duplication 0

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

@JE-Chen JE-Chen merged commit a2bf011 into dev Jun 19, 2026
16 checks passed
@JE-Chen JE-Chen deleted the feat/fuzzy-match-batch branch June 19, 2026 16:14
@sonarqubecloud

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant