Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions README/WHATS_NEW_zh-CN.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
# 本次更新 — AutoControl

## 本次更新 (2026-06-22) — 校验位算法

计算/验证 Luhn、Verhoeff、Damm 与 ISO 7064 MOD 97-10 校验位。完整参考:[`docs/source/Zh/doc/new_features/v115_features_doc.rst`](../docs/source/Zh/doc/new_features/v115_features_doc.rst)。

- **`luhn_validate` / `luhn_check_digit` / `verhoeff_*` / `damm_*` / `mod97_10_*`**(`AC_checksum_validate`、`AC_checksum_digit`):`pii_text` 以正则检测卡号/IBAN 形状、`data_quality` 做正则验证,但没有任何功能计算或验证*校验位*。本功能加入多数标识符背后的四种方案(卡号/IMEI、身份证号、IBAN)——`identifier_validate` 所依据的共用引擎。纯标准库、确定。

## 本次更新 (2026-06-22) — 移动平均平滑

平滑噪声值序列。完整参考:[`docs/source/Zh/doc/new_features/v102_features_doc.rst`](../docs/source/Zh/doc/new_features/v102_features_doc.rst)。
Expand Down
6 changes: 6 additions & 0 deletions README/WHATS_NEW_zh-TW.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
# 本次更新 — AutoControl

## 本次更新 (2026-06-22) — 檢查碼演算法

計算/驗證 Luhn、Verhoeff、Damm 與 ISO 7064 MOD 97-10 檢查碼。完整參考:[`docs/source/Zh/doc/new_features/v115_features_doc.rst`](../docs/source/Zh/doc/new_features/v115_features_doc.rst)。

- **`luhn_validate` / `luhn_check_digit` / `verhoeff_*` / `damm_*` / `mod97_10_*`**(`AC_checksum_validate`、`AC_checksum_digit`):`pii_text` 以正則偵測卡號/IBAN 形狀、`data_quality` 做正則驗證,但沒有任何功能計算或驗證*檢查碼*。本功能加入多數識別碼背後的四種方案(卡號/IMEI、國民身分碼、IBAN)——`identifier_validate` 所依據的共用引擎。純標準函式庫、具決定性。

## 本次更新 (2026-06-22) — 移動平均平滑

平滑雜訊值序列。完整參考:[`docs/source/Zh/doc/new_features/v102_features_doc.rst`](../docs/source/Zh/doc/new_features/v102_features_doc.rst)。
Expand Down
6 changes: 6 additions & 0 deletions WHATS_NEW.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
# What's New — AutoControl

## What's new (2026-06-22) — Check-Digit Algorithms

Compute / verify Luhn, Verhoeff, Damm and ISO 7064 MOD 97-10 check digits. Full reference: [`docs/source/Eng/doc/new_features/v115_features_doc.rst`](docs/source/Eng/doc/new_features/v115_features_doc.rst).

- **`luhn_validate` / `luhn_check_digit` / `verhoeff_*` / `damm_*` / `mod97_10_*`** (`AC_checksum_validate`, `AC_checksum_digit`): `pii_text` detects card/IBAN shapes by regex and `data_quality` does regex validation, but nothing computed or verified a *check digit*. This adds the four schemes behind most identifiers (cards/IMEI, national IDs, IBAN) — the shared engine `identifier_validate` builds on. Pure-stdlib, deterministic.

## What's new (2026-06-22) — GNU gettext Catalog I/O (.po / .mo)

Read/compile the de-facto translation format. Full reference: [`docs/source/Eng/doc/new_features/v114_features_doc.rst`](docs/source/Eng/doc/new_features/v114_features_doc.rst).
Expand Down
49 changes: 49 additions & 0 deletions docs/source/Eng/doc/new_features/v115_features_doc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
Check-Digit Algorithms
======================

``pii_text`` detects credit-card and IBAN *shapes* by regex and ``data_quality``
does type / range / regex validation, but nothing actually computes or verifies a
*check digit*. This adds the shared arithmetic engine for the four schemes behind
most real-world identifiers — and the primitive that account-number, card, IBAN,
ISBN and EAN validation build on.

Pure standard library (integer arithmetic; the Verhoeff and Damm tables are small
embedded constants). Every function is pure (string in, bool / str out), so it is
fully deterministic in CI.

Headless API
------------

.. code-block:: python

from je_auto_control import (
luhn_validate, luhn_check_digit,
verhoeff_validate, verhoeff_check_digit,
damm_validate, damm_check_digit,
mod97_10_validate, mod97_10_check_digits,
)

luhn_validate("4111111111111111") # True (credit-card / IMEI)
luhn_check_digit("7992739871") # '3' -> 79927398713
verhoeff_validate("2363") # True (catches transpositions)
damm_check_digit("572") # '4'
mod97_10_validate("3214282912345698765432161182") # True (IBAN engine)

- **Luhn** (mod 10): credit cards, IMEI, many national IDs — catches all
single-digit errors and most adjacent transpositions.
- **Verhoeff** and **Damm**: decimal schemes that catch *all* single-digit and
adjacent-transposition errors (stronger than Luhn).
- **ISO 7064 MOD 97-10**: the two-check-digit scheme behind IBAN and similar.

Each scheme exposes ``*_validate(number)`` (does the value incl. its check digit
verify?) and ``*_check_digit`` / ``*_check_digits`` (what digit(s) to append to a
bare payload?). Non-digit characters are ignored, so spaced/grouped input works.

Executor commands
-----------------

``AC_checksum_validate`` takes a ``scheme`` (``luhn`` / ``verhoeff`` / ``damm`` /
``mod97``) plus a ``number`` and returns ``{valid}``; ``AC_checksum_digit`` returns
``{check_digit}`` for a ``partial``. Both are exposed as MCP tools
(``ac_checksum_validate`` / ``ac_checksum_digit``) and as Script Builder commands
under **Data**.
1 change: 1 addition & 0 deletions docs/source/Eng/eng_index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,7 @@ Comprehensive guides for all AutoControl features.
doc/new_features/v112_features_doc
doc/new_features/v113_features_doc
doc/new_features/v114_features_doc
doc/new_features/v115_features_doc
doc/ocr_backends/ocr_backends_doc
doc/observability/observability_doc
doc/operations_layer/operations_layer_doc
Expand Down
41 changes: 41 additions & 0 deletions docs/source/Zh/doc/new_features/v115_features_doc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
檢查碼演算法
============

``pii_text`` 以正則偵測信用卡與 IBAN 的*形狀*、``data_quality`` 做型別/範圍/正則驗證,但沒有任何功能實際
計算或驗證*檢查碼*。本功能加入多數真實世界識別碼背後四種方案的共用運算引擎——也是帳號、卡號、IBAN、
ISBN、EAN 驗證所依據的基本元件。

純標準函式庫(整數運算;Verhoeff 與 Damm 表為小型內嵌常數)。每個函式皆為純函式(字串進、bool/str 出),
因此在 CI 中完全具決定性。

無頭 API
--------

.. code-block:: python

from je_auto_control import (
luhn_validate, luhn_check_digit,
verhoeff_validate, verhoeff_check_digit,
damm_validate, damm_check_digit,
mod97_10_validate, mod97_10_check_digits,
)

luhn_validate("4111111111111111") # True (信用卡 / IMEI)
luhn_check_digit("7992739871") # '3' -> 79927398713
verhoeff_validate("2363") # True (可抓出換位錯誤)
damm_check_digit("572") # '4'
mod97_10_validate("3214282912345698765432161182") # True (IBAN 引擎)

- **Luhn**(mod 10):信用卡、IMEI、多種國民身分碼——可抓出所有單一數字錯誤與多數相鄰換位。
- **Verhoeff** 與 **Damm**:十進位方案,可抓出*所有*單一數字與相鄰換位錯誤(比 Luhn 更強)。
- **ISO 7064 MOD 97-10**:IBAN 等使用的雙檢查碼方案。

每個方案提供 ``*_validate(number)``(含檢查碼的值是否驗證通過?)與 ``*_check_digit`` / ``*_check_digits``
(對裸負載應附加哪些檢查碼?)。非數字字元會被忽略,因此含空格/分組的輸入也適用。

執行器命令
----------

``AC_checksum_validate`` 接受 ``scheme``(``luhn`` / ``verhoeff`` / ``damm`` / ``mod97``)與 ``number`` 並回傳
``{valid}``;``AC_checksum_digit`` 對 ``partial`` 回傳 ``{check_digit}``。兩者皆以 MCP 工具
(``ac_checksum_validate`` / ``ac_checksum_digit``)以及 Script Builder 中 **Data** 分類下的命令提供。
1 change: 1 addition & 0 deletions docs/source/Zh/zh_index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,7 @@ AutoControl 所有功能的完整使用指南。
doc/new_features/v112_features_doc
doc/new_features/v113_features_doc
doc/new_features/v114_features_doc
doc/new_features/v115_features_doc
doc/ocr_backends/ocr_backends_doc
doc/observability/observability_doc
doc/operations_layer/operations_layer_doc
Expand Down
14 changes: 14 additions & 0 deletions je_auto_control/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -245,6 +245,12 @@
from je_auto_control.utils.gettext_catalog import (
GettextCatalog, parse_po, parse_po_file, read_mo, read_mo_file,
)
# Check-digit algorithms (Luhn / Verhoeff / Damm / ISO 7064 MOD 97-10)
from je_auto_control.utils.checksum import (
damm_check_digit, damm_validate, luhn_check_digit, luhn_validate,
mod97_10_check_digits, mod97_10_validate, verhoeff_check_digit,
verhoeff_validate,
)
# CI workflow annotations (GitHub Actions)
from je_auto_control.utils.ci_annotations import (
emit_annotations, format_annotation,
Expand Down Expand Up @@ -1007,6 +1013,14 @@ def start_autocontrol_gui(*args, **kwargs):
"parse_po_file",
"read_mo",
"read_mo_file",
"luhn_validate",
"luhn_check_digit",
"verhoeff_validate",
"verhoeff_check_digit",
"damm_validate",
"damm_check_digit",
"mod97_10_validate",
"mod97_10_check_digits",
"emit_annotations", "format_annotation",
"ClipboardHistory", "default_clipboard_history",
"analyze_heal_log", "heal_stats", "scan_secrets",
Expand Down
18 changes: 18 additions & 0 deletions je_auto_control/gui/script_builder/command_schema.py
Original file line number Diff line number Diff line change
Expand Up @@ -2170,6 +2170,24 @@ def _add_resilience_specs(specs: List[CommandSpec]) -> None:
),
description="Pick the plural-correct translation for count n.",
))
specs.append(CommandSpec(
"AC_checksum_validate", "Data", "Checksum: Validate",
fields=(
FieldSpec("scheme", FieldType.STRING,
placeholder="luhn | verhoeff | damm | mod97"),
FieldSpec("number", FieldType.STRING, placeholder="4111111111111111"),
),
description="Validate a number's check digit (Luhn/Verhoeff/Damm/mod97).",
))
specs.append(CommandSpec(
"AC_checksum_digit", "Data", "Checksum: Check Digit",
fields=(
FieldSpec("scheme", FieldType.STRING,
placeholder="luhn | verhoeff | damm | mod97"),
FieldSpec("partial", FieldType.STRING, placeholder="799273987"),
),
description="Compute the check digit(s) to append to a value.",
))
specs.append(CommandSpec(
"AC_diff_rows", "Data", "Dataset Diff: Rows by Key",
fields=(
Expand Down
12 changes: 12 additions & 0 deletions je_auto_control/utils/checksum/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
"""Check-digit algorithms: Luhn, Verhoeff, Damm, ISO 7064 MOD 97-10."""
from je_auto_control.utils.checksum.checksum import (
damm_check_digit, damm_validate, luhn_check_digit, luhn_validate,
mod97_10_check_digits, mod97_10_validate, verhoeff_check_digit,
verhoeff_validate,
)

__all__ = [
"damm_check_digit", "damm_validate", "luhn_check_digit", "luhn_validate",
"mod97_10_check_digits", "mod97_10_validate", "verhoeff_check_digit",
"verhoeff_validate",
]
120 changes: 120 additions & 0 deletions je_auto_control/utils/checksum/checksum.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
"""Check-digit algorithms: Luhn, Verhoeff, Damm, ISO 7064 MOD 97-10.

``pii_text`` detects credit-card and IBAN *shapes* by regex and ``data_quality``
does type/range/regex validation, but nothing in the project actually computes or
verifies a *check digit*. This is the shared arithmetic engine that catches the
single-digit and adjacent-transposition typos those formats are designed to
detect, and the primitive that ``identifier_validate`` (IBAN / ISBN / EAN / card)
builds on.

Pure standard library (integer arithmetic only; the Verhoeff and Damm tables are
small embedded constants). Every function is pure (string in, bool/str out), so it
is fully deterministic in CI.
"""
from typing import List

# Verhoeff dihedral-group multiplication, permutation and inverse tables.
_VERHOEFF_D = (
(0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (1, 2, 3, 4, 0, 6, 7, 8, 9, 5),
(2, 3, 4, 0, 1, 7, 8, 9, 5, 6), (3, 4, 0, 1, 2, 8, 9, 5, 6, 7),
(4, 0, 1, 2, 3, 9, 5, 6, 7, 8), (5, 9, 8, 7, 6, 0, 4, 3, 2, 1),
(6, 5, 9, 8, 7, 1, 0, 4, 3, 2), (7, 6, 5, 9, 8, 2, 1, 0, 4, 3),
(8, 7, 6, 5, 9, 3, 2, 1, 0, 4), (9, 8, 7, 6, 5, 4, 3, 2, 1, 0),
)
_VERHOEFF_P = (
(0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (1, 5, 7, 6, 2, 8, 3, 0, 9, 4),
(5, 8, 0, 3, 7, 9, 6, 1, 4, 2), (8, 9, 1, 6, 0, 4, 3, 5, 2, 7),
(9, 4, 5, 3, 1, 2, 6, 8, 7, 0), (4, 2, 8, 6, 5, 7, 3, 9, 0, 1),
(2, 7, 9, 3, 8, 0, 6, 4, 1, 5), (7, 0, 4, 6, 9, 1, 3, 2, 5, 8),
)
_VERHOEFF_INV = (0, 4, 3, 2, 1, 5, 6, 7, 8, 9)

# Damm quasigroup table (totally anti-symmetric).
_DAMM = (
(0, 3, 1, 7, 5, 9, 8, 6, 4, 2), (7, 0, 9, 2, 1, 5, 4, 8, 6, 3),
(4, 2, 0, 6, 8, 7, 1, 3, 5, 9), (1, 7, 5, 0, 9, 8, 3, 4, 2, 6),
(6, 1, 2, 3, 0, 4, 5, 9, 7, 8), (3, 6, 7, 4, 2, 0, 9, 5, 8, 1),
(5, 8, 6, 9, 7, 2, 0, 1, 3, 4), (8, 9, 4, 5, 3, 6, 2, 0, 1, 7),
(9, 4, 3, 8, 6, 1, 7, 2, 0, 5), (2, 5, 8, 1, 4, 3, 6, 7, 9, 0),
)


def _digits(value: object) -> List[int]:
"""Extract the decimal digits of a value as a list of ints."""
return [int(ch) for ch in str(value) if ch.isdigit()]


# --- Luhn (mod 10) --------------------------------------------------------

def _luhn_sum(digits: List[int]) -> int:
total = 0
for index, digit in enumerate(reversed(digits)):
if index % 2 == 1:
digit *= 2
if digit > 9:
digit -= 9
total += digit
return total


def luhn_validate(number: object) -> bool:
"""Whether ``number`` (incl. its trailing check digit) passes Luhn."""
digits = _digits(number)
return bool(digits) and _luhn_sum(digits) % 10 == 0


def luhn_check_digit(partial: object) -> str:
"""Return the Luhn check digit to append to ``partial`` (no check digit)."""
total = _luhn_sum(_digits(partial) + [0])
return str((10 - total % 10) % 10)


# --- Verhoeff -------------------------------------------------------------

def verhoeff_validate(number: object) -> bool:
"""Whether ``number`` (incl. check digit) passes the Verhoeff scheme."""
check = 0
for index, digit in enumerate(reversed(_digits(number))):
check = _VERHOEFF_D[check][_VERHOEFF_P[index % 8][digit]]
return check == 0


def verhoeff_check_digit(partial: object) -> str:
"""Return the Verhoeff check digit to append to ``partial``."""
check = 0
for index, digit in enumerate(reversed(_digits(partial))):
check = _VERHOEFF_D[check][_VERHOEFF_P[(index + 1) % 8][digit]]
return str(_VERHOEFF_INV[check])


# --- Damm -----------------------------------------------------------------

def damm_validate(number: object) -> bool:
"""Whether ``number`` (incl. check digit) passes the Damm scheme."""
interim = 0
for digit in _digits(number):
interim = _DAMM[interim][digit]
return interim == 0


def damm_check_digit(partial: object) -> str:
"""Return the Damm check digit to append to ``partial``."""
interim = 0
for digit in _digits(partial):
interim = _DAMM[interim][digit]
return str(interim)


# --- ISO 7064 MOD 97-10 (the IBAN engine) ---------------------------------

def mod97_10_validate(number: object) -> bool:
"""Whether the numeric string ``number`` satisfies ISO 7064 MOD 97-10."""
digits = "".join(ch for ch in str(number) if ch.isdigit())
return bool(digits) and int(digits) % 97 == 1


def mod97_10_check_digits(partial: object) -> str:
"""Return the two MOD 97-10 check digits to append to ``partial``."""
digits = "".join(ch for ch in str(partial) if ch.isdigit())
value = int(digits) if digits else 0
return f"{(1 - value * 100) % 97:02d}"
24 changes: 24 additions & 0 deletions je_auto_control/utils/executor/action_executor.py
Original file line number Diff line number Diff line change
Expand Up @@ -3047,6 +3047,28 @@ def _gettext_ngettext(po: str, msgid: str, msgid_plural: str,
return {"text": catalog.ngettext(msgid, msgid_plural, int(n))}


def _checksum_validate(scheme: str, number: str) -> Dict[str, Any]:
"""Adapter: validate a number's check digit under a named scheme."""
from je_auto_control.utils import checksum as cs
validators = {"luhn": cs.luhn_validate, "verhoeff": cs.verhoeff_validate,
"damm": cs.damm_validate, "mod97": cs.mod97_10_validate}
func = validators.get(scheme)
if func is None:
raise AutoControlActionException(f"unknown checksum scheme: {scheme!r}")
return {"valid": func(number)}


def _checksum_digit(scheme: str, partial: str) -> Dict[str, Any]:
"""Adapter: compute the check digit(s) for a value under a named scheme."""
from je_auto_control.utils import checksum as cs
digits = {"luhn": cs.luhn_check_digit, "verhoeff": cs.verhoeff_check_digit,
"damm": cs.damm_check_digit, "mod97": cs.mod97_10_check_digits}
func = digits.get(scheme)
if func is None:
raise AutoControlActionException(f"unknown checksum scheme: {scheme!r}")
return {"check_digit": func(partial)}


def _cas_put(name: str, key: str, value: Any,
expected_version: Any = None) -> Dict[str, Any]:
"""Adapter: optimistic put into a named versioned store."""
Expand Down Expand Up @@ -4740,6 +4762,8 @@ def __init__(self):
"AC_format_message": _format_message,
"AC_gettext_translate": _gettext_translate,
"AC_gettext_ngettext": _gettext_ngettext,
"AC_checksum_validate": _checksum_validate,
"AC_checksum_digit": _checksum_digit,
"AC_detect_drift": _detect_drift,
"AC_categorical_drift": _categorical_drift,
"AC_diff_rows": _diff_rows,
Expand Down
Loading
Loading