Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@

## Table of Contents

- [What's new (2026-06-20) — Locale-Aware Number, Currency & Date Parsing](#whats-new-2026-06-20--locale-aware-number-currency--date-parsing)
- [What's new (2026-06-20) — Perceptual-Hash Image Dedupe](#whats-new-2026-06-20--perceptual-hash-image-dedupe)
- [What's new (2026-06-20) — S3-Compatible Artifact Store](#whats-new-2026-06-20--s3-compatible-artifact-store)
- [What's new (2026-06-20) — Fuzzy String Matching & Dedupe](#whats-new-2026-06-20--fuzzy-string-matching--dedupe)
Expand Down Expand Up @@ -95,6 +96,12 @@

---

## What's new (2026-06-20) — Locale-Aware Number, Currency & Date Parsing

Parse localized numbers/currency/dates. Full reference: [`docs/source/Eng/doc/new_features/v43_features_doc.rst`](docs/source/Eng/doc/new_features/v43_features_doc.rst).

- **`parse_decimal` / `parse_number` / `format_decimal` / `format_currency` / `format_date`** (`AC_parse_decimal` / `AC_parse_number` / `AC_format_decimal` / `AC_format_currency` / `AC_format_date`, `ac_*`): OCR/UI text like `"1.234,56"` (de_DE) parses correctly to `1234.56` via **Babel**'s CLDR data, and values format back per-locale. `babel` is an optional `[locale]` extra, imported lazily; functional tests run under `importorskip` (wiring/facade always verified).

## What's new (2026-06-20) — Perceptual-Hash Image Dedupe

Collapse near-identical screenshots. Full reference: [`docs/source/Eng/doc/new_features/v42_features_doc.rst`](docs/source/Eng/doc/new_features/v42_features_doc.rst).
Expand Down
7 changes: 7 additions & 0 deletions README/README_zh-CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@

## 目录

- [本次更新 (2026-06-20) — 区域设置感知的数字、货币与日期解析](#本次更新-2026-06-20--区域设置感知的数字货币与日期解析)
- [本次更新 (2026-06-20) — 感知哈希图像去重](#本次更新-2026-06-20--感知哈希图像去重)
- [本次更新 (2026-06-20) — S3 兼容成品存储](#本次更新-2026-06-20--s3-兼容成品存储)
- [本次更新 (2026-06-20) — 模糊字符串匹配与去重](#本次更新-2026-06-20--模糊字符串匹配与去重)
Expand Down Expand Up @@ -94,6 +95,12 @@

---

## 本次更新 (2026-06-20) — 区域设置感知的数字、货币与日期解析

解析本地化的数字/货币/日期。完整参考:[`docs/source/Zh/doc/new_features/v43_features_doc.rst`](../docs/source/Zh/doc/new_features/v43_features_doc.rst)。

- **`parse_decimal` / `parse_number` / `format_decimal` / `format_currency` / `format_date`**(`AC_parse_decimal` / `AC_parse_number` / `AC_format_decimal` / `AC_format_currency` / `AC_format_date`、`ac_*`):像 `"1.234,56"`(de_DE)这样的 OCR/UI 文本会通过 **Babel** 的 CLDR 数据正确解析为 `1234.56`,值也能依区域设置格式化回去。`babel` 为可选 `[locale]` extra,采延迟导入;功能测试以 `importorskip` 运行(wiring/facade 一律验证)。

## 本次更新 (2026-06-20) — 感知哈希图像去重

收合近乎相同的屏幕截图。完整参考:[`docs/source/Zh/doc/new_features/v42_features_doc.rst`](../docs/source/Zh/doc/new_features/v42_features_doc.rst)。
Expand Down
7 changes: 7 additions & 0 deletions README/README_zh-TW.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@

## 目錄

- [本次更新 (2026-06-20) — 區域設定感知的數字、貨幣與日期解析](#本次更新-2026-06-20--區域設定感知的數字貨幣與日期解析)
- [本次更新 (2026-06-20) — 感知雜湊影像去重](#本次更新-2026-06-20--感知雜湊影像去重)
- [本次更新 (2026-06-20) — S3 相容成品儲存](#本次更新-2026-06-20--s3-相容成品儲存)
- [本次更新 (2026-06-20) — 模糊字串比對與去重](#本次更新-2026-06-20--模糊字串比對與去重)
Expand Down Expand Up @@ -94,6 +95,12 @@

---

## 本次更新 (2026-06-20) — 區域設定感知的數字、貨幣與日期解析

解析在地化的數字/貨幣/日期。完整參考:[`docs/source/Zh/doc/new_features/v43_features_doc.rst`](../docs/source/Zh/doc/new_features/v43_features_doc.rst)。

- **`parse_decimal` / `parse_number` / `format_decimal` / `format_currency` / `format_date`**(`AC_parse_decimal` / `AC_parse_number` / `AC_format_decimal` / `AC_format_currency` / `AC_format_date`、`ac_*`):像 `"1.234,56"`(de_DE)這樣的 OCR/UI 文字會透過 **Babel** 的 CLDR 資料正確解析為 `1234.56`,值也能依區域設定格式化回去。`babel` 為選用 `[locale]` extra,採延遲匯入;功能測試以 `importorskip` 執行(wiring/facade 一律驗證)。

## 本次更新 (2026-06-20) — 感知雜湊影像去重

收合近乎相同的螢幕截圖。完整參考:[`docs/source/Zh/doc/new_features/v42_features_doc.rst`](../docs/source/Zh/doc/new_features/v42_features_doc.rst)。
Expand Down
55 changes: 55 additions & 0 deletions docs/source/Eng/doc/new_features/v43_features_doc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
Locale-Aware Number, Currency & Date Parsing
============================================

Text scraped from a localized UI or OCR rarely matches Python's ``float()``:
``"1.234,56"`` is twelve-hundred in ``de_DE`` but malformed to ``float``. These
helpers parse such strings — and format values back — using **Babel**'s CLDR
data, so flows can read and assert on numbers, currency, and dates across
locales.

``babel`` is an **optional** dependency (``pip install je_auto_control[locale]``)
imported lazily, so the package stays importable without it; the functions raise
a clear error only when called without Babel. Imports no ``PySide6``.

Headless API
------------

.. code-block:: python

from je_auto_control import (
parse_decimal, parse_number, format_decimal, format_currency,
format_date)

parse_decimal("1.234,56", locale="de_DE") # -> 1234.56
parse_number("1,234", locale="en_US") # -> 1234

format_decimal(1234.5, locale="en_US") # -> "1,234.5"
format_currency(1234.5, "USD", locale="en_US") # -> "$1,234.50"
format_date("2026-06-20", locale="de_DE", fmt="short") # -> "20.06.26"

``format_date`` accepts an ISO ``YYYY-MM-DD`` string or a ``date`` object and a
``fmt`` of ``short`` / ``medium`` / ``long`` / ``full``. Parse + format
round-trip within a locale.

.. note::

The functional path requires Babel; CI runs these tests under
``importorskip`` so they execute wherever Babel is installed and are skipped
otherwise. The wiring/facade are always verified.

Executor commands
-----------------

================================ ===================================================
Command Effect
================================ ===================================================
``AC_parse_decimal`` ``{value}`` float from a locale decimal string.
``AC_parse_number`` ``{value}`` int from a locale integer string.
``AC_format_decimal`` ``{text}`` number formatted for a locale.
``AC_format_currency`` ``{text}`` currency (ISO 4217) for a locale.
``AC_format_date`` ``{text}`` ISO date formatted for a locale.
================================ ===================================================

The same operations are exposed as MCP tools (``ac_parse_decimal`` /
``ac_parse_number`` / ``ac_format_decimal`` / ``ac_format_currency`` /
``ac_format_date``) and as Script Builder commands under **Data**.
1 change: 1 addition & 0 deletions docs/source/Eng/eng_index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,7 @@ Comprehensive guides for all AutoControl features.
doc/new_features/v40_features_doc
doc/new_features/v41_features_doc
doc/new_features/v42_features_doc
doc/new_features/v43_features_doc
doc/ocr_backends/ocr_backends_doc
doc/observability/observability_doc
doc/operations_layer/operations_layer_doc
Expand Down
52 changes: 52 additions & 0 deletions docs/source/Zh/doc/new_features/v43_features_doc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
區域設定感知的數字、貨幣與日期解析
==================================

從在地化 UI 或 OCR 擷取的文字,鮮少能直接通過 Python 的 ``float()``:``"1.234,56"``
在 ``de_DE`` 是一千二百多,但對 ``float`` 卻是格式錯誤。這些輔助函式以 **Babel** 的
CLDR 資料解析這類字串(並可反向格式化值),讓流程能跨區域設定讀取並斷言數字、貨幣與日
期。

``babel`` 為**選用**相依(``pip install je_auto_control[locale]``),採延遲匯入,因此套
件在沒有它時仍可匯入;函式僅在未安裝 Babel 而被呼叫時才拋出明確錯誤。不匯入
``PySide6``。

無頭 API
--------

.. code-block:: python

from je_auto_control import (
parse_decimal, parse_number, format_decimal, format_currency,
format_date)

parse_decimal("1.234,56", locale="de_DE") # -> 1234.56
parse_number("1,234", locale="en_US") # -> 1234

format_decimal(1234.5, locale="en_US") # -> "1,234.5"
format_currency(1234.5, "USD", locale="en_US") # -> "$1,234.50"
format_date("2026-06-20", locale="de_DE", fmt="short") # -> "20.06.26"

``format_date`` 接受 ISO ``YYYY-MM-DD`` 字串或 ``date`` 物件,``fmt`` 可為 ``short`` /
``medium`` / ``long`` / ``full``。同一區域設定內解析 + 格式化可往返一致。

.. note::

功能路徑需要 Babel;CI 以 ``importorskip`` 執行這些測試,因此在有安裝 Babel 處執
行、否則跳過。wiring/facade 則一律驗證。

執行器指令
----------

================================ ===================================================
指令 效果
================================ ===================================================
``AC_parse_decimal`` 由區域設定小數字串得到 ``{value}`` float。
``AC_parse_number`` 由區域設定整數字串得到 ``{value}`` int。
``AC_format_decimal`` 依區域設定格式化數字的 ``{text}``。
``AC_format_currency`` 依區域設定的貨幣(ISO 4217)``{text}``。
``AC_format_date`` 依區域設定格式化 ISO 日期的 ``{text}``。
================================ ===================================================

相同操作亦提供為 MCP 工具(``ac_parse_decimal`` / ``ac_parse_number`` /
``ac_format_decimal`` / ``ac_format_currency`` / ``ac_format_date``),以及 Script
Builder 中 **Data** 分類下的指令。
1 change: 1 addition & 0 deletions docs/source/Zh/zh_index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,7 @@ AutoControl 所有功能的完整使用指南。
doc/new_features/v40_features_doc
doc/new_features/v41_features_doc
doc/new_features/v42_features_doc
doc/new_features/v43_features_doc
doc/ocr_backends/ocr_backends_doc
doc/observability/observability_doc
doc/operations_layer/operations_layer_doc
Expand Down
6 changes: 6 additions & 0 deletions je_auto_control/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -243,6 +243,10 @@
from je_auto_control.utils.image_dedup import (
average_hash, dedupe_images, dhash, hamming_distance, images_similar,
)
# Locale-aware number/currency/date parsing & formatting (optional babel)
from je_auto_control.utils.locale_parse import (
format_currency, format_date, format_decimal, parse_decimal, parse_number,
)
# Background popup/interrupt watchdog (unattended automation)
from je_auto_control.utils.watchdog import (
PopupWatchdog, WatchdogRule, default_popup_watchdog,
Expand Down Expand Up @@ -694,6 +698,8 @@ def start_autocontrol_gui(*args, **kwargs):
"set_default_store",
"average_hash", "dedupe_images", "dhash", "hamming_distance",
"images_similar",
"format_currency", "format_date", "format_decimal", "parse_decimal",
"parse_number",
# MCP server
"AuditLogger", "HttpMCPServer", "MCPContent", "MCPPrompt",
"MCPPromptArgument", "MCPResource", "MCPServer", "MCPTool",
Expand Down
48 changes: 48 additions & 0 deletions je_auto_control/gui/script_builder/command_schema.py
Original file line number Diff line number Diff line change
Expand Up @@ -946,6 +946,54 @@ def _add_misc_specs(specs: List[CommandSpec]) -> None:
),
description="Collapse near-duplicate images by perceptual hash.",
))
specs.append(CommandSpec(
"AC_parse_decimal", "Data", "Locale: Parse Decimal",
fields=(
FieldSpec("text", FieldType.STRING, placeholder="1.234,56"),
FieldSpec("locale", FieldType.STRING, optional=True,
default="en_US"),
),
description="Parse a locale-formatted decimal string to a float.",
))
specs.append(CommandSpec(
"AC_parse_number", "Data", "Locale: Parse Number",
fields=(
FieldSpec("text", FieldType.STRING, placeholder="1,234"),
FieldSpec("locale", FieldType.STRING, optional=True,
default="en_US"),
),
description="Parse a locale-formatted integer string to an int.",
))
specs.append(CommandSpec(
"AC_format_decimal", "Data", "Locale: Format Decimal",
fields=(
FieldSpec("value", FieldType.FLOAT),
FieldSpec("locale", FieldType.STRING, optional=True,
default="en_US"),
),
description="Format a number for a locale.",
))
specs.append(CommandSpec(
"AC_format_currency", "Data", "Locale: Format Currency",
fields=(
FieldSpec("value", FieldType.FLOAT),
FieldSpec("currency", FieldType.STRING, placeholder="USD"),
FieldSpec("locale", FieldType.STRING, optional=True,
default="en_US"),
),
description="Format a value as currency (ISO 4217) for a locale.",
))
specs.append(CommandSpec(
"AC_format_date", "Data", "Locale: Format Date",
fields=(
FieldSpec("value", FieldType.STRING, placeholder="2026-06-20"),
FieldSpec("locale", FieldType.STRING, optional=True,
default="en_US"),
FieldSpec("fmt", FieldType.ENUM, optional=True, default="medium",
choices=("short", "medium", "long", "full")),
),
description="Format an ISO date string for a locale.",
))
specs.append(CommandSpec(
"AC_generate_sop", "Report", "Generate SOP Document",
fields=(
Expand Down
37 changes: 37 additions & 0 deletions je_auto_control/utils/executor/action_executor.py
Original file line number Diff line number Diff line change
Expand Up @@ -3136,6 +3136,38 @@ def _dedupe_images(paths: Any, max_distance: int = 5) -> Dict[str, Any]:
max_distance=max_distance)}


def _parse_decimal(text: str, locale: str = "en_US") -> Dict[str, Any]:
"""Adapter: parse a locale-formatted decimal string to a float."""
from je_auto_control.utils.locale_parse import parse_decimal
return {"value": parse_decimal(text, locale)}


def _parse_number(text: str, locale: str = "en_US") -> Dict[str, Any]:
"""Adapter: parse a locale-formatted integer string to an int."""
from je_auto_control.utils.locale_parse import parse_number
return {"value": parse_number(text, locale)}


def _format_decimal(value: float, locale: str = "en_US") -> Dict[str, Any]:
"""Adapter: format a number for a locale."""
from je_auto_control.utils.locale_parse import format_decimal
return {"text": format_decimal(value, locale)}


def _format_currency(value: float, currency: str,
locale: str = "en_US") -> Dict[str, Any]:
"""Adapter: format a value as currency for a locale."""
from je_auto_control.utils.locale_parse import format_currency
return {"text": format_currency(value, currency, locale)}


def _format_date(value: str, locale: str = "en_US",
fmt: str = "medium") -> Dict[str, Any]:
"""Adapter: format an ISO date string for a locale."""
from je_auto_control.utils.locale_parse import format_date
return {"text": format_date(value, locale, fmt)}


class Executor:
"""
Executor
Expand Down Expand Up @@ -3397,6 +3429,11 @@ def __init__(self):
"AC_s3_delete": _s3_delete,
"AC_image_hash": _image_hash,
"AC_dedupe_images": _dedupe_images,
"AC_parse_decimal": _parse_decimal,
"AC_parse_number": _parse_number,
"AC_format_decimal": _format_decimal,
"AC_format_currency": _format_currency,
"AC_format_date": _format_date,
"AC_a11y_record_start": _a11y_record_start,
"AC_a11y_record_stop": _a11y_record_stop,
"AC_a11y_record_events": _a11y_record_events,
Expand Down
9 changes: 9 additions & 0 deletions je_auto_control/utils/locale_parse/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
"""Locale-aware number/currency/date parsing & formatting (optional babel)."""
from je_auto_control.utils.locale_parse.locale_parse import (
format_currency, format_date, format_decimal, parse_decimal, parse_number,
)

__all__ = [
"format_currency", "format_date", "format_decimal", "parse_decimal",
"parse_number",
]
59 changes: 59 additions & 0 deletions je_auto_control/utils/locale_parse/locale_parse.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
"""Parse and format numbers, currency, and dates the way a locale writes them.

Text scraped from a localized UI or OCR rarely matches Python's ``float()``:
``"1.234,56"`` is twelve-hundred in ``de_DE`` but malformed to ``float``. These
helpers parse such strings (and format values back) using **Babel**'s CLDR data,
so flows can read and assert on numbers/currency/dates across locales.

``babel`` is an **optional** dependency (``pip install je_auto_control[locale]``);
it is imported lazily, so the package stays importable without it and the
functions raise a clear error only when actually called without Babel installed.
Imports no ``PySide6``.
"""
import datetime
from typing import Any, Union


def _numbers() -> Any:
try:
from babel import numbers
except ImportError as error: # pragma: no cover - exercised without babel
raise RuntimeError(
"locale parsing requires Babel: pip install "
"je_auto_control[locale]") from error
return numbers


def parse_decimal(text: str, locale: str = "en_US") -> float:
"""Parse a locale-formatted decimal string into a ``float``."""
return float(_numbers().parse_decimal(text, locale=locale))


def parse_number(text: str, locale: str = "en_US") -> int:
"""Parse a locale-formatted integer string into an ``int``."""
return int(_numbers().parse_decimal(text, locale=locale))


def format_decimal(value: Union[int, float], locale: str = "en_US") -> str:
"""Format a number the way ``locale`` writes decimals."""
return _numbers().format_decimal(value, locale=locale)


def format_currency(value: Union[int, float], currency: str,
locale: str = "en_US") -> str:
"""Format ``value`` as ``currency`` (ISO 4217) for ``locale``."""
return _numbers().format_currency(value, currency, locale=locale)


def format_date(value: Union[str, datetime.date], locale: str = "en_US",
fmt: str = "medium") -> str:
"""Format a date (or ISO ``YYYY-MM-DD`` string) for ``locale``."""
try:
from babel.dates import format_date as _format_date
except ImportError as error: # pragma: no cover - exercised without babel
raise RuntimeError(
"locale formatting requires Babel: pip install "
"je_auto_control[locale]") from error
date_value = (datetime.date.fromisoformat(value)
if isinstance(value, str) else value)
return _format_date(date_value, format=fmt, locale=locale)
Loading
Loading