diff --git a/README.md b/README.md index ae0c5c49..d0f3f09e 100644 --- a/README.md +++ b/README.md @@ -13,6 +13,7 @@ ## Table of Contents +- [What's new (2026-06-22) — ICU-lite MessageFormat (Plural / Select)](#whats-new-2026-06-22--icu-lite-messageformat-plural--select) - [What's new (2026-06-22) — Locale-Aware List Formatting](#whats-new-2026-06-22--locale-aware-list-formatting) - [What's new (2026-06-22) — Bidirectional-Text QA (Trojan-Source Scan)](#whats-new-2026-06-22--bidirectional-text-qa-trojan-source-scan) - [What's new (2026-06-22) — Readability Scoring](#whats-new-2026-06-22--readability-scoring) @@ -165,6 +166,12 @@ --- +## What's new (2026-06-22) — ICU-lite MessageFormat (Plural / Select) + +Render count-aware localised messages. Full reference: [`docs/source/Eng/doc/new_features/v113_features_doc.rst`](docs/source/Eng/doc/new_features/v113_features_doc.rst). + +- **`format_message` / `plural_category` / `ordinal_category`** (`AC_format_message`): `i18n_test.check_catalog` only compares placeholder sets and `interpolate` is flat `${var}` — neither renders `"{count, plural, one {# item} other {# items}}"`. This implements the ICU MessageFormat subset most apps use: `select`, `plural`, `selectordinal` with CLDR categories, exact `=N` selectors, the `#` count, `offset:`, nesting and apostrophe quoting. Injectable plural rules. Pure-stdlib, deterministic. + ## What's new (2026-06-22) — Locale-Aware List Formatting Join items the way a language expects ("A, B, and C"). Full reference: [`docs/source/Eng/doc/new_features/v112_features_doc.rst`](docs/source/Eng/doc/new_features/v112_features_doc.rst). diff --git a/README/README_zh-CN.md b/README/README_zh-CN.md index c0de127e..fa7c5706 100644 --- a/README/README_zh-CN.md +++ b/README/README_zh-CN.md @@ -12,6 +12,7 @@ ## 目录 +- [本次更新 (2026-06-22) — ICU-lite MessageFormat(复数 / 选择)](#本次更新-2026-06-22--icu-lite-messageformat复数--选择) - [本次更新 (2026-06-22) — 区域感知列表格式化](#本次更新-2026-06-22--区域感知列表格式化) - [本次更新 (2026-06-22) — 双向文字 QA(Trojan-Source 扫描)](#本次更新-2026-06-22--双向文字-qatrojan-source-扫描) - [本次更新 (2026-06-22) — 可读性评分](#本次更新-2026-06-22--可读性评分) @@ -168,6 +169,12 @@ 平滑噪声值序列。完整参考:[`docs/source/Zh/doc/new_features/v102_features_doc.rst`](../docs/source/Zh/doc/new_features/v102_features_doc.rst)。 +## 本次更新 (2026-06-22) — ICU-lite MessageFormat(复数 / 选择) + +渲染依数量变化的在地化消息。完整参考:[`docs/source/Zh/doc/new_features/v113_features_doc.rst`](../docs/source/Zh/doc/new_features/v113_features_doc.rst)。 + +- **`format_message` / `plural_category` / `ordinal_category`**(`AC_format_message`):`i18n_test.check_catalog` 只比较占位符集合、`interpolate` 只做扁平 `${var}`——两者都无法渲染 `"{count, plural, one {# item} other {# items}}"`。本功能实作多数应用会用到的 ICU MessageFormat 子集:`select`、`plural`、`selectordinal` 搭配 CLDR 类别、优先于类别的精确 `=N` 选择器、`#` 数量、`offset:`、嵌套与单引号转义。复数规则可注入。纯标准库、确定。 + ## 本次更新 (2026-06-22) — 区域感知列表格式化 依某语言的期望串接项目(「A、B and C」)。完整参考:[`docs/source/Zh/doc/new_features/v112_features_doc.rst`](../docs/source/Zh/doc/new_features/v112_features_doc.rst)。 diff --git a/README/README_zh-TW.md b/README/README_zh-TW.md index b8f79163..37ecf28a 100644 --- a/README/README_zh-TW.md +++ b/README/README_zh-TW.md @@ -12,6 +12,7 @@ ## 目錄 +- [本次更新 (2026-06-22) — ICU-lite MessageFormat(複數 / 選擇)](#本次更新-2026-06-22--icu-lite-messageformat複數--選擇) - [本次更新 (2026-06-22) — 地區感知清單格式化](#本次更新-2026-06-22--地區感知清單格式化) - [本次更新 (2026-06-22) — 雙向文字 QA(Trojan-Source 掃描)](#本次更新-2026-06-22--雙向文字-qatrojan-source-掃描) - [本次更新 (2026-06-22) — 可讀性評分](#本次更新-2026-06-22--可讀性評分) @@ -168,6 +169,12 @@ 平滑雜訊值序列。完整參考:[`docs/source/Zh/doc/new_features/v102_features_doc.rst`](../docs/source/Zh/doc/new_features/v102_features_doc.rst)。 +## 本次更新 (2026-06-22) — ICU-lite MessageFormat(複數 / 選擇) + +渲染依數量變化的在地化訊息。完整參考:[`docs/source/Zh/doc/new_features/v113_features_doc.rst`](../docs/source/Zh/doc/new_features/v113_features_doc.rst)。 + +- **`format_message` / `plural_category` / `ordinal_category`**(`AC_format_message`):`i18n_test.check_catalog` 只比較佔位符集合、`interpolate` 只做扁平 `${var}`——兩者都無法渲染 `"{count, plural, one {# item} other {# items}}"`。本功能實作多數應用會用到的 ICU MessageFormat 子集:`select`、`plural`、`selectordinal` 搭配 CLDR 類別、優先於類別的精確 `=N` 選擇器、`#` 數量、`offset:`、巢狀與單引號跳脫。複數規則可注入。純標準函式庫、具決定性。 + ## 本次更新 (2026-06-22) — 地區感知清單格式化 依某語言的期望串接項目(「A、B and C」)。完整參考:[`docs/source/Zh/doc/new_features/v112_features_doc.rst`](../docs/source/Zh/doc/new_features/v112_features_doc.rst)。 diff --git a/docs/source/Eng/doc/new_features/v113_features_doc.rst b/docs/source/Eng/doc/new_features/v113_features_doc.rst new file mode 100644 index 00000000..8ff5831f --- /dev/null +++ b/docs/source/Eng/doc/new_features/v113_features_doc.rst @@ -0,0 +1,46 @@ +ICU-lite MessageFormat (Plural / Select) +======================================== + +``i18n_test.check_catalog`` only compares placeholder *sets* and ``interpolate`` +does flat ``${var}`` substitution — neither can render the count-aware messages +real localisation needs, e.g. ``"{count, plural, one {# item} other {# items}}"``. +This implements the ICU MessageFormat subset most apps use. + +Pure standard library; imports no ``PySide6``. The plural/ordinal category +functions are pure and the rule callables are injectable, so rendering is fully +deterministic in CI. + +Headless API +------------ + +.. code-block:: python + + from je_auto_control import format_message, plural_category, ordinal_category + + plural = "{count, plural, one {# item} other {# items}}" + format_message(plural, {"count": 1}) # '1 item' + format_message(plural, {"count": 5}) # '5 items' + + select = "{g, select, male {He} female {She} other {They}} won" + format_message(select, {"g": "female"}) # 'She won' + + ordinal = "{place, selectordinal, one {#st} two {#nd} few {#rd} other {#th}}" + format_message(ordinal, {"place": 3}) # '3rd' + + plural_category(2) # 'other' + ordinal_category(3) # 'few' + +Supported: simple ``{name}`` arguments, ``select`` (e.g. gender), ``plural`` and +``selectordinal`` with the CLDR categories (``zero``/``one``/``two``/``few``/ +``many``/``other``), exact ``=N`` selectors that win over a category, the ``#`` +count placeholder, a plural ``offset:`` (``#`` becomes count − offset), nested +arguments, and ICU apostrophe quoting (``''`` → ``'``; ``'{'`` → literal brace). +``plural_rules`` / ``ordinal_rules`` let you inject custom category functions; +``locale`` selects the built-ins (``en``, ``fr``). + +Executor commands +----------------- + +``AC_format_message`` takes a ``pattern`` plus a JSON ``args`` object and returns +``{text}``, accepting ``locale``. It is exposed as the MCP tool +``ac_format_message`` and as a Script Builder command under **Data**. diff --git a/docs/source/Eng/eng_index.rst b/docs/source/Eng/eng_index.rst index 6d30a4ec..f22f4ae4 100644 --- a/docs/source/Eng/eng_index.rst +++ b/docs/source/Eng/eng_index.rst @@ -135,6 +135,7 @@ Comprehensive guides for all AutoControl features. doc/new_features/v110_features_doc doc/new_features/v111_features_doc doc/new_features/v112_features_doc + doc/new_features/v113_features_doc doc/ocr_backends/ocr_backends_doc doc/observability/observability_doc doc/operations_layer/operations_layer_doc diff --git a/docs/source/Zh/doc/new_features/v113_features_doc.rst b/docs/source/Zh/doc/new_features/v113_features_doc.rst new file mode 100644 index 00000000..dbb1c599 --- /dev/null +++ b/docs/source/Zh/doc/new_features/v113_features_doc.rst @@ -0,0 +1,40 @@ +ICU-lite MessageFormat(複數 / 選擇) +=================================== + +``i18n_test.check_catalog`` 只比較佔位符*集合*、``interpolate`` 只做扁平 ``${var}`` 取代——兩者都無法渲染 +真正在地化所需的依數量變化訊息,例如 ``"{count, plural, one {# item} other {# items}}"``。本功能實作多數應用 +會用到的 ICU MessageFormat 子集。 + +純標準函式庫;不匯入 ``PySide6``。複數/序數類別函式為純函式,且規則 callable 可注入,因此渲染在 CI 中 +完全具決定性。 + +無頭 API +-------- + +.. code-block:: python + + from je_auto_control import format_message, plural_category, ordinal_category + + plural = "{count, plural, one {# item} other {# items}}" + format_message(plural, {"count": 1}) # '1 item' + format_message(plural, {"count": 5}) # '5 items' + + select = "{g, select, male {He} female {She} other {They}} won" + format_message(select, {"g": "female"}) # 'She won' + + ordinal = "{place, selectordinal, one {#st} two {#nd} few {#rd} other {#th}}" + format_message(ordinal, {"place": 3}) # '3rd' + + plural_category(2) # 'other' + ordinal_category(3) # 'few' + +支援:簡單 ``{name}`` 參數、``select``(如性別)、``plural`` 與 ``selectordinal`` 搭配 CLDR 類別 +(``zero``/``one``/``two``/``few``/``many``/``other``)、優先於類別的精確 ``=N`` 選擇器、``#`` 數量佔位符、 +複數 ``offset:``(``#`` 變為 count − offset)、巢狀參數,以及 ICU 單引號跳脫(``''`` → ``'``;``'{'`` → 字面 +大括號)。``plural_rules`` / ``ordinal_rules`` 可注入自訂類別函式;``locale`` 選擇內建規則(``en``、``fr``)。 + +執行器命令 +---------- + +``AC_format_message`` 接受 ``pattern`` 與 JSON ``args`` 物件並回傳 ``{text}``,可帶 ``locale``。它以 MCP 工具 +``ac_format_message`` 以及 Script Builder 中 **Data** 分類下的命令提供。 diff --git a/docs/source/Zh/zh_index.rst b/docs/source/Zh/zh_index.rst index ae08cf99..1cf876c2 100644 --- a/docs/source/Zh/zh_index.rst +++ b/docs/source/Zh/zh_index.rst @@ -135,6 +135,7 @@ AutoControl 所有功能的完整使用指南。 doc/new_features/v110_features_doc doc/new_features/v111_features_doc doc/new_features/v112_features_doc + doc/new_features/v113_features_doc doc/ocr_backends/ocr_backends_doc doc/observability/observability_doc doc/operations_layer/operations_layer_doc diff --git a/je_auto_control/__init__.py b/je_auto_control/__init__.py index b759c06f..96584da8 100644 --- a/je_auto_control/__init__.py +++ b/je_auto_control/__init__.py @@ -237,6 +237,10 @@ from je_auto_control.utils.bidi_check import is_balanced as is_bidi_balanced # Locale-aware list formatting ("A, B, and C") in the style of CLDR from je_auto_control.utils.list_format import format_list +# ICU-lite MessageFormat (plural / select / selectordinal rendering) +from je_auto_control.utils.message_format import ( + format_message, ordinal_category, plural_category, +) # CI workflow annotations (GitHub Actions) from je_auto_control.utils.ci_annotations import ( emit_annotations, format_annotation, @@ -991,6 +995,9 @@ def start_autocontrol_gui(*args, **kwargs): "is_trojan_source", "strip_bidi_controls", "format_list", + "format_message", + "ordinal_category", + "plural_category", "emit_annotations", "format_annotation", "ClipboardHistory", "default_clipboard_history", "analyze_heal_log", "heal_stats", "scan_secrets", diff --git a/je_auto_control/gui/script_builder/command_schema.py b/je_auto_control/gui/script_builder/command_schema.py index be454adf..28f2be76 100644 --- a/je_auto_control/gui/script_builder/command_schema.py +++ b/je_auto_control/gui/script_builder/command_schema.py @@ -2139,6 +2139,17 @@ def _add_resilience_specs(specs: List[CommandSpec]) -> None: ), description="Join items into a localised list ('A, B, and C').", )) + specs.append(CommandSpec( + "AC_format_message", "Data", "Text: Format Message (ICU)", + fields=( + FieldSpec("pattern", FieldType.STRING, + placeholder="{count, plural, one {# item} other {# items}}"), + FieldSpec("args", FieldType.STRING, placeholder='{"count": 3}'), + FieldSpec("locale", FieldType.STRING, optional=True, + placeholder="en | fr"), + ), + description="Render ICU plural/select/selectordinal message.", + )) specs.append(CommandSpec( "AC_diff_rows", "Data", "Dataset Diff: Rows by Key", fields=( diff --git a/je_auto_control/utils/executor/action_executor.py b/je_auto_control/utils/executor/action_executor.py index b5b6b47d..7679ae47 100644 --- a/je_auto_control/utils/executor/action_executor.py +++ b/je_auto_control/utils/executor/action_executor.py @@ -3021,6 +3021,16 @@ def _format_list(items: Any, style: str = "and", return {"text": format_list(list(items), style=style, locale=locale)} +def _format_message(pattern: str, args: Any = None, + locale: str = "en") -> Dict[str, Any]: + """Adapter: render an ICU-lite MessageFormat pattern.""" + import json + from je_auto_control.utils.message_format import format_message + if isinstance(args, str): + args = json.loads(args) + return {"text": format_message(pattern, args or {}, locale=locale)} + + def _cas_put(name: str, key: str, value: Any, expected_version: Any = None) -> Dict[str, Any]: """Adapter: optimistic put into a named versioned store.""" @@ -4711,6 +4721,7 @@ def __init__(self): "AC_bidi_check": _bidi_check, "AC_bidi_strip": _bidi_strip, "AC_format_list": _format_list, + "AC_format_message": _format_message, "AC_detect_drift": _detect_drift, "AC_categorical_drift": _categorical_drift, "AC_diff_rows": _diff_rows, diff --git a/je_auto_control/utils/mcp_server/tools/_factories.py b/je_auto_control/utils/mcp_server/tools/_factories.py index 30775162..9c62f224 100644 --- a/je_auto_control/utils/mcp_server/tools/_factories.py +++ b/je_auto_control/utils/mcp_server/tools/_factories.py @@ -3682,6 +3682,23 @@ def list_format_tools() -> List[MCPTool]: ] +def message_format_tools() -> List[MCPTool]: + return [ + MCPTool( + name="ac_format_message", + description=("Render an ICU-lite MessageFormat 'pattern' against " + "'args' (plural/select/selectordinal, =N, #). " + "'locale' picks plural rules. Returns {text}."), + input_schema=schema( + {"pattern": {"type": "string"}, "args": {"type": "object"}, + "locale": {"type": "string"}}, + ["pattern"]), + handler=h.format_message, + annotations=READ_ONLY, + ), + ] + + def bidi_check_tools() -> List[MCPTool]: return [ MCPTool( @@ -5744,7 +5761,7 @@ def media_assert_tools() -> List[MCPTool]: timeseries_tools, anomaly_tools, smoothing_tools, idempotency_tools, dedup_window_tools, sequence_gap_tools, optimistic_tools, outbox_tools, locale_collation_tools, confusables_tools, readability_tools, - bidi_check_tools, list_format_tools, + bidi_check_tools, list_format_tools, message_format_tools, dataset_diff_tools, referential_tools, link_header_tools, multipart_tools, http_content_tools, cookie_jar_tools, http_conditional_tools, saga_tools, decision_table_tools, locator_repair_tools, diff --git a/je_auto_control/utils/mcp_server/tools/_handlers.py b/je_auto_control/utils/mcp_server/tools/_handlers.py index fc330fe4..6c351095 100644 --- a/je_auto_control/utils/mcp_server/tools/_handlers.py +++ b/je_auto_control/utils/mcp_server/tools/_handlers.py @@ -2002,6 +2002,11 @@ def format_list(items, style="and", locale="en"): return _format_list(items, style, locale) +def format_message(pattern, args=None, locale="en"): + from je_auto_control.utils.executor.action_executor import _format_message + return _format_message(pattern, args, locale) + + def detect_drift(reference, current, threshold=0.25, bins=10): from je_auto_control.utils.executor.action_executor import _detect_drift return _detect_drift(reference, current, threshold, bins) diff --git a/je_auto_control/utils/message_format/__init__.py b/je_auto_control/utils/message_format/__init__.py new file mode 100644 index 00000000..451451ef --- /dev/null +++ b/je_auto_control/utils/message_format/__init__.py @@ -0,0 +1,6 @@ +"""ICU-lite MessageFormat (plural / select / selectordinal rendering).""" +from je_auto_control.utils.message_format.message_format import ( + format_message, ordinal_category, plural_category, +) + +__all__ = ["format_message", "ordinal_category", "plural_category"] diff --git a/je_auto_control/utils/message_format/message_format.py b/je_auto_control/utils/message_format/message_format.py new file mode 100644 index 00000000..35578298 --- /dev/null +++ b/je_auto_control/utils/message_format/message_format.py @@ -0,0 +1,230 @@ +"""ICU-lite MessageFormat: plural / select / selectordinal message rendering. + +``i18n_test.check_catalog`` only compares placeholder *sets* and ``interpolate`` +does flat ``${var}`` substitution — neither can render the count-aware messages +real localisation needs: ``"{count, plural, one {# item} other {# items}}"``. +This implements the ICU MessageFormat subset most apps use: simple ``{name}`` +arguments, ``select`` (e.g. gender), ``plural`` and ``selectordinal`` with CLDR +plural categories, exact ``=N`` selectors, the ``#`` count placeholder, an +``offset:`` and ICU apostrophe quoting. + +Pure standard library; imports no ``PySide6``. The plural/ordinal category +functions are pure and the rule callables are injectable, so rendering is fully +deterministic in CI. +""" +from typing import Any, Callable, Dict, List, Mapping, Optional, Tuple + +Node = Tuple +PluralRule = Callable[[Any], str] + +_WHITESPACE = " \t\r\n" +_TOKEN_STOP = set(_WHITESPACE) | {",", "{", "}"} +_QUOTABLE = "{}#|" + + +# --- CLDR plural / ordinal categories ------------------------------------- + +def _to_operands(value: Any) -> Tuple[float, int, bool]: + """Return ``(number, integer_part, is_integer)`` for a numeric value.""" + number = float(value) + return number, int(number), number.is_integer() + + +def _cardinal_en(_number: float, integer: int, is_int: bool) -> str: + return "one" if (is_int and integer == 1) else "other" + + +def _cardinal_fr(_number: float, integer: int, _is_int: bool) -> str: + return "one" if integer in (0, 1) else "other" + + +def _ordinal_en(_number: float, integer: int, is_int: bool) -> str: + if not is_int: + return "other" + mod10, mod100 = integer % 10, integer % 100 + if mod10 == 1 and mod100 != 11: + return "one" + if mod10 == 2 and mod100 != 12: + return "two" + if mod10 == 3 and mod100 != 13: + return "few" + return "other" + + +_CARDINAL = {"en": _cardinal_en, "fr": _cardinal_fr} +_ORDINAL = {"en": _ordinal_en} + + +def plural_category(number: Any, locale: str = "en") -> str: + """Return the CLDR cardinal plural category (``one``/``other``/...).""" + rule = _CARDINAL.get(locale, _cardinal_en) + return rule(*_to_operands(number)) + + +def ordinal_category(number: Any, locale: str = "en") -> str: + """Return the CLDR ordinal plural category (``one``/``two``/``few``/...).""" + rule = _ORDINAL.get(locale, _ordinal_en) + return rule(*_to_operands(number)) + + +def _format_number(value: Any) -> str: + """Render a number without a trailing ``.0`` for integer values.""" + if isinstance(value, float) and value.is_integer(): + return str(int(value)) + return str(value) + + +# --- parsing -------------------------------------------------------------- + +def _skip_ws(text: str, index: int) -> int: + while index < len(text) and text[index] in _WHITESPACE: + index += 1 + return index + + +def _read_token(text: str, index: int) -> Tuple[str, int]: + start = index + while index < len(text) and text[index] not in _TOKEN_STOP: + index += 1 + return text[start:index], index + + +def _flush(buffer: List[str], nodes: List[Node]) -> None: + if buffer: + nodes.append(("text", "".join(buffer))) + buffer.clear() + + +def _consume_quote(text: str, index: int, buffer: List[str]) -> int: + """Handle an ICU apostrophe at ``index``; append literal text to buffer.""" + nxt = text[index + 1] if index + 1 < len(text) else "" + if nxt == "'": + buffer.append("'") + return index + 2 + if nxt in _QUOTABLE: + index += 1 + while index < len(text) and text[index] != "'": + buffer.append(text[index]) + index += 1 + return index + 1 if index < len(text) else index + buffer.append("'") + return index + 1 + + +def _parse_message(text: str, index: int) -> Tuple[List[Node], int]: + """Parse a (sub)message until end of string or an unescaped ``}``.""" + nodes: List[Node] = [] + buffer: List[str] = [] + while index < len(text) and text[index] != "}": + char = text[index] + if char == "{": + _flush(buffer, nodes) + node, index = _parse_argument(text, index) + nodes.append(node) + elif char == "#": + _flush(buffer, nodes) + nodes.append(("hash",)) + index += 1 + elif char == "'": + index = _consume_quote(text, index, buffer) + else: + buffer.append(char) + index += 1 + _flush(buffer, nodes) + return nodes, index + + +def _parse_options(text: str, index: int) -> Tuple[Dict[str, List[Node]], int, int]: + """Parse ``selector {submessage}`` pairs (and an optional ``offset:``).""" + options: Dict[str, List[Node]] = {} + offset = 0 + index = _skip_ws(text, index) + while index < len(text) and text[index] != "}": + selector, index = _read_token(text, index) + index = _skip_ws(text, index) + if selector.startswith("offset:"): + offset = int(selector[len("offset:"):]) + continue + submessage, index = _parse_message(text, index + 1) + options[selector] = submessage + index = _skip_ws(text, index + 1) + return options, offset, index + + +def _parse_argument(text: str, index: int) -> Tuple[Node, int]: + """Parse a ``{...}`` argument starting at the opening brace.""" + index = _skip_ws(text, index + 1) + name, index = _read_token(text, index) + index = _skip_ws(text, index) + if index < len(text) and text[index] == "}": + return ("arg", name), index + 1 + index = _skip_ws(text, index + 1) # skip the comma + arg_type, index = _read_token(text, index) + index = _skip_ws(text, index) + if arg_type not in ("plural", "selectordinal", "select"): + raise ValueError(f"unknown argument type: {arg_type!r}") + options, offset, index = _parse_options(text, index + 1) # skip the comma + index += 1 # skip the closing brace + if arg_type == "select": + return ("select", name, options), index + return ("plural", name, options, arg_type == "selectordinal", offset), index + + +# --- rendering ------------------------------------------------------------ + +def _render_select(node: Node, args: Mapping[str, Any], + rules: Tuple[PluralRule, PluralRule]) -> str: + _, name, options = node + chosen = options.get(str(args.get(name, ""))) or options.get("other") or [] + return _render(chosen, args, rules) + + +def _render_plural(node: Node, args: Mapping[str, Any], + rules: Tuple[PluralRule, PluralRule]) -> str: + _, name, options, is_ordinal, offset = node + value = args.get(name, 0) + number, integer, is_int = _to_operands(value) + exact = "=" + (str(integer) if is_int else _format_number(number)) + chosen = options.get(exact) + if chosen is None: + rule = rules[1] if is_ordinal else rules[0] + chosen = options.get(rule(value)) or options.get("other") or [] + return _render(chosen, args, rules, plural_value=number - offset) + + +def _render(nodes: List[Node], args: Mapping[str, Any], + rules: Tuple[PluralRule, PluralRule], + plural_value: Optional[float] = None) -> str: + parts: List[str] = [] + for node in nodes: + kind = node[0] + if kind == "text": + parts.append(node[1]) + elif kind == "hash": + parts.append(_format_number(plural_value) + if plural_value is not None else "#") + elif kind == "arg": + parts.append(str(args.get(node[1], ""))) + elif kind == "select": + parts.append(_render_select(node, args, rules)) + else: + parts.append(_render_plural(node, args, rules)) + return "".join(parts) + + +def format_message(pattern: str, arguments: Optional[Mapping[str, Any]] = None, + *, locale: str = "en", + plural_rules: Optional[PluralRule] = None, + ordinal_rules: Optional[PluralRule] = None) -> str: + """Render an ICU-lite ``pattern`` against ``arguments``. + + Supports ``{name}`` placeholders, ``select``, ``plural`` and + ``selectordinal`` with CLDR categories, exact ``=N`` selectors, ``#`` (the + count, minus any ``offset:``) and ``'`` quoting. ``plural_rules`` / + ``ordinal_rules`` override the locale's category functions. + """ + args = arguments or {} + nodes, _ = _parse_message(pattern or "", 0) + cardinal = plural_rules or (lambda value: plural_category(value, locale)) + ordinal = ordinal_rules or (lambda value: ordinal_category(value, locale)) + return _render(nodes, args, (cardinal, ordinal)) diff --git a/test/unit_test/headless/test_message_format_batch.py b/test/unit_test/headless/test_message_format_batch.py new file mode 100644 index 00000000..9291eeee --- /dev/null +++ b/test/unit_test/headless/test_message_format_batch.py @@ -0,0 +1,97 @@ +"""Headless tests for ICU-lite MessageFormat. No Qt.""" +import json + +import je_auto_control as ac +from je_auto_control.utils.message_format import ( + format_message, ordinal_category, plural_category, +) + +_PLURAL = "{count, plural, one {# item} other {# items}}" + + +def test_simple_placeholder(): + assert format_message("Hello {name}!", {"name": "World"}) == "Hello World!" + + +def test_plural_one_and_other_with_hash(): + assert format_message(_PLURAL, {"count": 1}) == "1 item" + assert format_message(_PLURAL, {"count": 5}) == "5 items" + + +def test_exact_selector_beats_category(): + pattern = "{count, plural, =0 {no items} one {# item} other {# items}}" + assert format_message(pattern, {"count": 0}) == "no items" + assert format_message(pattern, {"count": 1}) == "1 item" + + +def test_select(): + pattern = "{g, select, male {He} female {She} other {They}} won" + assert format_message(pattern, {"g": "female"}) == "She won" + assert format_message(pattern, {"g": "nb"}) == "They won" # other + + +def test_selectordinal(): + pattern = ("{place, selectordinal, one {#st} two {#nd} few {#rd} " + "other {#th}}") + assert format_message(pattern, {"place": 1}) == "1st" + assert format_message(pattern, {"place": 2}) == "2nd" + assert format_message(pattern, {"place": 3}) == "3rd" + assert format_message(pattern, {"place": 11}) == "11th" # special + + +def test_offset_adjusts_hash(): + pattern = "{n, plural, offset:1 one {you and # other} other {you and # others}}" + assert format_message(pattern, {"n": 3}) == "you and 2 others" + + +def test_nested_select_and_plural(): + pattern = ("{g, select, " + "male {He has {n, plural, one {# cat} other {# cats}}} " + "other {They have {n, plural, one {# cat} other {# cats}}}}") + assert format_message(pattern, {"g": "male", "n": 1}) == "He has 1 cat" + assert format_message(pattern, {"g": "x", "n": 4}) == "They have 4 cats" + + +def test_apostrophe_quoting(): + assert format_message("it''s {x} '{lit}'", {"x": "A"}) == "it's A {lit}" + + +def test_category_helpers(): + assert plural_category(1) == "one" + assert plural_category(2) == "other" + assert plural_category(1, locale="fr") == "one" + assert plural_category(0, locale="fr") == "one" # French: 0 is "one" + assert ordinal_category(3) == "few" + assert ordinal_category(13) == "other" + + +def test_injectable_rules(): + # always-other rule overrides the locale default + assert format_message(_PLURAL, {"count": 1}, + plural_rules=lambda _n: "other") == "1 items" + + +# --- wiring --------------------------------------------------------------- + +def test_executor_round_trip(): + rec = ac.execute_action([[ + "AC_format_message", + {"pattern": _PLURAL, "args": json.dumps({"count": 2})}]]) + out = next(v for v in rec.values() if isinstance(v, dict)) + assert out["text"] == "2 items" + + +def test_wiring(): + known = ac.executor.known_commands() + assert "AC_format_message" in set(known) + from je_auto_control.utils.mcp_server.tools import build_default_tool_registry + names = {t.name for t in build_default_tool_registry()} + assert "ac_format_message" in names + from je_auto_control.gui.script_builder.command_schema import _build_specs + specs = {s.command for s in _build_specs()} + assert "AC_format_message" in specs + + +def test_facade_exports(): + for attr in ("format_message", "plural_category", "ordinal_category"): + assert hasattr(ac, attr) and attr in ac.__all__