Integration-Automation · JE-Chen · Jun 22, 2026 · Jun 22, 2026
diff --git a/README.md b/README.md
@@ -13,6 +13,7 @@
 
 ## Table of Contents
 
+- [What's new (2026-06-22) — Locale-Aware String Collation](#whats-new-2026-06-22--locale-aware-string-collation)
 - [What's new (2026-06-22) — Transactional Outbox](#whats-new-2026-06-22--transactional-outbox)
 - [What's new (2026-06-22) — Optimistic-Concurrency Versioned Store](#whats-new-2026-06-22--optimistic-concurrency-versioned-store)
 - [What's new (2026-06-22) — Per-Stream Sequence-Gap Detection](#whats-new-2026-06-22--per-stream-sequence-gap-detection)
@@ -160,6 +161,12 @@
 
 ---
 
+## What's new (2026-06-22) — Locale-Aware String Collation
+
+Sort strings the way a reader of the language expects. Full reference: [`docs/source/Eng/doc/new_features/v108_features_doc.rst`](docs/source/Eng/doc/new_features/v108_features_doc.rst).
+
+- **`sort_strings` / `collation_compare` / `collation_key`** (`AC_collation_sort`, `AC_collation_compare`): Python's default `sorted` is codepoint order, so `"Z" < "a"` and `"ä"` lands far from `"a"`. This Unicode-Collation-lite key orders by base letter, then accent (secondary), then case (tertiary), with an optional `tailoring` alphabet so Swedish puts `å ä ö` after `z`. Pure-stdlib (`unicodedata`), deterministic across platforms — unlike `locale.strxfrm`.
+
 ## What's new (2026-06-22) — Transactional Outbox
 
 Durably buffer events and drain them at-least-once. Full reference: [`docs/source/Eng/doc/new_features/v107_features_doc.rst`](docs/source/Eng/doc/new_features/v107_features_doc.rst).

diff --git a/README/README_zh-CN.md b/README/README_zh-CN.md
@@ -12,6 +12,7 @@
 
 ## 目录
 
+- [本次更新 (2026-06-22) — 区域感知字符串排序](#本次更新-2026-06-22--区域感知字符串排序)
 - [本次更新 (2026-06-22) — 事务型 Outbox](#本次更新-2026-06-22--事务型-outbox)
 - [本次更新 (2026-06-22) — 乐观并发版本存储](#本次更新-2026-06-22--乐观并发版本存储)
 - [本次更新 (2026-06-22) — 逐流序号间隙检测](#本次更新-2026-06-22--逐流序号间隙检测)
@@ -163,6 +164,12 @@
 
 平滑噪声值序列。完整参考:[`docs/source/Zh/doc/new_features/v102_features_doc.rst`](../docs/source/Zh/doc/new_features/v102_features_doc.rst)。
 
+## 本次更新 (2026-06-22) — 区域感知字符串排序
+
+依某语言读者的期望排序字符串。完整参考:[`docs/source/Zh/doc/new_features/v108_features_doc.rst`](../docs/source/Zh/doc/new_features/v108_features_doc.rst)。
+
+- **`sort_strings` / `collation_compare` / `collation_key`**(`AC_collation_sort`、`AC_collation_compare`):Python 默认的 `sorted` 是码位顺序,因此 `"Z" < "a"`,而 `"ä"` 离 `"a"` 很远。本 Unicode-Collation-lite 键先依基底字母、再依变音符号(次层)、再依大小写(三层)排序,并可用 `tailoring` 字母表让瑞典文将 `å ä ö` 排在 `z` 之后。纯标准库(`unicodedata`)、跨平台确定——不像 `locale.strxfrm`。
+
 ## 本次更新 (2026-06-22) — 事务型 Outbox
 
 持久化缓冲事件并以至少一次传递排空。完整参考:[`docs/source/Zh/doc/new_features/v107_features_doc.rst`](../docs/source/Zh/doc/new_features/v107_features_doc.rst)。

diff --git a/README/README_zh-TW.md b/README/README_zh-TW.md
@@ -12,6 +12,7 @@
 
 ## 目錄
 
+- [本次更新 (2026-06-22) — 地區感知字串排序](#本次更新-2026-06-22--地區感知字串排序)
 - [本次更新 (2026-06-22) — 交易型 Outbox](#本次更新-2026-06-22--交易型-outbox)
 - [本次更新 (2026-06-22) — 樂觀並行版本儲存](#本次更新-2026-06-22--樂觀並行版本儲存)
 - [本次更新 (2026-06-22) — 逐串流序號間隙偵測](#本次更新-2026-06-22--逐串流序號間隙偵測)
@@ -163,6 +164,12 @@
 
 平滑雜訊值序列。完整參考:[`docs/source/Zh/doc/new_features/v102_features_doc.rst`](../docs/source/Zh/doc/new_features/v102_features_doc.rst)。
 
+## 本次更新 (2026-06-22) — 地區感知字串排序
+
+依某語言讀者的期望排序字串。完整參考:[`docs/source/Zh/doc/new_features/v108_features_doc.rst`](../docs/source/Zh/doc/new_features/v108_features_doc.rst)。
+
+- **`sort_strings` / `collation_compare` / `collation_key`**(`AC_collation_sort`、`AC_collation_compare`):Python 預設的 `sorted` 是碼位順序,因此 `"Z" < "a"`,而 `"ä"` 離 `"a"` 很遠。本 Unicode-Collation-lite 鍵先依基底字母、再依變音符號(次層)、再依大小寫(三層)排序,並可用 `tailoring` 字母表讓瑞典文將 `å ä ö` 排在 `z` 之後。純標準函式庫(`unicodedata`)、跨平台具決定性——不像 `locale.strxfrm`。
+
 ## 本次更新 (2026-06-22) — 交易型 Outbox
 
 持久化緩衝事件並以至少一次傳遞排空。完整參考:[`docs/source/Zh/doc/new_features/v107_features_doc.rst`](../docs/source/Zh/doc/new_features/v107_features_doc.rst)。

diff --git a/docs/source/Eng/doc/new_features/v108_features_doc.rst b/docs/source/Eng/doc/new_features/v108_features_doc.rst
@@ -0,0 +1,47 @@
+Locale-Aware String Collation
+=============================
+
+``text_normalize`` canonicalises text and ``locale_parse`` formats numbers, but
+nothing sorts strings the way a reader of a given language expects. Python's
+default ``sorted`` is codepoint order, so ``"Z" < "a"`` and ``"ä"`` lands far
+from ``"a"``. A real collation orders by *base letter* first, then *accent*,
+then *case*, and lets a locale tailor the alphabet (Swedish sorts ``å ä ö`` after
+``z``).
+
+This builds a Unicode-Collation-lite sort key with three levels — primary (base
+letter), secondary (diacritics), tertiary (case) — plus an optional alphabet
+``tailoring``. Pure standard library (``unicodedata``); imports no ``PySide6``.
+Every function is pure, so it is fully deterministic across platforms (unlike
+``locale.strxfrm``, which depends on the host's installed locales).
+
+Headless API
+------------
+
+.. code-block:: python
+
+    from je_auto_control import sort_strings, collation_compare, collation_key
+
+    sort_strings(["résumé", "rest", "resume"])
+    # ['rest', 'resume', 'résumé']   (accent is a secondary difference)
+
+    swedish = "abcdefghijklmnopqrstuvwxyzåäö"
+    sort_strings(["zebra", "äpple", "apple"], tailoring=swedish)
+    # ['apple', 'zebra', 'äpple']    (å ä ö sort after z)
+
+    collation_compare("apple", "Apple")        # -1  (lowercase before uppercase)
+    sort_strings(rows, key=lambda r: r["name"])  # sort dicts by a field
+
+``strength`` (``primary`` / ``secondary`` / ``tertiary``) caps the levels
+compared, so ``strength="primary"`` is accent- and case-insensitive.
+``tailoring`` is an ordered alphabet whose characters sort in the given order and
+before any unlisted character; a precomposed letter such as ``"å"`` keeps its
+alphabet rank instead of decomposing to ``a`` + diaeresis. ``collation_key``
+returns the raw comparable tuple for use as a ``sorted`` key.
+
+Executor commands
+-----------------
+
+``AC_collation_sort`` takes a JSON list and returns ``{sorted}``;
+``AC_collation_compare`` returns ``{order: -1|0|1}``. Both accept ``strength``
+and ``tailoring``, are exposed as MCP tools (``ac_collation_sort`` /
+``ac_collation_compare``) and as Script Builder commands under **Data**.
diff --git a/docs/source/Eng/eng_index.rst b/docs/source/Eng/eng_index.rst
@@ -130,6 +130,7 @@ Comprehensive guides for all AutoControl features.
    doc/new_features/v105_features_doc
    doc/new_features/v106_features_doc
    doc/new_features/v107_features_doc
+   doc/new_features/v108_features_doc
    doc/ocr_backends/ocr_backends_doc
    doc/observability/observability_doc
    doc/operations_layer/operations_layer_doc

diff --git a/docs/source/Zh/doc/new_features/v108_features_doc.rst b/docs/source/Zh/doc/new_features/v108_features_doc.rst
@@ -0,0 +1,39 @@
+地區感知字串排序(Collation)
+============================
+
+``text_normalize`` 正規化文字、``locale_parse`` 格式化數字,但沒有任何功能能依某語言讀者的期望排序字串。
+Python 預設的 ``sorted`` 是碼位順序,因此 ``"Z" < "a"``,而 ``"ä"`` 會離 ``"a"`` 很遠。真正的排序會先依
+*基底字母*、再依*變音符號*、再依*大小寫*,並讓地區得以調整字母表(瑞典文將 ``å ä ö`` 排在 ``z`` 之後)。
+
+本功能建立一個 Unicode-Collation-lite 排序鍵,含三個層級——主層(基底字母)、次層(變音符號)、三層(大小寫)
+——以及選用的字母表 ``tailoring``。純標準函式庫(``unicodedata``);不匯入 ``PySide6``。每個函式皆為純函式,
+因此跨平台完全具決定性(不像 ``locale.strxfrm`` 取決於主機已安裝的地區設定)。
+
+無頭 API
+--------
+
+.. code-block:: python
+
+    from je_auto_control import sort_strings, collation_compare, collation_key
+
+    sort_strings(["résumé", "rest", "resume"])
+    # ['rest', 'resume', 'résumé']   (變音符號為次層差異)
+
+    swedish = "abcdefghijklmnopqrstuvwxyzåäö"
+    sort_strings(["zebra", "äpple", "apple"], tailoring=swedish)
+    # ['apple', 'zebra', 'äpple']    (å ä ö 排在 z 之後)
+
+    collation_compare("apple", "Apple")        # -1  (小寫在大寫之前)
+    sort_strings(rows, key=lambda r: r["name"])  # 依欄位排序字典
+
+``strength``(``primary`` / ``secondary`` / ``tertiary``)限制比較的層級,因此 ``strength="primary"`` 為
+不分變音符號與大小寫。``tailoring`` 是有序字母表,所列字元依給定順序排序,且排在任何未列字元之前;像 ``"å"``
+這類預組字元會保有其字母表排名,而非分解為 ``a`` + 分音符。``collation_key`` 回傳可比較的原始 tuple,供作
+``sorted`` 的 key 使用。
+
+執行器命令
+----------
+
+``AC_collation_sort`` 接受 JSON 列表並回傳 ``{sorted}``;``AC_collation_compare`` 回傳 ``{order: -1|0|1}``。
+兩者皆接受 ``strength`` 與 ``tailoring``,並以 MCP 工具(``ac_collation_sort`` / ``ac_collation_compare``)
+以及 Script Builder 中 **Data** 分類下的命令提供。
diff --git a/docs/source/Zh/zh_index.rst b/docs/source/Zh/zh_index.rst
@@ -130,6 +130,7 @@ AutoControl 所有功能的完整使用指南。
    doc/new_features/v105_features_doc
    doc/new_features/v106_features_doc
    doc/new_features/v107_features_doc
+   doc/new_features/v108_features_doc
    doc/ocr_backends/ocr_backends_doc
    doc/observability/observability_doc
    doc/operations_layer/operations_layer_doc

diff --git a/je_auto_control/__init__.py b/je_auto_control/__init__.py
@@ -213,6 +213,11 @@
 )
 # Transactional outbox (durable at-least-once event delivery)
 from je_auto_control.utils.outbox import Outbox
+# Locale-aware string collation (deterministic multi-level sort keys)
+from je_auto_control.utils.locale_collation import (
+    collation_key, sort_strings,
+)
+from je_auto_control.utils.locale_collation import compare as collation_compare
 # CI workflow annotations (GitHub Actions)
 from je_auto_control.utils.ci_annotations import (
     emit_annotations, format_annotation,
@@ -943,6 +948,9 @@ def start_autocontrol_gui(*args, **kwargs):
     "DedupWindow", "SequenceTracker",
     "VersionConflict", "VersionedStore", "check_if_match", "if_match_header",
     "Outbox",
+    "collation_key",
+    "collation_compare",
+    "sort_strings",
     "emit_annotations", "format_annotation",
     "ClipboardHistory", "default_clipboard_history",
     "analyze_heal_log", "heal_stats", "scan_secrets",

diff --git a/je_auto_control/gui/script_builder/command_schema.py b/je_auto_control/gui/script_builder/command_schema.py
@@ -2066,6 +2066,30 @@ def _add_resilience_specs(specs: List[CommandSpec]) -> None:
         ),
         description="List events still awaiting successful delivery.",
     ))
+    specs.append(CommandSpec(
+        "AC_collation_sort", "Data", "Text: Collation Sort",
+        fields=(
+            FieldSpec("items", FieldType.STRING,
+                      placeholder='["zebra", "apple", "Äpple"]'),
+            FieldSpec("strength", FieldType.STRING, optional=True,
+                      placeholder="tertiary"),
+            FieldSpec("tailoring", FieldType.STRING, optional=True,
+                      placeholder="abc...xyzåäö"),
+            FieldSpec("reverse", FieldType.BOOL, optional=True),
+        ),
+        description="Locale-aware sort (base letter, then accent, then case).",
+    ))
+    specs.append(CommandSpec(
+        "AC_collation_compare", "Data", "Text: Collation Compare",
+        fields=(
+            FieldSpec("first", FieldType.STRING, placeholder="apple"),
+            FieldSpec("second", FieldType.STRING, placeholder="Äpple"),
+            FieldSpec("strength", FieldType.STRING, optional=True,
+                      placeholder="tertiary"),
+            FieldSpec("tailoring", FieldType.STRING, optional=True),
+        ),
+        description="Locale-aware compare; returns order -1/0/1.",
+    ))
     specs.append(CommandSpec(
         "AC_diff_rows", "Data", "Dataset Diff: Rows by Key",
         fields=(

diff --git a/je_auto_control/utils/executor/action_executor.py b/je_auto_control/utils/executor/action_executor.py
@@ -2956,6 +2956,26 @@ def _outbox_pending(name: str) -> Dict[str, Any]:
     return {"pending": outbox.pending()}
 
 
+def _collation_sort(items: Any, strength: str = "tertiary",
+                    tailoring: Any = None, reverse: Any = False) -> Dict[str, Any]:
+    """Adapter: locale-aware sort of a list of strings."""
+    import json
+    from je_auto_control.utils.locale_collation import sort_strings
+    if isinstance(items, str):
+        items = json.loads(items)
+    ordered = sort_strings(list(items), strength=strength,
+                           tailoring=tailoring or None, reverse=bool(reverse))
+    return {"sorted": ordered}
+
+
+def _collation_compare(first: str, second: str, strength: str = "tertiary",
+                       tailoring: Any = None) -> Dict[str, Any]:
+    """Adapter: locale-aware comparison of two strings."""
+    from je_auto_control.utils.locale_collation import compare
+    return {"order": compare(first, second, strength=strength,
+                             tailoring=tailoring or None)}
+
+
 def _cas_put(name: str, key: str, value: Any,
              expected_version: Any = None) -> Dict[str, Any]:
     """Adapter: optimistic put into a named versioned store."""
@@ -4638,6 +4658,8 @@ def __init__(self):
             "AC_cas_get": _cas_get,
             "AC_outbox_enqueue": _outbox_enqueue,
             "AC_outbox_pending": _outbox_pending,
+            "AC_collation_sort": _collation_sort,
+            "AC_collation_compare": _collation_compare,
             "AC_detect_drift": _detect_drift,
             "AC_categorical_drift": _categorical_drift,
             "AC_diff_rows": _diff_rows,

diff --git a/je_auto_control/utils/locale_collation/__init__.py b/je_auto_control/utils/locale_collation/__init__.py
@@ -0,0 +1,6 @@
+"""Locale-aware string collation (deterministic multi-level sort keys)."""
+from je_auto_control.utils.locale_collation.locale_collation import (
+    collation_key, compare, sort_strings,
+)
+
+__all__ = ["collation_key", "compare", "sort_strings"]
diff --git a/je_auto_control/utils/locale_collation/locale_collation.py b/je_auto_control/utils/locale_collation/locale_collation.py
@@ -0,0 +1,122 @@
+"""Locale-aware string collation (deterministic multi-level sort keys).
+
+``text_normalize`` canonicalises text and ``locale_parse`` formats numbers, but
+nothing sorts strings the way a human reading a given language expects: Python's
+default ``sorted`` is codepoint order, so ``"Z" < "a"`` and ``"ä"`` lands far
+from ``"a"``. A real collation orders by base letter first, then accent, then
+case, and lets a locale tailor the alphabet (Swedish sorts ``å ä ö`` after
+``z``).
+
+This builds a Unicode-Collation-lite sort key with three levels — primary (base
+letter), secondary (diacritics), tertiary (case) — plus an optional alphabet
+``tailoring``. Pure standard library (``unicodedata``); imports no ``PySide6``.
+Every function is pure (text in, key/order out), so it is fully deterministic in
+CI and across platforms (unlike ``locale.strxfrm``).
+"""
+import unicodedata
+from typing import Callable, Dict, List, Optional, Sequence, Tuple
+
+_STRENGTHS = {"primary": 1, "secondary": 2, "tertiary": 3}
+
+CollationKey = Tuple[Tuple[int, ...], ...]
+
+
+def _build_tailoring(tailoring: Optional[str]) -> Optional[Dict[str, int]]:
+    """Map each character of an ordered alphabet to its primary rank."""
+    if not tailoring:
+        return None
+    ranks: Dict[str, int] = {}
+    for index, char in enumerate(tailoring):
+        folded = char.casefold()
+        if folded not in ranks:
+            ranks[folded] = index
+    return ranks
+
+
+def _untailored_weight(base: str, ranks: Optional[Dict[str, int]],
+                       offset: int) -> int:
+    """Primary weight of a folded base character outside any tailoring."""
+    if not base:
+        return offset if ranks is not None else 0
+    return offset + ord(base[0]) if ranks is not None else ord(base[0])
+
+
+def _char_weights(char: str, ranks: Optional[Dict[str, int]],
+                  offset: int) -> Tuple[List[int], List[int], List[int]]:
+    """Primary/secondary/tertiary weight contributions of one character.
+
+    A tailored character is treated atomically (no decomposition) so a
+    precomposed letter like ``"å"`` keeps its alphabet rank; everything else is
+    NFKD-decomposed so diacritics fall to the secondary level.
+    """
+    folded = char.casefold()
+    if ranks is not None and folded in ranks:
+        return [ranks[folded]], [], [1 if char != folded else 0]
+    primary: List[int] = []
+    secondary: List[int] = []
+    tertiary: List[int] = []
+    for sub in unicodedata.normalize("NFKD", char):
+        if unicodedata.combining(sub):
+            secondary.append(ord(sub))
+            continue
+        subfold = sub.casefold()
+        primary.append(_untailored_weight(subfold, ranks, offset))
+        tertiary.append(1 if sub != subfold else 0)
+    return primary, secondary, tertiary
+
+
+def collation_key(text: str, *, strength: str = "tertiary",
+                  tailoring: Optional[str] = None) -> CollationKey:
+    """Return a comparable multi-level sort key for ``text``.
+
+    Levels: primary (base letter), secondary (diacritics), tertiary (case,
+    lowercase before uppercase). ``strength`` (``primary`` / ``secondary`` /
+    ``tertiary``) caps the levels compared. ``tailoring`` is an ordered alphabet
+    whose characters sort in the given order and before any unlisted character
+    (so a Swedish ``"...xyzåäö"`` puts ``å`` after ``z``).
+    """
+    level = _STRENGTHS.get(strength)
+    if level is None:
+        raise ValueError(f"unknown strength: {strength!r}")
+    ranks = _build_tailoring(tailoring)
+    offset = len(tailoring) if tailoring else 0
+    primary: List[int] = []
+    secondary: List[int] = []
+    tertiary: List[int] = []
+    for char in text or "":
+        char_primary, char_secondary, char_tertiary = _char_weights(
+            char, ranks, offset)
+        primary.extend(char_primary)
+        secondary.extend(char_secondary)
+        tertiary.extend(char_tertiary)
+    levels = (tuple(primary), tuple(secondary), tuple(tertiary))
+    return levels[:level]
+
+
+def compare(first: str, second: str, *, strength: str = "tertiary",
+            tailoring: Optional[str] = None) -> int:
+    """Return ``-1`` / ``0`` / ``1`` ordering ``first`` against ``second``."""
+    key_first = collation_key(first, strength=strength, tailoring=tailoring)
+    key_second = collation_key(second, strength=strength, tailoring=tailoring)
+    if key_first < key_second:
+        return -1
+    if key_first > key_second:
+        return 1
+    return 0
+
+
+def sort_strings(items: Sequence[str], *, strength: str = "tertiary",
+                 tailoring: Optional[str] = None, reverse: bool = False,
+                 key: Optional[Callable[[object], str]] = None) -> List[object]:
+    """Return ``items`` sorted by collation key.
+
+    ``key`` extracts the string from each item (default: the item itself), so
+    dicts or tuples can be sorted by one of their fields.
+    """
+    extract = key or (lambda item: item)
+
+    def sort_key(item: object) -> CollationKey:
+        return collation_key(str(extract(item)), strength=strength,
+                             tailoring=tailoring)
+
+    return sorted(items, key=sort_key, reverse=reverse)