Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@

## Table of Contents

- [What's new (2026-06-19) — Approval Testing (Golden-Master Baselines)](#whats-new-2026-06-19--approval-testing-golden-master-baselines)
- [What's new (2026-06-19) — Network Egress Allowlist Guard](#whats-new-2026-06-19--network-egress-allowlist-guard)
- [What's new (2026-06-19) — Just-In-Time Credential Leases](#whats-new-2026-06-19--just-in-time-credential-leases)
- [What's new (2026-06-19) — Maker-Checker Approval Gate](#whats-new-2026-06-19--maker-checker-approval-gate)
Expand Down Expand Up @@ -87,6 +88,12 @@

---

## What's new (2026-06-19) — Approval Testing (Golden-Master Baselines)

Lock outputs against a human-approved baseline. Full reference: [`docs/source/Eng/doc/new_features/v35_features_doc.rst`](docs/source/Eng/doc/new_features/v35_features_doc.rst).

- **`verify_artifact` / `approve_artifact`** (`AC_verify_artifact` / `AC_approve_artifact` / `AC_pending_artifacts`, `ac_*`): golden-master / snapshot testing for *any* artifact (text, JSON, OCR output, screenshot bytes). `verify_artifact` compares produced content to `<name>.approved.<ext>`; a mismatch or missing baseline writes `<name>.received.<ext>` for review and fails, and `approve_artifact` promotes a reviewed received file to the baseline. Complements pixel diffing with a review-gated baseline you commit alongside the test; names are path-traversal-checked.

## What's new (2026-06-19) — Network Egress Allowlist Guard

Pin which hosts automation may reach. Full reference: [`docs/source/Eng/doc/new_features/v34_features_doc.rst`](docs/source/Eng/doc/new_features/v34_features_doc.rst).
Expand Down
7 changes: 7 additions & 0 deletions README/README_zh-CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@

## 目录

- [本次更新 (2026-06-19) — 核准式测试(Golden-Master 基准)](#本次更新-2026-06-19--核准式测试golden-master-基准)
- [本次更新 (2026-06-19) — 网络出口允许清单守卫](#本次更新-2026-06-19--网络出口允许清单守卫)
- [本次更新 (2026-06-19) — 即时凭证租约](#本次更新-2026-06-19--即时凭证租约)
- [本次更新 (2026-06-19) — Maker-Checker 审批闸门](#本次更新-2026-06-19--maker-checker-审批闸门)
Expand Down Expand Up @@ -86,6 +87,12 @@

---

## 本次更新 (2026-06-19) — 核准式测试(Golden-Master 基准)

将输出锁定到人工核准的基准。完整参考:[`docs/source/Zh/doc/new_features/v35_features_doc.rst`](../docs/source/Zh/doc/new_features/v35_features_doc.rst)。

- **`verify_artifact` / `approve_artifact`**(`AC_verify_artifact` / `AC_approve_artifact` / `AC_pending_artifacts`、`ac_*`):对*任何*产物(文本、JSON、OCR 输出、屏幕截图字节)进行 golden-master / snapshot 测试。`verify_artifact` 将产出内容与 `<name>.approved.<ext>` 比对;不符或缺少基准会写入 `<name>.received.<ext>` 供审查并失败,`approve_artifact` 则将审查后的 received 文件晋升为基准。以与测试一起提交、受审查把关的基准补强逐像素比对;名称会经过路径穿越检查。

## 本次更新 (2026-06-19) — 网络出口允许清单守卫

钉选自动化可连线的主机。完整参考:[`docs/source/Zh/doc/new_features/v34_features_doc.rst`](../docs/source/Zh/doc/new_features/v34_features_doc.rst)。
Expand Down
7 changes: 7 additions & 0 deletions README/README_zh-TW.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@

## 目錄

- [本次更新 (2026-06-19) — 核准式測試(Golden-Master 基準)](#本次更新-2026-06-19--核准式測試golden-master-基準)
- [本次更新 (2026-06-19) — 網路出口允許清單守衛](#本次更新-2026-06-19--網路出口允許清單守衛)
- [本次更新 (2026-06-19) — 即時憑證租約](#本次更新-2026-06-19--即時憑證租約)
- [本次更新 (2026-06-19) — Maker-Checker 審批閘門](#本次更新-2026-06-19--maker-checker-審批閘門)
Expand Down Expand Up @@ -86,6 +87,12 @@

---

## 本次更新 (2026-06-19) — 核准式測試(Golden-Master 基準)

將輸出鎖定到人工核准的基準。完整參考:[`docs/source/Zh/doc/new_features/v35_features_doc.rst`](../docs/source/Zh/doc/new_features/v35_features_doc.rst)。

- **`verify_artifact` / `approve_artifact`**(`AC_verify_artifact` / `AC_approve_artifact` / `AC_pending_artifacts`、`ac_*`):對*任何*產物(文字、JSON、OCR 輸出、螢幕截圖位元組)進行 golden-master / snapshot 測試。`verify_artifact` 將產出內容與 `<name>.approved.<ext>` 比對;不符或缺少基準會寫入 `<name>.received.<ext>` 供審查並失敗,`approve_artifact` 則將審查後的 received 檔晉升為基準。以與測試一起提交、受審查把關的基準補強逐像素比對;名稱會經過路徑穿越檢查。

## 本次更新 (2026-06-19) — 網路出口允許清單守衛

釘選自動化可連線的主機。完整參考:[`docs/source/Zh/doc/new_features/v34_features_doc.rst`](../docs/source/Zh/doc/new_features/v34_features_doc.rst)。
Expand Down
52 changes: 52 additions & 0 deletions docs/source/Eng/doc/new_features/v35_features_doc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
Approval Testing (Golden-Master Baselines)
==========================================

Approval testing (a.k.a. golden-master / snapshot testing) reframes "is this
output still correct?" as "does it still match the version a human approved?".
:func:`verify_artifact` compares produced content to a stored
``<name>.approved.<ext>`` baseline:

* **match** → the check passes;
* **mismatch or missing baseline** → the produced bytes are written to
``<name>.received.<ext>`` and the check fails, so a reviewer can diff the two
and, if the change is intended, promote it with :func:`approve_artifact`.

It works for *any* artifact — rendered text, JSON, OCR output, screenshot bytes
— so it complements pixel diffing with a review-gated baseline you commit
alongside the test. Pure standard library; imports no ``PySide6``. Names are
validated against path traversal.

Headless API
------------

.. code-block:: python

from je_auto_control import verify_artifact, approve_artifact

result = verify_artifact("invoice_render", produced_text,
approvals_dir="tests/.approvals")
if not result.match:
# first run is "new", a changed output is "mismatch"; review the
# .received file, then bless it:
approve_artifact("invoice_render", approvals_dir="tests/.approvals")

``content`` may be ``str`` or ``bytes`` (pass ``extension="png"`` for binary
snapshots). A verified run clears any stale received file.
``pending_artifacts(dir)`` lists names still awaiting approval. ``ApprovalResult``
carries ``status`` (``verified`` / ``mismatch`` / ``new``), ``match``, and both
file paths.

Executor commands
-----------------

================================ ===================================================
Command Effect
================================ ===================================================
``AC_verify_artifact`` Compare ``content`` to the approved baseline.
``AC_approve_artifact`` Promote the received artifact to the baseline.
``AC_pending_artifacts`` List artifacts awaiting approval.
================================ ===================================================

The same operations are exposed as MCP tools (``ac_verify_artifact`` /
``ac_approve_artifact`` / ``ac_pending_artifacts``) and as Script Builder
commands under **Testing**.
1 change: 1 addition & 0 deletions docs/source/Eng/eng_index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@ Comprehensive guides for all AutoControl features.
doc/new_features/v32_features_doc
doc/new_features/v33_features_doc
doc/new_features/v34_features_doc
doc/new_features/v35_features_doc
doc/ocr_backends/ocr_backends_doc
doc/observability/observability_doc
doc/operations_layer/operations_layer_doc
Expand Down
46 changes: 46 additions & 0 deletions docs/source/Zh/doc/new_features/v35_features_doc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
核准式測試(Golden-Master 基準)
================================

核准式測試(又稱 golden-master / snapshot 測試)把「這個輸出還正確嗎?」重新表述為
「它是否仍與人工核准過的版本相符?」。:func:`verify_artifact` 將產出的內容與儲存的
``<name>.approved.<ext>`` 基準比對:

* **相符** → 檢查通過;
* **不符或缺少基準** → 產出的位元組會被寫入 ``<name>.received.<ext>`` 且檢查失敗,讓
審查者可比對兩者,若變更為預期,即以 :func:`approve_artifact` 晉升。

它適用於*任何*產物 —— 渲染後的文字、JSON、OCR 輸出、螢幕截圖位元組 —— 因此以一個受
審查把關、與測試一起提交的基準,補強逐像素比對。純標準函式庫,不匯入 ``PySide6``。
名稱會經過路徑穿越驗證。

無頭 API
--------

.. code-block:: python

from je_auto_control import verify_artifact, approve_artifact

result = verify_artifact("invoice_render", produced_text,
approvals_dir="tests/.approvals")
if not result.match:
# 首次執行為 "new",輸出變更為 "mismatch";審查 .received 檔後再核可:
approve_artifact("invoice_render", approvals_dir="tests/.approvals")

``content`` 可為 ``str`` 或 ``bytes``(二進位快照請傳 ``extension="png"``)。相符的執
行會清除任何過期的 received 檔。``pending_artifacts(dir)`` 列出仍待核准的名稱。
``ApprovalResult`` 帶有 ``status``(``verified`` / ``mismatch`` / ``new``)、
``match`` 及兩個檔案路徑。

執行器指令
----------

================================ ===================================================
指令 效果
================================ ===================================================
``AC_verify_artifact`` 將 ``content`` 與已核准基準比對。
``AC_approve_artifact`` 將 received 產物晉升為基準。
``AC_pending_artifacts`` 列出待核准的產物。
================================ ===================================================

相同操作亦提供為 MCP 工具(``ac_verify_artifact`` / ``ac_approve_artifact`` /
``ac_pending_artifacts``),以及 Script Builder 中 **Testing** 分類下的指令。
1 change: 1 addition & 0 deletions docs/source/Zh/zh_index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@ AutoControl 所有功能的完整使用指南。
doc/new_features/v32_features_doc
doc/new_features/v33_features_doc
doc/new_features/v34_features_doc
doc/new_features/v35_features_doc
doc/ocr_backends/ocr_backends_doc
doc/observability/observability_doc
doc/operations_layer/operations_layer_doc
Expand Down
6 changes: 6 additions & 0 deletions je_auto_control/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -212,6 +212,10 @@
from je_auto_control.utils.egress import (
EgressBlocked, EgressPolicy, get_egress_policy, set_egress_policy,
)
# Approval testing: verify artifacts against a human-approved baseline
from je_auto_control.utils.approval import (
ApprovalResult, approve_artifact, pending_artifacts, verify_artifact,
)
# Background popup/interrupt watchdog (unattended automation)
from je_auto_control.utils.watchdog import (
PopupWatchdog, WatchdogRule, default_popup_watchdog,
Expand Down Expand Up @@ -650,6 +654,8 @@ def start_autocontrol_gui(*args, **kwargs):
"ApprovalGate", "CredentialBroker", "CredentialBrokerError",
"default_broker", "set_secret_resolver",
"EgressBlocked", "EgressPolicy", "get_egress_policy", "set_egress_policy",
"ApprovalResult", "approve_artifact", "pending_artifacts",
"verify_artifact",
# MCP server
"AuditLogger", "HttpMCPServer", "MCPContent", "MCPPrompt",
"MCPPromptArgument", "MCPResource", "MCPServer", "MCPTool",
Expand Down
29 changes: 29 additions & 0 deletions je_auto_control/gui/script_builder/command_schema.py
Original file line number Diff line number Diff line change
Expand Up @@ -772,6 +772,35 @@
fields=(),
description="Clear the egress policy back to allow-all.",
))
specs.append(CommandSpec(
"AC_verify_artifact", "Testing", "Approval: Verify Artifact",
fields=(
FieldSpec("name", FieldType.STRING, placeholder="login_screen"),
FieldSpec("content", FieldType.STRING),
FieldSpec("approvals_dir", FieldType.STRING, optional=True,
default=".approvals"),

Check failure on line 781 in je_auto_control/gui/script_builder/command_schema.py

View check run for this annotation

SonarQubeCloud / SonarCloud Code Analysis

Define a constant instead of duplicating this literal ".approvals" 3 times.

See more on https://sonarcloud.io/project/issues?id=Integration-Automation_AutoControlGUI&issues=AZ7gZJ0znzQ6wn2yYUgy&open=AZ7gZJ0znzQ6wn2yYUgy&pullRequest=243
FieldSpec("extension", FieldType.STRING, optional=True,
default="txt"),
),
description="Compare content to its approved baseline (snapshot test).",
))
specs.append(CommandSpec(
"AC_approve_artifact", "Testing", "Approval: Promote Received",
fields=(
FieldSpec("name", FieldType.STRING),
FieldSpec("approvals_dir", FieldType.STRING, optional=True,
default=".approvals"),
FieldSpec("extension", FieldType.STRING, optional=True,
default="txt"),
),
description="Promote a received artifact to the approved baseline.",
))
specs.append(CommandSpec(
"AC_pending_artifacts", "Testing", "Approval: List Pending",
fields=(FieldSpec("approvals_dir", FieldType.STRING, optional=True,
default=".approvals"),),
description="List artifacts awaiting approval.",
))
specs.append(CommandSpec(
"AC_generate_sop", "Report", "Generate SOP Document",
fields=(
Expand Down
9 changes: 9 additions & 0 deletions je_auto_control/utils/approval/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
"""Approval testing: verify artifacts against an approved baseline."""
from je_auto_control.utils.approval.approval_test import (
ApprovalResult, approve_artifact, pending_artifacts, verify_artifact,
)

__all__ = [
"ApprovalResult", "approve_artifact", "pending_artifacts",
"verify_artifact",
]
93 changes: 93 additions & 0 deletions je_auto_control/utils/approval/approval_test.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
"""Approval testing — lock an artifact against a human-approved baseline.

The approval-testing workflow (a.k.a. golden-master / snapshot testing) turns
"is this output still correct?" into "does this output still match the version
a human approved?". :func:`verify_artifact` compares produced ``content`` to a
stored ``<name>.approved.<ext>`` baseline:

* match → the check passes;
* mismatch or missing baseline → the produced bytes are written to
``<name>.received.<ext>`` and the check fails, so a reviewer can diff the two
and, if the change is intended, promote it with :func:`approve_artifact`.

It works for any artifact — rendered text, JSON, OCR output, screenshot bytes —
complementing pixel diffing with a review-gated baseline. Pure standard
library; imports no ``PySide6``.
"""
import os
from dataclasses import dataclass
from pathlib import Path
from typing import List, Union

DEFAULT_DIR = ".approvals"


@dataclass(frozen=True)
class ApprovalResult:
"""Outcome of :func:`verify_artifact`."""

name: str
status: str # "verified" | "mismatch" | "new"
match: bool
approved_path: str
received_path: str


def _safe_name(name: str) -> str:
"""Reject path-traversal in ``name`` and return it unchanged if safe."""
if not name or name != os.path.basename(name) or name in (".", ".."):
raise ValueError(f"unsafe approval name: {name!r}")
return name


def _paths(name: str, approvals_dir: str, extension: str):
base = Path(approvals_dir)
ext = extension.lstrip(".")
return (base / f"{_safe_name(name)}.approved.{ext}",
base / f"{name}.received.{ext}")


def _as_bytes(content: Union[str, bytes]) -> bytes:
return content.encode("utf-8") if isinstance(content, str) else bytes(content)


def verify_artifact(name: str, content: Union[str, bytes],
approvals_dir: str = DEFAULT_DIR,
extension: str = "txt") -> ApprovalResult:
"""Compare ``content`` to the approved baseline for ``name``.

On match the received file is cleared and ``match`` is ``True``; otherwise
the produced bytes are written to the received file for review.
"""
approved, received = _paths(name, approvals_dir, extension)
produced = _as_bytes(content)
if approved.is_file() and approved.read_bytes() == produced:
if received.is_file():
received.unlink()
return ApprovalResult(name, "verified", True,
str(approved), str(received))
received.parent.mkdir(parents=True, exist_ok=True)
received.write_bytes(produced)
status = "mismatch" if approved.is_file() else "new"
return ApprovalResult(name, status, False, str(approved), str(received))


def approve_artifact(name: str, approvals_dir: str = DEFAULT_DIR,
extension: str = "txt") -> str:
"""Promote the received artifact for ``name`` to be the approved baseline."""
approved, received = _paths(name, approvals_dir, extension)
if not received.is_file():
raise FileNotFoundError(
f"no received artifact to approve for {name!r}")
os.replace(received, approved)
return str(approved)


def pending_artifacts(approvals_dir: str = DEFAULT_DIR) -> List[str]:
"""Return the names of artifacts with a received file awaiting approval."""
base = Path(approvals_dir)
if not base.is_dir():
return []
names = [path.name.split(".received.", 1)[0]
for path in base.glob("*.received.*")]
return sorted(names)
27 changes: 27 additions & 0 deletions je_auto_control/utils/executor/action_executor.py
Original file line number Diff line number Diff line change
Expand Up @@ -2967,6 +2967,30 @@
return {"allow": None, "deny": []}


def _verify_artifact(name: str, content: Any,
approvals_dir: str = ".approvals",

Check failure on line 2971 in je_auto_control/utils/executor/action_executor.py

View check run for this annotation

SonarQubeCloud / SonarCloud Code Analysis

Define a constant instead of duplicating this literal ".approvals" 3 times.

See more on https://sonarcloud.io/project/issues?id=Integration-Automation_AutoControlGUI&issues=AZ7gZJ0gnzQ6wn2yYUgx&open=AZ7gZJ0gnzQ6wn2yYUgx&pullRequest=243
extension: str = "txt") -> Dict[str, Any]:
"""Adapter: verify an artifact against its approved baseline."""
from je_auto_control.utils.approval import verify_artifact
result = verify_artifact(name, content, approvals_dir, extension)
return {"status": result.status, "match": result.match,
"approved_path": result.approved_path,
"received_path": result.received_path}


def _approve_artifact(name: str, approvals_dir: str = ".approvals",
extension: str = "txt") -> Dict[str, Any]:
"""Adapter: promote a received artifact to the approved baseline."""
from je_auto_control.utils.approval import approve_artifact
return {"approved": approve_artifact(name, approvals_dir, extension)}


def _pending_artifacts(approvals_dir: str = ".approvals") -> Dict[str, Any]:
"""Adapter: list artifacts awaiting approval."""
from je_auto_control.utils.approval import pending_artifacts
return {"pending": pending_artifacts(approvals_dir)}


class Executor:
"""
Executor
Expand Down Expand Up @@ -3209,6 +3233,9 @@
"AC_egress_allow": _egress_allow,
"AC_egress_check": _egress_check,
"AC_egress_reset": _egress_reset,
"AC_verify_artifact": _verify_artifact,
"AC_approve_artifact": _approve_artifact,
"AC_pending_artifacts": _pending_artifacts,
"AC_a11y_record_start": _a11y_record_start,
"AC_a11y_record_stop": _a11y_record_stop,
"AC_a11y_record_events": _a11y_record_events,
Expand Down
Loading
Loading