Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@

## Table of Contents

- [What's new (2026-06-20) — Perceptual-Hash Image Dedupe](#whats-new-2026-06-20--perceptual-hash-image-dedupe)
- [What's new (2026-06-20) — S3-Compatible Artifact Store](#whats-new-2026-06-20--s3-compatible-artifact-store)
- [What's new (2026-06-20) — Fuzzy String Matching & Dedupe](#whats-new-2026-06-20--fuzzy-string-matching--dedupe)
- [What's new (2026-06-19) — Video Step-Overlay Report](#whats-new-2026-06-19--video-step-overlay-report)
Expand Down Expand Up @@ -94,6 +95,12 @@

---

## What's new (2026-06-20) — Perceptual-Hash Image Dedupe

Collapse near-identical screenshots. Full reference: [`docs/source/Eng/doc/new_features/v42_features_doc.rst`](docs/source/Eng/doc/new_features/v42_features_doc.rst).

- **`average_hash` / `dhash` / `hamming_distance` / `images_similar` / `dedupe_images`** (`AC_image_hash` / `AC_dedupe_images`, `ac_*`): perceptual hashing maps visually similar images to close fingerprints, so near-duplicate frames in a recording or step report cluster by Hamming distance and collapse to one representative. Uses **Pillow** (already core — no extra dep); the dedupe/compare logic is pure Python with an injectable `hasher`, so clustering is unit-tested without any image and the real Pillow path under `importorskip`.

## What's new (2026-06-20) — S3-Compatible Artifact Store

Push run artifacts to object storage. Full reference: [`docs/source/Eng/doc/new_features/v41_features_doc.rst`](docs/source/Eng/doc/new_features/v41_features_doc.rst).
Expand Down
7 changes: 7 additions & 0 deletions README/README_zh-CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@

## 目录

- [本次更新 (2026-06-20) — 感知哈希图像去重](#本次更新-2026-06-20--感知哈希图像去重)
- [本次更新 (2026-06-20) — S3 兼容成品存储](#本次更新-2026-06-20--s3-兼容成品存储)
- [本次更新 (2026-06-20) — 模糊字符串匹配与去重](#本次更新-2026-06-20--模糊字符串匹配与去重)
- [本次更新 (2026-06-19) — 视频步骤叠加报告](#本次更新-2026-06-19--视频步骤叠加报告)
Expand Down Expand Up @@ -93,6 +94,12 @@

---

## 本次更新 (2026-06-20) — 感知哈希图像去重

收合近乎相同的屏幕截图。完整参考:[`docs/source/Zh/doc/new_features/v42_features_doc.rst`](../docs/source/Zh/doc/new_features/v42_features_doc.rst)。

- **`average_hash` / `dhash` / `hamming_distance` / `images_similar` / `dedupe_images`**(`AC_image_hash` / `AC_dedupe_images`、`ac_*`):感知哈希将视觉相似的图像映射到接近的指纹,因此录像或步骤报告中的近似重复画面可依汉明距离分群并收合为一个代表。使用 **Pillow**(已是核心 —— 无额外依赖);去重/比较逻辑为纯 Python 且 `hasher` 可注入,因此分群在无任何图像下单元测试,实际 Pillow 路径以 `importorskip` 测试。

## 本次更新 (2026-06-20) — S3 兼容成品存储

将运行成品推送到对象存储。完整参考:[`docs/source/Zh/doc/new_features/v41_features_doc.rst`](../docs/source/Zh/doc/new_features/v41_features_doc.rst)。
Expand Down
7 changes: 7 additions & 0 deletions README/README_zh-TW.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@

## 目錄

- [本次更新 (2026-06-20) — 感知雜湊影像去重](#本次更新-2026-06-20--感知雜湊影像去重)
- [本次更新 (2026-06-20) — S3 相容成品儲存](#本次更新-2026-06-20--s3-相容成品儲存)
- [本次更新 (2026-06-20) — 模糊字串比對與去重](#本次更新-2026-06-20--模糊字串比對與去重)
- [本次更新 (2026-06-19) — 影片步驟疊加報告](#本次更新-2026-06-19--影片步驟疊加報告)
Expand Down Expand Up @@ -93,6 +94,12 @@

---

## 本次更新 (2026-06-20) — 感知雜湊影像去重

收合近乎相同的螢幕截圖。完整參考:[`docs/source/Zh/doc/new_features/v42_features_doc.rst`](../docs/source/Zh/doc/new_features/v42_features_doc.rst)。

- **`average_hash` / `dhash` / `hamming_distance` / `images_similar` / `dedupe_images`**(`AC_image_hash` / `AC_dedupe_images`、`ac_*`):感知雜湊將視覺相似的影像對應到接近的指紋,因此錄影或步驟報告中的近似重複畫面可依漢明距離分群並收合為一個代表。使用 **Pillow**(已是核心 —— 無額外相依);去重/比較邏輯為純 Python 且 `hasher` 可注入,因此分群在無任何影像下單元測試,實際 Pillow 路徑以 `importorskip` 測試。

## 本次更新 (2026-06-20) — S3 相容成品儲存

將執行成品推送到物件儲存。完整參考:[`docs/source/Zh/doc/new_features/v41_features_doc.rst`](../docs/source/Zh/doc/new_features/v41_features_doc.rst)。
Expand Down
47 changes: 47 additions & 0 deletions docs/source/Eng/doc/new_features/v42_features_doc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
Perceptual-Hash Image Dedupe
============================

A screen recording or a step report often contains many nearly identical frames.
Perceptual hashes (average-hash and difference-hash) map visually similar images
to numerically close fingerprints, so frames can be clustered by Hamming distance
and collapsed — keeping one representative per distinct view.

The hashing functions use **Pillow** (already a core dependency — no extra
package required); the dedupe/compare logic is pure Python and the ``hasher`` is
injectable, so clustering is unit-testable without any image. Imports no
``PySide6``.

Headless API
------------

.. code-block:: python

from je_auto_control import (
average_hash, dhash, hamming_distance, images_similar, dedupe_images)

h1 = average_hash("frame1.png") # hex fingerprint
h2 = average_hash("frame2.png")
hamming_distance(h1, h2) # bits that differ
images_similar(h1, h2, max_distance=5) # within tolerance?

dedupe_images(["a.png", "b.png", "c.png"], max_distance=5)
# -> keeps one image per near-duplicate cluster (first wins)

``average_hash`` compares each pixel to the mean brightness; ``dhash`` compares
each pixel to its right neighbour (more robust to gamma shifts). ``dedupe_images``
accepts a ``hasher`` hook (defaulting to ``average_hash``) so the clustering can
be tested with precomputed hashes.

Executor commands
-----------------

================================ ===================================================
Command Effect
================================ ===================================================
``AC_image_hash`` ``{hash}`` of an image (``algo``: average/dhash).
``AC_dedupe_images`` ``{unique}`` with near-duplicate images collapsed.
================================ ===================================================

``paths`` accepts a list or a JSON-string list (so the visual builder works). The
same operations are exposed as MCP tools (``ac_image_hash`` / ``ac_dedupe_images``)
and as Script Builder commands under **Image**.
1 change: 1 addition & 0 deletions docs/source/Eng/eng_index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,7 @@ Comprehensive guides for all AutoControl features.
doc/new_features/v39_features_doc
doc/new_features/v40_features_doc
doc/new_features/v41_features_doc
doc/new_features/v42_features_doc
doc/ocr_backends/ocr_backends_doc
doc/observability/observability_doc
doc/operations_layer/operations_layer_doc
Expand Down
43 changes: 43 additions & 0 deletions docs/source/Zh/doc/new_features/v42_features_doc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
感知雜湊影像去重
================

螢幕錄影或步驟報告常含有許多近乎相同的畫面。感知雜湊(average-hash 與 difference-hash)
將視覺上相似的影像對應到數值接近的指紋,因此可依漢明距離分群並收合 —— 每個明顯不同的
畫面只保留一個代表。

雜湊函式使用 **Pillow**(已是核心相依 —— 無需額外套件);去重/比較邏輯為純 Python,且
``hasher`` 可注入,因此分群可在無任何影像下單元測試。不匯入 ``PySide6``。

無頭 API
--------

.. code-block:: python

from je_auto_control import (
average_hash, dhash, hamming_distance, images_similar, dedupe_images)

h1 = average_hash("frame1.png") # 十六進位指紋
h2 = average_hash("frame2.png")
hamming_distance(h1, h2) # 相異的位元數
images_similar(h1, h2, max_distance=5) # 是否在容差內?

dedupe_images(["a.png", "b.png", "c.png"], max_distance=5)
# -> 每個近似重複叢集保留一張(保留第一個)

``average_hash`` 將每個像素與平均亮度比較;``dhash`` 將每個像素與其右鄰比較(對 gamma
偏移更穩健)。``dedupe_images`` 接受 ``hasher`` 掛鉤(預設為 ``average_hash``),因此可
用預先計算的雜湊測試分群。

執行器指令
----------

================================ ===================================================
指令 效果
================================ ===================================================
``AC_image_hash`` 影像的 ``{hash}``(``algo``:average/dhash)。
``AC_dedupe_images`` 收合近似重複影像後的 ``{unique}``。
================================ ===================================================

``paths`` 接受清單或 JSON 字串清單(因此視覺化建構器可用)。相同操作亦提供為 MCP 工具
(``ac_image_hash`` / ``ac_dedupe_images``),以及 Script Builder 中 **Image** 分類下的
指令。
1 change: 1 addition & 0 deletions docs/source/Zh/zh_index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,7 @@ AutoControl 所有功能的完整使用指南。
doc/new_features/v39_features_doc
doc/new_features/v40_features_doc
doc/new_features/v41_features_doc
doc/new_features/v42_features_doc
doc/ocr_backends/ocr_backends_doc
doc/observability/observability_doc
doc/operations_layer/operations_layer_doc
Expand Down
6 changes: 6 additions & 0 deletions je_auto_control/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -239,6 +239,10 @@
S3ArtifactStore, configure_default_store, get_default_store,
set_default_store,
)
# Perceptual-hash image dedupe (Pillow aHash/dHash)
from je_auto_control.utils.image_dedup import (
average_hash, dedupe_images, dhash, hamming_distance, images_similar,
)
# Background popup/interrupt watchdog (unattended automation)
from je_auto_control.utils.watchdog import (
PopupWatchdog, WatchdogRule, default_popup_watchdog,
Expand Down Expand Up @@ -688,6 +692,8 @@ def start_autocontrol_gui(*args, **kwargs):
"fuzzy_best_match", "fuzzy_dedupe", "fuzzy_matches", "fuzzy_ratio",
"S3ArtifactStore", "configure_default_store", "get_default_store",
"set_default_store",
"average_hash", "dedupe_images", "dhash", "hamming_distance",
"images_similar",
# MCP server
"AuditLogger", "HttpMCPServer", "MCPContent", "MCPPrompt",
"MCPPromptArgument", "MCPResource", "MCPServer", "MCPTool",
Expand Down
18 changes: 18 additions & 0 deletions je_auto_control/gui/script_builder/command_schema.py
Original file line number Diff line number Diff line change
Expand Up @@ -928,6 +928,24 @@ def _add_misc_specs(specs: List[CommandSpec]) -> None:
fields=(FieldSpec("key", FieldType.STRING),),
description="Delete an object from the default S3 store.",
))
specs.append(CommandSpec(
"AC_image_hash", "Image", "Perceptual Hash",
fields=(
FieldSpec("path", FieldType.FILE_PATH),
FieldSpec("algo", FieldType.ENUM, optional=True, default="average",
choices=("average", "dhash")),
),
description="Perceptual hash of an image (average or dhash).",
))
specs.append(CommandSpec(
"AC_dedupe_images", "Image", "Dedupe Near-Identical Images",
fields=(
FieldSpec("paths", FieldType.STRING,
placeholder='["a.png", "b.png"]'),
FieldSpec("max_distance", FieldType.INT, optional=True, default=5),
),
description="Collapse near-duplicate images by perceptual hash.",
))
specs.append(CommandSpec(
"AC_generate_sop", "Report", "Generate SOP Document",
fields=(
Expand Down
16 changes: 16 additions & 0 deletions je_auto_control/utils/executor/action_executor.py
Original file line number Diff line number Diff line change
Expand Up @@ -3122,6 +3122,20 @@ def _s3_delete(key: str) -> Dict[str, Any]:
return {"deleted": get_default_store().delete(key)}


def _image_hash(path: str, algo: str = "average") -> Dict[str, Any]:
"""Adapter: perceptual hash of an image (average or dhash)."""
from je_auto_control.utils.image_dedup import average_hash, dhash
hasher = dhash if algo == "dhash" else average_hash
return {"hash": hasher(path)}


def _dedupe_images(paths: Any, max_distance: int = 5) -> Dict[str, Any]:
"""Adapter: drop near-duplicate images, keeping the first of each cluster."""
from je_auto_control.utils.image_dedup import dedupe_images
return {"unique": dedupe_images(_coerce_list(paths),
max_distance=max_distance)}


class Executor:
"""
Executor
Expand Down Expand Up @@ -3381,6 +3395,8 @@ def __init__(self):
"AC_s3_download": _s3_download,
"AC_s3_list": _s3_list,
"AC_s3_delete": _s3_delete,
"AC_image_hash": _image_hash,
"AC_dedupe_images": _dedupe_images,
"AC_a11y_record_start": _a11y_record_start,
"AC_a11y_record_stop": _a11y_record_stop,
"AC_a11y_record_events": _a11y_record_events,
Expand Down
9 changes: 9 additions & 0 deletions je_auto_control/utils/image_dedup/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
"""Perceptual-hash image dedupe (Pillow-based aHash/dHash, no extra deps)."""
from je_auto_control.utils.image_dedup.perceptual_hash import (
average_hash, dedupe_images, dhash, hamming_distance, images_similar,
)

__all__ = [
"average_hash", "dedupe_images", "dhash", "hamming_distance",
"images_similar",
]
74 changes: 74 additions & 0 deletions je_auto_control/utils/image_dedup/perceptual_hash.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
"""Perceptual hashing to dedupe near-identical screenshots / frames.

A screen recording or a step report often contains many nearly identical frames.
Perceptual hashes (average-hash and difference-hash) map visually similar images
to numerically close fingerprints, so they can be clustered by Hamming distance
and collapsed — keeping one representative per distinct view.

The hashing functions use **Pillow** (already a core dependency — no extra
package required); the dedupe/compare logic is pure Python and the ``hasher`` is
injectable, so clustering is unit-testable without any image. Imports no
``PySide6``.
"""
from typing import Any, Callable, List, Optional, Sequence


def _gray_resized(image: Any, size: tuple) -> Any:
from PIL import Image
img = image if hasattr(image, "convert") else Image.open(image)
return img.convert("L").resize(size)


def _bits_to_hex(bits: str) -> str:
width = (len(bits) + 3) // 4
return f"{int(bits, 2):0{width}x}" if bits else "0"


def average_hash(image: Any, hash_size: int = 8) -> str:
"""Average-hash an image to a hex fingerprint (brightness vs. the mean)."""
pixels = list(_gray_resized(image, (hash_size, hash_size)).getdata())
average = sum(pixels) / len(pixels)
return _bits_to_hex("".join("1" if p > average else "0" for p in pixels))


def dhash(image: Any, hash_size: int = 8) -> str:
"""Difference-hash an image (each pixel brighter than its right neighbour)."""
width = hash_size + 1
pixels = list(_gray_resized(image, (width, hash_size)).getdata())
bits = [
"1" if pixels[row * width + col] > pixels[row * width + col + 1]
else "0"
for row in range(hash_size) for col in range(hash_size)
]
return _bits_to_hex("".join(bits))


def hamming_distance(hash_a: str, hash_b: str) -> int:
"""Number of differing bits between two hex fingerprints."""
return bin(int(hash_a, 16) ^ int(hash_b, 16)).count("1")


def images_similar(hash_a: str, hash_b: str, max_distance: int = 5) -> bool:
"""Whether two fingerprints are within ``max_distance`` bits."""
return hamming_distance(hash_a, hash_b) <= max_distance


def dedupe_images(images: Sequence[Any], *, max_distance: int = 5,
hasher: Optional[Callable[[Any], str]] = None) -> List[Any]:
"""Keep one image per near-duplicate cluster (first wins).

Each image is dropped when its hash is within ``max_distance`` bits of an
already-kept image. ``hasher`` defaults to :func:`average_hash`; inject a
fake to test the clustering without real images.
"""
compute = hasher or average_hash
kept: List[Any] = []
kept_hashes: List[str] = []
for image in images:
fingerprint = compute(image)
if any(hamming_distance(fingerprint, seen) <= max_distance
for seen in kept_hashes):
continue
kept.append(image)
kept_hashes.append(fingerprint)
return kept
30 changes: 29 additions & 1 deletion je_auto_control/utils/mcp_server/tools/_factories.py
Original file line number Diff line number Diff line change
Expand Up @@ -2927,6 +2927,34 @@ def artifact_store_tools() -> List[MCPTool]:
]


def image_dedup_tools() -> List[MCPTool]:
return [
MCPTool(
name="ac_image_hash",
description=("Perceptual hash of an image file for similarity "
"comparison. 'algo' is 'average' (default) or "
"'dhash'. Returns {hash} (hex)."),
input_schema=schema(
{"path": {"type": "string"},
"algo": {"type": "string", "enum": ["average", "dhash"]}},
["path"]),
handler=h.image_hash,
annotations=READ_ONLY,
),
MCPTool(
name="ac_dedupe_images",
description=("Collapse near-duplicate images by perceptual hash, "
"keeping the first of each cluster (images within "
"'max_distance' bits are dropped). Returns {unique}."),
input_schema=schema(
{"paths": {"type": "array", "items": {"type": "string"}},
"max_distance": {"type": "integer"}}, ["paths"]),
handler=h.dedupe_images,
annotations=READ_ONLY,
),
]


def unattended_tools() -> List[MCPTool]:
return [
MCPTool(
Expand Down Expand Up @@ -3987,7 +4015,7 @@ def media_assert_tools() -> List[MCPTool]:
process_doc_tools, tween_drag_tools, plugin_sdk_tools, governance_tools,
credential_lease_tools, egress_tools, approval_testing_tools,
trajectory_eval_tools, compliance_tools, agent_trace_tools,
video_report_tools, fuzzy_tools, artifact_store_tools,
video_report_tools, fuzzy_tools, artifact_store_tools, image_dedup_tools,
screen_record_tools,
process_and_shell_tools, remote_desktop_tools, gamepad_tools,
usb_passthrough_tools, assertion_tools, data_source_tools,
Expand Down
11 changes: 11 additions & 0 deletions je_auto_control/utils/mcp_server/tools/_handlers.py
Original file line number Diff line number Diff line change
Expand Up @@ -1412,6 +1412,17 @@ def s3_delete(key):
return {"deleted": get_default_store().delete(key)}


def image_hash(path, algo="average"):
from je_auto_control.utils.image_dedup import average_hash, dhash
hasher = dhash if algo == "dhash" else average_hash
return {"hash": hasher(path)}


def dedupe_images(paths, max_distance=5):
from je_auto_control.utils.image_dedup import dedupe_images as _dedupe
return {"unique": _dedupe(paths, max_distance=max_distance)}


def vlm_locate(description: str,
screen_region: Optional[List[int]] = None,
model: Optional[str] = None) -> Optional[List[int]]:
Expand Down
Loading
Loading