diff --git a/README.md b/README.md index 9dc5e697..f1aedfca 100644 --- a/README.md +++ b/README.md @@ -13,6 +13,7 @@ ## Table of Contents +- [What's new (2026-06-22) — Moving-Average Smoothing](#whats-new-2026-06-22--moving-average-smoothing) - [What's new (2026-06-22) — Single-Series Anomaly Detection](#whats-new-2026-06-22--single-series-anomaly-detection) - [What's new (2026-06-22) — Near-Duplicate Text Detection (SimHash / MinHash)](#whats-new-2026-06-22--near-duplicate-text-detection-simhash--minhash) - [What's new (2026-06-22) — String-Distance Similarity Metrics](#whats-new-2026-06-22--string-distance-similarity-metrics) @@ -154,6 +155,12 @@ --- +## What's new (2026-06-22) — Moving-Average Smoothing + +Smooth a noisy value series. Full reference: [`docs/source/Eng/doc/new_features/v102_features_doc.rst`](docs/source/Eng/doc/new_features/v102_features_doc.rst). + +- **`sma` / `wma` / `ewma` / `rolling`** (`AC_sma`, `AC_ewma`): `stats.describe` summarizes a whole sample and `timeseries` rolls counters into rates, but nothing smoothed a noisy signal. This adds trailing simple/weighted/exponentially-weighted moving averages and a generic rolling reducer, all returning a same-length list aligned to the input timeline. Pure-stdlib, deterministic. + ## What's new (2026-06-22) — Single-Series Anomaly Detection Flag the spike in one live metric series. Full reference: [`docs/source/Eng/doc/new_features/v101_features_doc.rst`](docs/source/Eng/doc/new_features/v101_features_doc.rst). diff --git a/README/README_zh-CN.md b/README/README_zh-CN.md index 88181797..c6d27157 100644 --- a/README/README_zh-CN.md +++ b/README/README_zh-CN.md @@ -12,6 +12,7 @@ ## 目录 +- [本次更新 (2026-06-22) — 移动平均平滑](#本次更新-2026-06-22--移动平均平滑) - [本次更新 (2026-06-22) — 单序列异常检测](#本次更新-2026-06-22--单序列异常检测) - [本次更新 (2026-06-22) — 近似重复文本检测(SimHash / MinHash)](#本次更新-2026-06-22--近似重复文本检测simhash--minhash) - [本次更新 (2026-06-22) — 字符串距离相似度量](#本次更新-2026-06-22--字符串距离相似度量) @@ -153,6 +154,12 @@ --- +## 本次更新 (2026-06-22) — 移动平均平滑 + +平滑噪声值序列。完整参考:[`docs/source/Zh/doc/new_features/v102_features_doc.rst`](../docs/source/Zh/doc/new_features/v102_features_doc.rst)。 + +- **`sma` / `wma` / `ewma` / `rolling`**(`AC_sma`、`AC_ewma`):`stats.describe` 汇总整个样本,`timeseries` 把计数器滚成速率,但没有东西能平滑噪声信号。本功能加入尾端简单/加权/指数加权移动平均与通用滚动归约器,全部返回与输入时间线对齐的等长 list。纯标准库、确定。 + ## 本次更新 (2026-06-22) — 单序列异常检测 标记单一实时度量序列中的尖峰。完整参考:[`docs/source/Zh/doc/new_features/v101_features_doc.rst`](../docs/source/Zh/doc/new_features/v101_features_doc.rst)。 diff --git a/README/README_zh-TW.md b/README/README_zh-TW.md index c564c6da..fdad9cbd 100644 --- a/README/README_zh-TW.md +++ b/README/README_zh-TW.md @@ -12,6 +12,7 @@ ## 目錄 +- [本次更新 (2026-06-22) — 移動平均平滑](#本次更新-2026-06-22--移動平均平滑) - [本次更新 (2026-06-22) — 單序列異常偵測](#本次更新-2026-06-22--單序列異常偵測) - [本次更新 (2026-06-22) — 近似重複文字偵測(SimHash / MinHash)](#本次更新-2026-06-22--近似重複文字偵測simhash--minhash) - [本次更新 (2026-06-22) — 字串距離相似度量](#本次更新-2026-06-22--字串距離相似度量) @@ -153,6 +154,12 @@ --- +## 本次更新 (2026-06-22) — 移動平均平滑 + +平滑雜訊值序列。完整參考:[`docs/source/Zh/doc/new_features/v102_features_doc.rst`](../docs/source/Zh/doc/new_features/v102_features_doc.rst)。 + +- **`sma` / `wma` / `ewma` / `rolling`**(`AC_sma`、`AC_ewma`):`stats.describe` 彙總整個樣本,`timeseries` 把計數器滾成速率,但沒有東西能平滑雜訊訊號。本功能加入尾端簡單/加權/指數加權移動平均與通用滾動歸約器,全部回傳與輸入時間線對齊的等長 list。純標準函式庫、具決定性。 + ## 本次更新 (2026-06-22) — 單序列異常偵測 標記單一即時度量序列中的尖峰。完整參考:[`docs/source/Zh/doc/new_features/v101_features_doc.rst`](../docs/source/Zh/doc/new_features/v101_features_doc.rst)。 diff --git a/docs/source/Eng/doc/new_features/v102_features_doc.rst b/docs/source/Eng/doc/new_features/v102_features_doc.rst new file mode 100644 index 00000000..7befe796 --- /dev/null +++ b/docs/source/Eng/doc/new_features/v102_features_doc.rst @@ -0,0 +1,35 @@ +Moving-Average Smoothing +======================== + +``stats.describe`` summarises a whole sample and ``timeseries`` rolls counters +into rates, but nothing smoothed a noisy signal or weighted recent points. This +adds trailing simple / weighted / exponentially-weighted moving averages and a +generic rolling reducer. + +Pure standard library; imports no ``PySide6``. Every function is pure (values +in, list out), so it is fully deterministic in CI. + +Headless API +------------ + +.. code-block:: python + + from je_auto_control import sma, wma, ewma, rolling + + sma([1, 2, 3, 4], 2) # [1.0, 1.5, 2.5, 3.5] + ewma([1, 2, 3], alpha=0.5) # [1.0, 1.5, 2.25] + wma(values, [1, 2, 3]) # weights align to the latest points + rolling(values, 5, max) # generic trailing-window reduction + +``sma`` averages each trailing window of ``window`` points; ``wma`` applies the +given weights (latest-aligned); ``ewma`` smooths with factor ``alpha`` in +``(0, 1]``; ``rolling`` applies any reducer over each trailing window. All +return a same-length list, so the result lines up with the input timeline (a +``resource_profiler`` FPS/CPU series, a latency stream, etc.). + +Executor commands +----------------- + +``AC_sma`` returns ``{series}`` for ``values`` over a ``window``; ``AC_ewma`` +returns ``{series}`` for an ``alpha``. Both are exposed as MCP tools (``ac_sma`` +/ ``ac_ewma``) and as Script Builder commands under **Data**. diff --git a/docs/source/Eng/eng_index.rst b/docs/source/Eng/eng_index.rst index 17f7ac91..5a4c0510 100644 --- a/docs/source/Eng/eng_index.rst +++ b/docs/source/Eng/eng_index.rst @@ -124,6 +124,7 @@ Comprehensive guides for all AutoControl features. doc/new_features/v99_features_doc doc/new_features/v100_features_doc doc/new_features/v101_features_doc + doc/new_features/v102_features_doc doc/ocr_backends/ocr_backends_doc doc/observability/observability_doc doc/operations_layer/operations_layer_doc diff --git a/docs/source/Zh/doc/new_features/v102_features_doc.rst b/docs/source/Zh/doc/new_features/v102_features_doc.rst new file mode 100644 index 00000000..13612133 --- /dev/null +++ b/docs/source/Zh/doc/new_features/v102_features_doc.rst @@ -0,0 +1,29 @@ +移動平均平滑 +========== + +``stats.describe`` 彙總整個樣本,``timeseries`` 把計數器滾成速率,但沒有東西能平滑雜訊訊號或加權近期點。 +本功能加入尾端的簡單 / 加權 / 指數加權移動平均,以及一個通用的滾動歸約器。 + +純標準函式庫;不匯入 ``PySide6``。每個函式皆為純函式(輸入值、輸出 list),因此在 CI 中完全具決定性。 + +無頭 API +-------- + +.. code-block:: python + + from je_auto_control import sma, wma, ewma, rolling + + sma([1, 2, 3, 4], 2) # [1.0, 1.5, 2.5, 3.5] + ewma([1, 2, 3], alpha=0.5) # [1.0, 1.5, 2.25] + wma(values, [1, 2, 3]) # 權重對齊到最新的點 + rolling(values, 5, max) # 通用尾端視窗歸約 + +``sma`` 對每個 ``window`` 點的尾端視窗取平均;``wma`` 套用給定權重(對齊最新);``ewma`` 以 ``(0, 1]`` 的 +``alpha`` 平滑;``rolling`` 對每個尾端視窗套用任意歸約器。全部回傳等長 list,因此結果與輸入時間線對齊 +(``resource_profiler`` FPS/CPU 序列、延遲串流等)。 + +執行器命令 +---------- + +``AC_sma`` 對 ``values`` 在 ``window`` 上回傳 ``{series}``;``AC_ewma`` 對 ``alpha`` 回傳 ``{series}``。 +兩者皆以 MCP 工具(``ac_sma`` / ``ac_ewma``)以及 Script Builder 中 **Data** 分類下的命令提供。 diff --git a/docs/source/Zh/zh_index.rst b/docs/source/Zh/zh_index.rst index b1457a45..fc78a5ef 100644 --- a/docs/source/Zh/zh_index.rst +++ b/docs/source/Zh/zh_index.rst @@ -124,6 +124,7 @@ AutoControl 所有功能的完整使用指南。 doc/new_features/v99_features_doc doc/new_features/v100_features_doc doc/new_features/v101_features_doc + doc/new_features/v102_features_doc doc/ocr_backends/ocr_backends_doc doc/observability/observability_doc doc/operations_layer/operations_layer_doc diff --git a/je_auto_control/__init__.py b/je_auto_control/__init__.py index 485fbadc..05157793 100644 --- a/je_auto_control/__init__.py +++ b/je_auto_control/__init__.py @@ -434,6 +434,8 @@ detect_anomalies, ewma_control, mad_anomalies, mad_scores, zscore_anomalies, zscore_scores, ) +# Moving-average smoothing (SMA / WMA / EWMA / rolling) +from je_auto_control.utils.smoothing import ewma, rolling, sma, wma # Bulkhead concurrency isolation + rate-limit header parsing from je_auto_control.utils.bulkhead import ( Bulkhead, BulkheadFullError, next_delay, parse_ratelimit, parse_retry_after, @@ -1005,6 +1007,7 @@ def start_autocontrol_gui(*args, **kwargs): "ts_rate", "ts_resample", "detect_anomalies", "ewma_control", "mad_anomalies", "mad_scores", "zscore_anomalies", "zscore_scores", + "ewma", "rolling", "sma", "wma", "Bulkhead", "BulkheadFullError", "next_delay", "parse_ratelimit", "parse_retry_after", "Cassette", "CassetteMissError", diff --git a/je_auto_control/gui/script_builder/command_schema.py b/je_auto_control/gui/script_builder/command_schema.py index 8edecef3..b3343d89 100644 --- a/je_auto_control/gui/script_builder/command_schema.py +++ b/je_auto_control/gui/script_builder/command_schema.py @@ -1979,6 +1979,22 @@ def _add_resilience_specs(specs: List[CommandSpec]) -> None: ), description="Flag outliers in a numeric series (MAD / z-score).", )) + specs.append(CommandSpec( + "AC_sma", "Data", "Smoothing: Simple Moving Average", + fields=( + FieldSpec("values", FieldType.STRING, placeholder="[1, 2, 3, 4, 5]"), + FieldSpec("window", FieldType.INT, placeholder="3"), + ), + description="Trailing simple moving average over a window.", + )) + specs.append(CommandSpec( + "AC_ewma", "Data", "Smoothing: EWMA", + fields=( + FieldSpec("values", FieldType.STRING, placeholder="[1, 2, 3, 4, 5]"), + FieldSpec("alpha", FieldType.FLOAT, optional=True, default=0.3), + ), + description="Exponentially-weighted moving average of a series.", + )) specs.append(CommandSpec( "AC_diff_rows", "Data", "Dataset Diff: Rows by Key", fields=( diff --git a/je_auto_control/utils/executor/action_executor.py b/je_auto_control/utils/executor/action_executor.py index 45187de4..e48147bd 100644 --- a/je_auto_control/utils/executor/action_executor.py +++ b/je_auto_control/utils/executor/action_executor.py @@ -3423,6 +3423,24 @@ def _detect_anomalies(values: Any, method: str = "mad", threshold=threshold)} +def _sma(values: Any, window: Any) -> Dict[str, Any]: + """Adapter: trailing simple moving average.""" + import json + from je_auto_control.utils.smoothing import sma + if isinstance(values, str): + values = json.loads(values) + return {"series": sma(values, int(window))} + + +def _ewma(values: Any, alpha: Any = 0.3) -> Dict[str, Any]: + """Adapter: exponentially-weighted moving average.""" + import json + from je_auto_control.utils.smoothing import ewma + if isinstance(values, str): + values = json.loads(values) + return {"series": ewma(values, alpha=float(alpha))} + + def _evaluate_slo(records: Any, target: float, window_s: Optional[float] = None) -> Dict[str, Any]: """Adapter: SLI + error budget for outcome records (list or JSON string).""" @@ -4524,6 +4542,8 @@ def __init__(self): "AC_ts_rate": _ts_rate, "AC_ts_downsample": _ts_downsample, "AC_detect_anomalies": _detect_anomalies, + "AC_sma": _sma, + "AC_ewma": _ewma, "AC_detect_drift": _detect_drift, "AC_categorical_drift": _categorical_drift, "AC_diff_rows": _diff_rows, diff --git a/je_auto_control/utils/mcp_server/tools/_factories.py b/je_auto_control/utils/mcp_server/tools/_factories.py index 706de390..78c6dd06 100644 --- a/je_auto_control/utils/mcp_server/tools/_factories.py +++ b/je_auto_control/utils/mcp_server/tools/_factories.py @@ -3548,6 +3548,31 @@ def dataset_diff_tools() -> List[MCPTool]: ] +def smoothing_tools() -> List[MCPTool]: + return [ + MCPTool( + name="ac_sma", + description=("Trailing simple moving average of a numeric 'values' " + "series over the last 'window' points. Returns {series}."), + input_schema=schema( + {"values": {"type": "array"}, "window": {"type": "integer"}}, + ["values", "window"]), + handler=h.sma, + annotations=READ_ONLY, + ), + MCPTool( + name="ac_ewma", + description=("Exponentially-weighted moving average of 'values' with " + "smoothing factor 'alpha'. Returns {series}."), + input_schema=schema( + {"values": {"type": "array"}, "alpha": {"type": "number"}}, + ["values"]), + handler=h.ewma, + annotations=READ_ONLY, + ), + ] + + def anomaly_tools() -> List[MCPTool]: return [ MCPTool( @@ -5499,7 +5524,7 @@ def media_assert_tools() -> List[MCPTool]: secret_ref_tools, config_schema_tools, config_redaction_tools, data_profile_tools, http_problem_tools, dotenv_tools, sse_client_tools, layered_config_tools, data_drift_tools, schema_compat_tools, - timeseries_tools, anomaly_tools, + timeseries_tools, anomaly_tools, smoothing_tools, dataset_diff_tools, referential_tools, link_header_tools, multipart_tools, http_content_tools, cookie_jar_tools, http_conditional_tools, saga_tools, decision_table_tools, locator_repair_tools, diff --git a/je_auto_control/utils/mcp_server/tools/_handlers.py b/je_auto_control/utils/mcp_server/tools/_handlers.py index a576f103..f67e9dd9 100644 --- a/je_auto_control/utils/mcp_server/tools/_handlers.py +++ b/je_auto_control/utils/mcp_server/tools/_handlers.py @@ -1910,6 +1910,16 @@ def detect_anomalies(values, method="mad", threshold=None): return _detect_anomalies(values, method, threshold) +def sma(values, window): + from je_auto_control.utils.executor.action_executor import _sma + return _sma(values, window) + + +def ewma(values, alpha=0.3): + from je_auto_control.utils.executor.action_executor import _ewma + return _ewma(values, alpha) + + def detect_drift(reference, current, threshold=0.25, bins=10): from je_auto_control.utils.executor.action_executor import _detect_drift return _detect_drift(reference, current, threshold, bins) diff --git a/je_auto_control/utils/smoothing/__init__.py b/je_auto_control/utils/smoothing/__init__.py new file mode 100644 index 00000000..6cb2cd1d --- /dev/null +++ b/je_auto_control/utils/smoothing/__init__.py @@ -0,0 +1,4 @@ +"""Moving-average smoothing for AutoControl value series.""" +from je_auto_control.utils.smoothing.smoothing import ewma, rolling, sma, wma + +__all__ = ["ewma", "rolling", "sma", "wma"] diff --git a/je_auto_control/utils/smoothing/smoothing.py b/je_auto_control/utils/smoothing/smoothing.py new file mode 100644 index 00000000..4767fa25 --- /dev/null +++ b/je_auto_control/utils/smoothing/smoothing.py @@ -0,0 +1,57 @@ +"""Moving-average smoothing for noisy value series. + +``stats.describe`` summarises a whole sample and ``timeseries`` rolls counters +into rates, but nothing smooths a noisy signal or weights recent points. This +adds trailing simple / weighted / exponentially-weighted moving averages and a +generic rolling reducer, e.g. for a ``resource_profiler`` FPS/CPU timeline. + +Pure standard library; imports no ``PySide6``. Every function is pure (values +in, list out), so it is fully deterministic in CI. +""" +from typing import Callable, List, Sequence + + +def sma(values: Sequence[float], window: int) -> List[float]: + """Trailing simple moving average over the last ``window`` points.""" + if window <= 0: + raise ValueError("window must be positive") + out: List[float] = [] + for i in range(len(values)): + chunk = values[max(0, i - window + 1):i + 1] + out.append(sum(chunk) / len(chunk)) + return out + + +def wma(values: Sequence[float], weights: Sequence[float]) -> List[float]: + """Trailing weighted moving average; ``weights`` align to the latest points.""" + weight_list = list(weights) + if not weight_list: + raise ValueError("weights must be non-empty") + out: List[float] = [] + for i in range(len(values)): + chunk = values[max(0, i - len(weight_list) + 1):i + 1] + applied = weight_list[-len(chunk):] + out.append(sum(x * w for x, w in zip(chunk, applied)) / sum(applied)) + return out + + +def ewma(values: Sequence[float], *, alpha: float) -> List[float]: + """Exponentially-weighted moving average (smoothing factor ``alpha``).""" + if not 0 < alpha <= 1: + raise ValueError("alpha must be in (0, 1]") + out: List[float] = [] + previous = None + for value in values: + previous = value if previous is None else alpha * value + ( + 1 - alpha) * previous + out.append(previous) + return out + + +def rolling(values: Sequence[float], window: int, + func: Callable[[Sequence[float]], float]) -> List[float]: + """Apply ``func`` over each trailing window of size ``window``.""" + if window <= 0: + raise ValueError("window must be positive") + return [func(values[max(0, i - window + 1):i + 1]) + for i in range(len(values))] diff --git a/test/unit_test/headless/test_smoothing_batch.py b/test/unit_test/headless/test_smoothing_batch.py new file mode 100644 index 00000000..61bfd2c1 --- /dev/null +++ b/test/unit_test/headless/test_smoothing_batch.py @@ -0,0 +1,67 @@ +"""Headless tests for moving-average smoothing. Pure stdlib, no Qt.""" +import json +import statistics + +import pytest + +import je_auto_control as ac +from je_auto_control.utils.smoothing import ewma, rolling, sma, wma + + +def test_sma_trailing(): + assert sma([1, 2, 3, 4], 2) == pytest.approx([1.0, 1.5, 2.5, 3.5]) + assert sma([5, 5, 5], 3) == pytest.approx([5.0, 5.0, 5.0]) + + +def test_wma_weights_latest(): + # weights [1,2] weight the latest point twice + assert wma([1, 3], [1, 2]) == pytest.approx([1.0, (1 * 1 + 3 * 2) / 3]) + + +def test_ewma(): + out = ewma([1, 2, 3], alpha=0.5) + assert out[0] == pytest.approx(1.0) + assert out[1] == pytest.approx(1.5) + assert out[2] == pytest.approx(2.25) + + +def test_rolling_generic(): + assert rolling([1, 9, 2, 8], 2, max) == pytest.approx([1, 9, 9, 8]) + assert rolling([1, 2, 3], 3, statistics.fmean)[-1] == pytest.approx(2.0) + + +def test_validation(): + for bad in (lambda: sma([1], 0), lambda: rolling([1], 0, sum), + lambda: wma([1], []), lambda: list(ewma([1], alpha=0))): + with pytest.raises(ValueError): + bad() + + +# --- wiring --------------------------------------------------------------- + +def test_executor_round_trip(): + rec = ac.execute_action([[ + "AC_sma", {"values": json.dumps([1, 2, 3, 4]), "window": 2}]]) + assert next(v for v in rec.values() + if isinstance(v, dict))["series"] == pytest.approx( + [1.0, 1.5, 2.5, 3.5]) + rec2 = ac.execute_action([[ + "AC_ewma", {"values": json.dumps([1, 2, 3]), "alpha": 0.5}]]) + assert next(v for v in rec2.values() + if isinstance(v, dict))["series"][-1] == pytest.approx(2.25) + + +def test_wiring(): + known = ac.executor.known_commands() + assert {"AC_sma", "AC_ewma"} <= set(known) + from je_auto_control.utils.mcp_server.tools import build_default_tool_registry + names = {t.name for t in build_default_tool_registry()} + assert {"ac_sma", "ac_ewma"} <= names + from je_auto_control.gui.script_builder.command_schema import _build_specs + specs = {s.command for s in _build_specs()} + assert {"AC_sma", "AC_ewma"} <= specs + + +def test_facade_exports(): + for attr in ("sma", "wma", "ewma", "rolling"): + assert hasattr(ac, attr) and attr in ac.__all__