feat(adb_vision): restore live piloting and expand audit logging#55
feat(adb_vision): restore live piloting and expand audit logging#55
Conversation
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
There was a problem hiding this comment.
Pull request overview
Restores and hardens the adb_vision/ MCP “live piloting” surface by adding structured audit logging across MCP/CLI + child events, expanding the screenshot backend ladder (including screenrecord), and updating docs to reflect the canonical logging/screenshot approach.
Changes:
- Added structured JSONL audit logging (
mcp_audit.jsonl) for MCP tool calls, subprocesses, HTTP calls, and backend events. - Expanded screenshot fallback ladder and updated tests to reflect the new backend order and scrcpy capture approach.
- Added local diagnostic/piloting CLIs (
diagnose.py,pilot.py) and updated architecture/roadmap/logging docs accordingly.
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| docs/dev/logging.md | Documents MCP audit JSONL and service surface guardrails. |
| docs/ROADMAP.md | Updates roadmap items for audit logging + screenshot ladder. |
| docs/ARCHITECTURE.md | Updates architecture diagram/notes for new logging + backend ladder. |
| adb_vision/tool_audit.py | Implements structured audit logging + optional FastMCP middleware. |
| adb_vision/test_tool_audit.py | Adds unit tests for audit logging and parent/child correlation. |
| adb_vision/test_server.py | Updates screenshot fallback-chain and scrcpy tests for new behavior. |
| adb_vision/server.py | Integrates audit middleware + logs adb subprocess calls; returns screenshot metadata. |
| adb_vision/screenshot.py | Adds HTTP/subprocess audit events; adds screenrecord backend; updates scrcpy pathing + frame extraction. |
| adb_vision/pyproject.toml | Adds new runtime deps (av/uiautomator2/pymemuc). |
| adb_vision/pilot.py | Adds minimal CLI wrapper for manual piloting with audit logging. |
| adb_vision/diagnose.py | Adds diagnostic CLI with backend probing and desktop screenshot fallback. |
| adb_vision/desktop_capture.ps1 | Implements Windows desktop/window capture fallback used by diagnostics. |
| adb_vision/README.md | Updates documentation for new screenshot ladder including screenrecord. |
| TDD_IMPLEMENTATION_PLAN.md | Updates implementation status/notes reflecting audit + new backends/CLIs. |
Comments suppressed due to low confidence (1)
adb_vision/pyproject.toml:17
server.pyandscreenshot.pynow importtool_audit, but the wheel build config still limitsonly-includetoserver.py,screenshot.py, and__init__.py. Building/installing the wheel would fail at runtime withModuleNotFoundError: tool_audit. Update the wheel include list (or packaging strategy) to shiptool_audit.py(and any other required runtime modules).
[tool.hatch.build.targets.wheel]
only-include = ["server.py", "screenshot.py", "__init__.py"]
| ADB = r"D:\Program Files\Microvirt\MEmu\adb.exe" | ||
| SERIAL = "127.0.0.1:21513" | ||
| GAME_PACKAGE = "com.YoStarEN.AzurLane" | ||
| SCREEN_DIR = Path(__file__).parent / "pilot_screens" | ||
| BLACK_STDDEV_THRESHOLD = 5.0 | ||
| SCREENSHOT_METHODS = ("droidcast", "scrcpy", "u2", "screenrecord", "screencap") |
There was a problem hiding this comment.
diagnose.py hard-codes a Windows-specific ADB path (D:\Program Files\Microvirt\MEmu\adb.exe) and the emulator serial. This makes the diagnostic CLI brittle across machines and CI (and duplicates server._find_adb() logic). Consider accepting --adb / --serial CLI flags (defaulting to the same detection logic as server.py) so diagnostics work in more environments without editing code.
| remote_path = f"/data/local/tmp/adb_vision_capture_{os.getpid()}.mp4" | ||
| with tempfile.TemporaryDirectory(prefix="adb_vision_") as tmpdir: |
There was a problem hiding this comment.
remote_path is derived only from os.getpid(), so concurrent screenrecord captures within the same process would race on the same /data/local/tmp/adb_vision_capture_<pid>.mp4 (one call can delete or overwrite the other's recording). Use a per-invocation unique suffix (e.g., event_id, monotonic counter, or random token) to avoid collisions.
| async def _capture_screenrecord(*, adb_run: AdbRunFn, serial: str, adb_exe: str) -> str: | ||
| ffmpeg_exe = shutil.which("ffmpeg") | ||
| if not ffmpeg_exe: | ||
| raise RuntimeError("ffmpeg not found in PATH; required for screenrecord fallback") | ||
|
|
||
| remote_path = f"/data/local/tmp/adb_vision_capture_{os.getpid()}.mp4" | ||
| with tempfile.TemporaryDirectory(prefix="adb_vision_") as tmpdir: | ||
| video_path = Path(tmpdir) / "capture.mp4" | ||
| frame_path = Path(tmpdir) / "frame.png" | ||
| png_data = b"" | ||
|
|
||
| try: | ||
| await adb_run("shell", "rm", "-f", remote_path, timeout=5.0) | ||
| except Exception: | ||
| pass | ||
|
|
||
| try: | ||
| await adb_run( | ||
| "shell", | ||
| "screenrecord", | ||
| f"--time-limit={_SCREENRECORD_SECONDS}", | ||
| remote_path, | ||
| timeout=10.0 + _SCREENRECORD_SECONDS, | ||
| ) | ||
| await adb_run("pull", remote_path, str(video_path), timeout=20.0) | ||
| await _run_logged_process( | ||
| argv=[ | ||
| ffmpeg_exe, | ||
| "-y", | ||
| "-i", | ||
| str(video_path), | ||
| "-frames:v", | ||
| "1", | ||
| "-update", | ||
| "1", | ||
| str(frame_path), | ||
| ], | ||
| timeout=20.0, | ||
| command_name="ffmpeg.exec", | ||
| ) | ||
| png_data = frame_path.read_bytes() | ||
| finally: | ||
| try: | ||
| await adb_run("shell", "rm", "-f", remote_path, timeout=5.0) | ||
| except Exception: | ||
| pass | ||
|
|
||
| if not _is_png(png_data): | ||
| raise RuntimeError("screenrecord frame extraction did not produce a PNG") | ||
| return base64.b64encode(png_data).decode("ascii") |
There was a problem hiding this comment.
The new screenrecord backend doesn’t have direct unit coverage (current tests only mock it in the fallback-chain test). Add focused tests for _capture_screenrecord (e.g., ffmpeg missing, success path with mocked adb_run/_run_logged_process and mocked Path.read_bytes, and cleanup behavior) so regressions in the fallback ladder are caught.
| png_b64, meta = payload | ||
| ms = int((time.monotonic() - t0) * 1000) | ||
| png_bytes = base64.b64decode(png_b64) | ||
| saved = _save_screenshot_png(png_b64, _action_seq + 1) | ||
| _action_log( | ||
| "adb_screenshot", | ||
| {"serial": ADB_SERIAL, "method": method}, | ||
| f"png_bytes={len(png_bytes)} saved={saved}", | ||
| f"png_bytes={len(png_bytes)} backend={meta.get('backend')} saved={saved}", |
There was a problem hiding this comment.
adb_screenshot base64-decodes the PNG to compute len(png_bytes), then _save_screenshot_png() decodes the same base64 again when writing the file. Since take_screenshot(..., include_metadata=True) already returns png_bytes in meta, consider using that value (and/or decoding once and reusing the bytes) to avoid the extra decode on every screenshot.
Summary
adb_visionsurface withdiagnose.pyandpilot.pyscreenrecord, desktop capture) when rawscreencapis blankTesting
cd adb_vision && uv run pytest test_server.py test_tool_audit.py -qcd adb_vision && uv run python diagnose.py --jsoncd adb_vision && uv run python pilot.py screenshot