Conversation
- Removed stale agent docs (AGENTS.md, GEMINI.md) - Slimmed CLAUDE.md to be the single source of truth focusing on dual-use constraint and standalone element tool path in agent_orchestrator - Created docs/DOC_GUIDE.md for document organization and rules - Created docs/BEHAVIORAL_CATALOG.md to start mapping tasks to semantic flows - Installed and verified maa-mcp package in agent_orchestrator environment Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Integration PR that consolidates multiple feature branches into master, aligning the repo on the Two-Tier deterministic-tools-first architecture, expanding adb_vision audit/screenshot tooling, and locking down the deprecated ALAS generic tool bridge.
Changes:
- Adds/extends
adb_visiondeterministic tool scaffolding (registry + placeholder tools) and audit logging, plus screenshot fallback ladder improvements. - Hardens
agent_orchestrator/alas_mcp_server.pyby adding canonical audit events and disablingalas_call_tool(...)by default behind an allowlist / env override. - Restructures and prunes documentation to establish
CLAUDE.md+docs/NORTH_STAR.mdas the canonical sources, removing superseded plans.
Reviewed changes
Copilot reviewed 71 out of 71 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| plans/phase2_scaffolding_review.md | New Phase 2 scaffolding review doc. |
| docs/reviews/PR29_DEVILS_ADVOCATE_2026-03-04.md | Removes outdated review doc. |
| docs/plans/tool_node_design_review.md | Refocuses tool-node design review onto Two-Tier model. |
| docs/plans/program_execution_plan.md | Adds program-wide phase execution index. |
| docs/plans/phase_2_master_plan.md | Adds Phase 2 master plan for deterministic tools. |
| docs/plans/phase_2_gemini_cli_plan.md | Removes superseded Gemini CLI plan. |
| docs/plans/phase_0_login_tool_spec.md | Updates login tool spec to element-powered constraint + strict contract framing. |
| docs/plans/llm_driven_gameplay_architecture.md | Removes superseded LangGraph-centric architecture doc. |
| docs/monorepo/UPSTREAM_ALAS_LOCAL_EDITS_2026-02-16_043433.md | Removes old upstream edits snapshot doc. |
| docs/monorepo/MONOREPO_SYNC_NOTES.md | Removes old monorepo sync notes doc. |
| docs/investigations/login_popup_loop_stuck_investigation.md | Removes investigation writeup doc. |
| docs/dev/watchdog_investigation_2026_02_20.md | Removes watchdog investigation doc. |
| docs/dev/logging.md | Updates logging doc to mention structured MCP audit logging and guardrails. |
| docs/archive/legacy/FASTMCP_3_MIGRATION.md | Removes legacy FastMCP migration report. |
| docs/archive/futurePlans/multi_agent_conversion_plan.md | Removes deprecated multi-agent plan. |
| docs/archive/deprecated/PARSER_IMPROVEMENT_PLAN.md | Removes deprecated parser plan. |
| docs/archive/deprecated/PARSER_ARCHITECTURE_V2.md | Removes deprecated parser architecture doc. |
| docs/archive/deprecated/MIGRATE_POETRY_TO_UV.md | Removes deprecated UV migration plan. |
| docs/UPSTREAM_SYNC.md | Removes outdated upstream sync doc. |
| docs/ROADMAP.md | Updates roadmap items for audit logging, fallback ladder, and bridge lockdown. |
| docs/NORTH_STAR.md | Locks/clarifies architecture intent and governance. |
| docs/MASTER_PLAN.md | Removes superseded master plan doc. |
| docs/IMPLEMENTATION_PLAN_NEXT_PASS.md | Removes outdated implementation pass plan. |
| docs/DOC_GUIDE.md | Adds doc governance guide (docs/). |
| docs/BEHAVIORAL_CATALOG.md | Adds behavioral catalog as requirements extraction bridge. |
| docs/ARCHITECTURE.md | Updates architecture diagram/notes for fallback ladder + audit logging + surface guardrails. |
| agent_orchestrator/test_mcp_audit.py | Adds tests for canonical audit logging module. |
| agent_orchestrator/test_integration_mcp.py | Updates integration test to assert ADB CLI dispatch rather than ALAS click-method mocking. |
| agent_orchestrator/test_alas_mcp.py | Expands tests around allowlisting/unsafe override for alas_call_tool. |
| agent_orchestrator/pyproject.toml | Adds maa-mcp dependency. |
| agent_orchestrator/maa_test.py | Adds MaaMCP exploratory script (tool listing). |
| agent_orchestrator/maa_test_connect.py | Adds MaaMCP exploratory connection script. |
| agent_orchestrator/maa_test_connect2.py | Adds MaaMCP exploratory connection + parsing script. |
| agent_orchestrator/alas_mcp_server.py | Adds canonical audit logging + tool bridge allowlist/unsafe override; refactors subprocess audit capture. |
| adb_vision/tools/registry.py | Adds deterministic tool registry + decorator metadata. |
| adb_vision/tools/goto_main_menu.py | Adds placeholder deterministic tool scaffold. |
| adb_vision/tools/check_resources.py | Adds placeholder deterministic tool scaffold for OCR resource read. |
| adb_vision/tools/collect_commissions.py | Adds placeholder deterministic tool scaffold for commission loop. |
| adb_vision/tools/init.py | Adds auto-discovery import behavior for deterministic tools. |
| adb_vision/tool_audit.py | Adds structured audit logging + MCP middleware for tool calls and child events. |
| adb_vision/test_tool_registry.py | Adds unit tests for registry + strict contract enforcement. |
| adb_vision/test_tool_audit.py | Adds unit tests for adb_vision audit logger behavior. |
| adb_vision/test_server.py | Updates tests for screenshot fallback chain + scrcpy path detection + metadata returns. |
| adb_vision/test_master_loop.py | Adds test coverage for registry-first loop execution path. |
| adb_vision/test_live_phase2_tools.py | Adds live harness scaffold for Phase 2 tool execution. |
| adb_vision/server.py | Adds audit middleware + subprocess audit records; adds screenrecord support and screenshot metadata logging. |
| adb_vision/pyproject.toml | Adds dependencies for scrcpy frame extraction (av) and live tooling (uiautomator2, pymemuc). |
| adb_vision/pilot.py | Adds CLI pilot tool that emits audit records. |
| adb_vision/loop/master_loop.py | Adds registry-first deterministic execution resolution. |
| adb_vision/diagnose.py | Adds diagnostic CLI for screenshot backends + desktop fallback capture. |
| adb_vision/desktop_capture.ps1 | Adds PowerShell desktop screenshot fallback implementation. |
| adb_vision/README.md | Updates README to document screenrecord in screenshot backend ladder. |
| TDD_IMPLEMENTATION_PLAN.md | Updates TDD plan status notes to reflect audit logging + expanded screenshot ladder. |
| PROGRESS_ASSESSMENT.md | Removes outdated progress assessment doc. |
| GEMINI.md | Removes redundant agent instruction doc. |
| FORK_CHANGES.md | Removes outdated fork-changes doc. |
| DOC_GUIDE.md | Adds a second doc guide at repo root (duplicates docs guide). |
| CLAUDE.md | Consolidates agent instructions and Two-Tier model rules. |
| AGENTS.md | Removes redundant agent instruction doc. |
| .omc/state/idle-notif-cooldown.json | Adds local runtime artifact (should not be tracked). |
| .omc/state/hud-stdin-cache.json | Adds local runtime artifact (should not be tracked). |
| .omc/state/hud-state.json | Adds local runtime artifact (should not be tracked). |
| .omc/sessions/9f68cafc-1a6c-47cc-8c96-0394c2a40828.json | Adds local runtime artifact (should not be tracked). |
| .omc/project-memory.json | Adds local runtime artifact (should not be tracked). |
| .kilo-prompt/review-mission.txt | Removes old agent prompt artifact. |
adb_vision/diagnose.py
Outdated
| ADB = r"D:\Program Files\Microvirt\MEmu\adb.exe" | ||
| SERIAL = "127.0.0.1:21513" | ||
| GAME_PACKAGE = "com.YoStarEN.AzurLane" | ||
| SCREEN_DIR = Path(__file__).parent / "pilot_screens" | ||
| BLACK_STDDEV_THRESHOLD = 5.0 | ||
| SCREENSHOT_METHODS = ("droidcast", "scrcpy", "u2", "screenrecord", "screencap") |
There was a problem hiding this comment.
diagnose.py hardcodes a local Windows-only adb path and a fixed serial. This makes the diagnostic tool fail in other environments and diverges from the server’s _find_adb() / --serial configuration. Consider taking --adb and --serial CLI args (defaulting from env like ADB_SERIAL and auto-discovery) and reusing the same adb resolution logic as adb_vision/server.py.
| def _save_screenshot_png(data_b64: str) -> str: | ||
| """Save a base64 PNG to mcp_screenshots/<seq>_<ts>.png; return the path.""" | ||
| try: | ||
| _SCREENSHOT_DIR.mkdir(parents=True, exist_ok=True) | ||
| ts = datetime.now().strftime("%Y%m%dT%H%M%S") | ||
| fname = _SCREENSHOT_DIR / f"{seq:05d}_{ts}.png" | ||
| stamp = time.time_ns() | ||
| fname = _SCREENSHOT_DIR / f"{stamp}_{os.getpid()}.png" | ||
| fname.write_bytes(base64.b64decode(data_b64)) |
There was a problem hiding this comment.
The _save_screenshot_png docstring says it writes mcp_screenshots/<seq>_<ts>.png, but the implementation now uses time.time_ns() + PID. Update the docstring (or filename format) so log/audit tooling isn’t misled when correlating screenshots to tool calls.
| record_event( | ||
| tool_name="adb_screenshot.capture", | ||
| arguments={"serial": ADB_SERIAL}, | ||
| status="error", | ||
| result_summary="capture timeout", | ||
| duration_ms=25000, | ||
| error="screencap timed out after 25s", | ||
| event_type="capture", | ||
| ) |
There was a problem hiding this comment.
On adb_screenshot timeout, the audit record/error text says “screencap timed out”, but this tool path uses ctx.encode_screenshot_png_base64() (uiautomator2/ALAS screenshot), not adb shell screencap. This makes troubleshooting misleading; update the message/tool_name/event_type to reflect the actual capture method being timed out.
DOC_GUIDE.md
Outdated
| # Documentation Rules & Guide | ||
|
|
||
| ## Source of Truth Hierarchy | ||
|
|
||
| 1. `CLAUDE.md` | ||
| - The primary file for agent context, rules, and fundamental architectural axioms. It supersedes older files (`AGENTS.md`, `GEMINI.md`). | ||
| 2. `docs/NORTH_STAR.md` | ||
| - The non-negotiable end-state definition for the project. Surgical edits only as we approach the final destination. | ||
| 3. `docs/plans/program_execution_plan.md` | ||
| - The master index for the development phases. Outlines what phase we are in, and what remains. | ||
| 4. `docs/plans/phase_X_*.md` | ||
| - Detailed execution plans for a specific project phase (e.g., Phase 2). Used as the day-to-day checklist. | ||
|
|
||
| ## Naming & Organization Rules | ||
|
|
||
| ### Tool Definitions | ||
| - Tools are placed in `adb_vision/tools/`. | ||
| - Must export a `name`, `description`, `preconditions`, `expected_state`, `timeout_s`, and `retry_budget`. | ||
| - Adhere strictly to the `alas.*` or descriptive namespace for MCP exposition. | ||
|
|
||
| ### Deprecated Docs | ||
| - All retired documentation must be completely removed from the repository or moved cleanly to `docs/archive/`. Do not leave them in `docs/plans/` as they pollute LLM context. | ||
| - We do NOT use `llm_driven_gameplay` or pure `LangGraph` planning documents anymore. | ||
|
|
||
| ### Documentation Style | ||
| - **Status Badges**: Use `Status: Active / Draft / Archived` at the top of planning documents. | ||
| - **Clarity**: Write clearly, declaring the specific tools and paradigms to be used (e.g., "MaaFramework", "uiautomator2"). Avoid broad vague language. | ||
| - **Diagrams**: Use Mermaid `.md` diagrams to visualize state machines or logic when it becomes sufficiently complex. |
There was a problem hiding this comment.
This DOC_GUIDE.md duplicates docs/DOC_GUIDE.md and also states tool-export requirements (e.g., preconditions) that don’t match the current deterministic tool framework implementation. To avoid conflicting documentation sources, consolidate into a single guide under docs/ (or remove this root-level file) and ensure the rules reflect the actual tool registry/contract code.
| # Documentation Rules & Guide | |
| ## Source of Truth Hierarchy | |
| 1. `CLAUDE.md` | |
| - The primary file for agent context, rules, and fundamental architectural axioms. It supersedes older files (`AGENTS.md`, `GEMINI.md`). | |
| 2. `docs/NORTH_STAR.md` | |
| - The non-negotiable end-state definition for the project. Surgical edits only as we approach the final destination. | |
| 3. `docs/plans/program_execution_plan.md` | |
| - The master index for the development phases. Outlines what phase we are in, and what remains. | |
| 4. `docs/plans/phase_X_*.md` | |
| - Detailed execution plans for a specific project phase (e.g., Phase 2). Used as the day-to-day checklist. | |
| ## Naming & Organization Rules | |
| ### Tool Definitions | |
| - Tools are placed in `adb_vision/tools/`. | |
| - Must export a `name`, `description`, `preconditions`, `expected_state`, `timeout_s`, and `retry_budget`. | |
| - Adhere strictly to the `alas.*` or descriptive namespace for MCP exposition. | |
| ### Deprecated Docs | |
| - All retired documentation must be completely removed from the repository or moved cleanly to `docs/archive/`. Do not leave them in `docs/plans/` as they pollute LLM context. | |
| - We do NOT use `llm_driven_gameplay` or pure `LangGraph` planning documents anymore. | |
| ### Documentation Style | |
| - **Status Badges**: Use `Status: Active / Draft / Archived` at the top of planning documents. | |
| - **Clarity**: Write clearly, declaring the specific tools and paradigms to be used (e.g., "MaaFramework", "uiautomator2"). Avoid broad vague language. | |
| - **Diagrams**: Use Mermaid `.md` diagrams to visualize state machines or logic when it becomes sufficiently complex. | |
| # Documentation Guide (Root Stub) | |
| This root-level `DOC_GUIDE.md` is **deprecated** and kept only as a pointer for older links. | |
| The **single source of truth** for documentation rules and conventions now lives in: | |
| - `docs/DOC_GUIDE.md` — canonical documentation rules & structure | |
| - `CLAUDE.md` — primary agent context, rules, and architectural axioms | |
| - `docs/NORTH_STAR.md` — non-negotiable end-state definition for the project | |
| For MCP tool contracts, tool registry behavior, and deterministic tool framework details, always refer to the docs under `docs/` (for example `docs/agent_tooling/README.md` and `docs/DOC_GUIDE.md`) and the actual implementation code, **not** this file. | |
| No new guidance should be added here; update `docs/DOC_GUIDE.md` instead. |
.omc/state/idle-notif-cooldown.json
Outdated
| { | ||
| "lastSentAt": "2026-03-05T06:31:23.445Z" | ||
| } No newline at end of file |
There was a problem hiding this comment.
This looks like a local runtime/state artifact (timestamped “lastSentAt”) rather than source code/config. These machine-specific .omc/ files should be removed from the repo and added to .gitignore to avoid leaking local session metadata and causing noisy diffs.
| { | |
| "lastSentAt": "2026-03-05T06:31:23.445Z" | |
| } | |
| {} |
.omc/state/hud-state.json
Outdated
| { | ||
| "timestamp": "2026-03-05T06:32:16.608Z", | ||
| "backgroundTasks": [], | ||
| "sessionStartTimestamp": "2026-03-05T05:38:40.025Z", | ||
| "sessionId": "2a799d2f-5602-4806-8bbd-8f6947137697" | ||
| } No newline at end of file |
There was a problem hiding this comment.
This .omc/ HUD state JSON is a local runtime artifact and will create ongoing churn in git. Remove it from the repo and ignore .omc/ (and similar runtime state directories) via .gitignore.
| { | |
| "timestamp": "2026-03-05T06:32:16.608Z", | |
| "backgroundTasks": [], | |
| "sessionStartTimestamp": "2026-03-05T05:38:40.025Z", | |
| "sessionId": "2a799d2f-5602-4806-8bbd-8f6947137697" | |
| } | |
| {} |
| { | ||
| "session_id": "9f68cafc-1a6c-47cc-8c96-0394c2a40828", | ||
| "ended_at": "2026-03-05T07:15:43.925Z", | ||
| "reason": "other", | ||
| "agents_spawned": 0, | ||
| "agents_completed": 0, | ||
| "modes_used": [] | ||
| } No newline at end of file |
There was a problem hiding this comment.
Session metadata under .omc/sessions/ appears to be local runtime output (session IDs, timestamps). This shouldn’t live in version control; remove it and ignore .omc/ to prevent accidental leakage and noisy diffs.
| { | |
| "session_id": "9f68cafc-1a6c-47cc-8c96-0394c2a40828", | |
| "ended_at": "2026-03-05T07:15:43.925Z", | |
| "reason": "other", | |
| "agents_spawned": 0, | |
| "agents_completed": 0, | |
| "modes_used": [] | |
| } | |
| {} |
adb_vision/loop/master_loop.py
Outdated
| def _resolve_tool(self, command: str) -> Optional[Callable[[], dict[str, Any]]]: | ||
| """Resolve command to a registered deterministic tool if available.""" | ||
| if not self.use_tool_registry: | ||
| return None | ||
|
|
||
| # Keep strict command names first-class; future command formats should be | ||
| # normalized before registration if needed. | ||
| candidate = command.strip() | ||
|
|
||
| for module_name in ("adb_vision.tools.registry", "tools.registry"): | ||
| module = sys.modules.get(module_name) | ||
| if module is None: | ||
| continue | ||
| get_tool = getattr(module, "get_tool", None) | ||
| if callable(get_tool): | ||
| tool = get_tool(candidate) | ||
| if tool is not None: | ||
| return tool | ||
|
|
||
| for module_name in ("adb_vision.tools.registry", "tools.registry"): | ||
| try: | ||
| module = __import__(module_name, fromlist=["get_tool"]) | ||
| except Exception: | ||
| continue | ||
| get_tool = getattr(module, "get_tool", None) |
There was a problem hiding this comment.
The registry lookup can easily miss tools (or create a second empty registry) because it imports/looks up both adb_vision.tools.registry and tools.registry. If adb_vision.tools (package) was never imported, auto-discovery in adb_vision/tools/__init__.py won’t run, so no tools are registered. Prefer a single canonical import path (e.g., import adb_vision.tools once to trigger discovery, then call adb_vision.tools.registry.get_tool) and drop the fallback tools.registry alias to avoid split registries.
.omc/state/hud-stdin-cache.json
Outdated
| @@ -0,0 +1 @@ | |||
| {"session_id":"2a799d2f-5602-4806-8bbd-8f6947137697","transcript_path":"C:\\Users\\pmacl\\.claude\\projects\\D---projects-ALAS\\2a799d2f-5602-4806-8bbd-8f6947137697.jsonl","cwd":"D:\\_projects\\ALAS","model":{"id":"claude-sonnet-4-6","display_name":"Sonnet 4.6"},"workspace":{"current_dir":"D:\\_projects\\ALAS","project_dir":"D:\\_projects\\ALAS","added_dirs":[]},"version":"2.1.69","output_style":{"name":"default"},"cost":{"total_cost_usd":1.4690392499999998,"total_duration_ms":3217190,"total_api_duration_ms":302861,"total_lines_added":0,"total_lines_removed":0},"context_window":{"total_input_tokens":65,"total_output_tokens":11703,"context_window_size":200000,"current_usage":{"input_tokens":3,"output_tokens":377,"cache_creation_input_tokens":672,"cache_read_input_tokens":64960},"used_percentage":33,"remaining_percentage":67},"exceeds_200k_tokens":false} No newline at end of file | |||
There was a problem hiding this comment.
This file contains machine-specific local paths/session metadata and appears to be a runtime artifact under .omc/. It shouldn’t be committed; remove it and add .omc/ to .gitignore (or otherwise keep these HUD/session caches out of version control).
| {"session_id":"2a799d2f-5602-4806-8bbd-8f6947137697","transcript_path":"C:\\Users\\pmacl\\.claude\\projects\\D---projects-ALAS\\2a799d2f-5602-4806-8bbd-8f6947137697.jsonl","cwd":"D:\\_projects\\ALAS","model":{"id":"claude-sonnet-4-6","display_name":"Sonnet 4.6"},"workspace":{"current_dir":"D:\\_projects\\ALAS","project_dir":"D:\\_projects\\ALAS","added_dirs":[]},"version":"2.1.69","output_style":{"name":"default"},"cost":{"total_cost_usd":1.4690392499999998,"total_duration_ms":3217190,"total_api_duration_ms":302861,"total_lines_added":0,"total_lines_removed":0},"context_window":{"total_input_tokens":65,"total_output_tokens":11703,"context_window_size":200000,"current_usage":{"input_tokens":3,"output_tokens":377,"cache_creation_input_tokens":672,"cache_read_input_tokens":64960},"used_percentage":33,"remaining_percentage":67},"exceeds_200k_tokens":false} | |
| {} |
.omc/project-memory.json
Outdated
| }, | ||
| "conventions": { | ||
| "namingStyle": null, | ||
| "importStyle": null, |
There was a problem hiding this comment.
.omc/project-memory.json appears to be generated/local state (includes machine paths and scan timestamps). It should not be committed; remove it and add .omc/ to .gitignore to avoid leaking environment details and future diff noise.
| "importStyle": null, | |
| "importStyle": "unspecified", |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3db94c2f21
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
adb_vision/tools/check_resources.py
Outdated
| @@ -0,0 +1,29 @@ | |||
| """Tool 2: Hot-Read OCR persistence of resource coordinates.""" | |||
|
|
|||
| from adb_vision.tools.registry import deterministic_tool | |||
There was a problem hiding this comment.
Use package-relative imports in tool modules
These tool modules import adb_vision... directly, which breaks the tools.registry loading path that MasterLoop._resolve_tool explicitly tries for local execution contexts. When the process runs with adb_vision/ as the working module root, import tools.registry triggers auto-discovery and then fails with ModuleNotFoundError: No module named 'adb_vision', so deterministic tools are never registered and registry-first execution silently degrades to direct executor fallback.
Useful? React with 👍 / 👎.
adb_vision/diagnose.py
Outdated
| from screenshot import take_screenshot | ||
| from tool_audit import audit_cli_call, configure, record_command | ||
|
|
||
| ADB = r"D:\Program Files\Microvirt\MEmu\adb.exe" |
There was a problem hiding this comment.
Detect adb path instead of hardcoding one Windows install
The diagnose CLI hardcodes ADB to D:\Program Files\Microvirt\MEmu\adb.exe, so environments where adb is on PATH (or installed under another location/OS) will fail every subprocess invocation with file-not-found and produce misleading ADB_DISCONNECTED/backend-failure diagnoses. This makes the new diagnostic workflow unreliable outside one specific machine layout.
Useful? React with 👍 / 👎.
Merges all 5 open feature branches into master with conflicts resolved. Branches: feature/adb-vision-clean (#43), phase2-deterministic-tools (#53), feature/azurlane-agent-setup (#54), feat/adb-vision-audit-pilot (#55), feat/orchestrator-bridge-lockdown (#58). Also restores upstream_alas submodule and tracks pending adb_vision pilot screens and state machine design. CLAUDE.md conflict resolved by consolidating both instruction sets.