Skip to content

chore: consolidate all open PRs into master#59

Open
Coldaine wants to merge 15 commits intomasterfrom
consolidated-merge-review
Open

chore: consolidate all open PRs into master#59
Coldaine wants to merge 15 commits intomasterfrom
consolidated-merge-review

Conversation

@Coldaine
Copy link
Owner

Merges all 5 open feature branches into master with conflicts resolved. Branches: feature/adb-vision-clean (#43), phase2-deterministic-tools (#53), feature/azurlane-agent-setup (#54), feat/adb-vision-audit-pilot (#55), feat/orchestrator-bridge-lockdown (#58). Also restores upstream_alas submodule and tracks pending adb_vision pilot screens and state machine design. CLAUDE.md conflict resolved by consolidating both instruction sets.

AI Agent and others added 13 commits March 4, 2026 16:50
- Removed stale agent docs (AGENTS.md, GEMINI.md)
- Slimmed CLAUDE.md to be the single source of truth focusing on dual-use constraint and standalone element tool path in agent_orchestrator
- Created docs/DOC_GUIDE.md for document organization and rules
- Created docs/BEHAVIORAL_CATALOG.md to start mapping tasks to semantic flows
- Installed and verified maa-mcp package in agent_orchestrator environment

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 11, 2026 02:46
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Integration PR that consolidates multiple feature branches into master, aligning the repo on the Two-Tier deterministic-tools-first architecture, expanding adb_vision audit/screenshot tooling, and locking down the deprecated ALAS generic tool bridge.

Changes:

  • Adds/extends adb_vision deterministic tool scaffolding (registry + placeholder tools) and audit logging, plus screenshot fallback ladder improvements.
  • Hardens agent_orchestrator/alas_mcp_server.py by adding canonical audit events and disabling alas_call_tool(...) by default behind an allowlist / env override.
  • Restructures and prunes documentation to establish CLAUDE.md + docs/NORTH_STAR.md as the canonical sources, removing superseded plans.

Reviewed changes

Copilot reviewed 71 out of 71 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
plans/phase2_scaffolding_review.md New Phase 2 scaffolding review doc.
docs/reviews/PR29_DEVILS_ADVOCATE_2026-03-04.md Removes outdated review doc.
docs/plans/tool_node_design_review.md Refocuses tool-node design review onto Two-Tier model.
docs/plans/program_execution_plan.md Adds program-wide phase execution index.
docs/plans/phase_2_master_plan.md Adds Phase 2 master plan for deterministic tools.
docs/plans/phase_2_gemini_cli_plan.md Removes superseded Gemini CLI plan.
docs/plans/phase_0_login_tool_spec.md Updates login tool spec to element-powered constraint + strict contract framing.
docs/plans/llm_driven_gameplay_architecture.md Removes superseded LangGraph-centric architecture doc.
docs/monorepo/UPSTREAM_ALAS_LOCAL_EDITS_2026-02-16_043433.md Removes old upstream edits snapshot doc.
docs/monorepo/MONOREPO_SYNC_NOTES.md Removes old monorepo sync notes doc.
docs/investigations/login_popup_loop_stuck_investigation.md Removes investigation writeup doc.
docs/dev/watchdog_investigation_2026_02_20.md Removes watchdog investigation doc.
docs/dev/logging.md Updates logging doc to mention structured MCP audit logging and guardrails.
docs/archive/legacy/FASTMCP_3_MIGRATION.md Removes legacy FastMCP migration report.
docs/archive/futurePlans/multi_agent_conversion_plan.md Removes deprecated multi-agent plan.
docs/archive/deprecated/PARSER_IMPROVEMENT_PLAN.md Removes deprecated parser plan.
docs/archive/deprecated/PARSER_ARCHITECTURE_V2.md Removes deprecated parser architecture doc.
docs/archive/deprecated/MIGRATE_POETRY_TO_UV.md Removes deprecated UV migration plan.
docs/UPSTREAM_SYNC.md Removes outdated upstream sync doc.
docs/ROADMAP.md Updates roadmap items for audit logging, fallback ladder, and bridge lockdown.
docs/NORTH_STAR.md Locks/clarifies architecture intent and governance.
docs/MASTER_PLAN.md Removes superseded master plan doc.
docs/IMPLEMENTATION_PLAN_NEXT_PASS.md Removes outdated implementation pass plan.
docs/DOC_GUIDE.md Adds doc governance guide (docs/).
docs/BEHAVIORAL_CATALOG.md Adds behavioral catalog as requirements extraction bridge.
docs/ARCHITECTURE.md Updates architecture diagram/notes for fallback ladder + audit logging + surface guardrails.
agent_orchestrator/test_mcp_audit.py Adds tests for canonical audit logging module.
agent_orchestrator/test_integration_mcp.py Updates integration test to assert ADB CLI dispatch rather than ALAS click-method mocking.
agent_orchestrator/test_alas_mcp.py Expands tests around allowlisting/unsafe override for alas_call_tool.
agent_orchestrator/pyproject.toml Adds maa-mcp dependency.
agent_orchestrator/maa_test.py Adds MaaMCP exploratory script (tool listing).
agent_orchestrator/maa_test_connect.py Adds MaaMCP exploratory connection script.
agent_orchestrator/maa_test_connect2.py Adds MaaMCP exploratory connection + parsing script.
agent_orchestrator/alas_mcp_server.py Adds canonical audit logging + tool bridge allowlist/unsafe override; refactors subprocess audit capture.
adb_vision/tools/registry.py Adds deterministic tool registry + decorator metadata.
adb_vision/tools/goto_main_menu.py Adds placeholder deterministic tool scaffold.
adb_vision/tools/check_resources.py Adds placeholder deterministic tool scaffold for OCR resource read.
adb_vision/tools/collect_commissions.py Adds placeholder deterministic tool scaffold for commission loop.
adb_vision/tools/init.py Adds auto-discovery import behavior for deterministic tools.
adb_vision/tool_audit.py Adds structured audit logging + MCP middleware for tool calls and child events.
adb_vision/test_tool_registry.py Adds unit tests for registry + strict contract enforcement.
adb_vision/test_tool_audit.py Adds unit tests for adb_vision audit logger behavior.
adb_vision/test_server.py Updates tests for screenshot fallback chain + scrcpy path detection + metadata returns.
adb_vision/test_master_loop.py Adds test coverage for registry-first loop execution path.
adb_vision/test_live_phase2_tools.py Adds live harness scaffold for Phase 2 tool execution.
adb_vision/server.py Adds audit middleware + subprocess audit records; adds screenrecord support and screenshot metadata logging.
adb_vision/pyproject.toml Adds dependencies for scrcpy frame extraction (av) and live tooling (uiautomator2, pymemuc).
adb_vision/pilot.py Adds CLI pilot tool that emits audit records.
adb_vision/loop/master_loop.py Adds registry-first deterministic execution resolution.
adb_vision/diagnose.py Adds diagnostic CLI for screenshot backends + desktop fallback capture.
adb_vision/desktop_capture.ps1 Adds PowerShell desktop screenshot fallback implementation.
adb_vision/README.md Updates README to document screenrecord in screenshot backend ladder.
TDD_IMPLEMENTATION_PLAN.md Updates TDD plan status notes to reflect audit logging + expanded screenshot ladder.
PROGRESS_ASSESSMENT.md Removes outdated progress assessment doc.
GEMINI.md Removes redundant agent instruction doc.
FORK_CHANGES.md Removes outdated fork-changes doc.
DOC_GUIDE.md Adds a second doc guide at repo root (duplicates docs guide).
CLAUDE.md Consolidates agent instructions and Two-Tier model rules.
AGENTS.md Removes redundant agent instruction doc.
.omc/state/idle-notif-cooldown.json Adds local runtime artifact (should not be tracked).
.omc/state/hud-stdin-cache.json Adds local runtime artifact (should not be tracked).
.omc/state/hud-state.json Adds local runtime artifact (should not be tracked).
.omc/sessions/9f68cafc-1a6c-47cc-8c96-0394c2a40828.json Adds local runtime artifact (should not be tracked).
.omc/project-memory.json Adds local runtime artifact (should not be tracked).
.kilo-prompt/review-mission.txt Removes old agent prompt artifact.

Comment on lines +30 to +35
ADB = r"D:\Program Files\Microvirt\MEmu\adb.exe"
SERIAL = "127.0.0.1:21513"
GAME_PACKAGE = "com.YoStarEN.AzurLane"
SCREEN_DIR = Path(__file__).parent / "pilot_screens"
BLACK_STDDEV_THRESHOLD = 5.0
SCREENSHOT_METHODS = ("droidcast", "scrcpy", "u2", "screenrecord", "screencap")
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

diagnose.py hardcodes a local Windows-only adb path and a fixed serial. This makes the diagnostic tool fail in other environments and diverges from the server’s _find_adb() / --serial configuration. Consider taking --adb and --serial CLI args (defaulting from env like ADB_SERIAL and auto-discovery) and reusing the same adb resolution logic as adb_vision/server.py.

Copilot uses AI. Check for mistakes.
Comment on lines +17 to 23
def _save_screenshot_png(data_b64: str) -> str:
"""Save a base64 PNG to mcp_screenshots/<seq>_<ts>.png; return the path."""
try:
_SCREENSHOT_DIR.mkdir(parents=True, exist_ok=True)
ts = datetime.now().strftime("%Y%m%dT%H%M%S")
fname = _SCREENSHOT_DIR / f"{seq:05d}_{ts}.png"
stamp = time.time_ns()
fname = _SCREENSHOT_DIR / f"{stamp}_{os.getpid()}.png"
fname.write_bytes(base64.b64decode(data_b64))
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _save_screenshot_png docstring says it writes mcp_screenshots/<seq>_<ts>.png, but the implementation now uses time.time_ns() + PID. Update the docstring (or filename format) so log/audit tooling isn’t misled when correlating screenshots to tool calls.

Copilot uses AI. Check for mistakes.
Comment on lines +251 to +259
record_event(
tool_name="adb_screenshot.capture",
arguments={"serial": ADB_SERIAL},
status="error",
result_summary="capture timeout",
duration_ms=25000,
error="screencap timed out after 25s",
event_type="capture",
)
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On adb_screenshot timeout, the audit record/error text says “screencap timed out”, but this tool path uses ctx.encode_screenshot_png_base64() (uiautomator2/ALAS screenshot), not adb shell screencap. This makes troubleshooting misleading; update the message/tool_name/event_type to reflect the actual capture method being timed out.

Copilot uses AI. Check for mistakes.
DOC_GUIDE.md Outdated
Comment on lines +1 to +28
# Documentation Rules & Guide

## Source of Truth Hierarchy

1. `CLAUDE.md`
- The primary file for agent context, rules, and fundamental architectural axioms. It supersedes older files (`AGENTS.md`, `GEMINI.md`).
2. `docs/NORTH_STAR.md`
- The non-negotiable end-state definition for the project. Surgical edits only as we approach the final destination.
3. `docs/plans/program_execution_plan.md`
- The master index for the development phases. Outlines what phase we are in, and what remains.
4. `docs/plans/phase_X_*.md`
- Detailed execution plans for a specific project phase (e.g., Phase 2). Used as the day-to-day checklist.

## Naming & Organization Rules

### Tool Definitions
- Tools are placed in `adb_vision/tools/`.
- Must export a `name`, `description`, `preconditions`, `expected_state`, `timeout_s`, and `retry_budget`.
- Adhere strictly to the `alas.*` or descriptive namespace for MCP exposition.

### Deprecated Docs
- All retired documentation must be completely removed from the repository or moved cleanly to `docs/archive/`. Do not leave them in `docs/plans/` as they pollute LLM context.
- We do NOT use `llm_driven_gameplay` or pure `LangGraph` planning documents anymore.

### Documentation Style
- **Status Badges**: Use `Status: Active / Draft / Archived` at the top of planning documents.
- **Clarity**: Write clearly, declaring the specific tools and paradigms to be used (e.g., "MaaFramework", "uiautomator2"). Avoid broad vague language.
- **Diagrams**: Use Mermaid `.md` diagrams to visualize state machines or logic when it becomes sufficiently complex.
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This DOC_GUIDE.md duplicates docs/DOC_GUIDE.md and also states tool-export requirements (e.g., preconditions) that don’t match the current deterministic tool framework implementation. To avoid conflicting documentation sources, consolidate into a single guide under docs/ (or remove this root-level file) and ensure the rules reflect the actual tool registry/contract code.

Suggested change
# Documentation Rules & Guide
## Source of Truth Hierarchy
1. `CLAUDE.md`
- The primary file for agent context, rules, and fundamental architectural axioms. It supersedes older files (`AGENTS.md`, `GEMINI.md`).
2. `docs/NORTH_STAR.md`
- The non-negotiable end-state definition for the project. Surgical edits only as we approach the final destination.
3. `docs/plans/program_execution_plan.md`
- The master index for the development phases. Outlines what phase we are in, and what remains.
4. `docs/plans/phase_X_*.md`
- Detailed execution plans for a specific project phase (e.g., Phase 2). Used as the day-to-day checklist.
## Naming & Organization Rules
### Tool Definitions
- Tools are placed in `adb_vision/tools/`.
- Must export a `name`, `description`, `preconditions`, `expected_state`, `timeout_s`, and `retry_budget`.
- Adhere strictly to the `alas.*` or descriptive namespace for MCP exposition.
### Deprecated Docs
- All retired documentation must be completely removed from the repository or moved cleanly to `docs/archive/`. Do not leave them in `docs/plans/` as they pollute LLM context.
- We do NOT use `llm_driven_gameplay` or pure `LangGraph` planning documents anymore.
### Documentation Style
- **Status Badges**: Use `Status: Active / Draft / Archived` at the top of planning documents.
- **Clarity**: Write clearly, declaring the specific tools and paradigms to be used (e.g., "MaaFramework", "uiautomator2"). Avoid broad vague language.
- **Diagrams**: Use Mermaid `.md` diagrams to visualize state machines or logic when it becomes sufficiently complex.
# Documentation Guide (Root Stub)
This root-level `DOC_GUIDE.md` is **deprecated** and kept only as a pointer for older links.
The **single source of truth** for documentation rules and conventions now lives in:
- `docs/DOC_GUIDE.md` — canonical documentation rules & structure
- `CLAUDE.md` — primary agent context, rules, and architectural axioms
- `docs/NORTH_STAR.md` — non-negotiable end-state definition for the project
For MCP tool contracts, tool registry behavior, and deterministic tool framework details, always refer to the docs under `docs/` (for example `docs/agent_tooling/README.md` and `docs/DOC_GUIDE.md`) and the actual implementation code, **not** this file.
No new guidance should be added here; update `docs/DOC_GUIDE.md` instead.

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +3
{
"lastSentAt": "2026-03-05T06:31:23.445Z"
} No newline at end of file
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a local runtime/state artifact (timestamped “lastSentAt”) rather than source code/config. These machine-specific .omc/ files should be removed from the repo and added to .gitignore to avoid leaking local session metadata and causing noisy diffs.

Suggested change
{
"lastSentAt": "2026-03-05T06:31:23.445Z"
}
{}

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +6
{
"timestamp": "2026-03-05T06:32:16.608Z",
"backgroundTasks": [],
"sessionStartTimestamp": "2026-03-05T05:38:40.025Z",
"sessionId": "2a799d2f-5602-4806-8bbd-8f6947137697"
} No newline at end of file
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This .omc/ HUD state JSON is a local runtime artifact and will create ongoing churn in git. Remove it from the repo and ignore .omc/ (and similar runtime state directories) via .gitignore.

Suggested change
{
"timestamp": "2026-03-05T06:32:16.608Z",
"backgroundTasks": [],
"sessionStartTimestamp": "2026-03-05T05:38:40.025Z",
"sessionId": "2a799d2f-5602-4806-8bbd-8f6947137697"
}
{}

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +8
{
"session_id": "9f68cafc-1a6c-47cc-8c96-0394c2a40828",
"ended_at": "2026-03-05T07:15:43.925Z",
"reason": "other",
"agents_spawned": 0,
"agents_completed": 0,
"modes_used": []
} No newline at end of file
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Session metadata under .omc/sessions/ appears to be local runtime output (session IDs, timestamps). This shouldn’t live in version control; remove it and ignore .omc/ to prevent accidental leakage and noisy diffs.

Suggested change
{
"session_id": "9f68cafc-1a6c-47cc-8c96-0394c2a40828",
"ended_at": "2026-03-05T07:15:43.925Z",
"reason": "other",
"agents_spawned": 0,
"agents_completed": 0,
"modes_used": []
}
{}

Copilot uses AI. Check for mistakes.
Comment on lines +146 to +170
def _resolve_tool(self, command: str) -> Optional[Callable[[], dict[str, Any]]]:
"""Resolve command to a registered deterministic tool if available."""
if not self.use_tool_registry:
return None

# Keep strict command names first-class; future command formats should be
# normalized before registration if needed.
candidate = command.strip()

for module_name in ("adb_vision.tools.registry", "tools.registry"):
module = sys.modules.get(module_name)
if module is None:
continue
get_tool = getattr(module, "get_tool", None)
if callable(get_tool):
tool = get_tool(candidate)
if tool is not None:
return tool

for module_name in ("adb_vision.tools.registry", "tools.registry"):
try:
module = __import__(module_name, fromlist=["get_tool"])
except Exception:
continue
get_tool = getattr(module, "get_tool", None)
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The registry lookup can easily miss tools (or create a second empty registry) because it imports/looks up both adb_vision.tools.registry and tools.registry. If adb_vision.tools (package) was never imported, auto-discovery in adb_vision/tools/__init__.py won’t run, so no tools are registered. Prefer a single canonical import path (e.g., import adb_vision.tools once to trigger discovery, then call adb_vision.tools.registry.get_tool) and drop the fallback tools.registry alias to avoid split registries.

Copilot uses AI. Check for mistakes.
@@ -0,0 +1 @@
{"session_id":"2a799d2f-5602-4806-8bbd-8f6947137697","transcript_path":"C:\\Users\\pmacl\\.claude\\projects\\D---projects-ALAS\\2a799d2f-5602-4806-8bbd-8f6947137697.jsonl","cwd":"D:\\_projects\\ALAS","model":{"id":"claude-sonnet-4-6","display_name":"Sonnet 4.6"},"workspace":{"current_dir":"D:\\_projects\\ALAS","project_dir":"D:\\_projects\\ALAS","added_dirs":[]},"version":"2.1.69","output_style":{"name":"default"},"cost":{"total_cost_usd":1.4690392499999998,"total_duration_ms":3217190,"total_api_duration_ms":302861,"total_lines_added":0,"total_lines_removed":0},"context_window":{"total_input_tokens":65,"total_output_tokens":11703,"context_window_size":200000,"current_usage":{"input_tokens":3,"output_tokens":377,"cache_creation_input_tokens":672,"cache_read_input_tokens":64960},"used_percentage":33,"remaining_percentage":67},"exceeds_200k_tokens":false} No newline at end of file
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file contains machine-specific local paths/session metadata and appears to be a runtime artifact under .omc/. It shouldn’t be committed; remove it and add .omc/ to .gitignore (or otherwise keep these HUD/session caches out of version control).

Suggested change
{"session_id":"2a799d2f-5602-4806-8bbd-8f6947137697","transcript_path":"C:\\Users\\pmacl\\.claude\\projects\\D---projects-ALAS\\2a799d2f-5602-4806-8bbd-8f6947137697.jsonl","cwd":"D:\\_projects\\ALAS","model":{"id":"claude-sonnet-4-6","display_name":"Sonnet 4.6"},"workspace":{"current_dir":"D:\\_projects\\ALAS","project_dir":"D:\\_projects\\ALAS","added_dirs":[]},"version":"2.1.69","output_style":{"name":"default"},"cost":{"total_cost_usd":1.4690392499999998,"total_duration_ms":3217190,"total_api_duration_ms":302861,"total_lines_added":0,"total_lines_removed":0},"context_window":{"total_input_tokens":65,"total_output_tokens":11703,"context_window_size":200000,"current_usage":{"input_tokens":3,"output_tokens":377,"cache_creation_input_tokens":672,"cache_read_input_tokens":64960},"used_percentage":33,"remaining_percentage":67},"exceeds_200k_tokens":false}
{}

Copilot uses AI. Check for mistakes.
},
"conventions": {
"namingStyle": null,
"importStyle": null,
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.omc/project-memory.json appears to be generated/local state (includes machine paths and scan timestamps). It should not be committed; remove it and add .omc/ to .gitignore to avoid leaking environment details and future diff noise.

Suggested change
"importStyle": null,
"importStyle": "unspecified",

Copilot uses AI. Check for mistakes.
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3db94c2f21

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@@ -0,0 +1,29 @@
"""Tool 2: Hot-Read OCR persistence of resource coordinates."""

from adb_vision.tools.registry import deterministic_tool

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Use package-relative imports in tool modules

These tool modules import adb_vision... directly, which breaks the tools.registry loading path that MasterLoop._resolve_tool explicitly tries for local execution contexts. When the process runs with adb_vision/ as the working module root, import tools.registry triggers auto-discovery and then fails with ModuleNotFoundError: No module named 'adb_vision', so deterministic tools are never registered and registry-first execution silently degrades to direct executor fallback.

Useful? React with 👍 / 👎.

from screenshot import take_screenshot
from tool_audit import audit_cli_call, configure, record_command

ADB = r"D:\Program Files\Microvirt\MEmu\adb.exe"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Detect adb path instead of hardcoding one Windows install

The diagnose CLI hardcodes ADB to D:\Program Files\Microvirt\MEmu\adb.exe, so environments where adb is on PATH (or installed under another location/OS) will fail every subprocess invocation with file-not-found and produce misleading ADB_DISCONNECTED/backend-failure diagnoses. This makes the new diagnostic workflow unreliable outside one specific machine layout.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants