Skip to content

feat: adb_vision - clean ADB + VLM game automation stack#43

Open
Coldaine wants to merge 10 commits intomasterfrom
feature/adb-vision-clean
Open

feat: adb_vision - clean ADB + VLM game automation stack#43
Coldaine wants to merge 10 commits intomasterfrom
feature/adb-vision-clean

Conversation

@Coldaine
Copy link
Owner

@Coldaine Coldaine commented Mar 3, 2026

Summary

New adb_vision/ folder: clean ADB game automation with zero ALAS dependency.

What's in this PR

Architecture

Gemini CLI connects to adb_vision MCP server via stdio. Gemini sees the screen (screenshot), reasons about what to do, and controls the emulator (tap/swipe). Gemini IS the bot.

Screenshot Methods (Issues for Jules)

Testing

cd adb_vision && uv run pytest test_server.py -v   # 10/10 pass

- Clean MCP server (server.py) with 6 tools: screenshot, tap, swipe, keyevent, launch_game, get_focus
- Pluggable screenshot backends (screenshot.py) with dispatch + stubs for DroidCast, scrcpy, u2, screencap
- 10/10 unit tests passing (all mocked, no device needed)
- Gemini CLI wired as vision driver via MCP (drive.bat + system prompt)
- .mcp.json updated to point to adb_vision server
- .gemini/settings.json registers adb-vision MCP server

GitHub issues created for Jules:
- #40: DroidCast APK HTTP stream backend
- #41: scrcpy virtual display capture backend
- #42: uiautomator2 ATX HTTP backend

Zero ALAS dependency. Pure async ADB + FastMCP.
Copilot AI review requested due to automatic review settings March 3, 2026 04:44
@chatgpt-codex-connector
Copy link

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

"server.py",
"--serial",
"127.0.0.1:21513"
]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 The Roast: Changing the port from 21503 to 21513? I see we're playing 'guess which MEmu instance' today. According to AGENTS.md, the documented environment uses 127.0.0.1:21503 for MEmuPlayer. This change breaks the 'Known local environment' documented in your own setup guide.

🩹 The Fix: Either document why port 21513 is needed, make this configurable via environment variable, or keep the documented default. Don't leave developers playing port roulette.

📏 Severity: critical

]
}
}
} No newline at end of file
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 The Roast: Missing newline at end of file. It's 2026 and we're still having this conversation? Your editor is gaslighting you into believing this file is complete when git diff is side-eyeing you hard.

🩹 The Fix:

Suggested change
}
"PYTHONIOENCODING": "utf-8"
}
}
}

📏 Severity: nitpick

"server.py",
"--serial",
"127.0.0.1:21513"
],
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 The Roast: Same port switcharoo here — 21503 became 21513 like it's no big deal. The AGENTS.md clearly states 127.0.0.1:21503 as the known environment. If you're changing the default, at least have the courtesy to update the docs or explain why instance 0 suddenly became instance 1.

🩹 The Fix: Align with documented environment or update AGENTS.md to explain the port change rationale.

📏 Severity: critical



ADB_EXECUTABLE: str = _find_adb()
ADB_SERIAL: str = "127.0.0.1:21513"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 The Roast: Hardcoded serial at module level — the module loads once and this value is cast in stone. It's like defining your WiFi password in a global constant and calling it 'configuration'. The argparse default below becomes purely decorative since this is already set.

🩹 The Fix: Set this to None here and let main() initialize it properly from args or environment. Or use a class-based approach that doesn't rely on module-level globals.

📏 Severity: warning

try:
log.debug("trying screenshot backend: %s", name)
b64 = await capture_fn(adb_run=adb_run, serial=serial, adb_exe=adb_exe)
if b64 and len(base64.b64decode(b64)) > 5000:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 The Roast: 5000 — the magic number that somehow distinguishes 'real screenshot' from 'blank MEmu garbage'. This threshold is doing more heavy lifting than a CrossFit enthusiast on leg day. Where did 5000 come from? The comment says '>> 5KB' but the check is '> 5000 bytes' which is a hair under 5KB. Close enough for government work, I suppose.

🩹 The Fix: Make this a named constant with a descriptive name like MIN_VALID_PNG_SIZE_BYTES = 5000 and document how you arrived at this value.

📏 Severity: suggestion

# Stub — implementation will be filled by a Jules issue.
# ---------------------------------------------------------------------------
async def _capture_droidcast(*, adb_run: AdbRunFn, serial: str, adb_exe: str) -> str:
raise NotImplementedError(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 The Roast: Three unimplemented backends that just raise NotImplementedError. I appreciate the optimism of providing a pluggable architecture when only one plugin actually works. It's like building a 4-car garage for a unicycle.

🩹 The Fix: Either implement at least one more backend so 'auto' failover makes sense, or remove the stubs and simplify until they're ready. Dead code is technical debt wearing a Halloween costume.

📏 Severity: suggestion

with open(_ACTION_LOG_PATH, "a", encoding="utf-8") as fh:
fh.write(json.dumps(record, ensure_ascii=True) + "\n")
except Exception:
pass
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 The Roast: except Exception: pass — the 'see no evil, hear no evil' of error handling. When logging fails, we just... don't? Silent failures are like ninja bugs: you don't know they're there until they've stolen your sanity at 3 AM.

🩹 The Fix: At minimum, log to stderr as a fallback: except Exception as e: print(f'Failed to log action: {e}', file=sys.stderr)

📏 Severity: warning

found = shutil.which("adb")
if found:
return found
candidates = [
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 The Roast: Windows-only hardcoded paths. I see C:\ and D:\ and I know this code won't work anywhere else without modification. It's like building a house with doors that only open in one direction — sure it works, but you're gonna confuse a lot of people.

🩹 The Fix: Add POSIX paths like /usr/bin/adb, /opt/android-sdk/platform-tools/adb, or just rely on shutil.which() entirely for non-Windows systems.

📏 Severity: suggestion

@kilo-code-bot
Copy link

kilo-code-bot bot commented Mar 3, 2026

Code Review Roast 🔥

Verdict: 4 New Issues Found | Recommendation: Address before merge

Overview

Severity Count
🚨 critical 0
⚠️ warning 1
💡 suggestion 1
🤏 nitpick 2
Issue Details (click to expand)
File Line Roast
adb_vision/screenshot.py 57 Fragile JSON parsing logic — assumes specific keys exist
adb_vision/server.py 28 Relative import prevents module usage
adb_vision/screenshot.py 19 Hardcoded port should be configurable
adb_vision/server.py 249 Hardcoded package name limits regional support

🏆 Best part: The upstream_alas submodule deletion shows you're actually committing to the "abandon ALAS" strategy instead of just talking about it. I respect the follow-through — most teams would keep it around "just in case" until the heat death of the universe.

💀 Worst part: The JSON parsing in _to_png_bytes is a debugging session waiting to happen. When some screenshot service returns {"image_data": "base64..."} instead of one of your blessed four keys, you'll spend 45 minutes wondering why the "working" code suddenly broke. I've been there. It wasn't fun. That's why it's a warning, not a nitpick.

📊 Overall: Like a demolition crew that remembered to bring dynamite but forgot the building permits — the intent is clear, the execution is mostly solid, but there are a few regulatory (read: architectural) details that need tidying before this passes inspection.

Files Reviewed (11 files)
  • .gemini/settings.json - already commented (port change, missing newline)
  • .githooks/pre-push - nested .git guard (good defensive addition)
  • .github/prompts/alas-separation.prompt.md - new comprehensive prompt
  • .mcp.json - already commented (port change, naming inconsistency)
  • CLAUDE.md - added git init/clone warning
  • adb_vision/server.py - 2 new issues (import style, hardcoded package)
  • adb_vision/screenshot.py - 2 new issues (fragile JSON parsing, hardcoded port)
  • docs/MASTER_PLAN.md - new authoritative plan (well structured)
  • docs/dev/memu_playbook.md - comprehensive MEmu reference
  • docs/monorepo/MONOREPO_SYNC_NOTES.md - added guardrails documentation
  • upstream_alas - deleted (bold architectural move)

Note: This PR contains 32 pre-existing comments. The 4 new issues above complement those and focus on architectural concerns not yet covered.

Fix these issues in Kilo Cloud

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Introduces a new standalone adb_vision/ module that provides a clean FastMCP stdio server for ADB-based “vision-driven” emulator control, explicitly avoiding any ALAS dependency, and wires editor/Gemini configs to use it.

Changes:

  • Added adb_vision/ MCP server (server.py) exposing ADB control + screenshot tools.
  • Implemented a pluggable screenshot dispatcher (screenshot.py) with stub backends and a screencap fallback.
  • Added a mocked unit test suite and Gemini CLI integration/config (test_server.py, drive.bat, .gemini/settings.json, .mcp.json).

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
adb_vision/test_server.py Adds mocked async unit tests for screenshot dispatch + MCP tools.
adb_vision/server.py New FastMCP server exposing ADB tools, action logging, and screenshot saving.
adb_vision/screenshot.py Screenshot backend dispatcher with stubs + screencap backend and blank-image guard.
adb_vision/pyproject.toml Declares adb_vision dependencies and dev test deps.
adb_vision/drive.bat Launch helper for Gemini CLI using the adb_vision policy prompt.
adb_vision/conftest.py Ensures local imports work for tests by adding adb_vision/ to sys.path.
adb_vision/init.py Package marker + brief module docstring.
adb_vision/README.md Usage/architecture documentation for adb_vision.
adb_vision/GEMINI_SYSTEM_PROMPT.md Gemini system prompt describing tools and workflow.
adb_vision/.gitignore Ignores runtime artifacts (JSONL logs, screenshots, venv, caches).
.mcp.json Points VS Code MCP config to the new adb_vision server entrypoint.
.gemini/settings.json Registers the adb_vision MCP server for Gemini CLI usage.

Comment on lines +13 to +29
def _fake_png(width: int = 1280, height: int = 720) -> bytes:
"""Generate a valid PNG with random noise that is > 5KB."""
from PIL import Image
import io
import random

random.seed(42)
# Random pixel data won't compress as small as solid color
pixels = bytes(random.randint(0, 255) for _ in range(width * height * 3))
img = Image.frombytes("RGB", (width, height), pixels)
buf = io.BytesIO()
img.save(buf, format="PNG")
return buf.getvalue()


FAKE_PNG_B64 = base64.b64encode(_fake_png()).decode("ascii")
FAKE_PNG_BYTES = base64.b64decode(FAKE_PNG_B64)
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_fake_png() defaults to generating a full 1280×720 random-noise image and is called at import time to build FAKE_PNG_B64. This makes the test module expensive to import and will slow the unit test suite significantly; consider using a much smaller deterministic PNG that still exceeds the 5KB threshold (or storing a small precomputed PNG bytes constant).

Copilot uses AI. Check for mistakes.
Comment on lines 2 to 12
"mcpServers": {
"alas": {
"command": "uv",
"args": [
"run",
"--directory",
"agent_orchestrator",
"alas_mcp_server.py",
"--config",
"alas"
"adb_vision",
"server.py",
"--serial",
"127.0.0.1:21513"
],
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The VS Code MCP server entry is still named alas even though it now launches adb_vision/server.py. Renaming the key to something like adb-vision would reduce confusion (and avoids clobbering any existing ALAS MCP configuration).

Copilot uses AI. Check for mistakes.
Comment on lines +298 to +303
parser.add_argument(
"--screenshot-method",
default="auto",
choices=["auto", "droidcast", "scrcpy", "u2", "screencap"],
help="Default screenshot method",
)
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The --screenshot-method CLI option is parsed but never used, so it currently has no effect on tool behavior. Either remove this argument or wire it into adb_screenshot() (e.g., set a module-level default method in main() and have adb_screenshot() use it when the caller doesn’t explicitly pass method).

Suggested change
parser.add_argument(
"--screenshot-method",
default="auto",
choices=["auto", "droidcast", "scrcpy", "u2", "screencap"],
help="Default screenshot method",
)

Copilot uses AI. Check for mistakes.
t0 = time.monotonic()
await _adb_run("shell", "input", "tap", str(x), str(y), timeout=5.0)
ms = int((time.monotonic() - t0) * 1000)
_action_log("adb_tap", {"x": x, "y": y}, f"tapped {x},{y}", "", ms)
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action log records for adb_tap omit the active ADB serial, which makes the JSONL log ambiguous if the server is started with a non-default --serial. Consider including serial in the logged args (as is already done for some other tools).

Suggested change
_action_log("adb_tap", {"x": x, "y": y}, f"tapped {x},{y}", "", ms)
_action_log("adb_tap", {"serial": ADB_SERIAL, "x": x, "y": y}, f"tapped {x},{y}", "", ms)

Copilot uses AI. Check for mistakes.
Comment on lines +213 to +219
_action_log(
"adb_swipe",
{"x1": x1, "y1": y1, "x2": x2, "y2": y2, "duration_ms": duration_ms},
f"swiped {x1},{y1}->{x2},{y2}",
"",
ms,
)
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action log records for adb_swipe omit the active ADB serial, which makes the JSONL log ambiguous if the server is started with a non-default --serial. Consider including serial in the logged args (as is already done for some other tools).

Copilot uses AI. Check for mistakes.
t0 = time.monotonic()
await _adb_run("shell", "input", "keyevent", str(keycode), timeout=5.0)
ms = int((time.monotonic() - t0) * 1000)
_action_log("adb_keyevent", {"keycode": keycode}, f"keyevent {keycode}", "", ms)
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action log records for adb_keyevent omit the active ADB serial, which makes the JSONL log ambiguous if the server is started with a non-default --serial. Consider including serial in the logged args (as is already done for some other tools).

Suggested change
_action_log("adb_keyevent", {"keycode": keycode}, f"keyevent {keycode}", "", ms)
_action_log(
"adb_keyevent",
{"keycode": keycode, "serial": ADB_SERIAL},
f"keyevent {keycode}",
"",
ms,
)

Copilot uses AI. Check for mistakes.
Comment on lines +45 to +50
if b64 and len(base64.b64decode(b64)) > 5000:
# Sanity check: real screenshots are >> 5KB; blank MEmu
# screencaps are ~3.6KB
log.info("screenshot captured via %s", name)
return b64
log.warning("%s returned suspiciously small image (%d bytes)", name, len(base64.b64decode(b64)))
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

take_screenshot() base64-decodes the returned image multiple times (size check + warning), which does redundant work on potentially large screenshots. Decode once into bytes, reuse the length for the sanity check/logging, and only re-encode if needed.

Suggested change
if b64 and len(base64.b64decode(b64)) > 5000:
# Sanity check: real screenshots are >> 5KB; blank MEmu
# screencaps are ~3.6KB
log.info("screenshot captured via %s", name)
return b64
log.warning("%s returned suspiciously small image (%d bytes)", name, len(base64.b64decode(b64)))
img_bytes = base64.b64decode(b64)
img_size = len(img_bytes)
if b64 and img_size > 5000:
# Sanity check: real screenshots are >> 5KB; blank MEmu
# screencaps are ~3.6KB
log.info("screenshot captured via %s", name)
return b64
log.warning(
"%s returned suspiciously small image (%d bytes)",
name,
img_size,
)

Copilot uses AI. Check for mistakes.
AI Agent added 3 commits March 3, 2026 10:42
- add adb_vision live test and DroidCast debug/setup scripts
- include current alas_wrapped updates, assets, configs, and campaign content
- retain existing adb_vision unit test baseline (10 passed)
"server.py",
"--serial",
"127.0.0.1:21513"
]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 The Roast: Changing the port from 21503 to 21513? I see we're playing 'guess the MEmu instance' today. The AGENTS.md specifically documents 127.0.0.1:21503 as the known environment for PatrickCustom.json. Either this is a deliberate instance change (in which case, document it) or someone's MEmu config drifted and we're codifying it.

🩹 The Fix: Add a comment explaining why 21513 is the correct port for this setup, or revert to 21503 if this was unintentional. Future-you will thank present-you when debugging connection failures.

📏 Severity: suggestion

]
}
}
} No newline at end of file
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 The Roast: Missing newline at end of file. It's 2026 and we're still doing this? POSIX tools everywhere are shedding tears. This isn't just pedantry — cat and wc and half your git diff tools expect that trailing newline.

🩹 The Fix: Add a newline at the end of the file. In VS Code: cursor to end, press Enter once, save.

📏 Severity: nitpick

"server.py",
"--serial",
"127.0.0.1:21513"
],
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 The Roast: Same port switcharoo here — 21503 became 21513. But wait, there's more! The VS Code MCP server entry is still named alas even though we're now pointing at adb_vision/server.py. So we have a server named 'alas' that runs 'adb_vision'. That's like naming your dog 'Cat' — technically allowed, but confusing for everyone involved.

🩹 The Fix: Either rename the server to adb-vision (matching .gemini/settings.json) or add a comment explaining why the naming discrepancy exists. Consistency is the hobgoblin of little minds, but it's also the savior of 3AM debugging sessions.

📏 Severity: suggestion



ADB_EXECUTABLE: str = _find_adb()
ADB_SERIAL: str = "127.0.0.1:21513"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 The Roast: Hardcoded serial at module level — the module imports, then main() parses args, then ADB_SERIAL gets reassigned. This is technically fine for the stdio transport since main() runs before tools are called, but it's a landmine for anyone importing this module for testing. The global gets set at import time to 21513, and if your test tries to override it before calling main(), surprise!

🩹 The Fix: Consider making ADB_SERIAL default to None at module level and have _adb_run() resolve it at call time (falling back to env var or default). Or at minimum, document this initialization order hazard in a module-level docstring.

📏 Severity: nitpick

default="auto",
choices=["auto", "droidcast", "scrcpy", "u2", "screencap"],
help="Default screenshot method",
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 The Roast: The --screenshot-method CLI option is parsed with great ceremony — choices validated, help text provided, the works — and then... completely ignored. It's like building a beautiful door and never installing the handle. The adb_screenshot tool has its own method parameter that defaults to 'auto' from the function signature, making this CLI arg a decorative element.

🩹 The Fix: Either wire this up (store a module-level default and use it in adb_screenshot when method='auto' is passed), or remove the argument until it's implemented. Silent no-ops are the original sin of CLI design.

📏 Severity: warning



FAKE_PNG_B64 = base64.b64encode(_fake_png()).decode("ascii")
FAKE_PNG_BYTES = base64.b64decode(FAKE_PNG_B64)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 The Roast: _fake_png() defaults to generating a full 1280×720 random-noise PNG for every test — that's 2.7MB of random data per call, base64-encoded to 3.6MB. Your tests are now generating gigabytes of entropy over time. The random.seed(42) helps with reproducibility, but PIL's PNG compression + random noise = non-deterministic output sizes that could theoretically flap around your 5000-byte threshold.

🩹 The Fix: Use a solid color or pattern that compresses deterministically:

img = Image.new('RGB', (width, height), color=(255, 0, 0))  # solid red

This produces consistent, small PNGs and tests your threshold logic without relying on RNG behavior.

📏 Severity: nitpick

t0 = time.monotonic()
await _adb_run("shell", "input", "tap", str(x), str(y), timeout=5.0)
ms = int((time.monotonic() - t0) * 1000)
_action_log("adb_tap", {"x": x, "y": y}, f"tapped {x},{y}", "", ms)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 The Roast: Action log records for adb_tap omit the active ADB serial. If someone has multiple emulators running (or switches serials during testing), the log will say 'tapped 100,200' without indicating which device got tapped. This makes debugging multi-device scenarios impossible.

🩹 The Fix: Include the serial in the action log args: args={"x": x, "y": y, "serial": ADB_SERIAL}. Same applies to adb_swipe and adb_keyevent.

📏 Severity: nitpick

f"swiped {x1},{y1}->{x2},{y2}",
"",
ms,
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 The Roast: Action log records for adb_swipe omit the active ADB serial. Same issue as adb_tap — multi-device debugging becomes archaeology when you don't know which device received the swipe.

🩹 The Fix: Include "serial": ADB_SERIAL in the args dict for all action log calls.

📏 Severity: nitpick

t0 = time.monotonic()
await _adb_run("shell", "input", "keyevent", str(keycode), timeout=5.0)
ms = int((time.monotonic() - t0) * 1000)
_action_log("adb_keyevent", {"keycode": keycode}, f"keyevent {keycode}", "", ms)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 The Roast: Action log records for adb_keyevent omit the active ADB serial. Consistency issue with other action logs — when debugging, you want to trace the full context of every operation.

🩹 The Fix: Include "serial": ADB_SERIAL in the args dict.

📏 Severity: nitpick

# screencaps are ~3.6KB
log.info("screenshot captured via %s", name)
return b64
log.warning("%s returned suspiciously small image (%d bytes)", name, len(base64.b64decode(b64)))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 The Roast: take_screenshot() base64-decodes the returned image multiple times — once here to check size, potentially again elsewhere. For a 720p screenshot (~100KB PNG → ~136KB base64), this is wasteful. You're doing base64.b64decode(b64) twice in the success path: once for size check, once for the warning message.

🩹 The Fix: Decode once and reuse:

decoded = base64.b64decode(b64)
if decoded and len(decoded) > 5000:
    ...
log.warning("%s returned suspiciously small image (%d bytes)", name, len(decoded))

📏 Severity: nitpick

AI Agent added 3 commits March 4, 2026 02:49
Replace NotImplementedError stubs with real implementations:

- _capture_droidcast: HTTP GET http://127.0.0.1:53516/preview via
  asyncio.to_thread + urllib.request. Validates PNG header. Requires
  setup_droidcast.py to have run (pushes APK, forwards port 53516).
  This is the primary fix for MEmu/VirtualBox where screencap returns
  a blank ~3.6KB image due to the OpenGL framebuffer not being populated.

- _capture_u2: Forwards tcp:7912 via adb_run, then HTTP GET
  http://127.0.0.1:7912/screenshot from the uiautomator2 ATX agent.
  Requires uiautomator2 to be initialized on the device.

- _capture_scrcpy: Runs scrcpy screenshot <tmpfile> as a subprocess
  (scrcpy v2.7+). Falls back cleanly if scrcpy is not in PATH.

Add 7 new unit tests (all mocked, no device needed):
  test_droidcast_backend_success/not_running/bad_header
  test_u2_backend_success/not_running
  test_scrcpy_backend_not_in_path/success

All 17 tests pass.
name: 'alas-separation'
description: 'Separate agent orchestrator from ALAS-with-Dashboard while preserving state engine'
agent: 'agent'
model: 'Claude Sonnet 4'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 The Roast: Claude Sonnet 4 — I appreciate the optimism, but we're not living in 2027. Anthropic hasn't released a 'Sonnet 4' yet. Unless you're beta-testing from the future, this model name is hallucinated.

🩹 The Fix: Use claude-3-5-sonnet-latest or another valid model identifier.

📏 Severity: warning

"alas_mcp_server.py",
"--config",
"alas"
"adb_vision",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 The Roast: The MCP server name at line 3 is still alas even though you've completely replaced it with adb_vision. That's like putting a Ferrari badge on a Honda Civic — the config may work, but you're going to confuse every developer who looks at this later.

🩹 The Fix: Rename the server key from alas to adb-vision to match what it actually runs.

📏 Severity: suggestion

finally:
try:
os.unlink(tmp_path)
except FileNotFoundError:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 The Roast: Bare except FileNotFoundError: pass — the silent treatment for file system errors. 'What happened to the temp file?' 'Dunno, it passed.' This won't bite you until you're debugging why screenshots mysteriously fail on Windows with 260-character path limits.

🩹 The Fix: Log the exception before swallowing it, even at debug level.

📏 Severity: nitpick

from pathlib import Path
from typing import Any, Dict, Optional

from screenshot import take_screenshot
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 The Roast: from screenshot import take_screenshot — This import is playing Russian roulette with Python's module system. It only works when running as a script because Python adds the script's directory to sys.path. Try import adb_vision.server from another directory and watch it explode like my patience when a dependency import fails in production.

🩹 The Fix: Use absolute imports: from adb_vision.screenshot import take_screenshot. This requires ensuring the parent directory is in PYTHONPATH, but at least it works consistently whether run as script or imported as module.

📏 Severity: suggestion

if not raw:
raise RuntimeError("Empty screenshot payload")
if raw[:1] in (b"{", b"["):
payload = json.loads(raw.decode("utf-8", errors="replace"))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 The Roast: This JSON parsing logic is like playing 'guess which key contains the image' with a blindfold on. If a server returns {"result": "<base64>"} instead of one of your four hardcoded keys, you raise RuntimeError even though the data might be perfectly valid. I've seen more robust error handling in a house of cards.

🩹 The Fix: Make the key list configurable or try to base64-decode values from ANY key if the content looks like base64. Better yet, check the HTTP Content-Type header and skip JSON parsing if it's image/png.

📏 Severity: warning


AdbRunFn = Callable[..., Awaitable[bytes]]

_DROIDCAST_PORT = 53516
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 The Roast: _DROIDCAST_PORT = 53516 — The port number that launched a thousand conflicts. Hardcoding ports is like hardcoding passwords: it works until it doesn't, and then you're debugging why your tests fail on CI because something else grabbed that port. At least use an environment variable with a fallback so I can change it without editing source code.

🩹 The Fix:

_DROIDCAST_PORT = int(os.environ.get("DROIDCAST_PORT", 53516))

📏 Severity: nitpick

"android.intent.action.MAIN",
"-c",
"android.intent.category.LAUNCHER",
"-n",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 The Roast: Hardcoded package name com.YoStarEN.AzurLane — Oh, you thought everyone plays the English version? What about JP? CN? TW? This is as configurable as a brick wall. One day someone will want to automate the Japanese version and they'll have to fork the entire codebase just to change one string.

🩹 The Fix: Add a --package CLI argument or AZUR_LANE_PACKAGE environment variable with the EN version as default. Then pass it to adb_launch_game() instead of hardcoding it.

📏 Severity: nitpick

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants