Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 10 additions & 2 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,9 +93,15 @@ CODEC supports user-authored Python plugins at `~/.codec/plugins/*.py` that regi

**Audit:** every successful hook fire emits `hook_fired`; plugin-internal exceptions emit `hook_error` with `level="warning"` (operation still succeeded). `correlation_id` inherits from the wrapping operation per Step 1 §1.4 — never regenerated.

**Trust model:** local Python files curated by the user. No marketplace, no auto-install, no inter-plugin sandbox. Same trust model as skills.
**Trust model (PR-2F, closes D-18):** local Python files curated by the user, plus a SHA-256 allowlist gate + daemon-thread timeout. Same chokepoint pattern as PR-1A's `skills/.manifest.json` for skills:

Implementation: `codec_hooks.py` (run_with_hooks + PluginRegistry), wired into `codec_dispatch.py:run_skill` (covers wake-word + chat pre-LLM + chat post-LLM tag), `codec_agents.py:Agent.run` (crew), `codec_voice.py:dispatch_skill` (voice WebSocket), `codec_mcp.py:tool_fn` (MCP stdio + HTTP). Voice WebSocket also fires `emit_operation_start` / `emit_operation_end` from `VoicePipeline.run` start/finally.
- **`~/.codec/plugins.allowlist`** (JSON, 0600, atomic-write) keys plugins by basename → `{sha256, approved_at, approved_by}`. `PluginRegistry.get_fn` computes the file's current SHA-256 and refuses to call `exec_module` if the hash isn't in the allowlist OR has changed. Refusal → `plugin_load_blocked` audit (`reason ∈ {not_in_allowlist, hash_mismatch, file_unreadable}`, optional `detail` field with the AST result from `is_dangerous_skill_code` for forensic clarity).
- **Allowlist IS the trust decision, AST IS NOT.** The same pattern as PR-1A: a plugin that imports `subprocess` (e.g. `self_improve.py`) would fail the AST check, but its hash is in the allowlist so it loads. AST is only consulted to enrich the audit emit for refused plugins.
- **Grandfather migration.** First `scan()` after PR-2F lands with no allowlist file present + plugins in `~/.codec/plugins/*.py` writes their current hashes to the allowlist with `approved_by: "initial_migration"` + emits `plugin_allowlist_migrated`. Idempotent — subsequent scans no-op.
- **Daemon-thread timeout.** Every hook fire runs inside `threading.Thread(daemon=True)` with a hard timeout (default 500ms, configurable via `~/.codec/config.json:plugin_hook_timeout_ms`). On timeout: `plugin_hook_timeout` audit, return `None`, calling thread continues. Daemon=True so process shutdown isn't blocked.
- **Operator approval flow.** New skill `plugin_approve` (`SKILL_MCP_EXPOSE=False`) — invoked via chat "approve plugin newhook.py" — computes the file's SHA-256, writes the allowlist entry with `approved_by: "operator"`, clears any prior refusal cache, emits `plugin_approved`. The skill is NEVER MCP-exposed (claude.ai can't extend the plugin trust boundary).

Implementation: `codec_hooks.py` (run_with_hooks + PluginRegistry — now owns its `allowlist_path` for per-test isolation), wired into `codec_dispatch.py:run_skill` (covers wake-word + chat pre-LLM + chat post-LLM tag), `codec_agents.py:Agent.run` (crew), `codec_voice.py:dispatch_skill` (voice WebSocket), `codec_mcp.py:tool_fn` (MCP stdio + HTTP). Voice WebSocket also fires `emit_operation_start` / `emit_operation_end` from `VoicePipeline.run` start/finally.

### AskUserQuestion + stuck detection + step budget (Phase 1 Step 3)

Expand Down Expand Up @@ -818,6 +824,8 @@ These zones break running infrastructure if changed without coordination. NEVER
- `~/.codec/oauth_state.json` — clearing this invalidates ALL claude.ai connections; touch only after explicit user OK + `pm2 restart codec-mcp-http`
- Keychain entry `ai.avadigital.codec.internal_token` (Phase 1 Wave 2, PR-2D) — deleting forces a full token regen across every CODEC daemon. Heartbeat / scheduler / skills miss for ≤30s while their cache misses; `internal_token_autogenerated` audits emit. Use only for rotation. The legacy `X-Internal: codec` literal header is no longer recognized anywhere in `AuthMiddleware` — restoring it would re-open D-11.
- Keychain entry `ai.avadigital.codec.audit_hmac_secret` (Phase 1 Wave 2, PR-2E) — backs HMAC-SHA256 signing of `~/.codec/audit.log`. Deleting it forces a fresh bootstrap on next write; subsequent `verify_audit_log()` will count lines signed by the OLD secret as `broken` (stdlib WARNING at bootstrap time tells the operator). Rotate only as part of a planned forensic cycle (export the old log first via the verify utility). NEVER call `codec_audit.log_event` from inside `codec_keychain.get_audit_hmac_secret()` — circular write path → deadlock on the non-reentrant `_LOCK`. The silent bootstrap path in PR-2E exists exactly to avoid this.
- `~/.codec/plugins.allowlist` (Phase 1 Wave 2, PR-2F) — SHA-256 allowlist that gates plugin loading. Deleting it forces every plugin in `~/.codec/plugins/` to fall through grandfather migration on next scan, which re-adds them all with `approved_by: "initial_migration"` — effectively a "trust everything currently in the dir again" reset. Use cautiously: if a malicious file is in the plugins dir at the time of deletion, the migration grandfathers it too. Atomic-write contract — never hand-edit; use the `plugin_approve` skill (operator-only, `SKILL_MCP_EXPOSE=False`) or `codec_hooks.approve_plugin(filename)`. File perms enforced at 0600.
- `~/.codec/config.json:plugin_hook_timeout_ms` (Phase 1 Wave 2, PR-2F) — default 500ms. Lower than ~50ms and legitimate hooks (notably the synchronous body of `self_improve.py`'s `on_operation_end`, which does a snapshot + daemon-thread spawn) start tripping the timeout. Higher than ~2000ms reintroduces DoS risk via a malicious hook blocking the calling thread. Keep in the [100, 1000] band unless surfaced.
- `~/.codec/audit.log` (Phase 1 Wave 2, PR-2E) — tampering with existing lines or appending forged lines is detected by HMAC verification (`verify_audit_log()` / `audit_verify` skill). Whole-line deletion is **NOT** detected by per-line HMAC (documented limitation of Q5=a — hash-chain considered too fragile on rotation). File perms enforced at 0600. Direct edits via shell are technically possible but observable post-hoc; do not edit by hand.
- `~/.codec/pending_questions.json` (Phase 1 Step 3) — direct edits race in-flight `threading.Event` waiters; agents will hang or skip answers. Use `POST /api/agents/answer/{qid}` or `codec_ask_user.submit_answer()` instead.
- `~/.codec/voice_session.json` (Phase 1 Step 3) — voice-session active-marker; `VoicePipeline.run` owns its lifecycle.
Expand Down
32 changes: 23 additions & 9 deletions codec.py
Original file line number Diff line number Diff line change
Expand Up @@ -797,16 +797,30 @@ def do_screenshot_question():
push(lambda: show_overlay('Screenshot failed', '#ff3333', 2000))
return
print(f"[CODEC] Screenshot captured ({len(ctx)} chars)")
# Show brief summary of what was captured, then open question dialog
summary = ctx[:120].replace('"', '\\"').replace('\n', ' ')
# PR-2F (closes D-21): pass the OCR summary as an osascript ARGV argument
# rather than interpolating it into the script source. AppleScript reads
# `summary` from `item 1 of argv` — NO string interpolation means an
# adversarial OCR result (`"\n display dialog "PWNED"`) is treated as
# literal text by AppleScript and cannot break out of the string context.
summary = ctx[:120]
body = f"I captured your screen:\n\n{summary}…\n\nWhat would you like to know about it?"
script = (
'on run argv\n'
' set bodyText to item 1 of argv\n'
' tell application "System Events"\n'
' set frontmost of first process whose frontmost is true to true\n'
' end tell\n'
' set t to text returned of (display dialog bodyText '
'default answer "" with title "CODEC Screenshot" '
'buttons {"Cancel","Ask"} default button "Ask")\n'
' return t\n'
'end run'
)
try:
r = subprocess.run(["osascript", "-e",
f'tell application "System Events"\nset frontmost of first process whose frontmost is true to true\nend tell\n'
f'set t to text returned of (display dialog '
f'"I captured your screen:\\n\\n{summary}…\\n\\nWhat would you like to know about it?" '
f'default answer "" with title "CODEC Screenshot" '
f'buttons {{"Cancel","Ask"}} default button "Ask")'],
capture_output=True, text=True, timeout=120)
r = subprocess.run(
["osascript", "-e", script, body],
capture_output=True, text=True, timeout=120,
)
question = r.stdout.strip()
if question:
task = question + " [SCREEN CONTEXT: " + ctx[:800] + "]"
Expand Down
Loading
Loading