Skip to content

[FIX] Rectify: CodexBackend Command Builder Env Contract Immunity#3394

Merged
Trecek merged 8 commits into
developfrom
rectify-codexbackend-command-builder-gaps-missing-env-vars-s/3383
May 31, 2026
Merged

[FIX] Rectify: CodexBackend Command Builder Env Contract Immunity#3394
Trecek merged 8 commits into
developfrom
rectify-codexbackend-command-builder-gaps-missing-env-vars-s/3383

Conversation

@Trecek
Copy link
Copy Markdown
Collaborator

@Trecek Trecek commented May 31, 2026

Summary

The CodexBackend command builder methods (build_food_truck_cmd, build_resume_cmd, build_headless_cmd) silently diverge from ClaudeCodeBackend in which environment variables they inject. The root cause is that the CodingAgentBackend Protocol specifies only method signatures — it says nothing about which env vars each method must produce.

The architectural fix: define per-session-type REQUIRED_ENV frozensets in _type_constants_env.py, wire them through EnvPolicy.build_env(required=...) at every build site, and add cross-backend parametrized contract tests that assert env var parity across all backends in the registry.

Key changes:

  1. Add SKILL_SESSION_REQUIRED_ENV, ORCHESTRATOR_SESSION_REQUIRED_ENV, and RESUME_SESSION_BASELINE_KEYS frozensets to _type_constants_env.py
  2. Wire required through all build_*_cmd methods in both CodexBackend and ClaudeCodeBackend
  3. Fix CodexBackend.build_resume_cmd to use filtered base env (remove _HEADLESS_EXCLUSIVE_VARS leakage) and add missing sandbox flags
  4. Add missing env vars (AGENT_BACKEND_ENV_VAR, AUTOSKILLIT_HEADLESS_AUTO_GATE, MCP_CONNECTION_NONBLOCKING) to CodexBackend.build_food_truck_cmd
  5. Add cross-backend env contract tests and extend existing invariant tests to cover CodexBackend
  6. Replace NotImplementedError stubs (validate_skill_content, list_plugins) with safe returns

Architecture Impact

module-dependency Diagram

classDiagram
    direction TB
    class CodexBackend {
        +build_food_truck_cmd()
        +build_resume_cmd()
        +build_headless_cmd()
        +validate_skill_content()
        +list_plugins()
    }
    class ClaudeCodeBackend {
        +build_food_truck_cmd()
        +build_resume_cmd()
        +build_headless_cmd()
    }
    class EnvPolicy {
        +build_env(required)
    }
    class CodingAgentBackend {
        <<Protocol>>
    }
    CodexBackend ..|> CodingAgentBackend
    ClaudeCodeBackend ..|> CodingAgentBackend
    CodexBackend --> EnvPolicy
    ClaudeCodeBackend --> EnvPolicy
Loading

process-flow Diagram

flowchart TD
    A[build_food_truck_cmd] --> B[EnvPolicy.build_env required=SKILL_SESSION_REQUIRED_ENV]
    C[build_resume_cmd] --> D[EnvPolicy.build_env required=RESUME_SESSION_BASELINE_KEYS]
    E[build_headless_cmd] --> F[EnvPolicy.build_env required=ORCHESTRATOR_SESSION_REQUIRED_ENV]
    B --> G[CodexBackend food truck process]
    D --> H[CodexBackend resume process]
    F --> I[CodexBackend headless process]
Loading

security Diagram

flowchart LR
    A[_HEADLESS_EXCLUSIVE_VARS] -->|filtered out| B[build_resume_cmd base env]
    C[missing sandbox flags] -->|added| D[CodexBackend.build_resume_cmd]
    E[AGENT_BACKEND_ENV_VAR MCP_CONNECTION_NONBLOCKING AUTOSKILLIT_HEADLESS_AUTO_GATE] -->|added| F[CodexBackend.build_food_truck_cmd]
    G[NotImplementedError stubs] -->|replaced| H[safe returns]
Loading

Closes #3383

Implementation Plan

Plan file: .autoskillit/temp/rectify/rectify_codex_env_contract_immunity_2026-05-30_215500.md

🤖 Generated with Claude Code via AutoSkillit

Token Usage Summary

Step Model count uncached output cache_read peak_ctx turns cache_write time
rectify* opus[1m] 1 2.5k 26.1k 3.0M 142.2k 332 224.2k 24m 15s
review_approach* sonnet 1 6.9k 6.2k 175.4k 38.3k 53 32.5k 4m 15s
dry_walkthrough* opus 1 48 14.6k 2.0M 79.7k 206 63.2k 10m 36s
implement* sonnet 1 658 48.3k 7.4M 138.4k 244 122.7k 14m 2s
audit_impl* sonnet 1 1.4k 14.6k 247.7k 46.9k 67 48.7k 6m 32s
prepare_pr* sonnet 1 87.6k 3.8k 224.9k 33.2k 24 26.9k 1m 26s
compose_pr* sonnet 1 40.8k 2.2k 163.9k 27.8k 13 15.5k 47s
review_pr* sonnet 2 428 114.6k 3.4M 139.0k 211 269.1k 28m 22s
resolve_review* opus 2 2.5k 34.1k 3.7M 94.7k 216 145.9k 25m 54s
Total 142.8k 264.4k 20.3M 142.2k 948.8k 1h 56m

* Step used a non-Anthropic provider; caching behavior may differ.

Token Efficiency

Step LoC Changed cache_read/LoC cache_write/LoC output/LoC
rectify 0
review_approach 0
dry_walkthrough 0
implement 314 23534.4 390.8 153.9
audit_impl 0
prepare_pr 0
compose_pr 0
review_pr 0
resolve_review 57 65648.0 2559.7 598.9
Total 371 54824.5 2557.4 712.6

Model Usage Breakdown

Model steps uncached output cache_read cache_write time
opus[1m] 1 2.5k 26.1k 3.0M 224.2k 24m 15s
sonnet 6 137.7k 189.6k 11.7M 515.5k 55m 26s
opus 2 2.5k 48.7k 5.7M 209.1k 36m 30s

@Trecek Trecek force-pushed the rectify-codexbackend-command-builder-gaps-missing-env-vars-s/3383 branch from 3f12541 to 1af5947 Compare May 31, 2026 06:44
Trecek and others added 8 commits May 31, 2026 00:13
…Backend

Adds per-session-type REQUIRED_ENV frozensets and wires them through
EnvPolicy.build_env at every build site, closing the silent drift between
CodexBackend and ClaudeCodeBackend env injection. Fixes filtered base
env in Codex headless/resume commands, adds sandbox flag to resume cmd,
resolves NotImplementedError stubs, and adds cross-backend parametrized
contract tests that structurally prevent future env var drift.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…NV reinjection

MAX_MCP_OUTPUT_TOKENS is in _HEADLESS_EXCLUSIVE_VARS but legitimately
re-injected via _SESSION_BASELINE_ENV extras in build_resume_cmd. The leak
assertions must exclude intentionally re-injected keys. Also fixes ruff
import ordering in core/__init__.pyi.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…heck

Replace `assert isinstance(result, list)` with `assert result == []` in
TestCodexStubMethods to pin the actual stub return value and catch regressions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…tract file

Remove test_resume_cmd_env_uses_filtered_base, test_resume_cmd_has_sandbox_flag,
and test_headless_cmd_uses_filtered_base — all duplicate assertions already in
test_codex_backend.py. Clean up unused imports.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Both SKILL_SESSION_REQUIRED_ENV and ORCHESTRATOR_SESSION_REQUIRED_ENV
now include MCP_CONNECTION_NONBLOCKING, matching the mandatory status
already enforced by test_all_session_builders_inject_mcp_connection_nonblocking.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…n Codex tests

Import from autoskillit.execution.commands (the canonical re-export)
instead of the private autoskillit.execution.backends._claude_prompt
module, consistent with all other env-boundary tests in the repo.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…headless_cmd

Pre-merging env_extras into filtered_base subjected them to the
CodexEnvPolicy denylist, silently dropping matching keys. Pass them
as extras= instead, consistent with build_skill_session_cmd and
build_food_truck_cmd which bypass the denylist for extras.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…verage tests

MCP_CONNECTION_NONBLOCKING is always injected fresh via extras (never
read from os.environ), so it belongs in the always_injected exemption
set alongside AGENT_BACKEND_ENV_VAR.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@Trecek Trecek force-pushed the rectify-codexbackend-command-builder-gaps-missing-env-vars-s/3383 branch from 1af5947 to b34d0d9 Compare May 31, 2026 07:13
@Trecek Trecek added this pull request to the merge queue May 31, 2026
Merged via the queue into develop with commit 45223b4 May 31, 2026
3 checks passed
@Trecek Trecek deleted the rectify-codexbackend-command-builder-gaps-missing-env-vars-s/3383 branch May 31, 2026 07:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant