Rectify: Codex Config Schema Drift Immunity — TOML Key Quoting + Pre-Flight Validation#3361
Merged
Trecek merged 7 commits intoMay 31, 2026
Conversation
93142be to
73c3d02
Compare
…Flight Validation - Add `_quote_key()` helper and `_BARE_KEY_RE` to `_codex_config.py`; apply to all 8 key-emission sites so dotted/special keys are quoted per TOML spec - Add `_validate_codex_config()` to `codex.py`; calls `codex doctor --json` and checks `config.load` status after MCP registration in `ensure_pre_launch()` - Add `TestSerializeTomlKeyQuoting` (8 tests) and 2 preservation tests to `test_codex_config.py` to catch round-trip corruption by construction - Add `TestCodexEnsurePreLaunchConfigValidation` (8 tests) to `test_codex_backend.py` covering error, ok, timeout, OSError, ordering, malformed JSON, missing check, and integration scenarios Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Project arch rule requires `import regex as re` in src/ outside hooks/ and core/. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…simplify exception handler Add returncode != 0 guard before JSON parsing to avoid silent false negatives when codex doctor exits non-zero. Simplify redundant `except (json.JSONDecodeError, ValueError)` to `except json.JSONDecodeError` since JSONDecodeError is a subclass of ValueError. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add control character escaping (\b, \t, \n, \f, \r) in _quote_key() to prevent malformed TOML basic strings when keys contain control characters. Per the TOML spec, these must be escaped in basic strings. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… TOML keys Add test_round_trip_key_with_backslash and test_round_trip_key_with_dot_and_double_quote to TestSerializeTomlKeyQuoting to cover the backslash-escape and combined special character code paths in _quote_key. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use getattr for result.returncode and result.stdout to handle cases where subprocess.run is mocked with minimal result objects in test environments that don't provide all CompletedProcess attributes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ex_config Cover the graceful-degradation path added in b2f0483 where codex doctor exits non-zero — ensure ensure_pre_launch() returns [] without attempting JSON parsing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
73c3d02 to
f8f2b4b
Compare
This was referenced May 31, 2026
Closed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Python's
tomllibvalidates TOML syntax but cannot detect Codex's Rustserdeschema rejections. The session config setup copies~/.codex/config.tomlverbatim viashutil.copy2, propagating schema-drifted configs that crash Codex at launch. Additionally,_serialize_toml()emits dotted TOML keys unquoted, corrupting configs on round-trip writes.This is crash #6 in a chain of 6 crashes in the same code path, indicating systemic architectural weakness. The immunity plan addresses the root causes: (1) add binary-driven pre-flight validation via
codex doctor --json, (2) fix the TOML serializer's key quoting, and (3) add round-trip fidelity tests that catch serialization corruption by construction.Part A (this PR): TOML key quoting fix (REQ-REM-005) + Pre-flight validation via
codex doctor --json(REQ-REM-001).Requirements
REQ-REM-001: Pre-flight config validation via
codex doctor --jsonAdd a pre-flight check inside
CodexBackend.ensure_pre_launch()atcodex.py:669-675that runscodex doctor --jsonand parses the structured output forchecks["config.load"]["status"]. If status is not"ok", return a user-facing error string including the check'ssummaryandremediationfields.Why
--jsoninstead of exit code:codex doctorexits 0 even for some config errors (they appear as "Notes" not "Failures"). The--jsonoutput provides per-check structured status that is reliable regardless of exit-code semantics.Ordering: The
codex doctorcall must come AFTERensure_codex_mcp_registered()(which may write to config.toml), so the doctor validates the post-write config state.Edge cases:
codex doctor --jsonincludes a websocket connectivity check with a 15-second timeout. Usetimeout=20for the subprocess to avoid killing it before doctor finishes. Treatsubprocess.TimeoutExpiredas a non-blocking warning, not an error.OSErrorfrom the subprocess is NOT unreachable: catchOSErrordefensively.summaryandremediationfrom the JSON output, and (b) a suggestion to runcodex doctorfor full diagnostics.REQ-REM-005: Fix
_serialize_tomldotted-key quoting_serialize_toml()in_codex_config.pymust quote TOML keys that contain dots. Currently, a key likegpt-5.5is emitted unquoted, causing TOML to interpret it as nested keys on re-parse. The fix: quote any key string containing a.character (e.g., emit"gpt-5.5" = 1instead ofgpt-5.5 = 1).Conflict Resolution Decisions
The following files had merge conflicts that were automatically resolved.
None
Closes #3335
Implementation Plan
Plan file:
/home/talon/projects/autoskillit-runs/remediation-20260530-161302-198051/.autoskillit/temp/rectify/rectify_codex_config_schema_drift_immunity_2026-05-30_200000.md🤖 Generated with Claude Code via AutoSkillit
Token Usage Summary
* Step used a non-Anthropic provider; caching behavior may differ.
Token Efficiency
Model Usage Breakdown