Skip to content

release: 1.2.0 — opus-1m defaults + ephemeral worktree + git MCP isolation#44

Merged
sf-jin-ku merged 4 commits into
mainfrom
jin/review-isolation-and-defaults
May 12, 2026
Merged

release: 1.2.0 — opus-1m defaults + ephemeral worktree + git MCP isolation#44
sf-jin-ku merged 4 commits into
mainfrom
jin/review-isolation-and-defaults

Conversation

@sf-jin-ku
Copy link
Copy Markdown
Contributor

Summary

Bumps the plugin to v1.2.0 (user-confirmed: 1.1.0 → 1.2.0 follows the prior 1.0.9 → 1.1.0 minor jump) with three behavior changes for review / adversarial-review / rescue:

1. Default Claude model = opus-4.7[1m] + xhigh effort (5f82d2b)

  • --model unset → opusclaude-opus-4-7[1m] (1M-context variant)
  • --model sonnetclaude-sonnet-4-6[1m] (also 1M)
  • --model haikuclaude-haiku-4-5 (no 1M variant available)
  • Per-model effort defaults: xhigh for opus, high for sonnet, unset for haiku
  • xhigh is now a first-class VALID_EFFORTS member; the legacy xhigh → max alias is removed and max is reserved for explicit opt-in
  • --effort is now exposed on review and adversarial-review (was task-only)

2. Review/adversarial-review now run behind a Bash-free MCP boundary (105b8cd)

The previous Bash sub-pattern allowlist (Bash(git status:*) etc.) is not strictly enforced by the Claude CLI — once Bash appears in --allowedTools with any sub-pattern, the entire Bash tool opens up. Replace with three layers:

  • Bundled read-only git MCP server (scripts/lib/mcp-git.mjs) exposing diff, log, show, blame, status, grep, ls_files as structured tools. Argv is built inside the server (no shell), refs and paths are validated against tight character classes (no leading -, no traversal, no metacharacters), and the working tree is locked to CC_GIT_ROOT. MCP tool names are strictly enforced by --allowedTools.
  • Ephemeral git worktree per branch-mode review run (scripts/lib/review-worktree.mjs). Working-tree-scope reviews run in the original repo so staged/unstaged/untracked changes remain visible. Nested try/finally so every failure path tears the worktree down. Stale-worktree sweeper at review start handles kill -9 survivors.
  • New SANDBOX_REVIEW_TOOLS allowlist: Read, Glob, Grep, WebSearch, WebFetch, mcp__gitReview__* — no Bash entry. The read-only sandbox preset leaves network unrestricted so WebFetch/WebSearch and the Claude CLI's own API path keep working; safety comes from removing Bash, not from blocking network.

3. Codex review findings (9509c06)

  • working-tree scope now skips the worktree (otherwise git status in a HEAD worktree reported clean and Claude couldn't see the changes being reviewed).
  • ensureValidRef rejects refs starting with - so --ext-diff-style values can't be passed as git options.
  • The stdio MCP server now emits -32600 Invalid Request even for malformed payloads without an id.
  • git worktree add failures run git worktree prune so partial bookkeeping is reclaimed.
  • New sweepers (pruneStaleSandboxSettings, pruneStaleReviewMcpConfigs) clean up abandoned sandbox/cc-sandbox-*.json and mcp/cc-mcp-*.json (the latter contains CC_GIT_ROOT).
  • createReviewIsolation moved into lib/review-worktree.mjs and is unit-tested directly.

4. Drop env-scrubbing from the spawn path (eff2cdc)

Earlier work bundled a buildClaudeSpawnEnv helper that stripped loopback ANTHROPIC_BASE_URL / *_PROXY env vars from the child Claude process when they were not declared in ~/.claude/settings.json. The protected case (stale ad-hoc shell export pointing at a dead loopback proxy) is uncommon, and the scrubbing's asymmetric failure mode — parent Codex dies on the dead proxy, child Claude silently bypasses it — is harder to diagnose than the symmetric failure mode. cco users were already unaffected because cco setup registers its proxy entries in settings.json. Removing the scrubber simplifies the spawn path and makes child Claude consistently inherit whatever env the parent had.

Test plan

  • npm test — 483/483 pass (down from 489 after removing the 6 env-scrubbing tests)
  • npm run typecheck
  • npm run lint
  • npm run check:version-sync — package.json + .codex-plugin/plugin.json both at 1.2.0
  • npm run check:changelog — v1.2.0 section present
  • End-to-end `companion review --wait` on this branch (Claude review inspected its own changes via the bundled MCP git server)
  • Worktree + MCP config + sandbox settings cleanup verified on disk after the smoke run

Related

🤖 Generated with Claude Code

jinku and others added 4 commits May 11, 2026 16:23
Apply opus + xhigh effort as the global default whenever the user does
not pass --model or --effort to review, adversarial-review, or task.
The opus and sonnet aliases now resolve to their 1M-context variants
(claude-opus-4-7[1m], claude-sonnet-4-6[1m]); haiku stays on the
standard claude-haiku-4-5. Per-model effort defaults: xhigh for opus,
high for sonnet, unset for haiku. xhigh is promoted to a first-class
VALID_EFFORTS member and is no longer aliased to max — max is reserved
for users who explicitly request it.

Also expose --effort on review and adversarial-review so callers can
override the model-derived default, and update SKILL.md, README, and
the cli-runtime internal reference accordingly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… + git MCP

Replace the previous Bash-pattern allowlist (which the Claude CLI does not
strictly enforce — once Bash appears with any sub-pattern, the entire Bash
tool opens up) with a tighter design built around three layers:

1. A bundled read-only git MCP server (`scripts/lib/mcp-git.mjs`) exposing
   `diff`, `log`, `show`, `blame`, `status`, `grep`, `ls_files`. Argv is
   constructed inside the server with spawnSync (no shell), refs and paths
   are validated against tight character classes, and pathspecs cannot
   escape the configured git root.
2. An ephemeral `git worktree` created per review run, with cleanup in a
   nested try/finally so every setup failure path still tears it down. A
   stale-worktree sweeper runs at the start of each review.
3. An updated `SANDBOX_REVIEW_TOOLS` allowlist that drops Bash entirely
   and adds only `mcp__gitReview__*` plus Read/Glob/Grep/Web. The
   read-only sandbox preset stops blocking network so WebFetch/WebSearch
   and the Claude CLI's own API path keep working.

`runClaudeReview` and `runClaudeAdversarialReview` now construct the
worktree, spawn the MCP server via a generated `--mcp-config`, run Claude
inside the worktree, and clean everything up on the way out.

Tests cover the MCP request dispatch, every tool handler (including
injection rejection), the worktree lifecycle, the new allowlist contract,
and the new buildArgs flags (`--mcp-config`, `--strict-mcp-config`).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Six findings from the Codex review pass on commit 0d9bf09, all addressed:

1. Working-tree reviews were silently incomplete. The previous code always
   built a worktree at HEAD, so staged/unstaged/untracked changes
   disappeared from MCP git output — `status` reported clean and `diff`
   showed nothing. Replace `createReviewWorktree` at the review entry
   point with `createReviewIsolation`, which keeps the worktree for
   branch reviews and runs in the original repo for working-tree
   reviews. Containment for working-tree mode now relies on the
   Bash-free allowlist + read-only sandbox (the same surface that holds
   for the worktree case).

2. Reject refs that start with `-` (e.g. `-p`, `--ext-diff`). The
   previous `REF_PATTERN` allowed leading hyphens, so a "ref" could be
   passed to git as an option since refs are emitted before the `--`
   separator. Add an explicit guard + tests covering the dash-prefix
   class.

3. `runMcpGitServer` was suppressing `-32600 Invalid Request` responses
   because the response gate keyed off `request.id`. Malformed payloads
   (arrays, scalars) have no id, so the error never reached the caller.
   Send error responses regardless of request-id presence; keep the
   notification rule for valid no-id requests.

4. `git worktree add` failures could leave behind a `.git/worktrees/...`
   bookkeeping entry. Run `git worktree prune` on the failure path so
   partial setup does not strand metadata.

5. `kill -9` mid-run could leak `sandbox/cc-sandbox-*.json` and
   `mcp/cc-mcp-*.json` files forever, with the MCP file containing a
   stale `CC_GIT_ROOT`. Add `pruneStaleSandboxSettings` and
   `pruneStaleReviewMcpConfigs` sweepers (same 6h threshold as the
   worktree sweeper) and call them at review start.

6. `createReviewIsolation` is moved to the same module as the worktree
   helpers so it is unit-testable. Tests cover both modes plus the new
   stdio framing behavior (child-process round-trip of `mcp-git`
   confirming the -32600 error path actually emits a response).

The third Codex finding — that `Read`/`Glob`/`Grep` plus an open
network leave a broader exfiltration surface than the MCP-only git
boundary suggests — is intentional for now (the user explicitly asked
for network access). It is documented as a known trade-off rather than
fixed in this commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Drop the inherited buildClaudeSpawnEnv loopback-proxy scrubbing that was
bundled into commit 9327619. The case it protected (a stale ad-hoc
shell export of ANTHROPIC_BASE_URL / *_PROXY pointing at a dead
loopback URL, while the user is *not* using cco) is uncommon, and the
scrubbing's asymmetric behavior — parent Codex fails against the dead
proxy, child Claude silently bypasses it — makes diagnosis harder than
the symmetric failure mode. cco users already had their entries
registered in `~/.claude/settings.json` and were unaffected by the
scrubber in either direction.

Bump version to 1.2.0 (package.json + .codex-plugin/plugin.json) and
add a CHANGELOG entry describing:
- the new default Claude model (opus-4.7[1m], xhigh effort) for review,
  adversarial-review, and rescue/task; with sonnet/haiku per-model
  effort defaults
- the review/adversarial-review isolation rework (ephemeral worktree
  + bundled read-only git MCP server + Bash-free allowlist)
- the read-only sandbox preset leaving network unrestricted
- new `--effort` flag exposed on the review skills
- stale-resource sweepers for crashed runs

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: eff2cdcaf3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread scripts/lib/mcp-git.mjs
Comment on lines +287 to +289
diff(args, root) {
const cmd = ["diff", "--no-color"];
if (args?.stat === true) cmd.push("--stat");
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Add --no-ext-diff/--no-textconv to MCP git diff/show

The new MCP server runs git diff/git show without disabling external diff drivers, so repositories that define diff.external, textconv, or custom diff drivers can execute arbitrary local programs when Claude calls these tools. In review mode this bypasses the intended “no Bash” isolation and can still run mutating/exfiltrating commands via Git config or attributes, so the read-only boundary is not actually enforced for those repos.

Useful? React with 👍 / 👎.

Comment on lines +184 to +186
if (now - stat.mtimeMs < maxAgeMs) continue;
safeRemoveWorktree(repoRoot, full);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Restrict stale worktree pruning to the matching Git repo

pruneStaleReviewWorktrees scans a shared global runtime directory and removes every old entry using the current repoRoot, so starting a review in repo A can delete stale worktree directories that belong to repo B. Because cleanup force-removes the directory even if git worktree remove fails for the wrong repo, this can destroy another repo’s review workspace and leave its .git/worktrees bookkeeping inconsistent.

Useful? React with 👍 / 👎.

@sf-jin-ku sf-jin-ku merged commit 6d1e49f into main May 12, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant