test: harden v0.8.3 plan-mode trust-boundary tests + fix nested marker tracking#899
Conversation
…r tracking Follow-up from the v0.8.3 multi-model code review (GPT 5.4, Gemini 3.1 Pro, Kimi K2.5, MiniMax M2.7, Claude). All non-blocking; the release is correct. - processor.ts: remove the nested `// altimate_change` block around the reworded plan-no-tool warning. `script/upstream/analyze.ts` `findMarkers` uses a single `openBlock` with no nesting stack, so the inner `start` clobbered the outer plan-refusal block and dropped it from marker tracking. The warning text is already inside the outer block, so removing the inner markers keeps the strict gate green (pure deletion) and restores correct block tracking. (Gemini, verified against analyze.ts:499-520.) - plan.txt: revert the start-marker description edit. plan.txt is imported raw into the LLM prompt, so the added words were model-visible; the prose itself already documents the trivial-task escape hatch. (Gemini #5.) - plan-layer-e2e.test.ts / release-v0.8.3-adversarial.test.ts: replace `session: {} as any` with a structurally valid `{ slug, time: { created } }` and add an explicit `Flag.OPENCODE_EXPERIMENTAL_PLAN_MODE === false` precondition, so a future flag flip fails loudly instead of throwing an opaque TypeError in `Session.plan`. (Consensus; Gemini MAJOR.) - plan-layer-e2e.test.ts: add an end-to-end SINK test — drive insertReminders' output through the real `system`-array hoist and `MessageV2.toModelMessages`, asserting attacker text never reaches `system` and the hoisted reminder is not duplicated into the user role. (GPT.) - release-v0.8.3-adversarial.test.ts: make the wording-fix guards concept/synonym-tolerant (keep the load-bearing negative "too thin to act on" guard) so legitimate copy improvements don't break the suite. (MiniMax.) - plan-layer-e2e.test.ts: rename the mislabeled `insertReminders return shape` test — it asserts the `InsertRemindersResult` type alias, not runtime behaviour (which is covered by the behavioral describe). (MiniMax.) Closes #898 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Claude Code Review
This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.
Tip: disable this comment in your organization's Code Review settings.
|
Ready to act? Review this PR in Change Stack to turn feedback into patch suggestions you can inspect and refine. No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
💤 Files with no reviewable changes (1)
📝 WalkthroughWalkthroughThe PR removes altimate_change marker infrastructure from plan-layer code and prompts, consolidates test file-reading patterns, and significantly expands trust-boundary assertions around reminder hoisting and synthetic warning behavior in both plan-layer e2e and adversarial regression test suites. ChangesPlan-layer marker cleanup and test hardening
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related issues
Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
👋 This PR was automatically closed by our quality checks. Common reasons:
If you believe this was a mistake, please open an issue explaining your intended contribution and a maintainer will help you. |
|
👋 This PR was automatically closed by our quality checks. Common reasons:
If you believe this was a mistake, please open an issue explaining your intended contribution and a maintainer will help you. |
dev-punia-altimate
left a comment
There was a problem hiding this comment.
Multi-Persona Review — Verdict: skipped
Multi-persona review completed.
0/0 agents completed · 2s · 0 findings (0 critical, 0 high, 0 medium)
Multi-Persona Review · vllm:qwen3-next-80b (waves) + vllm-fallback (synth) ·
What does this PR do?
Fast-follow hardening from the v0.8.3 multi-model code review (GPT 5.4 + Gemini 3.1 Pro + Kimi K2.5 + MiniMax M2.7 + Claude). All reviewers' verdict on the shipped v0.8.3 was ship — would not have blocked; these are the non-blocking items they surfaced.
processor.ts). The v0.8.3 marker fix wrapped the reworded plan-no-tool warning in an inner// altimate_changeblock nested inside the outer plan-refusal block.script/upstream/analyze.tsfindMarkers(lines 499–520) uses a singleopenBlockwith no nesting stack, so the innerstartclobbered the outer block and itsendwas dropped — the outer block fell out of marker tracking. It passed the strict CI gate (hunk-based) and the count balance, so it never failed the release, but it degraded upstream-bridge accuracy. The warning text is already inside the outer block, so removing the inner markers keeps the strict gate green (pure deletion) and restores correct tracking. (Gemini; verified againstanalyze.ts.)session: {} as anywith a structurally valid{ slug, time: { created } }and added an explicitFlag.OPENCODE_EXPERIMENTAL_PLAN_MODE === falseprecondition, so a future flag flip fails loudly instead of throwing an opaqueTypeErrorinSession.plan(and silently dropping the intended coverage). (Consensus of all 5 reviewers; Gemini rated MAJOR.)trustedReminderParts; this drives the result through the realsystem-array hoist andMessageV2.toModelMessages, asserting attacker text never reachessystemand the hoisted reminder is not duplicated into the user role. (GPT.)too thin to act onguard) so legitimate copy improvements don't break the suite. (MiniMax.)InsertRemindersResulttype alias, not runtime behaviour. (MiniMax.)plan.txtstart-marker description edit — plan.txt is imported raw into the LLM prompt, so the added words were model-visible; the prose already documents the escape hatch. (Gemini.)Type of change
Issue for this PR
Closes #898
How did you verify your code works?
bun test test/session/plan-layer-e2e.test.ts test/skill/release-v0.8.3-adversarial.test.ts test/session/processor.test.ts test/session/system.test.ts test/provider/family.test.ts— 192 pass, 0 fail (incl. the new end-to-end sink test and flag preconditions).bun test test/session/ test/skill/ test/provider/— 1619 pass, 0 fail.bun run typecheck— clean.bun run script/upstream/analyze.ts --markers --base origin/main --strict— All custom code properly marked (confirms the nested-marker removal keeps the gate green and the outer block is no longer clobbered).processor.tsnow has a single outer block (345→411), no inner markers.Checklist
Summary by cubic
Hardened v0.8.3 plan‑mode trust boundary and added end‑to‑end sink coverage, while fixing nested marker tracking in
processor.tsto restore accurate upstream markers. Also reverted the model‑visibleplan.txtmarker description to keep the prompt unchanged. Closes #898.Bug Fixes
// altimate_changemarkers around the plan‑no‑tool warning soscript/upstream/analyze.tscorrectly tracks the outer block.plan.txtstart‑marker description since the file is imported raw into the LLM prompt.Test Improvements
systemand reminders aren’t duplicated into the user role.session: {} as anywith a valid{ slug, time: { created } }and added aFlag.OPENCODE_EXPERIMENTAL_PLAN_MODE === falseprecondition to fail loudly on future flips.InsertRemindersResulttype alias.Written for commit d1e1a4c. Summary will update on new commits.
Summary by CodeRabbit
Tests
Chores