Skip to content

feat: adaptive gating + cross-review dedup for review army (v0.15.2.0)#760

Merged
garrytan merged 9 commits intomainfrom
garrytan/learning-phase-2.5-clean
Apr 5, 2026
Merged

feat: adaptive gating + cross-review dedup for review army (v0.15.2.0)#760
garrytan merged 9 commits intomainfrom
garrytan/learning-phase-2.5-clean

Conversation

@garrytan
Copy link
Copy Markdown
Owner

@garrytan garrytan commented Apr 1, 2026

Summary

Reviews now learn from your decisions and get smarter over time.

  • Cross-review finding dedup — skip a finding once, it stays quiet until the code changes. No more re-skipping the same intentional patterns every PR.
  • Test stub suggestions — specialists suggest skeleton tests alongside findings using the detected test framework (Jest, Vitest, RSpec, pytest, Go test). Findings with test stubs are surfaced as ASK items.
  • Adaptive specialist gating — specialists dispatched 10+ times with zero findings get auto-gated. Security and data-migration are exempt (insurance policies). Force any specialist back with --security, --performance, etc.
  • Per-specialist stats — every review records which specialists ran, findings per specialist, and skip/gate reasons. Powers adaptive gating and gives /retro richer data.

Files Changed

  • review/specialists/*.md — add test_stub optional field to all specialist schemas
  • review/design-checklist.md — document test_stub field
  • scripts/resolvers/review-army.ts — test framework detection, adaptive gating logic, per-specialist stats
  • review/SKILL.md.tmpl — Step 5.0 cross-review dedup, test stub override in Step 5a, enriched review-log
  • review/SKILL.md — regenerated
  • bin/gstack-specialist-stats — new binary for specialist hit rate tracking

Test Coverage

All new code paths are prompt template logic (natural language instructions to Claude) and a shell script. No app-level code paths to unit test. Existing test suite passes (all assertions green).

Pre-Landing Review

No issues found. Changes are prompt templates + infrastructure only. No SQL, auth, or trust boundary changes.

Test plan

  • All bun tests pass (0 failures)
  • SKILL.md regenerated successfully from template
  • Clean cherry-pick onto main (no conflicts)

🤖 Generated with Claude Code

garrytan and others added 5 commits April 1, 2026 14:33
All specialist prompts now document test_stub as an optional output field,
enabling specialists to suggest test code alongside findings.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds gstack-specialist-stats binary for tracking specialist hit rates.
Resolver now detects test framework for test_stub generation, applies
adaptive gating to skip silent specialists, and compiles per-specialist
stats for the review-log entry.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ew-log

Step 5.0 suppresses findings previously skipped by the user when the
relevant code hasn't changed. Test stub findings force ASK classification
so users approve test creation. Review-log now includes quality_score,
per-specialist stats, and per-finding action records.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
[ -f a ] || [ -f b ] && X="y" evaluates as A || (B && C), so the
assignment only runs when the second test passes. Wrap the OR group
in braces: { [ -f a ] || [ -f b ]; } && X="y".

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 1, 2026

E2E Evals: ✅ PASS

5/5 tests passed | $.55 total cost | 12 parallel runners

Suite Result Status Cost
e2e-review 3/3 $0.51
llm-judge 2/2 $0.04

12x ubicloud-standard-2 (Docker: pre-baked toolchain + deps) | wall clock ≈ slowest suite

garrytan added 4 commits April 2, 2026 18:34
…se-2.5-clean

# Conflicts:
#	CHANGELOG.md
#	VERSION
#	package.json
…se-2.5-clean

# Conflicts:
#	CHANGELOG.md
#	VERSION
@garrytan garrytan merged commit 9ca8f1d into main Apr 5, 2026
18 checks passed
joethorngren added a commit to joethorngren/jstack that referenced this pull request Apr 6, 2026
Upstream features merged:
- GStack Browser with anti-bot stealth (garrytan#695)
- Adaptive gating + cross-review dedup for review army (garrytan#760)
- Voice-friendly skill triggers for AquaVoice (garrytan#732)
- Native OpenClaw skills + ClaHub publishing (garrytan#832)
- Declarative multi-host platform (OpenCode, Slate, Cursor, OpenClaw) (garrytan#793)
- Interactive /plan-devex-review + DX review skills (garrytan#784, garrytan#796)
- Ship re-run verification checks (garrytan#833)
- Community security wave — 8 PRs, 4 contributors (garrytan#847)
- Security wave 1 — 14 fixes for audit (garrytan#810)
- Team-friendly gstack install mode (garrytan#809)
- Anti-skip rule for all review skills (garrytan#804)
- Various bug fixes and doc updates

Conflict resolution strategy:
- Branding files (README, CLAUDE.md, CHANGELOG, VERSION, CONTRIBUTING, TODOS): kept jstack
- SKILL.md / SKILL.md.tmpl files: took upstream (skill improvements)
- Code files (browse/, scripts/, tests, setup, package.json): took upstream
- Telemetry files (bin/gstack-telemetry-sync, supabase telemetry-ingest): kept deleted (jstack privacy policy)
- New upstream files (hosts/, openclaw/, devex-review/, etc.): accepted as-is

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant