docs: audit and resolve SYSTEM_PROMPT.md issues#240
Merged
Conversation
… craft guidelines
…nciples from Claude Fable 5
…rinciples from Claude Opus 4.7
…tion from Claude Sonnet 4.5
Remove 6 borderline-quality additions that were redundant or conflicting: - Security directive (duplicate of RESPONSE STANDARDS) - Explain before you act (conflicts with No commentary between tool calls) - Never apologize (conflicts with Owning Errors) - No flattery (redundant with tone guidance) - Present opposing perspectives (encourages hedging) - Recognize past-context cues (misleading for model capabilities) Also fix directive numbering (4→11, sequential) and remove blank line artifact at EOF.
- Bump SYSTEM_PROMPT.md version from 1 to 2 - Add AGENTS.md section 5.5 requiring prompt file version increments when PRs modify files in ./prompts/ (target branch version + 1)
Restructure sections to group related concepts and elevate safety/security: - Moved Security to CORE DIRECTIVES (#5) — it's a safety constraint, not a response standard - Consolidated 7 execution directives (5-11) into 5 tight EXECUTION BEHAVIOR bullets - Moved CODE CRAFT and DELIVERABLES adjacent to RESPONSE STANDARDS - Moved TONE & STYLE after response/craft standards - Merged MEMORY CAPTURE + MEMORY USAGE into single MEMORY section - Moved MEMORY section to end with other tool sections - Removed redundant [SYSTEM NOTE] (already covered by directive #1) - Trimmed EXAMPLE INTERACTIONS from 3 to 2 examples - Removed EOF duplication artifact - Reduced from 200 to 169 lines
The user likes the character flair from italics and bracketed asides — it's part of the persona. Reverting the over-formatting restriction.
1. Remove tool call cap — let task dictate tool count, not arbitrary limits 2. Soften search mandate — remove 'search regardless of confidence' to prevent context starvation 3. Consolidate execution bullets — renamed 'One message, one job' to 'Atomic execution', reordered for clarity 4. Clarify show-your-work vs. be-terse boundary — explicitly state terse execution for technical work 5. Fix code example — replace meta-comment with actual placeholder text
1. Remove tool call cap — task dictates tool count, not arbitrary limits 2. Soften search mandate — remove 'search regardless of confidence' to prevent context starvation 3. Rename 'One message, one job' to 'Atomic execution' for clarity 4. Clarify show-your-work vs. be-terse boundary — terse execution for technical work 5. Fix code example — replace meta-comment with actual placeholder text
1. Remove EOF duplication artifact (2 lines) 2. Reframe 'mysterious competence' to 'quiet competence' — avoids tension with uncertainty guidance 3. Soften 'most capable... imaginable' to 'highly capable' — prevents overconfidence 4. Clarify chameleon triggers — 'Intensity/Focus' now specifies 'when debugging code or solving a complex issue' 5. Clean code example — replace bracketed meta-asides with simple placeholders
…s conflict with todo tool The old name implied the state file is for general multi-step work, which overlaps with the todo tool's purpose. The state file is specifically for when the model gets cut off mid-task and needs to resume — not a task management strategy. The todo tool handles multi-step decomposition; the state file handles interruption recovery.
Added explicit guidance: use todo tool for multi-step decomposition, persist queue state to state file if interrupted, resume by reading state file on next turn. This eliminates ambiguity about how the two mechanisms work together.
Reset to clean base (commit 203012a) and apply all fixes from three cold audits: 1. Remove tool call cap — 'let the task dictate the tool count' 2. Soften search mandate — remove 'search regardless of confidence' 3. Rename 'One message, one job' to 'Atomic execution' 4. Clarify show-your-work vs. be-terse boundary 5. Clean code example placeholders 6. Reframe 'mysterious competence' to 'quiet competence' 7. Softer 'highly capable' instead of 'most capable imaginable' 8. Clarify chameleon triggers — 'Intensity/Focus' for debugging 9. Rename 'Multi-turn state' to 'Interruption recovery' 10. Restore italics/brackets character flair in formatting 11. Remove all duplication artifacts 12. Clean EOF Total: 168 lines, clean structure, no corruption.
- Remove dead YAML frontmatter (token waste) - Resolve 'extra requirements' vs 'implied sub-tasks' contradiction - Unify verbosity under explicit analysis/execution mode-switching - Replace vague 'Chameleon of Character' with concrete role anchors - Fix engineering-mode example to demonstrate dropped persona - Replace 'trust your intuition' with 5 concrete sampling triggers - Add priority hierarchy for conflicting directives - Consolidate 3 redundant task-execution sections into 1
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Comprehensive overhaul of
prompts/SYSTEM_PROMPT.mdRemoves redundancy, restructures for clarity, and resolves five performance defects identified in a cold audit.Versioning
Phase 1 — New directives
Added practical guidelines across seven areas:
Phase 2 — Removal of redundant/conflicting rules
Phase 3 — Restructure for clarity and priority
Phase 4 — Performance defect fixes (cold audit)
Resolved five measurable performance defects:
Phase 5 — Structural clarifications
Result
Type of Change
Testing
Reviewed the full system prompt for consistency, section coherence, and semantic equivalence. No behavioral rules were removed — only reorganized, consolidated, and clarified.
Coverage
Checklist
npm run lintpasses