microsoft · Alan-Jowett · Apr 9, 2026 · Apr 8, 2026 · Apr 9, 2026 · Apr 9, 2026
diff --git a/CATALOG.md b/CATALOG.md
diff --git a/README.md b/README.md
@@ -569,12 +569,11 @@ personas, analysis protocols, and task templates.
 | `discover-tests-for-changes` | Find relevant tests for local code changes |
 | `scaffold-test-project` | Scaffold test project with build and runner setup |
 
-**Planning** (2 templates):
+**Planning** (1 template):
 
 | Name | Description |
 |------|-------------|
-| `plan-implementation` | Implementation task breakdown with dependencies |
-| `plan-refactoring` | Safe, incremental refactoring plan |
+| `plan-implementation` | Implementation task breakdown with dependencies. Use `mode=refactoring` for safe, incremental refactoring plans. |
 
 **Agent Authoring** (1 template):
 

diff --git a/docs/getting-started.md b/docs/getting-started.md
@@ -93,7 +93,7 @@ For the full composition model and assembly internals, see the
 | Write requirements | `author-requirements-doc` | software-architect |
 | Design a system | `author-design-doc` | software-architect |
 | Plan implementation | `plan-implementation` | software-architect |
-| Plan a refactoring | `plan-refactoring` | software-architect |
+| Plan a refactoring | `plan-implementation` (mode=refactoring) | software-architect |
 | Create a test plan | `author-validation-plan` | software-architect |
 | Audit for security | `investigate-security` | security-auditor |
 | Set up CI/CD | `author-pipeline` | devops-engineer |

diff --git a/manifest.yaml b/manifest.yaml
@@ -1522,16 +1522,9 @@ templates:
       path: templates/plan-implementation.md
       description: >
         Decompose a project into an actionable implementation plan
-        with tasks, dependencies, and risk assessment.
-      persona: software-architect
-      protocols: [anti-hallucination, self-verification]
-      format: implementation-plan
-
-    - name: plan-refactoring
-      path: templates/plan-refactoring.md
-      description: >
-        Plan a safe, incremental refactoring with step-by-step
-        changes that maintain correctness at each step.
+        with tasks, dependencies, and risk assessment. Supports two
+        modes: "implementation" (new feature/project) and "refactoring"
+        (safe, incremental transformation of existing code).
       persona: software-architect
       protocols: [anti-hallucination, self-verification]
       format: implementation-plan

diff --git a/protocols/guardrails/adversarial-falsification.md b/protocols/guardrails/adversarial-falsification.md
@@ -9,10 +9,12 @@ description: >
   Requires the reviewer to attempt to disprove every candidate finding before
   reporting it, reject known-safe patterns, and resist premature summarization.
 applicable_to:
-  - review-code
-  - investigate-bug
-  - investigate-security
   - exhaustive-bug-hunt
+  - engineering-workflow
+  - maintenance-workflow
+  - spec-extraction-workflow
+  - audit-spec-alignment
+  - audit-implementation-alignment
 ---
 
 # Protocol: Adversarial Falsification

diff --git a/protocols/guardrails/anti-hallucination.md b/protocols/guardrails/anti-hallucination.md
@@ -23,8 +23,8 @@ fabrication and enforce intellectual honesty.
 Every claim in your output MUST be categorized as one of:
 
 - **KNOWN**: Directly stated in or derivable from the provided context.
-- **INFERRED**: A reasonable conclusion drawn from the context, with the
-  reasoning chain made explicit.
+- **INFERRED**: A conclusion derived through a stated chain of logical steps
+  from the context, with the reasoning chain made explicit.
 - **ASSUMED**: Not established by context. The assumption MUST be flagged
   with `[ASSUMPTION]` and a justification for why it is reasonable.
 
@@ -45,8 +45,9 @@ additional context instead of proceeding.
 
 - When multiple interpretations of a requirement or behavior are possible,
   enumerate them explicitly rather than choosing one silently.
-- When confidence in a conclusion is low, state: "Low confidence — this conclusion
-  depends on [specific assumption]. Verify by [specific action]."
+- When a conclusion depends on 2 or more ASSUMED premises (per Rule 1), flag it
+  explicitly: "Low confidence — this conclusion depends on [N] assumptions:
+  [list each]. Verify by [specific action]."
 
 ### 4. Source Attribution
 

diff --git a/protocols/guardrails/definition-of-done.md b/protocols/guardrails/definition-of-done.md
@@ -10,6 +10,10 @@ description: >
   by requiring verification of functionality, tests, diagnostics, build
   health, regression safety, and plan alignment.
 applicable_to: []
+# User-composed protocol — not auto-included by any template.
+# Intended for: implementation planning, engineering workflows,
+# and any task where explicit completion criteria prevent premature
+# "done" declarations.
 ---
 
 # Protocol: Definition of Done

diff --git a/protocols/guardrails/input-clarity-gate.md b/protocols/guardrails/input-clarity-gate.md
@@ -11,6 +11,10 @@ description: >
   natural language input and generates targeted clarifying questions
   instead of findings.
 applicable_to: []
+# User-composed protocol — not auto-included by any template.
+# Intended for: interactive templates and workflows where user-
+# provided natural language input must be validated for clarity
+# before task execution begins.
 ---
 
 # Protocol: Input Clarity Gate

diff --git a/protocols/guardrails/self-verification.md b/protocols/guardrails/self-verification.md
@@ -39,17 +39,13 @@ presenting it as final. Treat it as a pre-submission checklist.
 
 ### 2. Citation Audit
 
-Every factual claim must use the epistemic categories defined in the
-`anti-hallucination` protocol (KNOWN / INFERRED / ASSUMED).
-
-- Every factual claim in the output MUST be traceable to:
-  - A specific location in the provided code or context, OR
-  - An explicit `[ASSUMPTION]` or `[INFERRED]` label.
-- Scan the output for claims that lack citations. For each:
-  - Add the citation if the source is identifiable.
-  - Label as `[ASSUMPTION]` if not grounded in provided context.
-  - Remove the claim if it cannot be supported or labeled.
-- **Zero uncited factual claims** is the target.
+Apply the epistemic labeling rules from the `anti-hallucination` protocol
+(Rules 1–4: KNOWN/INFERRED/ASSUMED classification, refusal to fabricate,
+uncertainty disclosure, source attribution). Scan the output for factual
+claims that lack epistemic labels or source citations, and remediate each:
+add the appropriate epistemic label (`[KNOWN]`, `[INFERRED]`, or
+`[ASSUMPTION]`), add the citation, or remove the claim. **Zero uncited factual
+claims** is the target.
 
 ### 3. Coverage Confirmation
 
@@ -59,11 +55,9 @@ Every factual claim must use the epistemic categories defined in the
     but not covered in the output?
   - If any areas were intentionally excluded, document why in a
     "Limitations" or "Coverage" section.
-- State explicitly:
-  - "**Examined**: [what was analyzed — directories, files, patterns]."
-  - "**Method**: [how items were found — search queries, commands, scripts]."
-  - "**Excluded**: [what was intentionally not examined, and why]."
-  - "**Limitations**: [what could not be examined due to access, time, or context]."
+- Include the 4-field coverage statement defined in the
+  `operational-constraints` protocol (Rule 9: Examined, Method,
+  Excluded, Limitations).
 
 ### 4. Internal Consistency Check
 
@@ -95,7 +89,9 @@ other directive text intended for LLM consumption, scan for language
 that introduces non-deterministic interpretation:
 
 - [ ] Are all instructions specific enough that two different LLMs
-      would produce structurally similar output?
+      would produce output with the same section headings, the same
+      number of items per section (±20%), and the same classification
+      labels?
 - [ ] Are quantifiers concrete (specific counts or ranges, not
       "some" or "several")?
 - [ ] Are evaluation criteria observable (not subjective adjectives

diff --git a/protocols/guardrails/tool-reliability-defense.md b/protocols/guardrails/tool-reliability-defense.md
@@ -10,6 +10,10 @@ description: >
   confirmation. Addresses known failure modes in AI coding tools
   including edit corruption, rendering artifacts, and encoding errors.
 applicable_to: []
+# User-composed protocol — not auto-included by any template.
+# Intended for: agentic workflows and agent instruction authoring
+# where tool outputs (file edits, shell commands, search results)
+# must be independently verified before proceeding.
 ---
 
 # Protocol: Tool Reliability Defense

diff --git a/protocols/reasoning/fixed-point-verification.md b/protocols/reasoning/fixed-point-verification.md
@@ -10,6 +10,10 @@ description: >
   outputs differ, the transformation does not reach a fixed point
   and is not idempotent or round-trip stable.
 applicable_to: []
+# User-composed protocol — not auto-included by any template.
+# Intended for: compiler, formatter, serializer, migrator, or
+# linter auto-fix tasks where idempotency or round-trip stability
+# must be verified.
 ---
 
 # Protocol: Fixed-Point Verification