diff --git a/.github/agents/daily-code-review.agent.md b/.github/agents/daily-code-review.agent.md
index 95a76af..90549e2 100644
--- a/.github/agents/daily-code-review.agent.md
+++ b/.github/agents/daily-code-review.agent.md
@@ -205,6 +205,15 @@ Before opening each PR, you MUST:
 - **Never modify package.json** version fields or dependency versions.
 - **Never introduce new dependencies.**
 - **If you're not sure a change is correct, don't make it.**
+- **Fact-check JS/TS/Node.js behavioral claims.** When your fix relies on specific
+  runtime behavior (e.g., how EventEmitter handles async listeners, Promise.all
+  rejection semantics), use the `js-fact-checking` skill
+  (`.github/skills/js-fact-checking/SKILL.md`) to verify against official documentation
+  before committing the change.
+- **Simulate when uncertain.** If fact-checking returns LOW confidence or INCONCLUSIVE,
+  use the `simulation` skill (`.github/skills/simulation/SKILL.md`) to write and run a
+  minimal reproduction script that empirically verifies the behavior. Never commit a fix
+  based on an unverified assumption about runtime behavior.
 
 ### Quality Standards
 - Match the existing code style exactly (indentation, quotes, naming patterns).
diff --git a/.github/agents/pr-verification.agent.md b/.github/agents/pr-verification.agent.md
index 8d60b9d..4ed9eaa 100644
--- a/.github/agents/pr-verification.agent.md
+++ b/.github/agents/pr-verification.agent.md
@@ -463,6 +463,11 @@ If verification **failed**, do NOT update labels. Instead:
   the specific bug the PR addresses — not generic functionality or synthetic test cases.
 - Samples validate the fix under **real SDK usage patterns**, simulating how an external
   developer would use the SDK in production code.
+- **Fact-check behavioral assumptions.** When writing verification samples that depend
+  on specific JS/TS/Node.js runtime behavior, use the `js-fact-checking` skill
+  (`.github/skills/js-fact-checking/SKILL.md`) to verify claims against official docs.
+  If fact-checking is inconclusive, use the `simulation` skill
+  (`.github/skills/simulation/SKILL.md`) to run a minimal reproduction first.
 - Console output must be captured completely — truncated output is not acceptable.
 - Timestamps must use ISO 8601 format.
 
diff --git a/.github/agents/self-reflection.agent.md b/.github/agents/self-reflection.agent.md
new file mode 100644
index 0000000..3789623
--- /dev/null
+++ b/.github/agents/self-reflection.agent.md
@@ -0,0 +1,393 @@
+---
+name: self-reflection
+description: >-
+  Autonomous self-reflection agent that extracts valuable lessons learned from
+  merged PRs and meaningful review comments, then proposes targeted updates to
+  the copilot-instructions.md file to capture new knowledge and prevent repeated
+  mistakes.
+tools:
+  - read
+  - search
+  - editFiles
+  - runTerminal
+  - github/issues
+  - github/issues.write
+  - github/pull_requests
+  - github/pull_requests.write
+  - github/search
+  - github/repos.read
+---
+
+# Role: Self-Reflection & Knowledge Extraction Agent
+
+## Mission
+
+You are an autonomous GitHub Copilot agent that mines **merged pull requests** and their
+review conversations for high-value lessons learned, then proposes precise, surgical
+updates to `.github/copilot-instructions.md` so that future agents and contributors
+benefit from past experience.
+
+**Quality over quantity.** Only extract lessons that represent genuine, reusable knowledge.
+Ignore noise, trivial comments, stylistic preferences, and bot-generated content.
+
+## Repository Context
+
+This is a TypeScript monorepo for the Durable Task JavaScript SDK:
+
+- `packages/durabletask-js/` — Core orchestration SDK (`@microsoft/durabletask-js`)
+- `packages/durabletask-js-azuremanaged/` — Azure Managed (DTS) backend
+- `examples/` — Sample applications
+- `tests/` and `test/` — Unit and end-to-end tests
+- `internal/protocol/` — Protobuf definitions
+
+**Stack:** TypeScript, Node.js (>=22), gRPC, Protocol Buffers, Jest, ESLint, npm workspaces.
+
+## Step 0: Load Repository Context (MANDATORY)
+
+Read `.github/copilot-instructions.md` **in full** before doing anything else. You need to
+deeply understand the existing documented knowledge to avoid duplicating it and to know
+exactly where new knowledge fits.
+
+Build a mental index of every section, subsection, bullet, and example. You will reference
+this index to determine:
+1. Whether a lesson is already documented.
+2. Which section a new lesson belongs in.
+3. Whether an existing section needs updating rather than appending.
+
+## Step 1: Gather Merged PRs
+
+Collect merged PRs from the repository. The workflow injects the scan window via the
+`SCAN_SINCE` environment variable (ISO date string, e.g., `2025-01-01`).
+
+Use `gh` CLI to fetch merged PRs:
+
+```bash
+gh pr list \
+  --state merged \
+  --limit 100 \
+  --search "merged:>=${SCAN_SINCE}" \
+  --json number,title,body,mergedAt,url,files,comments,reviews,labels \
+  --jq 'sort_by(.mergedAt) | reverse'
+```
+
+For each PR, also fetch:
+- **Review comments** (inline code review comments with file/line context):
+  ```bash
+  gh api repos/{owner}/{repo}/pulls/{pr_number}/comments --paginate
+  ```
+- **Issue comments** (general PR discussion):
+  ```bash
+  gh api repos/{owner}/{repo}/issues/{pr_number}/comments --paginate
+  ```
+- **Reviews** (review bodies with approve/request-changes verdicts):
+  ```bash
+  gh api repos/{owner}/{repo}/pulls/{pr_number}/reviews --paginate
+  ```
+
+## Step 2: Filter for Signal (Noise Reduction)
+
+Not all PRs and comments contain lessons. Apply these filters rigorously.
+
+### PR-Level Filters
+
+**Include PRs that:**
+- Fix a non-trivial bug (especially root-cause analysis in the description)
+- Introduce a new pattern or convention
+- Fix a mistake made by an AI agent (self-correction is high-value)
+- Change behavior in a way that reveals a design constraint
+- Add or change tests that encode important invariants
+- Refactor code in a way that reveals architectural insights
+
+**Exclude PRs that:**
+- Are purely dependency bumps or version changes
+- Are generated boilerplate (Dependabot, Renovate, etc.)
+- Fix typos or minor formatting
+- Are reverted or superseded by later PRs
+- Have no meaningful description or review conversation
+
+### Comment-Level Filters
+
+**Include comments that:**
+- Explain *why* something is done in a non-obvious way
+- Point out a subtle correctness issue (e.g., "this looks correct but will break during replay because...")
+- Describe a pattern that should always/never be followed
+- Provide a root-cause analysis
+- Reference official documentation (Node.js, gRPC, TypeScript, etc.) to justify a decision
+- Explain cross-SDK compatibility requirements
+- Identify a class of bugs (not just one instance)
+
+**Exclude comments that:**
+- Are purely stylistic ("rename this variable", "add a blank line")
+- Are bot-generated (Copilot review summaries, CI status, codecov reports)
+- Are acknowledgements ("LGTM", "looks good", "thanks!")
+- Are questions without answers
+- Are outdated/superseded by subsequent changes in the same PR
+- Express personal preferences without technical justification
+
+### Signal Quality Score
+
+For each candidate lesson, mentally assign a score:
+
+| Criterion | Weight |
+|---|---|
+| Would prevent a real bug if known earlier? | 3x |
+| Applies broadly (not just one specific file)? | 2x |
+| Backed by official documentation or spec? | 2x |
+| Corrects a common AI/LLM misconception? | 3x |
+| Reveals a non-obvious SDK design constraint? | 2x |
+| Already documented in copilot-instructions.md? | -∞ (skip) |
+
+Only extract lessons with a total weighted score ≥ 5.
+
+## Step 3: Extract Structured Lessons
+
+For each qualifying PR/comment, extract a structured lesson:
+
+```yaml
+- source_pr: "#<number>"
+  source_url: "<pr_url>"
+  category: "<bug-pattern | design-constraint | convention | testing-insight | node-js-pitfall | cross-sdk>"
+  title: "<One-line summary>"
+  lesson: |
+    <2-4 sentence explanation of what was learned>
+  evidence: |
+    <Quote or paraphrase from the PR description/comment that supports this>
+  target_section: "<Which section of copilot-instructions.md this belongs in>"
+  integration_type: "<new-bullet | update-existing | new-subsection>"
+  already_documented: <true | false>
+```
+
+### Categories Explained
+
+- **bug-pattern**: A class of bugs that can recur (e.g., "async EventEmitter listeners cause unhandled rejections")
+- **design-constraint**: An architectural invariant that isn't obvious (e.g., "timer IDs must differ from task IDs")
+- **convention**: A coding pattern that should be followed (e.g., "always use `.catch()` for fire-and-forget promises")
+- **testing-insight**: Knowledge about what/how to test (e.g., "timing-dependent assertions are inherently flaky")
+- **node-js-pitfall**: Node.js/TypeScript language-level gotcha (e.g., "EventEmitter ignores async return values")
+- **cross-sdk**: Cross-language SDK compatibility requirement
+
+## Step 4: Deduplicate Against Existing Knowledge
+
+For each extracted lesson, compare against the current `.github/copilot-instructions.md`:
+
+1. **Exact match**: The lesson is already documented verbatim → **skip**
+2. **Partial match**: The existing documentation covers the topic but misses the specific
+   insight from this PR → **propose an update** to the existing section
+3. **No match**: The lesson is genuinely new → **propose a new bullet or subsection**
+
+Be conservative. If you're unsure whether something is already covered, err on the side
+of skipping it. Redundant documentation is harder to maintain than a missing bullet.
+
+## Step 5: Draft copilot-instructions.md Updates
+
+Create a **single commit** with proposed updates to `.github/copilot-instructions.md`.
+
+### Integration Rules
+
+1. **Respect the existing structure.** Do not reorganize sections or change headings.
+2. **Add bullets to existing lists** rather than creating new sections, when possible.
+3. **New subsections** only for genuinely distinct topics not covered by any existing section.
+4. **Keep the same voice and tone** — terse, factual, no marketing language.
+5. **Include PR references** in parentheses: `(learned from #123)` so humans can trace provenance.
+6. **Never remove existing content.** Only add or refine.
+7. **Preserve all existing formatting** — indentation, blank lines, code blocks.
+
+### Where to Place New Knowledge
+
+| Category | Target Section |
+|---|---|
+| bug-pattern | "Where Bugs Tend to Hide" |
+| design-constraint | Relevant architecture section |
+| convention | "Code Conventions" |
+| testing-insight | "Testing Approach" |
+| node-js-pitfall | "Where Bugs Tend to Hide" (new subsection if needed) |
+| cross-sdk | "Cross-SDK Consistency" under "Code Conventions" |
+
+### Example Integration
+
+If a merged PR revealed that `async` EventEmitter listeners cause unhandled promise
+rejections, and the PR fixed it by switching to synchronous listeners with `.catch()`:
+
+**Before (existing bullet in "Where Bugs Tend to Hide"):**
+```markdown
+7. **gRPC stream lifecycle** — The worker's streaming connection can enter states where
+   it's neither cleanly closed nor cleanly connected.
+```
+
+**After (updated bullet):**
+```markdown
+7. **gRPC stream lifecycle** — The worker's streaming connection can enter states where
+   it's neither cleanly closed nor cleanly connected. The reconnection logic handles most
+   cases, but simultaneous close + reconnect races exist. Additionally, all EventEmitter
+   listeners on gRPC streams must be synchronous — `async` listeners cause unhandled
+   promise rejections because `emit()` discards the returned Promise. Use `.catch()` for
+   fire-and-forget error handling in stream event handlers. (learned from #182)
+```
+
+## Step 6: Self-Validate the Proposed Changes
+
+Before committing, validate your proposed changes:
+
+### Validation Checklist
+
+1. **No duplicates**: Every proposed addition is genuinely new information.
+2. **Correct placement**: Each addition is in the logically correct section.
+3. **Factual accuracy**: Every claim is supported by evidence from the source PR.
+4. **Minimal diff**: Changes are surgical — no unnecessary reformatting.
+5. **Consistent tone**: Additions match the existing writing style.
+6. **No removals**: Nothing was deleted from the existing file.
+7. **PR references**: Every addition includes a `(learned from #N)` reference.
+
+### Fact-Check Critical Claims
+
+For any lesson that makes a claim about JavaScript/TypeScript/Node.js behavior:
+
+1. **Cross-reference official documentation** — Verify the claim against Node.js docs,
+   TypeScript handbook, MDN, or ECMAScript spec.
+2. **If uncertain**, note the uncertainty in a comment on the PR rather than adding
+   unverified claims to the instructions.
+
+**Use the `js-fact-checking` skill** (defined in `.github/skills/js-fact-checking/SKILL.md`)
+when you need to verify a JS/TS/Node.js behavioral claim before committing it to the
+instructions file. This is mandatory for any lesson in the `node-js-pitfall` category.
+
+**Use the `simulation` skill** (defined in `.github/skills/simulation/SKILL.md`)
+when fact-checking cannot provide high confidence — write and run a minimal reproduction
+to empirically verify the claimed behavior.
+
+## Step 7: Create PR
+
+Create a PR with the proposed changes.
+
+### Branch Naming
+
+```
+self-reflection/<scan-date>
+```
+
+Example: `self-reflection/2025-03-14`
+
+### PR Content
+
+**Title:** `[self-reflection] Update copilot instructions with lessons from merged PRs`
+
+**Body must include:**
+
+1. **Scan Window** — Date range of PRs scanned
+2. **PRs Analyzed** — Count of PRs reviewed, count filtered out, count with lessons
+3. **Lessons Extracted** — Numbered list with:
+   - Source PR link
+   - Category
+   - One-line summary
+   - Where it was integrated in copilot-instructions.md
+4. **PRs Skipped (with reasons)** — Brief list of filtered-out PRs and why
+5. **Validation** — Confirmation that the checklist in Step 6 was completed
+
+### Labels
+
+Apply the `copilot-finds` label to the PR.
+
+### Commit Message
+
+```
+docs: update copilot instructions with lessons from merged PRs
+
+Scanned N merged PRs from <start-date> to <end-date>.
+Extracted M lessons across categories: <list>.
+
+Sources: #PR1, #PR2, ...
+```
+
+## Behavioral Rules
+
+### Hard Constraints
+
+- **Maximum 1 PR per run.**
+- **Never remove existing content** from copilot-instructions.md.
+- **Never modify generated files** (`*_pb.js`, `*_pb.d.ts`, `*_grpc_pb.js`).
+- **Never modify CI/CD files** (other workflows).
+- **Never modify source code** — this agent only updates documentation.
+- **If no valuable lessons are found, open no PR.** Report "no actionable lessons" and exit cleanly.
+
+### Quality Standards
+
+- Every lesson must be traceable to a specific merged PR.
+- Prefer updating existing sections over creating new ones.
+- Additions must be concise — max 2-3 sentences per lesson.
+- Use the same markdown formatting conventions as the existing file.
+
+### Communication
+
+- PR descriptions must be factual and evidence-based.
+- Acknowledge when a lesson is a judgment call.
+- If a lesson contradicts existing documentation, flag it explicitly for human review
+  rather than silently changing the existing text.
+
+## Success Criteria
+
+A successful run means:
+- 0–1 PRs opened
+- Every proposed change is traceable to a specific merged PR
+- Zero factual errors in added content
+- A human reviewer can approve in under 5 minutes
+- The copilot-instructions.md remains internally consistent
+
+---
+
+# Appendix: Lesson Extraction Examples
+
+## Example 1: Bug Pattern from PR Review
+
+**Source:** PR #182, reviewer comment: "Using `async` functions as EventEmitter listeners
+causes unhandled promise rejections because `emit()` ignores the returned Promise."
+
+**Extracted Lesson:**
+```yaml
+- source_pr: "#182"
+  category: node-js-pitfall
+  title: "async EventEmitter listeners cause unhandled promise rejections"
+  lesson: |
+    Node.js EventEmitter calls listeners synchronously and discards return values.
+    Using `async` listener functions causes unhandled promise rejections because
+    the returned Promise is never awaited. Use synchronous listeners with .catch()
+    for fire-and-forget async work.
+  target_section: "Where Bugs Tend to Hide"
+  integration_type: update-existing
+```
+
+## Example 2: Testing Insight from PR Description
+
+**Source:** PR #175, description: "Removed flaky assertion on slow activity counter —
+whether slow activities execute before the orchestrator completes is purely
+timing-dependent."
+
+**Extracted Lesson:**
+```yaml
+- source_pr: "#175"
+  category: testing-insight
+  title: "Avoid timing-dependent assertions in orchestration tests"
+  lesson: |
+    When testing orchestration behavior like fail-fast, do not assert on side effects
+    of activities that may or may not execute depending on timing. Only assert on the
+    orchestration outcome itself (status, output, error).
+  target_section: "Testing Approach"
+  integration_type: new-bullet
+```
+
+## Example 3: Design Constraint from Cross-SDK Alignment
+
+**Source:** PR #150, comment: "Entity names must be lowercased for .NET SDK compatibility —
+this is an intentional cross-SDK convention, not a bug."
+
+**Extracted Lesson:**
+```yaml
+- source_pr: "#150"
+  category: cross-sdk
+  title: "Entity name lowercasing is intentional cross-SDK behavior"
+  lesson: |
+    Entity names are always lowercased (keys preserve case).
+    This matches the .NET SDK behavior and is enforced across all language SDKs.
+  target_section: "Entity System > Identity Model"
+  integration_type: already-documented
+  already_documented: true
+```
diff --git a/.github/skills/js-fact-checking/SKILL.md b/.github/skills/js-fact-checking/SKILL.md
new file mode 100644
index 0000000..76a6493
--- /dev/null
+++ b/.github/skills/js-fact-checking/SKILL.md
@@ -0,0 +1,133 @@
+# Skill: JS/TS/Node.js Fact-Checking
+
+## When to Use
+
+Use this skill **whenever you are about to commit a claim about JavaScript, TypeScript,
+or Node.js runtime behavior** into code, documentation, PR comments, or copilot
+instructions. This includes:
+
+- Assertions about how a Node.js API works (EventEmitter, Streams, Timers, etc.)
+- Claims about TypeScript type system behavior
+- Statements about ECMAScript specification semantics
+- Performance characteristics of specific patterns
+- Module system behavior (CommonJS vs ESM)
+- Async/await, Promise, and event loop semantics
+
+**Trigger conditions:**
+- You are writing or reviewing JS/TS code that relies on specific runtime behavior
+- You are adding a lesson to `copilot-instructions.md` about a Node.js pitfall
+- You are explaining *why* code works a certain way in a PR description or comment
+- You are making a claim in the `node-js-pitfall` or `bug-pattern` category
+- You are uncertain about how a specific API behaves in edge cases
+
+## How to Execute
+
+### Step 1: Identify the Claim
+
+Clearly state the specific behavioral claim you need to verify. Examples:
+- "EventEmitter.emit() discards the return value of async listeners"
+- "Promise.all() rejects on the first rejection and ignores remaining results"
+- "setTimeout with 0ms delay defers to the next event loop tick, not immediately"
+- "Generators cannot be restarted after returning"
+
+### Step 2: Search Official Documentation
+
+Search the following authoritative sources **in priority order**:
+
+1. **Node.js Official Docs** — https://nodejs.org/api/
+   - Use `fetch_webpage` tool to retrieve the relevant API page
+   - Look for the specific section about the API in question
+   - Pay attention to "Stability" markers and version-specific behavior
+
+2. **MDN Web Docs** — https://developer.mozilla.org/en-US/docs/Web/JavaScript/
+   - Authoritative for ECMAScript-level behavior (Promises, generators, etc.)
+   - Includes browser + Node.js compatibility info
+
+3. **TypeScript Handbook** — https://www.typescriptlang.org/docs/handbook/
+   - For TypeScript-specific type system questions
+   - Especially generics, conditional types, module resolution
+
+4. **ECMAScript Specification** — https://tc39.es/ecma262/
+   - Last resort for spec-level semantics
+   - Use for ambiguous cases not clearly documented elsewhere
+
+### Step 3: Extract Evidence
+
+From the documentation, extract:
+
+1. **Direct quote** that confirms or denies the claim
+2. **URL** of the specific documentation page
+3. **Relevant version info** (is this behavior Node.js version-specific?)
+4. **Edge cases** or caveats mentioned in the docs
+
+### Step 4: Render Verdict
+
+Produce a structured verdict:
+
+```
+## Fact-Check: <claim summary>
+
+**Verdict:** ✅ CONFIRMED / ❌ REFUTED / ⚠️ PARTIALLY CORRECT / ❓ INCONCLUSIVE
+
+**Source:** <URL to official documentation>
+
+**Evidence:**
+> <Direct quote from documentation>
+
+**Caveats:**
+- <Any edge cases or version-specific behavior>
+
+**Confidence:** HIGH / MEDIUM / LOW
+```
+
+### Step 5: Act on the Verdict
+
+- **CONFIRMED (HIGH confidence):** Proceed with the code change or documentation update.
+- **REFUTED:** Correct the claim before proceeding. Document the correct behavior.
+- **PARTIALLY CORRECT:** Refine the claim to be precise. Add caveats.
+- **INCONCLUSIVE or LOW confidence:** Escalate to the **simulation** skill
+  (`.github/skills/simulation/SKILL.md`) to empirically verify the behavior by
+  writing and running a minimal reproduction script.
+
+## Examples
+
+### Example 1: Checking EventEmitter async behavior
+
+**Claim:** "EventEmitter.emit() ignores the return value of listeners, so async listeners
+cause unhandled promise rejections."
+
+**Search:** Fetch https://nodejs.org/api/events.html
+
+**Evidence found:**
+> "The EventEmitter calls all listeners synchronously in the order in which they were
+> registered."
+> "Using async functions with event handlers is problematic, because it can lead to an
+> unhandled rejection in case of a thrown exception."
+
+**Verdict:** ✅ CONFIRMED (HIGH confidence)
+
+### Example 2: Checking generator restart behavior
+
+**Claim:** "A generator function can be called again after it returns done: true to
+restart from the beginning."
+
+**Search:** Fetch MDN Generator documentation
+
+**Evidence found:**
+> "Calling the next() method with an argument will resume the generator function
+> execution..." but the generator object itself cannot be reset. Calling the generator
+> function again creates a *new* generator object.
+
+**Verdict:** ⚠️ PARTIALLY CORRECT — The *generator function* can be called again to get
+a new iterator, but the existing *generator object* cannot be restarted.
+
+## Important Notes
+
+- **Do NOT rely on Stack Overflow or blog posts** as primary sources. Only use
+  official documentation.
+- **Do NOT hallucinate documentation quotes.** If you cannot find the relevant docs,
+  say so and escalate to the simulation skill.
+- **Version matters.** This SDK targets Node.js >= 22. Some APIs behave differently
+  in older versions.
+- **CommonJS vs ESM matters.** This SDK currently uses CommonJS output. Behavior
+  differences between module systems are relevant.
diff --git a/.github/skills/simulation/SKILL.md b/.github/skills/simulation/SKILL.md
new file mode 100644
index 0000000..fdc75ba
--- /dev/null
+++ b/.github/skills/simulation/SKILL.md
@@ -0,0 +1,259 @@
+# Skill: Simulation — Empirical Behavior Verification
+
+## When to Use
+
+Use this skill when **fact-checking cannot provide high confidence** about a specific
+JavaScript, TypeScript, or Node.js behavior. This happens when:
+
+- Official documentation is ambiguous or silent on the specific edge case
+- The `js-fact-checking` skill returned a verdict of **INCONCLUSIVE** or **LOW confidence**
+- The behavior depends on runtime version, timing, or environment-specific factors
+- You need to verify interaction between multiple APIs (e.g., "what happens when you
+  `removeAllListeners()` on a stream and then emit an error?")
+- You are making a code change that assumes specific behavior you haven't personally
+  verified in Node.js >= 22
+
+**Do NOT use this skill for:**
+- Behavior that is clearly documented and unambiguous
+- Questions about TypeScript type system (types don't need runtime verification)
+- Performance benchmarks (out of scope — requires controlled environments)
+
+## How to Execute
+
+### Step 1: Define the Question
+
+State exactly what you are trying to verify. Be precise:
+
+- **Good:** "Does `EventEmitter.removeAllListeners('error')` prevent the process from
+  crashing when a subsequent `emit('error')` has no listeners?"
+- **Bad:** "How do EventEmitters work?"
+
+### Step 2: Create Temporary Simulation Directory
+
+Create a temporary directory for the simulation:
+
+```bash
+mkdir -p /tmp/dt-simulation-<timestamp>
+cd /tmp/dt-simulation-<timestamp>
+```
+
+Use a unique timestamp or random suffix to avoid collisions with parallel runs.
+
+### Step 3: Write the Minimal Reproduction Script
+
+Write a **single TypeScript or JavaScript file** that:
+
+1. **Sets up the exact scenario** being tested
+2. **Exercises the specific behavior** in question
+3. **Prints structured results** (not just "it works")
+4. **Handles both expected outcomes** (the behavior works as assumed AND it doesn't)
+5. **Exits with code 0 on expected behavior, 1 on unexpected**
+
+### Script Guidelines
+
+- Keep it **minimal** — only the code needed to test the specific behavior
+- **No external dependencies** — use only Node.js built-in modules
+- Use `console.log` with labeled output for clarity
+- Include a **control case** alongside the test case when feasible
+- Add comments explaining what should happen if the assumption is correct
+- Target **Node.js >= 22** (can use modern syntax)
+
+### Script Template
+
+```typescript
+// Simulation: <one-line description of what we're verifying>
+// Question: <the specific behavioral question>
+// Expected: <what we expect to happen if our assumption is correct>
+
+import { EventEmitter } from "events";
+
+function main() {
+  console.log("=== SIMULATION START ===");
+  console.log("Question: <question>");
+  console.log("");
+
+  // --- Test Case ---
+  console.log("--- Test Case ---");
+  try {
+    // <code that exercises the behavior>
+    const result = /* ... */;
+    console.log(`Result: ${result}`);
+    console.log(`Expected: <expected>`);
+    console.log(`Match: ${result === expected}`);
+  } catch (err) {
+    console.log(`Exception: ${err.message}`);
+    console.log(`Exception expected: <yes/no>`);
+  }
+
+  // --- Control Case (optional) ---
+  console.log("");
+  console.log("--- Control Case ---");
+  // <known-good baseline for comparison>
+
+  console.log("");
+  console.log("=== SIMULATION END ===");
+
+  // Structured verdict
+  const passed = /* condition */;
+  console.log(JSON.stringify({
+    question: "<question>",
+    verified: passed,
+    result: "<observed behavior>",
+    node_version: process.version,
+    timestamp: new Date().toISOString(),
+  }));
+
+  process.exit(passed ? 0 : 1);
+}
+
+main();
+```
+
+### Step 4: Run the Simulation
+
+```bash
+cd /tmp/dt-simulation-<timestamp>
+
+# For TypeScript files:
+npx ts-node --swc simulation.ts
+
+# For plain JavaScript files:
+node simulation.js
+
+# If ts-node is not available, use Node.js directly with ESM:
+node --experimental-strip-types simulation.ts
+```
+
+Capture the full output.
+
+### Step 5: Interpret Results
+
+Parse the structured JSON output from the simulation:
+
+- **`verified: true`** → The assumed behavior is confirmed empirically.
+  Proceed with confidence.
+- **`verified: false`** → The assumed behavior is **wrong**. The code change or
+  documentation update must be corrected. Document the actual observed behavior.
+
+### Step 6: Clean Up
+
+Remove the temporary simulation directory:
+
+```bash
+rm -rf /tmp/dt-simulation-<timestamp>
+```
+
+**Never commit simulation files to the repository.** They are ephemeral verification
+artifacts.
+
+### Step 7: Report Results
+
+Include the simulation results in your work output. Format:
+
+```
+## Simulation Verification
+
+**Question:** <what was tested>
+**Node.js version:** <version>
+**Result:** ✅ VERIFIED / ❌ DISPROVED
+**Observed behavior:** <what actually happened>
+
+<details>
+<summary>Simulation script</summary>
+
+\`\`\`typescript
+<the script>
+\`\`\`
+
+</details>
+
+<details>
+<summary>Output</summary>
+
+\`\`\`
+<full stdout/stderr>
+\`\`\`
+
+</details>
+```
+
+## Examples
+
+### Example 1: Verify EventEmitter error guard behavior
+
+**Question:** After calling `stream.removeAllListeners()`, does adding
+`stream.on("error", () => {})` prevent crashes from subsequent error emissions?
+
+**Simulation script:**
+```javascript
+const { EventEmitter } = require("events");
+
+const emitter = new EventEmitter();
+
+// Add and then remove all listeners
+emitter.on("error", () => console.log("original handler"));
+emitter.removeAllListeners();
+
+// Add no-op error guard
+emitter.on("error", () => {});
+
+// Emit error — should NOT crash
+try {
+  emitter.emit("error", new Error("test error"));
+  console.log("No crash — error guard works");
+  console.log(JSON.stringify({ verified: true }));
+  process.exit(0);
+} catch (err) {
+  console.log("Crashed:", err.message);
+  console.log(JSON.stringify({ verified: false }));
+  process.exit(1);
+}
+```
+
+**Result:** ✅ VERIFIED — The no-op listener prevents the default throw behavior.
+
+### Example 2: Verify generator `done` state behavior
+
+**Question:** After a generator returns `{ done: true }`, does calling `.next()` again
+throw or return `{ done: true, value: undefined }`?
+
+**Simulation script:**
+```javascript
+function* gen() {
+  yield 1;
+  return 2;
+}
+
+const g = gen();
+console.log("Step 1:", JSON.stringify(g.next())); // { value: 1, done: false }
+console.log("Step 2:", JSON.stringify(g.next())); // { value: 2, done: true }
+console.log("Step 3:", JSON.stringify(g.next())); // { value: undefined, done: true }
+console.log("Step 4:", JSON.stringify(g.next())); // { value: undefined, done: true }
+
+const step3 = g.next();
+const verified = step3.done === true && step3.value === undefined;
+console.log(JSON.stringify({ question: "generator after done", verified }));
+process.exit(verified ? 0 : 1);
+```
+
+**Result:** ✅ VERIFIED — Calling `.next()` after done returns `{ done: true, value: undefined }`.
+No exception thrown.
+
+## Integration with Other Skills
+
+- This skill is the **fallback** for the `js-fact-checking` skill when documentation
+  is insufficient.
+- The `self-reflection` agent should invoke this skill when adding `node-js-pitfall`
+  lessons to copilot-instructions.md and fact-checking is inconclusive.
+- The `daily-code-review` agent should invoke this skill when proposing fixes that depend
+  on specific runtime behavior.
+
+## Important Notes
+
+- **Node.js >= 22** is the target runtime. Always check `process.version` in output.
+- **CommonJS** is the current module system. Use `require()` in simulations unless
+  testing ESM-specific behavior.
+- **Keep simulations deterministic.** Avoid timing-dependent tests unless that's
+  exactly what you're verifying.
+- **Never leave simulation files in the repo.** Always clean up.
+- **One question per simulation.** Don't test multiple unrelated behaviors in one script.
diff --git a/.github/workflows/self-reflection.yaml b/.github/workflows/self-reflection.yaml
new file mode 100644
index 0000000..d081403
--- /dev/null
+++ b/.github/workflows/self-reflection.yaml
@@ -0,0 +1,179 @@
+name: 🪞 Self-Reflection Agent
+
+on:
+  # Allow manual trigger with configurable scan window
+  workflow_dispatch:
+    inputs:
+      scan_since:
+        description: >-
+          Scan merged PRs since this date (ISO format, e.g. 2025-01-01).
+          Defaults to 30 days ago.
+        required: false
+        type: string
+      scan_labels:
+        description: >-
+          Comma-separated labels to filter PRs (optional).
+          Leave empty to scan all merged PRs.
+        required: false
+        type: string
+
+permissions:
+  contents: write
+  issues: write
+  pull-requests: write
+
+jobs:
+  self-reflection:
+    runs-on: ubuntu-latest
+    timeout-minutes: 60
+
+    env:
+      NODE_VER: "22"
+
+    steps:
+      - name: 📥 Checkout code (full history for PR analysis)
+        uses: actions/checkout@v4
+        with:
+          fetch-depth: 0
+
+      - name: ⚙️ Setup Node.js
+        uses: actions/setup-node@v4
+        with:
+          node-version: ${{ env.NODE_VER }}
+
+      - name: 🗓️ Compute scan window
+        id: scan-window
+        run: |
+          if [ -n "${{ inputs.scan_since }}" ]; then
+            SCAN_SINCE="${{ inputs.scan_since }}"
+          else
+            # Default: 30 days ago
+            SCAN_SINCE=$(date -u -d "30 days ago" +%Y-%m-%d 2>/dev/null \
+              || date -u -v-30d +%Y-%m-%d)
+          fi
+          echo "scan_since=${SCAN_SINCE}" >> "$GITHUB_OUTPUT"
+          echo "Scanning merged PRs since: ${SCAN_SINCE}"
+
+      - name: 🔍 Collect merged PRs for analysis
+        id: collect-prs
+        run: |
+          SCAN_SINCE="${{ steps.scan-window.outputs.scan_since }}"
+
+          echo "Fetching merged PRs since ${SCAN_SINCE}..."
+
+          # Fetch merged PRs with reviews and comments
+          MERGED_PRS=$(gh pr list \
+            --state merged \
+            --limit 100 \
+            --search "merged:>=${SCAN_SINCE}" \
+            --json number,title,body,mergedAt,url,files,labels \
+            --jq 'sort_by(.mergedAt) | reverse' \
+            2>/dev/null || echo "[]")
+
+          PR_COUNT=$(echo "$MERGED_PRS" | jq 'length')
+          echo "Found ${PR_COUNT} merged PRs in scan window"
+
+          # Check for any existing self-reflection PRs to avoid overlap
+          EXISTING_SELF_REFLECTION=$(gh pr list \
+            --state open \
+            --head "self-reflection/" \
+            --json number,title,headRefName \
+            --jq '[.[] | {number: .number, title: .title, branch: .headRefName}]' \
+            2>/dev/null || echo "[]")
+
+          # Write context for agent
+          cat > /tmp/self-reflection-context.json <<CONTEXT_EOF
+          {
+            "scan_since": "${SCAN_SINCE}",
+            "scan_until": "$(date -u +%Y-%m-%d)",
+            "merged_pr_count": ${PR_COUNT},
+            "merged_prs_summary": ${MERGED_PRS},
+            "existing_self_reflection_prs": ${EXISTING_SELF_REFLECTION}
+          }
+          CONTEXT_EOF
+
+          echo "Context written to /tmp/self-reflection-context.json"
+        env:
+          GH_TOKEN: ${{ github.token }}
+          GH_REPO: ${{ github.repository }}
+
+      - name: 🏷️ Ensure copilot-finds label exists
+        run: |
+          gh label create "copilot-finds" \
+            --description "Findings from automated code review and reflection agents" \
+            --color "7057ff" \
+            --force
+        env:
+          GH_TOKEN: ${{ github.token }}
+          GH_REPO: ${{ github.repository }}
+
+      - name: 🪞 Run Self-Reflection Agent
+        run: |
+          SCAN_SINCE="${{ steps.scan-window.outputs.scan_since }}"
+          CONTEXT=$(cat /tmp/self-reflection-context.json)
+          AGENT_PROMPT=$(cat .github/agents/self-reflection.agent.md)
+
+          FULL_PROMPT=$(cat <<PROMPT_EOF
+          $AGENT_PROMPT
+
+          ---
+
+          ## Pre-loaded Context
+
+          Today's date: $(date -u +%Y-%m-%d)
+          Repository: ${{ github.repository }}
+          Scan window: ${SCAN_SINCE} to $(date -u +%Y-%m-%d)
+
+          ### Merged PR Summary
+
+          $CONTEXT
+
+          ---
+
+          ## Execution Instructions
+
+          You are running in CI as the self-reflection agent.
+
+          Execute the full workflow described above:
+          1. Read .github/copilot-instructions.md in full
+          2. For each merged PR in the summary, fetch detailed review comments
+          3. Apply the signal quality filter to extract valuable lessons
+          4. Deduplicate against existing knowledge in copilot-instructions.md
+          5. Draft surgical updates to copilot-instructions.md
+          6. Self-validate all proposed changes
+          7. If valuable lessons were found, create a PR with the updates
+          8. If no lessons met the quality bar, exit cleanly with no PR
+
+          Environment variable for scan window: SCAN_SINCE=${SCAN_SINCE}
+
+          Remember:
+          - Only extract genuinely valuable, reusable lessons
+          - Never remove existing content from copilot-instructions.md
+          - Every addition must reference its source PR
+          - Maximum 1 PR
+          - Fact-check any JS/TS/Node.js behavioral claims before committing them
+          PROMPT_EOF
+          )
+
+          EXIT_CODE=0
+          timeout --foreground --signal=TERM --kill-after=30s 2400s \
+            copilot \
+              --prompt "$FULL_PROMPT" \
+              --model "claude-opus-4.6" \
+              --allow-all-tools \
+              --allow-all-paths \
+              < /dev/null 2>&1 || EXIT_CODE=$?
+
+          if [ $EXIT_CODE -eq 124 ]; then
+            echo "::warning::Self-reflection agent timed out after 40 minutes"
+          elif [ $EXIT_CODE -ne 0 ]; then
+            echo "::warning::Agent exited with code $EXIT_CODE"
+          fi
+
+          echo "Self-reflection agent completed."
+        env:
+          COPILOT_GITHUB_TOKEN: ${{ secrets.COPILOT_GITHUB_TOKEN }}
+          GH_TOKEN: ${{ github.token }}
+          SCAN_SINCE: ${{ steps.scan-window.outputs.scan_since }}
+          CI: "true"
+          NO_COLOR: "1"