feat(webhook): add persistent retry with exponential backoff by jonathanpeterwu · Pull Request #19 · stackmemoryai/stackmemory

jonathanpeterwu · 2026-06-09T15:31:10Z

Summary

Add persistent webhook delivery queue with SQLite-backed retry and exponential backoff
Failed webhook events retry up to 5x with backoff schedule: 1s, 2s, 4s, 8s, 16s (capped at 300s with jitter)
Deliveries persist across process restarts — no more lost events on crash
Health endpoint now returns delivery stats by status (pending/processing/completed/failed/dead)
Also fixes 46 pre-existing lint errors, OpenRouter test skip on invalid API key, and lazy redis import

Changes

File	Change
`src/integrations/linear/webhook-retry.ts`	New: WebhookDeliveryQueue with SQLite persistence, atomic claim, background worker
`src/integrations/linear/webhook-server.ts`	Replace in-memory queue with WebhookDeliveryQueue
`src/integrations/linear/__tests__/webhook-retry.test.ts`	11 tests: enqueue, retry, dead letter, concurrency, stats
`src/core/storage/two-tier-storage.ts`	Make redis import lazy (optional dep)
`src/core/extensions/__tests__/openrouter-integration.test.ts`	Skip on 401/403 instead of failing
8 other files	Fix pre-existing eslint errors

Test plan

11 unit tests for WebhookDeliveryQueue (all passing)
Existing test suite: 144/145 pass, 1 skipped (OpenRouter)
Build succeeds
Lint: 0 errors

stackmemory scaffold creates company/, wiki/, skills/, clients/, raw/, and .stackmemory/config.yml. Enables local context management with file-based skill rot detection and tenant isolation.

New DaemonTelemetryService collects anonymous usage snapshots: - Daemon health (uptime, context saves, memory triggers, errors) - Session counts (total heartbeats, active now) - Skill audit entries, handoff counts - No PII — instance ID is random hex Runs daily (default 24h interval, first at boot+30s). Stores rolling 90-snapshot history in ~/.stackmemory/telemetry.json. Opt out: STACKMEMORY_TELEMETRY=0 or telemetry.enabled: false in config.

…gest skills Three-component system in DaemonDesirePathService: 1. ActionStreamLogger — PostToolUse hook captures tool:target pairs to ~/.stackmemory/desire-paths/action-stream.jsonl (no data/content) 2. PatternDetector — sliding window extracts repeated sequences, filters by min 3 occurrences across 2+ sessions, scores by freq×sessions 3. SkillSuggester — generates skill.md files from top patterns with inputs/outputs inferred from sequence endpoints - 10MB JSONL rotation, 10K entry scan cap for performance - Opt out: STACKMEMORY_DESIRE_PATHS=0 or desirePaths.enabled: false - Scans every 6h, first at boot+2m - Suggestions written to ~/.stackmemory/desire-paths/suggestions/ - 3 adversarial review rounds: fixed separator injection, added scan cap, improved skill naming with target directory context

… to 12h when idle

- Auto-starts daemon on session boot - Writes session heartbeats for telemetry tracking - Restores handoff context from previous sessions - Sets STACKMEMORY_SESSION env for desire-path hook - Determinism watcher + tracing - bin/hermes-sm and bin/hermes-smd registered in package.json

…ions

… test

Centralizes token estimation across 14 files through src/core/cache/token-estimator.ts and packages/sdk/src/token-estimator.ts. Lazy-loads cl100k_base encoder with char/4 fallback if WASM fails. Also ports context-budget hook to codex-sm exit handler for compact/restart nudges matching Claude Code behavior.

Skills are often prompt text (content) not code — content licenses like CC-BY-4.0 fit better than MIT for these. Adds KnownLicenseSchema enum with both code (MIT, Apache-2.0, ISC, BSD) and content (CC-BY-4.0, CC-BY-SA-4.0, CC0-1.0) licenses while keeping the field open for custom SPDX identifiers.

Markdown table parser + CLI commands + MCP tools for local-first task steering. Tasks live in master-tasks.md, optionally sync to Linear/GH. - Parser: parse/serialize/update/add/getNext for pipe-delimited md tables - CLI: stackmemory tasks init/md list/md next/md add/md update - MCP: get_next_master_task, update_master_task, create_master_task - 19 tests covering parse, round-trip, priority sorting, file ops

…rm, script-suggest - dedup-reads: escalate to [STOP] at 5+ reads (was soft-only at 3+) - desire-path-hook: auto-route Bash→Glob/Read/Grep with inline suggestions - prewarm-tools: SessionStart hook emits top deferred tool pre-fetch hint - script-suggest: detects multi-tool patterns matching existing scripts

Replays 7,589 action-stream entries through hook logic. Result: 324K token savings projected (22% waste reduction).

Emits reminder when >7 days since last mine and new suggested skills exist. Points to /workflow-skill-miner.

…lete - Skip patterns with <2 unique tools or <3 steps in generateSuggestions() - Add cleanTrivialSuggestion() to auto-delete stale suggestion files - Remove duplicate quality gate from autoPromote (filtering now upstream) - Prevents bloat: 19/20 current patterns were trivial (git×2, Edit×3, etc.)

…/UI tasks Design tasks bypass subscription-first cascade (Codex→Grok→API) and route directly to Claude CLI, which excels at creative UI/UX decisions. - Add 'design' task type to SubagentRequest, TaskType, ModelRouterConfig - Add forceProvider field for explicit provider override - Add design prompt (opinionated, production-ready, no-ask-just-decide) - Add delegateDesign() convenience method for wrapper CLIs

Consolidate from StackMemory-specific config to a generic agent reference covering stack, structure, commands, and key patterns.

Filter action-stream by current project directory so prewarm suggestions are scoped to the repo you're working in. Falls back to global stats when no project-specific data exists.

Reads MEMORY.md index at session start, scores entries by relevance to current project context, and surfaces the most useful memories.

image-preprocess: PreToolUse hook that intercepts Read calls on image files and routes them through vision-capable models. image-extract-mcp: stdio MCP server providing a describe_image tool for text extraction from images via vision model APIs.

daemon.js: persistent file watcher for all GEPA targets — triggers optimization on CLAUDE.md changes. Session hook and .before-optimize baseline updated for current optimization state.

# Conflicts: # src/daemon/services/desire-path-service.ts # src/hooks/prewarm-tools.cjs

…harness End-to-end integration bridging Stagehand browser automation with StackMemory's desire-path system for workflow discovery and replay. - StagehandWorkflowCapture: wraps act/extract/observe, emits to action-stream - WorkflowCache: persists workflows, bridges to desire-path patterns.json - WorkflowReplayer: cached (0 tokens), AI (self-healing), hybrid modes - WorkflowBenchmark: compare stagehand-ai vs cached vs playwright-code - 4 MCP tools: workflow_list, workflow_get, workflow_replay, workflow_benchmarks - Benchmark script with 3 test workflows (GitHub, HN, NPM) Stagehand is a peer dependency — not required for core functionality.

CliBrowserAgent: Playwright + claude/codex CLI hybrid that routes AI understanding through subscription CLIs instead of direct API. Falls back to Anthropic API when CLI hooks interfere. - Playwright handles browser control (fast, deterministic) - claude --print / codex -q handles extraction/action interpretation - Results cached locally for zero-LLM replay on subsequent runs - Benchmark script supports --cli mode vs --api mode Known: claude --print triggers SessionEnd hooks that timeout. TODO: fix hook interference or add CLAUDE_CODE_SKIP_HOOKS support.

When DISABLE_HOOKS=1 env var is set, all session lifecycle hooks exit immediately. Prevents timeout/cancellation when claude --print is invoked as a subprocess (e.g., from CliBrowserAgent). Hooks patched: chime-on-stop.sh, stop-checkpoint.js, session-rescue.sh, wiki-update.js, token-meter-finalize.js, gepa-session-hook.js

claude --print produces valid output before SessionEnd hooks fire. Exit 143 from hook cancellation shouldn't reject — check stdout content instead of exit code.

- cd-thrash-guard: warns on 3+ cd commands in 10 tool calls - linear-dedup: detects duplicate Linear API calls within 60s - bash-dominance-guard: suggests Read/Grep/Glob/Edit over Bash equivalents

Distributable investigation skill teaching agents to debug production issues from structured wide-format logs. Encodes the tenant-context + domain-extras + timeline-reconstruction pattern.

Token estimator tests assumed ceil(length/4) heuristic but implementation now uses js-tiktoken (cl100k_base). Subagent routing tests failed because codex CLI is installed locally, causing isCodexAvailable() to short-circuit multiProvider/batch/Kimi paths — fixed by spying on private methods.

Every MCP tool call now emits begin/finish traces to Raindrop Workshop when RAINDROP_LOCAL_DEBUGGER env is set. Conditional — zero overhead when env var is absent. Flush on SIGINT shutdown.

Replace async IIFE wrapping the tool dispatch switch with a named handleTool() function. Removes one indentation level from ~500 lines, makes the Raindrop tracing wrapper cleaner, and shrinks the bundle.

Three adapter modes behind a unified ScreenAdapter interface: - TmuxAdapter: reads pane buffer as text, sends keys (CLI, no API needed) - DesktopAdapter: macOS screenshots + AppleScript + Haiku VLM - BrowserAdapter: Playwright DOM reads for claude.ai/code web app Rule-based state machine detects: IDLE, WORKING, PERMISSION_PROMPT, ERROR, RATE_LIMITED, STUCK, SESSION_ENDED from screen content. LLM decision layer (Haiku) handles ambiguous states and generates nudges for stuck sessions instead of blind restarts. CLI: stackmemory operator start/stop/status/attach - Drains master-tasks.md queue overnight on Max plan - Auto-approves permission prompts, exponential backoff on rate limits - JSONL logging + checkpoint file for monitoring - 31 tests passing

- STUCK now nudges twice before marking blocked (was: immediate block) - Track nudgeCount per task in checkpoint, reset on new task - Fix ScreenshotAdapter adapterType to match union type - Add index.ts barrel export for clean imports - 32 tests passing

…→ apply) Repackages the external instincts/continuous-learning concept into StackMemory as a native feature backed by SQLite. New modules: - PatternStore: CRUD with confidence scoring, decay, pruning - PatternObserver: extracts patterns from trace events at session end (tool sequences, error→fix pairs, tool preferences) - PatternApplier: surfaces relevant patterns in context retrieval - CLI: stackmemory patterns list|learn|stats|prune|export|import Schema: adds `patterns` table with domain, trigger, action, confidence scoring (0.3→0.85 based on observation count), project-scoping, and weekly confidence decay. 16 tests passing.

…arity Fills 3 feature gaps vs continuous-learning-v2: - promote: project→global (manual or auto-detect cross-project patterns) - projects: list projects with pattern counts - evolve: cluster related patterns, identify skill/command candidates PatternStore gains: promote(), projects(), promotionCandidates(), findClusters() CL-v2 for stackmemory is now retired — observer was already disabled, no hooks registered. Patterns system is the native replacement.

Tests already skipped when OPENROUTER_API_KEY env var is unset, but still failed with 401 when the key existed but was expired or invalid. Now catch auth errors (401/403) and skip gracefully via ctx.skip().

Replace in-memory webhook event queue with SQLite-backed delivery queue. Failed webhook events are now retried up to 5 times with exponential backoff (1s, 2s, 4s, 8s, 16s, capped at 300s) and jitter. Deliveries persist across process restarts. - WebhookDeliveryQueue: SQLite persistence, atomic claim, background worker - Reuses existing calculateBackoff() from core/errors/recovery.ts - Health endpoint now returns delivery stats by status - 11 tests covering enqueue, retry, dead letter, concurrency, stats

- Fix prettier formatting in hermes-sm, daemon-config, research-stream-service, telemetry-service - Add eslint-disable for dynamic require() in operator, adapter-factory - Replace inline require() with top-level imports in operator-logger, subagent-client

Redis is optional (only needed for remote tier). The hard import crashed when redis wasn't installed, causing test suite failures. Now uses lazy require() with graceful fallback.

…-backoff # Conflicts: # CLAUDE.md

Ensures every bash script that invokes stackmemory or node loads the correct Node version from .nvmrc before executing. Prevents better-sqlite3 NODE_MODULE_VERSION errors when hooks, wrappers, or daemons run in contexts without nvm initialized (git GUIs, IDE integrations, background processes, subprocesses).

StackMemory Bot (CLI) added 30 commits May 5, 2026 19:11

feat(teleop): add native voice control prototype

2745dc7

feat(cli): add scaffold command for Company OS folder structure

a226e45

stackmemory scaffold creates company/, wiki/, skills/, clients/, raw/, and .stackmemory/config.yml. Enables local context management with file-based skill rot detection and tenant isolation.

feat(hooks): add self-healing daemon health check for SessionStart

dcb1b6a

fix(desire-paths): adaptive backoff — hourly when active, exponential…

6ed3b4a

… to 12h when idle

feat(desire-paths): auto-promote skills above 0.8 confidence + 5 sess…

290b34d

…ions

feat(daemon): add research stream scanner for market signal detection

65421a8

fix(test): replace bun:test import with vitest in desire-path-service…

6395ce0

… test

feat(bench): add hook benchmark script + baseline report

bff0d6c

Replays 7,589 action-stream entries through hook logic. Result: 324K token savings projected (22% waste reduction).

feat(hooks): weekly skill-mine reminder on SessionStart

46ac2b9

Emits reminder when >7 days since last mine and new suggested skills exist. Points to /workflow-skill-miner.

docs: rewrite CLAUDE.md as tool-agnostic agent guide

9b91953

Consolidate from StackMemory-specific config to a generic agent reference covering stack, structure, commands, and key patterns.

feat(hooks): project-aware prewarm tool cache

213113d

Filter action-stream by current project directory so prewarm suggestions are scoped to the repo you're working in. Falls back to global stats when no project-specific data exists.

feat(hooks): memory-loader SessionStart hook

9ed1d52

Reads MEMORY.md index at session start, scores entries by relevance to current project context, and surfaces the most useful memories.

feat(gepa): daemon watcher + session hook updates

ce84de1

daemon.js: persistent file watcher for all GEPA targets — triggers optimization on CLAUDE.md changes. Session hook and .before-optimize baseline updated for current optimization state.

chore(gepa): gen-001 variant updates from optimization run

dc7a026

Merge branch 'feat/token-optimization-hooks'

3ee0269

# Conflicts: # src/daemon/services/desire-path-service.ts # src/hooks/prewarm-tools.cjs

fix(browser): accept CLI output on non-zero exit from hook failures

b80fbeb

claude --print produces valid output before SessionEnd hooks fire. Exit 143 from hook cancellation shouldn't reject — check stdout content instead of exit code.

chore(deps): add stagehand + playwright for browser workflows

d4c8c83

StackMemory Bot (CLI) added 28 commits May 27, 2026 17:29

feat(hooks): add cd-thrash, linear-dedup, and bash-dominance guardrails

e46d2ff

- cd-thrash-guard: warns on 3+ cd commands in 10 tool calls - linear-dedup: detects duplicate Linear API calls within 60s - bash-dominance-guard: suggests Read/Grep/Glob/Edit over Bash equivalents

fix(benchmark): fix NPM selector timeout + use page.evaluate extraction

82ce2c2

chore(gepa): update daemon state + generation variants

85746a2

chore(gepa): update hook state + scores

e3b521a

chore(gepa): auto-optimizer state + eval results

70206e2

chore(gepa): daemon state update

a41dc02

chore(gepa): auto-optimizer state update

ab1287a

feat(skill-packs): add ops/log-investigation pack

60dcabd

Distributable investigation skill teaching agents to debug production issues from structured wide-format logs. Encodes the tenant-context + domain-extras + timeline-reconstruction pattern.

feat(tracing): instrument MCP server with Raindrop Workshop

95057db

Every MCP tool call now emits begin/finish traces to Raindrop Workshop when RAINDROP_LOCAL_DEBUGGER env is set. Conditional — zero overhead when env var is absent. Flush on SIGINT shutdown.

chore(gepa): auto-optimizer state update

f8e260c

chore: add Raindrop Workshop agent config files

f855fa3

refactor(mcp): extract handleTool from IIFE wrapper

d2c345e

Replace async IIFE wrapping the tool dispatch switch with a named handleTool() function. Removes one indentation level from ~500 lines, makes the Raindrop tracing wrapper cleaner, and shrinks the bundle.

chore: handoff checkpoint on main

4ffb3bc

chore: handoff checkpoint on main

a9ccf87

chore: handoff checkpoint on main

1ff60a7

fix(test): skip OpenRouter live tests on invalid/missing API key

f95cc25

Tests already skipped when OPENROUTER_API_KEY env var is unset, but still failed with 401 when the key existed but was expired or invalid. Now catch auth errors (401/403) and skip gracefully via ctx.skip().

fix(storage): make redis import lazy in two-tier-storage

f907c83

Redis is optional (only needed for remote tier). The hard import crashed when redis wasn't installed, causing test suite failures. Now uses lazy require() with graceful fallback.

Merge remote-tracking branch 'origin/main' into feature/webhook-retry…

b106553

…-backoff # Conflicts: # CLAUDE.md

chore: handoff checkpoint on feature/webhook-retry-backoff

972d8e6

docs: mark webhook retry backoff acceptance criteria as complete

a53eb9a

jonathanpeterwu merged commit 4d22836 into main Jun 9, 2026
3 of 6 checks passed

jonathanpeterwu deleted the feature/webhook-retry-backoff branch June 9, 2026 18:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(webhook): add persistent retry with exponential backoff#19

feat(webhook): add persistent retry with exponential backoff#19
jonathanpeterwu merged 61 commits into
mainfrom
feature/webhook-retry-backoff

jonathanpeterwu commented Jun 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jonathanpeterwu commented Jun 9, 2026

Summary

Changes

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant