Context
Wedge 1.13 (heartbeat + state snapshot/restore + S3 backend) is shipped and verified end-to-end across claude-agent-sdk and gitagent. For deepagents, the warm-sandbox surface is currently gated off (commit 2885e0c) because of an engine-level conversation persistence gap.
This issue tracks the engineering work to lift that gate.
The gap (precisely characterized)
Cross-harness test, all three engines, same recipe:
POST /sandboxes with sessionStore: { kind: "mongo" }
- Chat: "Remember the password: AURORA-42. Reply: ACKNOWLEDGED."
POST /sandboxes/:id/snapshot (memory state store)
DELETE /sandboxes/:id (sandbox A disposed)
POST /sandboxes/restore → new sandbox B (sessionId preserved from sessionStoreRef)
- Chat on B: "What was the password I told you earlier?"
| Harness |
Recalls AURORA-42 after restore? |
claude-agent-sdk |
✓ PASS |
gitagent |
✓ PASS |
deepagents |
✗ FAIL — "This appears to be the start of our conversation" |
Layer-by-layer:
| Layer |
claude-agent-sdk |
gitagent |
deepagents |
| OS workdir snapshot+restore (tarball round-trip) |
✓ |
✓ |
✓ (35 files captured + restored) |
| Agent tools see OS workdir |
✓ |
✓ |
✗ (LangGraph virtual FS) |
| Conversation log persists via SessionStore |
✓ |
✓ |
✗ (in-memory checkpointer) |
| Conversation survives snapshot → restore |
✓ |
✓ |
✗ |
Root cause
engine-deepagents uses LangGraph's MemorySaver checkpointer for conversation state. That checkpointer is fully in-memory; it doesn't read/write through the SessionStore protocol that engine-claude-agent-sdk (and the gitclaw flow in engine-gitagent) plug into for cross-process resume.
Multi-turn WITHIN one live deepagents sandbox works fine — the in-memory checkpointer survives across chat() calls on the same ComputerAgent instance (verified). The break only happens when the substrate disposes (TTL / DELETE / shutdown) and a new sandbox is constructed from the snapshot — there's nothing to read the prior conversation back from.
Current behavior
Per commit 2885e0c, the warm-sandbox surface refuses deepagents at the door rather than ship a half-honest contract:
POST /sandboxes { harness: \"deepagents\", ... }
→ 400 { code: \"HARNESS_NOT_SUPPORTED\",
message: \"Warm sandboxes are not yet supported for harness=deepagents.
The LangGraph checkpointer state isn't bridged to SessionStore...
Use POST /run or POST /tasks for deepagents one-shot work.\" }
POST /sandboxes/restore from a deepagents snapshot is gated the same way (defensive, covers pre-existing snapshots).
Deepagents still works perfectly on POST /run (live SSE, single conversation) and POST /tasks (background mongo-persisted runs). The gate is specifically scoped to the warm-sandbox surface whose contract implies persistence.
Proposed fix
Mirror what engine-claude-agent-sdk does today. Three slices:
1. packages/engine-deepagents/src/engine.ts
Currently uses an in-memory MemorySaver checkpointer. Replace with a checkpointer adapter that:
- On every turn, serializes the LangGraph state and calls
sessionStore.save(sessionKey, serialized).
- On first turn after construction, calls
sessionStore.load(sessionKey) and, if non-empty, seeds the LangGraph state before invoking the graph.
Reference for the protocol contract: packages/protocol/src/contracts.ts:57-62 (the sessionStore field on EngineContext). Reference for an existing implementation that consumes it: packages/engine-claude-agent-sdk/src/engine.ts:71-85.
LangGraph already has the BaseCheckpointSaver interface — implementing one that delegates to SessionStore is the cleanest path (no fork of upstream). Sketch:
class SessionStoreCheckpointer extends BaseCheckpointSaver {
constructor(private store: SessionStore, private sessionId: string) { super(); }
async getTuple(config) { /* sessionStore.load → deserialize → CheckpointTuple */ }
async put(config, checkpoint, metadata) { /* serialize → sessionStore.save */ }
async list(config, options?) { /* optional */ }
}
2. Remove the gate
Once (1) is in, drop the harness === \"deepagents\" check in:
examples/computeragent-server.ts validateSandboxBody() (~line 1958)
examples/computeragent-server.ts POST /sandboxes/restore (~line 1390, the snap.config.harness === \"deepagents\" guard)
3. Cross-harness verification
Existing test scripts already parametrize by --harness. After (1) lands, the existing cross-harness suite should turn the documented skips into passes without code changes:
python3 scripts/test-sandboxes.py --harness deepagents
python3 scripts/test-sandboxes-s3.py --harness deepagents
Plus the conversation-persistence smoke test (currently a one-off bash script in this issue — promote it to t_conversation_persistence_across_restore in scripts/test-sandboxes.py).
Out of scope for this issue
- The OS-workdir attachment layer is a separate gap (deepagents' file tools query a virtual FS, not the OS workdir). Fixing the conversation persistence aspect doesn't address this. It's tracked separately as a smaller follow-up (deepagents could either sync the virtual FS from OS attachments on boot, or expose a tool that reads from cwd).
Verification on landing
End-to-end:
POST /sandboxes with harness: \"deepagents\" → 201 (gate lifted).
- Repro the conversation-persistence recipe above → AURORA-42 recalled.
- Full cross-harness suites pass with
--harness deepagents.
Provenance
- Code commit (gate):
2885e0c feat(sandboxes): reject harness=deepagents at the warm-sandbox door
- Verification commit (revealing the gap):
5e2816c test(sandboxes): in-place restore (target=<existingSandboxId>) verified
- Plan file:
.claude/plans/lets-have-a-brainstroming-unified-hippo.md
Context
Wedge 1.13 (heartbeat + state snapshot/restore + S3 backend) is shipped and verified end-to-end across
claude-agent-sdkandgitagent. Fordeepagents, the warm-sandbox surface is currently gated off (commit2885e0c) because of an engine-level conversation persistence gap.This issue tracks the engineering work to lift that gate.
The gap (precisely characterized)
Cross-harness test, all three engines, same recipe:
POST /sandboxeswithsessionStore: { kind: "mongo" }POST /sandboxes/:id/snapshot(memory state store)DELETE /sandboxes/:id(sandbox A disposed)POST /sandboxes/restore→ new sandbox B (sessionId preserved from sessionStoreRef)claude-agent-sdkgitagentdeepagentsLayer-by-layer:
Root cause
engine-deepagentsuses LangGraph'sMemorySavercheckpointer for conversation state. That checkpointer is fully in-memory; it doesn't read/write through theSessionStoreprotocol thatengine-claude-agent-sdk(and the gitclaw flow inengine-gitagent) plug into for cross-process resume.Multi-turn WITHIN one live deepagents sandbox works fine — the in-memory checkpointer survives across
chat()calls on the sameComputerAgentinstance (verified). The break only happens when the substrate disposes (TTL / DELETE / shutdown) and a new sandbox is constructed from the snapshot — there's nothing to read the prior conversation back from.Current behavior
Per commit
2885e0c, the warm-sandbox surface refuses deepagents at the door rather than ship a half-honest contract:POST /sandboxes/restorefrom a deepagents snapshot is gated the same way (defensive, covers pre-existing snapshots).Deepagents still works perfectly on
POST /run(live SSE, single conversation) andPOST /tasks(background mongo-persisted runs). The gate is specifically scoped to the warm-sandbox surface whose contract implies persistence.Proposed fix
Mirror what
engine-claude-agent-sdkdoes today. Three slices:1.
packages/engine-deepagents/src/engine.tsCurrently uses an in-memory
MemorySavercheckpointer. Replace with a checkpointer adapter that:sessionStore.save(sessionKey, serialized).sessionStore.load(sessionKey)and, if non-empty, seeds the LangGraph state before invoking the graph.Reference for the protocol contract:
packages/protocol/src/contracts.ts:57-62(thesessionStorefield onEngineContext). Reference for an existing implementation that consumes it:packages/engine-claude-agent-sdk/src/engine.ts:71-85.LangGraph already has the
BaseCheckpointSaverinterface — implementing one that delegates toSessionStoreis the cleanest path (no fork of upstream). Sketch:2. Remove the gate
Once (1) is in, drop the
harness === \"deepagents\"check in:examples/computeragent-server.tsvalidateSandboxBody()(~line 1958)examples/computeragent-server.tsPOST /sandboxes/restore(~line 1390, thesnap.config.harness === \"deepagents\"guard)3. Cross-harness verification
Existing test scripts already parametrize by
--harness. After (1) lands, the existing cross-harness suite should turn the documented skips into passes without code changes:Plus the conversation-persistence smoke test (currently a one-off bash script in this issue — promote it to
t_conversation_persistence_across_restoreinscripts/test-sandboxes.py).Out of scope for this issue
Verification on landing
End-to-end:
POST /sandboxeswithharness: \"deepagents\"→ 201 (gate lifted).--harness deepagents.Provenance
2885e0cfeat(sandboxes): reject harness=deepagents at the warm-sandbox door5e2816ctest(sandboxes): in-place restore (target=<existingSandboxId>) verified.claude/plans/lets-have-a-brainstroming-unified-hippo.md