Skip to content

Warm sandboxes (Wedge 1.12/1.13): bridge engine-deepagents MemorySaver to SessionStore #7

@shreyas-lyzr

Description

@shreyas-lyzr

Context

Wedge 1.13 (heartbeat + state snapshot/restore + S3 backend) is shipped and verified end-to-end across claude-agent-sdk and gitagent. For deepagents, the warm-sandbox surface is currently gated off (commit 2885e0c) because of an engine-level conversation persistence gap.

This issue tracks the engineering work to lift that gate.

The gap (precisely characterized)

Cross-harness test, all three engines, same recipe:

  1. POST /sandboxes with sessionStore: { kind: "mongo" }
  2. Chat: "Remember the password: AURORA-42. Reply: ACKNOWLEDGED."
  3. POST /sandboxes/:id/snapshot (memory state store)
  4. DELETE /sandboxes/:id (sandbox A disposed)
  5. POST /sandboxes/restore → new sandbox B (sessionId preserved from sessionStoreRef)
  6. Chat on B: "What was the password I told you earlier?"
Harness Recalls AURORA-42 after restore?
claude-agent-sdk ✓ PASS
gitagent ✓ PASS
deepagents ✗ FAIL — "This appears to be the start of our conversation"

Layer-by-layer:

Layer claude-agent-sdk gitagent deepagents
OS workdir snapshot+restore (tarball round-trip) ✓ (35 files captured + restored)
Agent tools see OS workdir ✗ (LangGraph virtual FS)
Conversation log persists via SessionStore ✗ (in-memory checkpointer)
Conversation survives snapshot → restore

Root cause

engine-deepagents uses LangGraph's MemorySaver checkpointer for conversation state. That checkpointer is fully in-memory; it doesn't read/write through the SessionStore protocol that engine-claude-agent-sdk (and the gitclaw flow in engine-gitagent) plug into for cross-process resume.

Multi-turn WITHIN one live deepagents sandbox works fine — the in-memory checkpointer survives across chat() calls on the same ComputerAgent instance (verified). The break only happens when the substrate disposes (TTL / DELETE / shutdown) and a new sandbox is constructed from the snapshot — there's nothing to read the prior conversation back from.

Current behavior

Per commit 2885e0c, the warm-sandbox surface refuses deepagents at the door rather than ship a half-honest contract:

POST /sandboxes  { harness: \"deepagents\", ... }
→ 400 { code: \"HARNESS_NOT_SUPPORTED\",
        message: \"Warm sandboxes are not yet supported for harness=deepagents.
                  The LangGraph checkpointer state isn't bridged to SessionStore...
                  Use POST /run or POST /tasks for deepagents one-shot work.\" }

POST /sandboxes/restore from a deepagents snapshot is gated the same way (defensive, covers pre-existing snapshots).

Deepagents still works perfectly on POST /run (live SSE, single conversation) and POST /tasks (background mongo-persisted runs). The gate is specifically scoped to the warm-sandbox surface whose contract implies persistence.

Proposed fix

Mirror what engine-claude-agent-sdk does today. Three slices:

1. packages/engine-deepagents/src/engine.ts

Currently uses an in-memory MemorySaver checkpointer. Replace with a checkpointer adapter that:

  • On every turn, serializes the LangGraph state and calls sessionStore.save(sessionKey, serialized).
  • On first turn after construction, calls sessionStore.load(sessionKey) and, if non-empty, seeds the LangGraph state before invoking the graph.

Reference for the protocol contract: packages/protocol/src/contracts.ts:57-62 (the sessionStore field on EngineContext). Reference for an existing implementation that consumes it: packages/engine-claude-agent-sdk/src/engine.ts:71-85.

LangGraph already has the BaseCheckpointSaver interface — implementing one that delegates to SessionStore is the cleanest path (no fork of upstream). Sketch:

class SessionStoreCheckpointer extends BaseCheckpointSaver {
  constructor(private store: SessionStore, private sessionId: string) { super(); }
  async getTuple(config) { /* sessionStore.load → deserialize → CheckpointTuple */ }
  async put(config, checkpoint, metadata) { /* serialize → sessionStore.save */ }
  async list(config, options?) { /* optional */ }
}

2. Remove the gate

Once (1) is in, drop the harness === \"deepagents\" check in:

  • examples/computeragent-server.ts validateSandboxBody() (~line 1958)
  • examples/computeragent-server.ts POST /sandboxes/restore (~line 1390, the snap.config.harness === \"deepagents\" guard)

3. Cross-harness verification

Existing test scripts already parametrize by --harness. After (1) lands, the existing cross-harness suite should turn the documented skips into passes without code changes:

python3 scripts/test-sandboxes.py --harness deepagents
python3 scripts/test-sandboxes-s3.py --harness deepagents

Plus the conversation-persistence smoke test (currently a one-off bash script in this issue — promote it to t_conversation_persistence_across_restore in scripts/test-sandboxes.py).

Out of scope for this issue

  • The OS-workdir attachment layer is a separate gap (deepagents' file tools query a virtual FS, not the OS workdir). Fixing the conversation persistence aspect doesn't address this. It's tracked separately as a smaller follow-up (deepagents could either sync the virtual FS from OS attachments on boot, or expose a tool that reads from cwd).

Verification on landing

End-to-end:

  1. POST /sandboxes with harness: \"deepagents\" → 201 (gate lifted).
  2. Repro the conversation-persistence recipe above → AURORA-42 recalled.
  3. Full cross-harness suites pass with --harness deepagents.

Provenance

  • Code commit (gate): 2885e0c feat(sandboxes): reject harness=deepagents at the warm-sandbox door
  • Verification commit (revealing the gap): 5e2816c test(sandboxes): in-place restore (target=<existingSandboxId>) verified
  • Plan file: .claude/plans/lets-have-a-brainstroming-unified-hippo.md

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions