Skip to content

feat(viz): server-side streaming pipeline for the graph data layer#50

Merged
cdeust merged 3 commits into
mainfrom
viz/server-streaming-pipeline
May 31, 2026
Merged

feat(viz): server-side streaming pipeline for the graph data layer#50
cdeust merged 3 commits into
mainfrom
viz/server-streaming-pipeline

Conversation

@cdeust
Copy link
Copy Markdown
Owner

@cdeust cdeust commented May 31, 2026

Summary

  • Lands the server / data-layer half of wip/layout-authority-sse-streaming onto main. No UI code is touched (per user direction: visualization UI is being rebuilt separately against this contract).
  • Adds the SSE event stream, CXGB binary snapshot, quadtree positions endpoint, phase state machine, and the full LayoutAuthority service.
  • Reworks the workflow-graph builder for streaming: skeleton-first baseline, interleaved load+ingest+emit, 500-item chunking, PG server-side cursor on memories, L6 symbols on the SSE stream.

What's new on the wire

endpoint purpose
/api/graph/events live SSE batches as each source ingests
/api/graph.bin CXGB binary snapshot — ~110 ms cold load on a 135 k-node store vs ~1-3 s JSON
/api/quadtree Apache Arrow IPC of every node's (id, x, y, kind) for client-side hit testing
/api/graph/progress phase state machine (L0 → L1 … L6:<proj> → L6_CROSS)

What's new in the build pipeline

  • Skeleton-first baseline so the first poll returns a usable backbone in ~1 s.
  • Per-source 500-item chunking — large sources (107 k memories, 670 k L6 symbols) don't pop as one big jump.
  • Streaming PG cursor (iter_memories_chunked / iter_hot_memories_chunked) — Python no longer materialises 100 k+ rows before yielding the first node.
  • L6 symbols now emit through the same SSE stream as the structural phases.

What's new in LayoutAuthority

  • Full ForceAtlas2 server-side layout service with HNSW-backed position storage (layout_pg_store).
  • 98 design audits under tasks/layout-authority/audits/ capture the reasoning trail.

What is NOT in this PR (intentionally)

  • Any UI code — ui/unified/js/*, ui/unified-viz.html, the renderers, the bridge. The user is rebuilding the visualization UI from scratch against this server contract.

Test plan

  • 37 / 37 layout-authority + open_visualization tests pass
  • All touched server modules import cleanly
  • Reviewer: spot-check mcp_server/server/http_standalone_graph.py phase wiring
  • Reviewer: confirm mcp_server/infrastructure/pg_store_queries.py iter_hot_memories_chunked doesn't change non-streaming callers

🤖 Generated with Claude Code

cdeust and others added 3 commits May 31, 2026 10:35
Consolidates the server / data-layer half of the
wip/layout-authority-sse-streaming branch onto main. Everything in
this commit is server-side: PostgreSQL streaming, SSE event stream,
LayoutAuthority, workflow-graph builder, and the layout audit
trail. UI code stays on wip for the separate rebuild the user is
planning.

What's new on the wire:

  /api/graph/events        — live SSE stream of build batches.
                             Server-paced producer emits {nodes,edges}
                             chunks in label order as each source
                             finishes (see graph_event_stream.py +
                             http_standalone_endpoints.py).

  /api/graph.bin           — CXGB binary snapshot of the cumulative
                             graph for fast cold-cache loads (~110 ms
                             on a 135 k-node store vs ~1-3 s for
                             /api/graph JSON). See graph_snapshot.py.

  /api/quadtree            — Apache Arrow IPC of every node's
                             (id, x, y, kind) for client-side hit
                             testing. Computed positions come from
                             LayoutAuthority's ForceAtlas2 pass and
                             persisted via layout_pg_store.

  /api/graph/progress      — phase state machine
                             (L0 domains → L1 setup → L2 tools →
                             L3 files → L4 discussions → L5 memories
                             → L6:<proj> per-project symbols →
                             L6_CROSS). Used by clients to render
                             the loading bar without a snapshot.

What's new in the build pipeline:

  * Skeleton-first baseline so the very first poll returns a usable
    domain backbone in ~1 s; everything else streams in behind.
  * Interleaved load+ingest+emit — every per-source batch is
    published to /api/graph/events the instant it ingests, rather
    than waiting for the full source to finish.
  * Per-source 500-item chunking inside the streamer so large sources
    (107 k memories, 670 k L6 symbols) don't show up as one big jump.
  * Streaming PG cursor on the memories phase
    (iter_memories_chunked + iter_hot_memories_chunked) so the
    Python side doesn't materialise 100 k+ rows in memory before
    yielding the first node.
  * L6 symbols now emit through the same SSE stream as the structural
    phases (previously they only landed in the cumulative cache and
    were invisible to live clients).

What's new in LayoutAuthority:

  layout_authority.py + scheduler + geometry + lod + log + pressure
  + protocol + wire — the full ForceAtlas2 server-side layout
  service. Produces stable positions for every node so the client
  can render without running physics. Persists to layout_pg_store
  (HNSW-backed via pg_vector) and serves through /api/quadtree.

  98 design audits in tasks/layout-authority/audits/ capture the
  full reasoning trail for the layout-authority architecture
  (Pólya, Einstein, Fermi, ...).

What stays the same:

  * open_visualization handler still opens the bundled viz at
    127.0.0.1:3458 — but server-side bootstrap now passes
    ?viz=force as the default mode so the README hero view loads
    instead of the slow tilemap fallback.
  * mcp_server/handlers/workflow_graph build_workflow_graph()
    signature is unchanged for non-streaming callers; new
    on_batch / defer_native_ast / stage kwargs are optional.

What is NOT in this commit (intentionally):

  * Any UI code — workflow_graph.js, workflow_graph_bridge.js,
    workflow_graph_render_canvas.js, controls.js, unified-viz.html,
    graph.js, the workflow_graph_filters / panel / labels /
    humanize modules, graph_event_stream.js, graph_snapshot.js.
    The user is rebuilding the visualization UI from scratch
    against this server contract; the wip branch carries the
    research-spike UI code that is being discarded.

Tests:
  37 / 37 layout-authority + open_visualization tests pass.
  All touched server modules import cleanly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The file manipulates sys.path before the package imports so it can be
run directly without installing. ruff E402 flags these as out-of-order;
suppress with noqa since the path setup is intentional.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
All 16 files introduced in the server-side streaming pipeline
(layout_authority*, graph_event_stream, graph_snapshot, etc.) needed
ruff formatting to pass `ruff format --check`. No logic changes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@cdeust cdeust merged commit 9343046 into main May 31, 2026
7 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant