Skip to content

fix(backend): preserve snapshot events in loadEvents head+tail read for large JSONL files#1662

Open
markturansky wants to merge 6 commits into
ambient-code:mainfrom
markturansky:fix/loadevents-head-tail-snapshot-preservation-v2
Open

fix(backend): preserve snapshot events in loadEvents head+tail read for large JSONL files#1662
markturansky wants to merge 6 commits into
ambient-code:mainfrom
markturansky:fix/loadevents-head-tail-snapshot-preservation-v2

Conversation

@markturansky

@markturansky markturansky commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Fixes chat history disappearing when navigating away and back for sessions with large event logs (>2MB JSONL)
  • Root cause: the tail-read optimization (ffe4a21) only reads the last 2MB, truncating early snapshot/lifecycle events (MESSAGES_SNAPSHOT, RUN_STARTED, etc.) that the frontend needs for full conversation replay
  • Implements a head+tail read strategy: scans file head for snapshot/lifecycle events, scans tail for streaming events, merges with type < /dev/null | timestamp deduplication

Test plan

  • TestFastExtractType - 6 sub-tests covering standard events, edge cases
  • TestLoadEventsHeadTailMerge - 3 sub-tests with 4.4MB JSONL files verifying head preservation, deduplication, and tail-only fallback
  • All 16 existing + new tests pass (go test ./websocket/ -v)
  • go build and go vet pass
  • Manual integration test: navigate away and back on a session with large event log, verify full conversation replay

Generated with Claude Code

Summary by CodeRabbit

  • Bug Fixes

    • Optimized the event log loading mechanism for large log files, improving load times and eliminating event duplication during replay sessions.
    • Added automatic fallback handling to ensure continued functionality when log file access encounters edge cases.
  • Tests

    • Expanded test coverage for event log loading, merging, and deduplication workflows.

…or large JSONL files

The tail-read optimization (ffe4a21) only reads the last 2MB of event log
files, truncating early snapshot/lifecycle events (MESSAGES_SNAPSHOT,
RUN_STARTED, etc.) that the frontend needs for full conversation replay.

This adds a head+tail read strategy: scan the file head for snapshot and
lifecycle events, scan the tail for streaming events, and merge them with
type+timestamp deduplication. Fixes chat history disappearing on navigation
for sessions with large event logs.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
@netlify

netlify Bot commented Jun 5, 2026

Copy link
Copy Markdown

Deploy Preview for cheerful-kitten-f556a0 canceled.

Name Link
🔨 Latest commit 505475e
🔍 Latest deploy log https://app.netlify.com/projects/cheerful-kitten-f556a0/deploys/6a275d2c732d5800084f1f30

@coderabbitai

coderabbitai Bot commented Jun 5, 2026

Copy link
Copy Markdown
Contributor
📝 Walkthrough

Walkthrough

Event log replay for large agui-events.jsonl files now reads the head for lifecycle/snapshot events and the tail for recent streaming events, merges them while deduplicating on type+timestamp, and falls back to full-file reads if tail seeking fails. Fast type extraction and buffered head scanning avoid JSON parsing overhead.

Changes

Event Log Head+Tail Loading

Layer / File(s) Summary
Head+tail read strategy and merging
components/backend/websocket/agui_store.go
Large file detection now scans head events (run started/finished, snapshots) and tail events (recent streaming). Merge deduplicates by type|timestamp composite key, preserving head snapshots first. Comments and skip-first logic updated to reflect dual-scan approach.
Head snapshot scanning and type extraction
components/backend/websocket/agui_store.go
scanHeadSnapshotEvents reads log from start up to replayMaxTailBytes with buffered scanning and early cutoff. fastExtractType extracts JSON type field using byte-level searches to avoid full unmarshaling overhead on non-head events.
Test coverage for head+tail loading
components/backend/websocket/agui_store_test.go
TestFastExtractType validates type extraction across edge cases (missing/empty inputs, non-first field). writeLargeEventFile generates oversized JSONL with configurable head/tail. TestLoadEventsHeadTailMerge validates head snapshot preservation, deduplication on overlap, and tail-only fallback.

Sequence Diagram

sequenceDiagram
  participant loadEvents
  participant scanHeadSnapshotEvents
  participant fastExtractType
  participant eventFile as agui-events.jsonl
  
  loadEvents->>eventFile: check file size
  alt file > replayMaxTailBytes
    loadEvents->>scanHeadSnapshotEvents: scan head for snapshots
    scanHeadSnapshotEvents->>eventFile: read from start
    scanHeadSnapshotEvents->>fastExtractType: extract type field
    fastExtractType-->>scanHeadSnapshotEvents: type value
    scanHeadSnapshotEvents-->>loadEvents: head events
    loadEvents->>eventFile: seek to tail offset
    loadEvents->>eventFile: scan tail events
    loadEvents->>loadEvents: merge by type|timestamp key
    loadEvents-->>loadEvents: deduplicated events
  else file <= replayMaxTailBytes
    loadEvents->>eventFile: read full file
  end
Loading

Possibly related PRs

  • ambient-code/platform#890: Earlier PR modified loadEvents to avoid excessive memory on large event logs; this PR adds head+tail scanning and deduplication on top of that foundation.
🚥 Pre-merge checks | ✅ 7 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (7 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed Title follows Conventional Commits format (fix(backend): ...) and accurately describes the main change: implementing head+tail read strategy to preserve snapshot events in loadEvents for large JSONL files.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Performance And Algorithmic Complexity ✅ Passed Bounded 2MB head+tail scans with O(1) dedup lookups; fastExtractType avoids unnecessary JSON parsing; linear O(n) algorithm with no O(n²), N+1, unbounded growth.
Security And Secret Handling ✅ Passed No hardcoded secrets, path traversal protection via sessionID validation, secure JSON serialization, no sensitive payloads logged, no SQL/command injection, no K8s Secret violations.
Kubernetes Resource Safety ✅ Passed PR modifies only Go backend code (agui_store.go, agui_store_test.go) for event log handling; contains no Kubernetes manifests, RBAC policies, resource definitions, or pod configurations.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
✨ Simplify code
  • Create PR with simplified code

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
components/backend/websocket/agui_store.go (1)

273-338: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Keep the head scan out of the tail window.

For files in (replayMaxTailBytes, 2*replayMaxTailBytes), these two reads overlap. Appending headEvents before tailEvents then reorders overlap-range events and can replay lifecycle/snapshot events ahead of earlier stream events from the same file region. Clamp the head scan to tailOffset (or otherwise preserve file order) before merging.

Suggested fix
-	headEvents := scanHeadSnapshotEvents(f)
-
 	tailOffset := fileSize - replayMaxTailBytes
+	headScanLimit := tailOffset
+	if headScanLimit > replayMaxTailBytes {
+		headScanLimit = replayMaxTailBytes
+	}
+	headEvents := scanHeadSnapshotEvents(f, headScanLimit)
 	if _, err := f.Seek(tailOffset, 0); err != nil {
-func scanHeadSnapshotEvents(f *os.File) []map[string]interface{} {
+func scanHeadSnapshotEvents(f *os.File, maxBytes int64) []map[string]interface{} {
+	if maxBytes <= 0 {
+		return nil
+	}
 	if _, err := f.Seek(0, 0); err != nil {
 		return nil
 	}
@@
-	for bytesRead < replayMaxTailBytes {
+	for bytesRead < maxBytes {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@components/backend/websocket/agui_store.go` around lines 273 - 338, The head
scan can overlap the tail window causing event reordering; clamp the head scan
to the start of the tail window by ensuring scanHeadSnapshotEvents only reads up
to tailOffset (or by changing its signature to accept an io.Reader/limit) so it
doesn't include bytes at or after tailOffset, then merge headEvents and
tailEvents as before; modify the call to scanHeadSnapshotEvents and/or its
implementation (referencing scanHeadSnapshotEvents and the tailOffset variable)
to use an io.LimitedReader or seek+read-to-boundary so headEvents never overlaps
the tail region.
🧹 Nitpick comments (1)
components/backend/websocket/agui_store_test.go (1)

497-503: ⚡ Quick win

Add one overlap-range case and one numeric-timestamp case here.

These fixtures only write string timestamps and aim for a file well above 4MB, so the suite never covers the two brittle paths in this merge logic: numeric timestamp values from types.BaseEvent and overlapping head/tail windows for files just over 2MB. A subtest for each would pin both regressions.

Also applies to: 536-664

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@components/backend/websocket/agui_store_test.go` around lines 497 - 503, The
test fixtures only produce string timestamps and miss two brittle merge paths;
update the padding loop that builds events (look for paddingCount,
paddingContent and the map with keys "type", "messageId", "delta", "timestamp")
to add two additional cases: (1) an extra subtest that produces overlapping
head/tail windows by inserting events with timestamps that create an overlap
around the ~2MB boundary (so the file is just over 2MB and triggers the
overlap-merge logic), and (2) a subtest that writes at least one event using a
numeric timestamp value (the numeric form used by types.BaseEvent) instead of a
string to exercise the numeric-timestamp merge path; ensure these events use the
same event shape (types.EventTypeTextMessageContent, unique messageId like
"msg-pad-overlap-#", and delta content) so existing merge assertions still run.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@components/backend/websocket/agui_store.go`:
- Around line 273-338: The head scan can overlap the tail window causing event
reordering; clamp the head scan to the start of the tail window by ensuring
scanHeadSnapshotEvents only reads up to tailOffset (or by changing its signature
to accept an io.Reader/limit) so it doesn't include bytes at or after
tailOffset, then merge headEvents and tailEvents as before; modify the call to
scanHeadSnapshotEvents and/or its implementation (referencing
scanHeadSnapshotEvents and the tailOffset variable) to use an io.LimitedReader
or seek+read-to-boundary so headEvents never overlaps the tail region.

---

Nitpick comments:
In `@components/backend/websocket/agui_store_test.go`:
- Around line 497-503: The test fixtures only produce string timestamps and miss
two brittle merge paths; update the padding loop that builds events (look for
paddingCount, paddingContent and the map with keys "type", "messageId", "delta",
"timestamp") to add two additional cases: (1) an extra subtest that produces
overlapping head/tail windows by inserting events with timestamps that create an
overlap around the ~2MB boundary (so the file is just over 2MB and triggers the
overlap-merge logic), and (2) a subtest that writes at least one event using a
numeric timestamp value (the numeric form used by types.BaseEvent) instead of a
string to exercise the numeric-timestamp merge path; ensure these events use the
same event shape (types.EventTypeTextMessageContent, unique messageId like
"msg-pad-overlap-#", and delta content) so existing merge assertions still run.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 3c42f091-cf69-4c4c-b293-1e200d213eb2

📥 Commits

Reviewing files that changed from the base of the PR and between cd9d5d9 and 9301f08.

📒 Files selected for processing (2)
  • components/backend/websocket/agui_store.go
  • components/backend/websocket/agui_store_test.go

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant