feat(traces): opt-in consent-gated server-side conversation trace storage#18
Merged
Conversation
Gateway-side recording of physical requests, gated by org setting + client consent cookie; 30-day TTL; reconstruct SessionTrace at view time; OTel GenAI vocabulary alignment; admin read/delete + GDPR erasure. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add boolean storeTraces to settings schema for server-side trace storage opt-in. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add storeTraces: false to emptySettings so the GET endpoint returns a consistent value before the first PUT. Rewrite the persists the storeTraces flag test to use shared mockModel/defaultQuotas helpers and add a second assertion that verifies the false default path. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Write one trace document per model call when storeTraces is on AND the request carries x-trace-consent: yes. Recording is strictly fire-and-forget (service swallows all errors). Also advertises availability via x-trace-storage: available response header. Covers both streaming and non-streaming completion paths. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ing in the response path Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds GET/DELETE endpoints for trace-requests under /api/traces/:type/:id (conversation list + per-conversation detail/delete + per-user erasure). Mounts the router in app.ts and extends the test-env cleanup to clear trace-requests between test runs. Adds api spec verifying gateway recording end-to-end (consent gate, storeTraces toggle, auth guard, delete). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…lper add a compound index on (owner.type, owner.id, userId) to support the per-user erasure query without a collection scan. add two API tests: one asserting the 400 guard when ?userId is omitted, and a real end-to-end erasure test using an org owner where trackPerUser=true stores the member's userId. also make waitForConversations() throw instead of silently returning an empty result on timeout. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds pure `reconstructTrace(requests)` function that rebuilds a `SessionTrace` from an array of stored `StoredTraceRequest` documents so the existing trace viewer can display server-side traces without modification. Includes three unit tests covering system-prompt extraction, tool-call/result pairing, and sub-agent grouping. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ble-quoted previews safeJson keeps non-JSON string outputs as-is. Make the tool-result test assertion robust to object/string output forms. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…to gateway Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add TraceConsentSheet component that shows a bottom-sheet asking the user to accept or decline server-side trace storage when the gateway advertises x-trace-storage: available and no consent cookie exists. Mount it in AgentChat.vue. Add a consent toggle in the Settings tab of AgentChatDebugDialog for users to revisit their choice. Backed by e2e tests covering accept/persist, settings-toggle visibility, and no-sheet-when-disabled scenarios. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Introduce a module-level `consentRef` in trace-consent.ts, updated by `writeConsent`, so accepting/declining the bottom-sheet is immediately reflected in the Settings-tab toggle without a page reload. Also switch the bottom-sheet from `v-model` to `:model-value` on a read-only computed, and strengthen the e2e test with an immediate toBeHidden() assertion after Accept. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Route fetchStored/loadStored/deleteStored failures through the page's existing loadError ref so admins see feedback on 4xx/5xx/network errors. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…se UI Fix A: strip leading `subagent_` from tool-call names before matching against `subByKey` keys (which are built from the display name, e.g. `Researcher:0`), so that two distinct sub-agents firing in the same step each map to the correct sub-agent block by name rather than falling back to arrival order. Fix B: add a per-user GDPR erase action (mdiAccountRemove icon button) on each stored-conversation row that has a userId. Calls DELETE /api/traces/:type/:id?userId=… after window.confirm, then re-fetches the list. Bilingual i18n keys added. Updated the trace-review e2e to use `.first()` on the button locator now that rows can have two buttons. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…gs form
The settings vjsf form mishandled the conditionally-hidden model/quota
sections and persisted-data round-tripping:
- models/quotas are required at the root but their sections are hidden
until a provider exists; vjsf prunes the hidden empty data on first
edit, so toggling the new "store traces" switch on an empty config
raised a global "information obligatoire" error. Drop models/quotas
from the root required array (both have defaults).
- the server kept re-injecting models: {} (emptySettings + PUT
body.models ?? {}) which vjsf then strips from the hidden section,
producing a permanent diff that made Save reappear on every reload.
Round-trip models exactly instead: emptySettings omits it and PUT
stores it only when present. The API already preserves submitted data.
- extract the duplicated defaultQuotas into settings/service.ts and use
it (with optional chaining) in gateway/summary/usage now that
models/quotas are optional on the Settings type.
Regression tests cover: no required error when toggling store-traces on
an empty config, empty config converges after save, and a saved config
converges after one save.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…cation Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…mplification Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Replace the fixed \$limit:200 with a \$facet-based pagination. Honor ?page and ?size query params (size clamped 1-200, default 20). Response now includes a count field alongside results. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…t recorder Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Remove all live-capture methods (reset, setSystemPrompt, snapshotTools, startTurn, startToolCall, finishToolCall, startSubAgent, recordSubAgentStep, recordCompaction, recordModerationDecision, recordPhysicalRequest, finishStep, addStepMessages) and the transient private state they used (currentTurn, currentStep, pendingToolCalls). The class now exposes only fromTrace(), getTrace(), getTraceOverview(), getTraceEntry(), and getTraceEntries(). Rewrite the unit spec to use fromTrace() with hardcoded SessionTrace fixtures instead of the removed recording API. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ted traces Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…eview link Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…w page The handoff bridge and the /:type/:id/trace-review page are superseded by server-stored traces (activity page + /traces/:id/review). Removed their unit/e2e test files and updated typed-router.d.ts accordingly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…bedded + standalone)
Add tests for compaction, moderation, hidden-context and tools-changed overview entries that were lost in the read-only surface rewrite. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ation-in-trace (out of scope) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… trace Tag the moderator gateway call with a moderation:<turnId> context so it is stored, derive contextKind 'moderation' server-side, and reconstruct it into a moderation trace entry (re-parsing the stored verdict). Also fixes systemPrompt reconstruction to use a turn request, not the moderation/compaction prompt. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The consent bottom-sheet (shown when trace storage is available but consent is undefined) overlays the chat and blocks message sends. Move storeTraces:true out of the shared settings into the review-page tests that grant consent, so the functionality tests (multi-message / compaction) aren't blocked. Also retry once locally to absorb dev-server load flakes during the long sequential e2e run. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The review-page assertions duplicated the dedicated 'Subagent trace appears on the review page' test, making :77 a double-flow test prone to timeout. It now only verifies the live UI chain (no trace storage / consent needed).
A turn blocked by moderation aborts its assistant stream before the gateway records the turn request, leaving only the moderation request stored. Reconstruct now surfaces such orphaned moderation/compaction requests as their own turn, so the moderation entry appears on the review page regardless of whether the turn was stored. Fixes the intermittent moderation review-page test.
…resAt Retention is a single fixed 30-day policy, so store createdAt as a BSON Date and put the TTL index on it (expireAfterSeconds: 30d) instead of computing a separate expiresAt at write time. mongoLib.configure drops+recreates the ttl-keys index.
GET /traces/conversation/:id queries by conversation.id with no owner, which no
existing index covered as a prefix (list-keys has it 3rd) — it was a collection
scan. Add { conversation.id: 1, createdAt: 1 }.
A physical request and the semantic entries reconstructed from its response share one timestamp, so the timestamp-only sort left the request after its derived info. Add a tie-break rank: inputs, then the physical request, then the extracted entries. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Reconcile the trace-storage work with main's quota refactor (#16) and streaming parallel-tool-call fix (#17): - gateway/summary routers adopt resolveUsageIdentity/enforceQuotas from usage/enforce.ts while keeping trace recording and the streamed tool-call capture; merged the per-id toolCallIndex with streamedToolCalls - defaultQuotas (now incl. untrusted) stays centralized in settings/service.ts; routers fall back to it, enforce.ts uses NonNullable<Settings['quotas']> - settings.quotas/models remain optional (storeTraces form work) - regenerated put-req validate.js and vjsf components from the merged schema Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add opt-in, consent-gated server-side storage of conversation traces, with admin review tooling, replacing the old in-browser session recorder and trace upload/handoff.
What changed:
trace-requestscollection storing each physical gateway request/response; written fire-and-forget only when the org enablesstoreTracesand the user consents (x-trace-consent). Auto-expires after 30 days via a TTL index./api/traces: paginated conversation list, fetch-by-conversation, delete-conversation, and GDPR per-user erasure (DELETE /:type/:id?userId=).main: adopts theusage/enforce.tsquota refactor (feat(limits): add untrusted pool quota for anonymous + external usage #16) and the parallel-tool-call streaming fix (fix(gateway): preserve parallel tool calls in the streaming path #17), keeping trace capture in the same stream loop.Why: give admins a way to review real conversations for debugging/quality, while keeping it strictly opt-in per org and per user, with bounded retention.
Regression risks (reviewer focus):
settingsPUT now omitsmodelswhen the form sends none (documented form-diff fix); withreplaceOnethis clears a stored emptymodels. Confirm no consumer readssettings.models/settings.quotaswithout optional chaining — both are now optional in the schema.trace-requestscollection at startup.GET /traces/conversation/:idlooks up across owners before authorizing (returns 403 with no data to non-admins of the owner).