Skip to content

Add server observability tracing and metrics#1697

Merged
juliusmarminge merged 11 commits intomainfrom
t3code/effect-tracing-observability
Apr 3, 2026
Merged

Add server observability tracing and metrics#1697
juliusmarminge merged 11 commits intomainfrom
t3code/effect-tracing-observability

Conversation

@juliusmarminge
Copy link
Copy Markdown
Member

@juliusmarminge juliusmarminge commented Apr 2, 2026

Summary

  • Adds server-side observability plumbing for local trace files, OTLP export configuration, and trace/metric initialization.
  • Instruments key runtime paths, including RPC, git execution, provider/orchestration flow, and process startup, with trace attributes and metrics.
  • Refactors snapshot and checkpoint lookup logic to use the unified orchestration read model and updates related tests and docs.
  • Simplifies and reshapes several server/web data flows and UI components to align with the updated orchestration state model.

Testing

  • Not run locally; this PR body was prepared from the diff only.
  • Updated and added unit/integration coverage across server observability, git, checkpointing, orchestration, and web logic paths.

Note

Medium Risk
Introduces new tracing/metrics plumbing and wraps several core runtime paths (RPC, orchestration dispatch, provider operations, git, terminal) which could affect performance and failure behavior if instrumentation misbehaves or config is incorrect, though changes are mostly additive.

Overview
Adds server-side observability: local NDJSON trace recording (with batching/rotation) plus optional OTLP trace/metrics export driven by new ServerConfig fields and CLI env vars (T3CODE_TRACE_*, T3CODE_OTLP_*), and wires this via a new ObservabilityLive layer (replacing file-based logging in ServerLoggerLive).

Instruments key execution paths with spans and metrics: WebSocket RPC handlers (via observeRpc* wrappers), GitCore.execute (spans + command metrics + hook events), orchestration command dispatch (duration/total + ack latency on first committed event), provider session/turn APIs and runtime event ingestion, and terminal lifecycle/restart counters; startup phases and editor launches now annotate spans.

Exposes observability info to the web client in server config snapshots and adds a Settings "Diagnostics" section to display the logs directory and open it in the preferred editor, with updated fixtures/tests and a new docs link in README.md.

Written by Cursor Bugbot for commit c6b7082. This will update automatically on new commits. Configure here.

Note

Add distributed tracing, metrics, and local NDJSON trace file export to the server

  • Introduces a new ObservabilityLive layer (replacing ServerLoggerLive) that writes spans to a rotating local NDJSON trace file and optionally exports traces and metrics via OTLP when configured.
  • Adds ~15 named metrics counters and timers (prefixed t3_) covering RPC requests, orchestration commands, provider sessions/turns, git commands, and terminal sessions, with a withMetrics combinator to attach them to effects.
  • Instruments WebSocket RPC handlers, the orchestration engine, provider service, git core, and terminal manager with span annotations and metric emission.
  • Adds new observability fields to ServerConfig and ServerConfigShape for trace file path, OTLP URLs, batch settings, and service name, read from T3CODE_* environment variables.
  • Adds a Diagnostics section to GeneralSettingsPanel showing the logs directory path and a button to open it, powered by a new useServerObservability hook.
  • Risk: ServerConfig schema now requires an observability object; any existing serialized configs without it will fail schema validation.

Macroscope summarized c6b7082.

- Add local trace file output and OTLP export hooks
- Instrument RPC, git, orchestration, provider, and DB paths
- Document observability configuration
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 2, 2026

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 719b563d-3c5c-487e-bf31-545bd8a8dc10

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch t3code/effect-tracing-observability

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions bot added vouch:trusted PR author is trusted by repo permissions or the VOUCHED list. size:XXL 1,000+ changed lines (additions + deletions). labels Apr 2, 2026
Copy link
Copy Markdown
Contributor

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Autofix Details

Bugbot Autofix prepared fixes for both issues found in the latest run.

  • ✅ Fixed: Provider metrics always report "unknown" provider label
    • Hoisted resolveRoutableSession calls above the inner Effect.gen in all five affected operations so routed.adapter.provider is available for providerMetricAttributes instead of hardcoding "unknown".
  • ✅ Fixed: Exported withMetricAttributes is never used anywhere
    • Removed the unused withMetricAttributes re-export from Metrics.ts as it was dead code with no consumers in the codebase.

Create PR

Or push these changes by commenting:

@cursor push 397de494ef
Preview (397de494ef)
diff --git a/apps/server/src/observability/Metrics.ts b/apps/server/src/observability/Metrics.ts
--- a/apps/server/src/observability/Metrics.ts
+++ b/apps/server/src/observability/Metrics.ts
@@ -74,8 +74,6 @@
   attributes: Readonly<Record<string, unknown>>,
 ): ReadonlyArray<[string, string]> => Object.entries(compactMetricAttributes(attributes));
 
-export const withMetricAttributes = Metric.withAttributes;
-
 export const increment = (
   metric: Metric.Metric<number, unknown>,
   attributes: Readonly<Record<string, unknown>>,

diff --git a/apps/server/src/provider/Layers/ProviderService.ts b/apps/server/src/provider/Layers/ProviderService.ts
--- a/apps/server/src/provider/Layers/ProviderService.ts
+++ b/apps/server/src/provider/Layers/ProviderService.ts
@@ -460,12 +460,12 @@
         schema: ProviderInterruptTurnInput,
         payload: rawInput,
       });
+      const routed = yield* resolveRoutableSession({
+        threadId: input.threadId,
+        operation: "ProviderService.interruptTurn",
+        allowRecovery: true,
+      });
       return yield* Effect.gen(function* () {
-        const routed = yield* resolveRoutableSession({
-          threadId: input.threadId,
-          operation: "ProviderService.interruptTurn",
-          allowRecovery: true,
-        });
         yield* Effect.annotateCurrentSpan({
           "provider.operation": "interrupt-turn",
           "provider.kind": routed.adapter.provider,
@@ -479,7 +479,7 @@
       }).pipe(
         withMetrics({
           counter: providerTurnsTotal,
-          attributes: providerMetricAttributes("unknown", {
+          attributes: providerMetricAttributes(routed.adapter.provider, {
             operation: "interrupt",
           }),
         }),
@@ -494,12 +494,12 @@
         schema: ProviderRespondToRequestInput,
         payload: rawInput,
       });
+      const routed = yield* resolveRoutableSession({
+        threadId: input.threadId,
+        operation: "ProviderService.respondToRequest",
+        allowRecovery: true,
+      });
       return yield* Effect.gen(function* () {
-        const routed = yield* resolveRoutableSession({
-          threadId: input.threadId,
-          operation: "ProviderService.respondToRequest",
-          allowRecovery: true,
-        });
         yield* Effect.annotateCurrentSpan({
           "provider.operation": "respond-to-request",
           "provider.kind": routed.adapter.provider,
@@ -514,7 +514,7 @@
       }).pipe(
         withMetrics({
           counter: providerTurnsTotal,
-          attributes: providerMetricAttributes("unknown", {
+          attributes: providerMetricAttributes(routed.adapter.provider, {
             operation: "approval-response",
           }),
         }),
@@ -530,12 +530,12 @@
       schema: ProviderRespondToUserInputInput,
       payload: rawInput,
     });
+    const routed = yield* resolveRoutableSession({
+      threadId: input.threadId,
+      operation: "ProviderService.respondToUserInput",
+      allowRecovery: true,
+    });
     return yield* Effect.gen(function* () {
-      const routed = yield* resolveRoutableSession({
-        threadId: input.threadId,
-        operation: "ProviderService.respondToUserInput",
-        allowRecovery: true,
-      });
       yield* Effect.annotateCurrentSpan({
         "provider.operation": "respond-to-user-input",
         "provider.kind": routed.adapter.provider,
@@ -546,7 +546,7 @@
     }).pipe(
       withMetrics({
         counter: providerTurnsTotal,
-        attributes: providerMetricAttributes("unknown", {
+        attributes: providerMetricAttributes(routed.adapter.provider, {
           operation: "user-input-response",
         }),
       }),
@@ -560,12 +560,12 @@
         schema: ProviderStopSessionInput,
         payload: rawInput,
       });
+      const routed = yield* resolveRoutableSession({
+        threadId: input.threadId,
+        operation: "ProviderService.stopSession",
+        allowRecovery: false,
+      });
       return yield* Effect.gen(function* () {
-        const routed = yield* resolveRoutableSession({
-          threadId: input.threadId,
-          operation: "ProviderService.stopSession",
-          allowRecovery: false,
-        });
         yield* Effect.annotateCurrentSpan({
           "provider.operation": "stop-session",
           "provider.kind": routed.adapter.provider,
@@ -581,7 +581,7 @@
       }).pipe(
         withMetrics({
           counter: providerSessionsTotal,
-          attributes: providerMetricAttributes("unknown", {
+          attributes: providerMetricAttributes(routed.adapter.provider, {
             operation: "stop",
           }),
         }),
@@ -651,12 +651,12 @@
     if (input.numTurns === 0) {
       return;
     }
+    const routed = yield* resolveRoutableSession({
+      threadId: input.threadId,
+      operation: "ProviderService.rollbackConversation",
+      allowRecovery: true,
+    });
     return yield* Effect.gen(function* () {
-      const routed = yield* resolveRoutableSession({
-        threadId: input.threadId,
-        operation: "ProviderService.rollbackConversation",
-        allowRecovery: true,
-      });
       yield* Effect.annotateCurrentSpan({
         "provider.operation": "rollback-conversation",
         "provider.kind": routed.adapter.provider,
@@ -671,7 +671,7 @@
     }).pipe(
       withMetrics({
         counter: providerTurnsTotal,
-        attributes: providerMetricAttributes("unknown", {
+        attributes: providerMetricAttributes(routed.adapter.provider, {
           operation: "rollback",
         }),
       }),

You can send follow-ups to this agent here.

@macroscopeapp
Copy link
Copy Markdown
Contributor

macroscopeapp bot commented Apr 2, 2026

Approvability

Verdict: Needs human review

This PR introduces a complete server observability system including local trace files, OTLP export, and metrics throughout the codebase. The scope includes new infrastructure, runtime behavior changes (file creation, measurement overhead), and an unresolved review comment about per-query metrics performance overhead that warrants discussion.

You can customize Macroscope's approvability policy. Learn more.

juliusmarminge and others added 3 commits April 2, 2026 13:16
Resolve startup and git observability conflicts while preserving mainline behavior.

Co-authored-by: codex <codex@users.noreply.github.com>
- Record orchestration command ack latency and route provider metrics by resolved provider
- Capture RPC stream failures and keep trace sink writing past bad records
- Add circular attribute normalization coverage
Copy link
Copy Markdown
Contributor

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

There are 3 total unresolved issues (including 1 from previous review).

Autofix Details

Bugbot Autofix prepared fixes for both issues found in the latest run.

  • ✅ Fixed: Orchestration command metrics outcome always reports success
    • Moved withMetrics before Effect.catch so the metrics wrapper observes the actual success/failure exit of the command processing effect, rather than always seeing the void success returned by the catch handler.
  • ✅ Fixed: Redundant wrapper functions add unnecessary indirection
    • Removed the trivial rpcEffect, rpcStream, and rpcStreamEffect pass-through wrappers and replaced all call sites with direct calls to observeRpcEffect, observeRpcStream, and observeRpcStreamEffect.

Create PR

Or push these changes by commenting:

@cursor push 6cb70ca80d
Preview (6cb70ca80d)
diff --git a/apps/server/src/orchestration/Layers/OrchestrationEngine.ts b/apps/server/src/orchestration/Layers/OrchestrationEngine.ts
--- a/apps/server/src/orchestration/Layers/OrchestrationEngine.ts
+++ b/apps/server/src/orchestration/Layers/OrchestrationEngine.ts
@@ -201,6 +201,15 @@
       }
       yield* Deferred.succeed(envelope.result, { sequence: committedCommand.lastSequence });
     }).pipe(
+      Effect.withSpan(`orchestration.command.${envelope.command.type}`),
+      withMetrics({
+        counter: orchestrationCommandsTotal,
+        timer: orchestrationCommandDuration,
+        attributes: {
+          commandType: envelope.command.type,
+          aggregateKind: aggregateRef.aggregateKind,
+        },
+      }),
       Effect.catch((error) =>
         Effect.gen(function* () {
           yield* reconcileReadModelAfterDispatchFailure.pipe(
@@ -232,16 +241,7 @@
           yield* Deferred.fail(envelope.result, error);
         }),
       ),
-      Effect.withSpan(`orchestration.command.${envelope.command.type}`),
       Effect.asVoid,
-      withMetrics({
-        counter: orchestrationCommandsTotal,
-        timer: orchestrationCommandDuration,
-        attributes: {
-          commandType: envelope.command.type,
-          aggregateKind: aggregateRef.aggregateKind,
-        },
-      }),
     );
   };
 

diff --git a/apps/server/src/ws.ts b/apps/server/src/ws.ts
--- a/apps/server/src/ws.ts
+++ b/apps/server/src/ws.ts
@@ -76,31 +76,9 @@
       };
     });
 
-    const rpcEffect = <A, E, R>(
-      method: string,
-      effect: Effect.Effect<A, E, R>,
-      traceAttributes?: Readonly<Record<string, unknown>>,
-    ) => observeRpcEffect(method, effect, traceAttributes);
-
-    const rpcStream = <A, E, R>(
-      method: string,
-      stream: Stream.Stream<A, E, R>,
-      traceAttributes?: Readonly<Record<string, unknown>>,
-    ) => observeRpcStream(method, stream, traceAttributes);
-
-    const rpcStreamEffect = <A, StreamError, StreamContext, EffectError, EffectContext>(
-      method: string,
-      effect: Effect.Effect<
-        Stream.Stream<A, StreamError, StreamContext>,
-        EffectError,
-        EffectContext
-      >,
-      traceAttributes?: Readonly<Record<string, unknown>>,
-    ) => observeRpcStreamEffect(method, effect, traceAttributes);
-
     return WsRpcGroup.of({
       [ORCHESTRATION_WS_METHODS.getSnapshot]: (_input) =>
-        rpcEffect(
+        observeRpcEffect(
           ORCHESTRATION_WS_METHODS.getSnapshot,
           projectionSnapshotQuery.getSnapshot().pipe(
             Effect.mapError(
@@ -114,7 +92,7 @@
           { "rpc.aggregate": "orchestration" },
         ),
       [ORCHESTRATION_WS_METHODS.dispatchCommand]: (command) =>
-        rpcEffect(
+        observeRpcEffect(
           ORCHESTRATION_WS_METHODS.dispatchCommand,
           Effect.gen(function* () {
             const normalizedCommand = yield* normalizeDispatchCommand(command);
@@ -132,7 +110,7 @@
           { "rpc.aggregate": "orchestration" },
         ),
       [ORCHESTRATION_WS_METHODS.getTurnDiff]: (input) =>
-        rpcEffect(
+        observeRpcEffect(
           ORCHESTRATION_WS_METHODS.getTurnDiff,
           checkpointDiffQuery.getTurnDiff(input).pipe(
             Effect.mapError(
@@ -146,7 +124,7 @@
           { "rpc.aggregate": "orchestration" },
         ),
       [ORCHESTRATION_WS_METHODS.getFullThreadDiff]: (input) =>
-        rpcEffect(
+        observeRpcEffect(
           ORCHESTRATION_WS_METHODS.getFullThreadDiff,
           checkpointDiffQuery.getFullThreadDiff(input).pipe(
             Effect.mapError(
@@ -160,7 +138,7 @@
           { "rpc.aggregate": "orchestration" },
         ),
       [ORCHESTRATION_WS_METHODS.replayEvents]: (input) =>
-        rpcEffect(
+        observeRpcEffect(
           ORCHESTRATION_WS_METHODS.replayEvents,
           Stream.runCollect(
             orchestrationEngine.readEvents(
@@ -179,7 +157,7 @@
           { "rpc.aggregate": "orchestration" },
         ),
       [WS_METHODS.subscribeOrchestrationDomainEvents]: (_input) =>
-        rpcStreamEffect(
+        observeRpcStreamEffect(
           WS_METHODS.subscribeOrchestrationDomainEvents,
           Effect.gen(function* () {
             const snapshot = yield* orchestrationEngine.getReadModel();
@@ -238,15 +216,17 @@
           { "rpc.aggregate": "orchestration" },
         ),
       [WS_METHODS.serverGetConfig]: (_input) =>
-        rpcEffect(WS_METHODS.serverGetConfig, loadServerConfig, { "rpc.aggregate": "server" }),
+        observeRpcEffect(WS_METHODS.serverGetConfig, loadServerConfig, {
+          "rpc.aggregate": "server",
+        }),
       [WS_METHODS.serverRefreshProviders]: (_input) =>
-        rpcEffect(
+        observeRpcEffect(
           WS_METHODS.serverRefreshProviders,
           providerRegistry.refresh().pipe(Effect.map((providers) => ({ providers }))),
           { "rpc.aggregate": "server" },
         ),
       [WS_METHODS.serverUpsertKeybinding]: (rule) =>
-        rpcEffect(
+        observeRpcEffect(
           WS_METHODS.serverUpsertKeybinding,
           Effect.gen(function* () {
             const keybindingsConfig = yield* keybindings.upsertKeybindingRule(rule);
@@ -255,15 +235,15 @@
           { "rpc.aggregate": "server" },
         ),
       [WS_METHODS.serverGetSettings]: (_input) =>
-        rpcEffect(WS_METHODS.serverGetSettings, serverSettings.getSettings, {
+        observeRpcEffect(WS_METHODS.serverGetSettings, serverSettings.getSettings, {
           "rpc.aggregate": "server",
         }),
       [WS_METHODS.serverUpdateSettings]: ({ patch }) =>
-        rpcEffect(WS_METHODS.serverUpdateSettings, serverSettings.updateSettings(patch), {
+        observeRpcEffect(WS_METHODS.serverUpdateSettings, serverSettings.updateSettings(patch), {
           "rpc.aggregate": "server",
         }),
       [WS_METHODS.projectsSearchEntries]: (input) =>
-        rpcEffect(
+        observeRpcEffect(
           WS_METHODS.projectsSearchEntries,
           workspaceEntries.search(input).pipe(
             Effect.mapError(
@@ -277,7 +257,7 @@
           { "rpc.aggregate": "workspace" },
         ),
       [WS_METHODS.projectsWriteFile]: (input) =>
-        rpcEffect(
+        observeRpcEffect(
           WS_METHODS.projectsWriteFile,
           workspaceFileSystem.writeFile(input).pipe(
             Effect.mapError((cause) => {
@@ -293,15 +273,19 @@
           { "rpc.aggregate": "workspace" },
         ),
       [WS_METHODS.shellOpenInEditor]: (input) =>
-        rpcEffect(WS_METHODS.shellOpenInEditor, open.openInEditor(input), {
+        observeRpcEffect(WS_METHODS.shellOpenInEditor, open.openInEditor(input), {
           "rpc.aggregate": "workspace",
         }),
       [WS_METHODS.gitStatus]: (input) =>
-        rpcEffect(WS_METHODS.gitStatus, gitManager.status(input), { "rpc.aggregate": "git" }),
+        observeRpcEffect(WS_METHODS.gitStatus, gitManager.status(input), {
+          "rpc.aggregate": "git",
+        }),
       [WS_METHODS.gitPull]: (input) =>
-        rpcEffect(WS_METHODS.gitPull, git.pullCurrentBranch(input.cwd), { "rpc.aggregate": "git" }),
+        observeRpcEffect(WS_METHODS.gitPull, git.pullCurrentBranch(input.cwd), {
+          "rpc.aggregate": "git",
+        }),
       [WS_METHODS.gitRunStackedAction]: (input) =>
-        rpcStream(
+        observeRpcStream(
           WS_METHODS.gitRunStackedAction,
           Stream.callback<GitActionProgressEvent, GitManagerServiceError>((queue) =>
             gitManager
@@ -321,61 +305,63 @@
           { "rpc.aggregate": "git" },
         ),
       [WS_METHODS.gitResolvePullRequest]: (input) =>
-        rpcEffect(WS_METHODS.gitResolvePullRequest, gitManager.resolvePullRequest(input), {
+        observeRpcEffect(WS_METHODS.gitResolvePullRequest, gitManager.resolvePullRequest(input), {
           "rpc.aggregate": "git",
         }),
       [WS_METHODS.gitPreparePullRequestThread]: (input) =>
-        rpcEffect(
+        observeRpcEffect(
           WS_METHODS.gitPreparePullRequestThread,
           gitManager.preparePullRequestThread(input),
           { "rpc.aggregate": "git" },
         ),
       [WS_METHODS.gitListBranches]: (input) =>
-        rpcEffect(WS_METHODS.gitListBranches, git.listBranches(input), { "rpc.aggregate": "git" }),
+        observeRpcEffect(WS_METHODS.gitListBranches, git.listBranches(input), {
+          "rpc.aggregate": "git",
+        }),
       [WS_METHODS.gitCreateWorktree]: (input) =>
-        rpcEffect(WS_METHODS.gitCreateWorktree, git.createWorktree(input), {
+        observeRpcEffect(WS_METHODS.gitCreateWorktree, git.createWorktree(input), {
           "rpc.aggregate": "git",
         }),
       [WS_METHODS.gitRemoveWorktree]: (input) =>
-        rpcEffect(WS_METHODS.gitRemoveWorktree, git.removeWorktree(input), {
+        observeRpcEffect(WS_METHODS.gitRemoveWorktree, git.removeWorktree(input), {
           "rpc.aggregate": "git",
         }),
       [WS_METHODS.gitCreateBranch]: (input) =>
-        rpcEffect(WS_METHODS.gitCreateBranch, git.createBranch(input), {
+        observeRpcEffect(WS_METHODS.gitCreateBranch, git.createBranch(input), {
           "rpc.aggregate": "git",
         }),
       [WS_METHODS.gitCheckout]: (input) =>
-        rpcEffect(WS_METHODS.gitCheckout, Effect.scoped(git.checkoutBranch(input)), {
+        observeRpcEffect(WS_METHODS.gitCheckout, Effect.scoped(git.checkoutBranch(input)), {
           "rpc.aggregate": "git",
         }),
       [WS_METHODS.gitInit]: (input) =>
-        rpcEffect(WS_METHODS.gitInit, git.initRepo(input), { "rpc.aggregate": "git" }),
+        observeRpcEffect(WS_METHODS.gitInit, git.initRepo(input), { "rpc.aggregate": "git" }),
       [WS_METHODS.terminalOpen]: (input) =>
-        rpcEffect(WS_METHODS.terminalOpen, terminalManager.open(input), {
+        observeRpcEffect(WS_METHODS.terminalOpen, terminalManager.open(input), {
           "rpc.aggregate": "terminal",
         }),
       [WS_METHODS.terminalWrite]: (input) =>
-        rpcEffect(WS_METHODS.terminalWrite, terminalManager.write(input), {
+        observeRpcEffect(WS_METHODS.terminalWrite, terminalManager.write(input), {
           "rpc.aggregate": "terminal",
         }),
       [WS_METHODS.terminalResize]: (input) =>
-        rpcEffect(WS_METHODS.terminalResize, terminalManager.resize(input), {
+        observeRpcEffect(WS_METHODS.terminalResize, terminalManager.resize(input), {
           "rpc.aggregate": "terminal",
         }),
       [WS_METHODS.terminalClear]: (input) =>
-        rpcEffect(WS_METHODS.terminalClear, terminalManager.clear(input), {
+        observeRpcEffect(WS_METHODS.terminalClear, terminalManager.clear(input), {
           "rpc.aggregate": "terminal",
         }),
       [WS_METHODS.terminalRestart]: (input) =>
-        rpcEffect(WS_METHODS.terminalRestart, terminalManager.restart(input), {
+        observeRpcEffect(WS_METHODS.terminalRestart, terminalManager.restart(input), {
           "rpc.aggregate": "terminal",
         }),
       [WS_METHODS.terminalClose]: (input) =>
-        rpcEffect(WS_METHODS.terminalClose, terminalManager.close(input), {
+        observeRpcEffect(WS_METHODS.terminalClose, terminalManager.close(input), {
           "rpc.aggregate": "terminal",
         }),
       [WS_METHODS.subscribeTerminalEvents]: (_input) =>
-        rpcStream(
+        observeRpcStream(
           WS_METHODS.subscribeTerminalEvents,
           Stream.callback<TerminalEvent>((queue) =>
             Effect.acquireRelease(
@@ -386,7 +372,7 @@
           { "rpc.aggregate": "terminal" },
         ),
       [WS_METHODS.subscribeServerConfig]: (_input) =>
-        rpcStreamEffect(
+        observeRpcStreamEffect(
           WS_METHODS.subscribeServerConfig,
           Effect.gen(function* () {
             const keybindingsUpdates = keybindings.streamChanges.pipe(
@@ -425,7 +411,7 @@
           { "rpc.aggregate": "server" },
         ),
       [WS_METHODS.subscribeServerLifecycle]: (_input) =>
-        rpcStreamEffect(
+        observeRpcStreamEffect(
           WS_METHODS.subscribeServerLifecycle,
           Effect.gen(function* () {
             const snapshot = yield* lifecycleEvents.snapshot;

You can send follow-ups to this agent here.

juliusmarminge and others added 3 commits April 2, 2026 16:35
Co-authored-by: codex <codex@users.noreply.github.com>
- expose logs folder and OTLP tracing state in server config
- add a Diagnostics section with an open-logs-folder action
- update tests and observability docs for the new config shape
- Call `observeRpc*` directly for each WS method
- Remove the local `rpcEffect`/`rpcStream` aliases
@juliusmarminge juliusmarminge force-pushed the t3code/effect-tracing-observability branch from 88db488 to 3e9d5fc Compare April 3, 2026 00:30
Copy link
Copy Markdown
Contributor

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Autofix Details

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: Redundant double call to normalizeModelMetricLabel
    • Stored the result of normalizeModelMetricLabel(input.model) in a local modelFamily variable and reused it for both the truthiness check and the attribute value.

Create PR

Or push these changes by commenting:

@cursor push 472c5ab580
Preview (472c5ab580)
diff --git a/apps/server/src/observability/Metrics.ts b/apps/server/src/observability/Metrics.ts
--- a/apps/server/src/observability/Metrics.ts
+++ b/apps/server/src/observability/Metrics.ts
@@ -155,11 +155,11 @@
   readonly provider: string;
   readonly model: string | null | undefined;
   readonly extra?: Readonly<Record<string, unknown>>;
-}) =>
-  compactMetricAttributes({
+}) => {
+  const modelFamily = normalizeModelMetricLabel(input.model);
+  return compactMetricAttributes({
     provider: input.provider,
-    ...(normalizeModelMetricLabel(input.model)
-      ? { modelFamily: normalizeModelMetricLabel(input.model) }
-      : {}),
+    ...(modelFamily ? { modelFamily } : {}),
     ...input.extra,
   });
+};

You can send follow-ups to this agent here.

- Record success, failure, and interrupt outcomes from command dispatch
- Normalize invalid dates and group GPT models under a shared label
- Add coverage for failed dispatch metric accounting
- name previously untraced effects
- annotate editor launch spans with context
Copy link
Copy Markdown
Contributor

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Autofix Details

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: Metric records input provider instead of resolved provider
    • Changed sendTurn to use a mutable metricProvider variable set after routing, with a lazy attributes callback in withMetrics, matching the pattern used by all other provider operations.

Create PR

Or push these changes by commenting:

@cursor push 0ac1d4d33b
Preview (0ac1d4d33b)
diff --git a/apps/server/src/observability/Metrics.ts b/apps/server/src/observability/Metrics.ts
--- a/apps/server/src/observability/Metrics.ts
+++ b/apps/server/src/observability/Metrics.ts
@@ -91,7 +91,9 @@
 export interface WithMetricsOptions {
   readonly counter?: Metric.Metric<number, unknown>;
   readonly timer?: Metric.Metric<Duration.Duration, unknown>;
-  readonly attributes?: Readonly<Record<string, unknown>>;
+  readonly attributes?:
+    | Readonly<Record<string, unknown>>
+    | (() => Readonly<Record<string, unknown>>);
   readonly outcomeAttributes?: (
     outcome: ReturnType<typeof outcomeFromExit>,
   ) => Readonly<Record<string, unknown>>;
@@ -105,7 +107,8 @@
     const startedAt = Date.now();
     const exit = yield* Effect.exit(effect);
     const duration = Duration.millis(Math.max(0, Date.now() - startedAt));
-    const baseAttributes = options.attributes ?? {};
+    const baseAttributes =
+      typeof options.attributes === "function" ? options.attributes() : (options.attributes ?? {});
 
     if (options.timer) {
       yield* Metric.update(

diff --git a/apps/server/src/provider/Layers/ProviderService.ts b/apps/server/src/provider/Layers/ProviderService.ts
--- a/apps/server/src/provider/Layers/ProviderService.ts
+++ b/apps/server/src/provider/Layers/ProviderService.ts
@@ -407,12 +407,16 @@
       "provider.interaction_mode": input.interactionMode,
       "provider.attachment_count": input.attachments.length,
     });
+    let metricProvider = "unknown";
+    let metricModel = input.modelSelection?.model;
     return yield* Effect.gen(function* () {
       const routed = yield* resolveRoutableSession({
         threadId: input.threadId,
         operation: "ProviderService.sendTurn",
         allowRecovery: true,
       });
+      metricProvider = routed.adapter.provider;
+      metricModel = input.modelSelection?.model;
       yield* Effect.annotateCurrentSpan({
         "provider.kind": routed.adapter.provider,
         ...(input.modelSelection?.model ? { "provider.model": input.modelSelection.model } : {}),
@@ -442,13 +446,14 @@
       withMetrics({
         counter: providerTurnsTotal,
         timer: providerTurnDuration,
-        attributes: providerTurnMetricAttributes({
-          provider: input.modelSelection?.provider ?? "unknown",
-          model: input.modelSelection?.model,
-          extra: {
-            operation: "send",
-          },
-        }),
+        attributes: () =>
+          providerTurnMetricAttributes({
+            provider: metricProvider,
+            model: metricModel,
+            extra: {
+              operation: "send",
+            },
+          }),
       }),
     );
   });

You can send follow-ups to this agent here.

- Allow withMetrics attributes to be computed lazily
- Record sendTurn metrics with the resolved provider
allowRecovery: true,
});
metricProvider = routed.adapter.provider;
metricModel = input.modelSelection?.model;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant reassignment of metricModel variable

Low Severity

metricModel is assigned input.modelSelection?.model at line 411, then reassigned the exact same expression at line 419 inside the inner Effect.gen. Since input is never mutated between these two assignments, the second assignment is redundant. This may indicate the developer intended to capture a different value at that point (e.g., the resolved model from the routed session), which would be the actual bug, or it's simply dead code.

Additional Locations (1)
Fix in Cursor Fix in Web

timer: dbQueryDuration,
attributes: databaseMetricAttributes,
}),
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DB query metrics add overhead to every statement

Medium Severity

The withMetrics wrapper is applied to every individual SQLite statement execution (runStatement and runValues). Each invocation creates an Effect.gen generator, calls Effect.exit (which boxes the result), records Date.now() twice, constructs a Duration, and updates both a counter and a timer metric. For a query-heavy application like an orchestration engine with frequent reads and writes, this per-query overhead on the hot path can meaningfully degrade throughput.

Additional Locations (1)
Fix in Cursor Fix in Web

@juliusmarminge juliusmarminge enabled auto-merge (squash) April 3, 2026 02:28
@juliusmarminge juliusmarminge merged commit 752f96e into main Apr 3, 2026
12 checks passed
@juliusmarminge juliusmarminge deleted the t3code/effect-tracing-observability branch April 3, 2026 02:30
Copy link
Copy Markdown
Contributor

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

There are 4 total unresolved issues (including 2 from previous reviews).

Fix All in Cursor

Bugbot Autofix is ON, but it could not run because the branch was deleted or merged before autofix could start.

return;
}

const error = Cause.squash(exit.cause) as OrchestrationDispatchError;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsafe cast silently mishandles interrupt and defect causes

Medium Severity

Cause.squash(exit.cause) is cast as OrchestrationDispatchError, but for interrupt or defect causes it actually returns an InterruptedException or the raw defect value. The old code used Effect.catch, which only handled typed errors and let interrupts propagate naturally as fiber interruptions. The new code captures interrupts via Effect.exit and then feeds the mistyped squashed value to Deferred.fail, resolving the deferred with a value that doesn't conform to OrchestrationDispatchError. This also causes reconcileReadModelAfterDispatchFailure to run unnecessarily for interrupts since the Schema.is(OrchestrationCommandPreviouslyRejectedError) guard returns false.

Fix in Cursor Fix in Web

operation: "interrupt",
}),
}),
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Metrics use outcomeAttributes where attributes fits better

Low Severity

Several provider operations (interruptTurn, respondToRequest, stopSession, rollbackConversation) pass provider/operation labels via outcomeAttributes instead of attributes, and the closure ignores the outcome parameter it receives. The outcomeAttributes option is designed for attributes that vary based on the outcome (success/failure/interrupt), but here the attributes are static. Using attributes with a lazy function would express the intent more clearly and be consistent with sendTurn.

Additional Locations (2)
Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:XXL 1,000+ changed lines (additions + deletions). vouch:trusted PR author is trusted by repo permissions or the VOUCHED list.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant