Add server observability tracing and metrics#1697
Conversation
- Add local trace file output and OTLP export hooks - Instrument RPC, git, orchestration, provider, and DB paths - Document observability configuration
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Autofix Details
Bugbot Autofix prepared fixes for both issues found in the latest run.
- ✅ Fixed: Provider metrics always report "unknown" provider label
- Hoisted resolveRoutableSession calls above the inner Effect.gen in all five affected operations so routed.adapter.provider is available for providerMetricAttributes instead of hardcoding "unknown".
- ✅ Fixed: Exported
withMetricAttributesis never used anywhere- Removed the unused withMetricAttributes re-export from Metrics.ts as it was dead code with no consumers in the codebase.
Or push these changes by commenting:
@cursor push 397de494ef
Preview (397de494ef)
diff --git a/apps/server/src/observability/Metrics.ts b/apps/server/src/observability/Metrics.ts
--- a/apps/server/src/observability/Metrics.ts
+++ b/apps/server/src/observability/Metrics.ts
@@ -74,8 +74,6 @@
attributes: Readonly<Record<string, unknown>>,
): ReadonlyArray<[string, string]> => Object.entries(compactMetricAttributes(attributes));
-export const withMetricAttributes = Metric.withAttributes;
-
export const increment = (
metric: Metric.Metric<number, unknown>,
attributes: Readonly<Record<string, unknown>>,
diff --git a/apps/server/src/provider/Layers/ProviderService.ts b/apps/server/src/provider/Layers/ProviderService.ts
--- a/apps/server/src/provider/Layers/ProviderService.ts
+++ b/apps/server/src/provider/Layers/ProviderService.ts
@@ -460,12 +460,12 @@
schema: ProviderInterruptTurnInput,
payload: rawInput,
});
+ const routed = yield* resolveRoutableSession({
+ threadId: input.threadId,
+ operation: "ProviderService.interruptTurn",
+ allowRecovery: true,
+ });
return yield* Effect.gen(function* () {
- const routed = yield* resolveRoutableSession({
- threadId: input.threadId,
- operation: "ProviderService.interruptTurn",
- allowRecovery: true,
- });
yield* Effect.annotateCurrentSpan({
"provider.operation": "interrupt-turn",
"provider.kind": routed.adapter.provider,
@@ -479,7 +479,7 @@
}).pipe(
withMetrics({
counter: providerTurnsTotal,
- attributes: providerMetricAttributes("unknown", {
+ attributes: providerMetricAttributes(routed.adapter.provider, {
operation: "interrupt",
}),
}),
@@ -494,12 +494,12 @@
schema: ProviderRespondToRequestInput,
payload: rawInput,
});
+ const routed = yield* resolveRoutableSession({
+ threadId: input.threadId,
+ operation: "ProviderService.respondToRequest",
+ allowRecovery: true,
+ });
return yield* Effect.gen(function* () {
- const routed = yield* resolveRoutableSession({
- threadId: input.threadId,
- operation: "ProviderService.respondToRequest",
- allowRecovery: true,
- });
yield* Effect.annotateCurrentSpan({
"provider.operation": "respond-to-request",
"provider.kind": routed.adapter.provider,
@@ -514,7 +514,7 @@
}).pipe(
withMetrics({
counter: providerTurnsTotal,
- attributes: providerMetricAttributes("unknown", {
+ attributes: providerMetricAttributes(routed.adapter.provider, {
operation: "approval-response",
}),
}),
@@ -530,12 +530,12 @@
schema: ProviderRespondToUserInputInput,
payload: rawInput,
});
+ const routed = yield* resolveRoutableSession({
+ threadId: input.threadId,
+ operation: "ProviderService.respondToUserInput",
+ allowRecovery: true,
+ });
return yield* Effect.gen(function* () {
- const routed = yield* resolveRoutableSession({
- threadId: input.threadId,
- operation: "ProviderService.respondToUserInput",
- allowRecovery: true,
- });
yield* Effect.annotateCurrentSpan({
"provider.operation": "respond-to-user-input",
"provider.kind": routed.adapter.provider,
@@ -546,7 +546,7 @@
}).pipe(
withMetrics({
counter: providerTurnsTotal,
- attributes: providerMetricAttributes("unknown", {
+ attributes: providerMetricAttributes(routed.adapter.provider, {
operation: "user-input-response",
}),
}),
@@ -560,12 +560,12 @@
schema: ProviderStopSessionInput,
payload: rawInput,
});
+ const routed = yield* resolveRoutableSession({
+ threadId: input.threadId,
+ operation: "ProviderService.stopSession",
+ allowRecovery: false,
+ });
return yield* Effect.gen(function* () {
- const routed = yield* resolveRoutableSession({
- threadId: input.threadId,
- operation: "ProviderService.stopSession",
- allowRecovery: false,
- });
yield* Effect.annotateCurrentSpan({
"provider.operation": "stop-session",
"provider.kind": routed.adapter.provider,
@@ -581,7 +581,7 @@
}).pipe(
withMetrics({
counter: providerSessionsTotal,
- attributes: providerMetricAttributes("unknown", {
+ attributes: providerMetricAttributes(routed.adapter.provider, {
operation: "stop",
}),
}),
@@ -651,12 +651,12 @@
if (input.numTurns === 0) {
return;
}
+ const routed = yield* resolveRoutableSession({
+ threadId: input.threadId,
+ operation: "ProviderService.rollbackConversation",
+ allowRecovery: true,
+ });
return yield* Effect.gen(function* () {
- const routed = yield* resolveRoutableSession({
- threadId: input.threadId,
- operation: "ProviderService.rollbackConversation",
- allowRecovery: true,
- });
yield* Effect.annotateCurrentSpan({
"provider.operation": "rollback-conversation",
"provider.kind": routed.adapter.provider,
@@ -671,7 +671,7 @@
}).pipe(
withMetrics({
counter: providerTurnsTotal,
- attributes: providerMetricAttributes("unknown", {
+ attributes: providerMetricAttributes(routed.adapter.provider, {
operation: "rollback",
}),
}),You can send follow-ups to this agent here.
ApprovabilityVerdict: Needs human review This PR introduces a complete server observability system including local trace files, OTLP export, and metrics throughout the codebase. The scope includes new infrastructure, runtime behavior changes (file creation, measurement overhead), and an unresolved review comment about per-query metrics performance overhead that warrants discussion. You can customize Macroscope's approvability policy. Learn more. |
Resolve startup and git observability conflicts while preserving mainline behavior. Co-authored-by: codex <codex@users.noreply.github.com>
- Record orchestration command ack latency and route provider metrics by resolved provider - Capture RPC stream failures and keep trace sink writing past bad records - Add circular attribute normalization coverage
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
There are 3 total unresolved issues (including 1 from previous review).
Autofix Details
Bugbot Autofix prepared fixes for both issues found in the latest run.
- ✅ Fixed: Orchestration command metrics outcome always reports success
- Moved withMetrics before Effect.catch so the metrics wrapper observes the actual success/failure exit of the command processing effect, rather than always seeing the void success returned by the catch handler.
- ✅ Fixed: Redundant wrapper functions add unnecessary indirection
- Removed the trivial rpcEffect, rpcStream, and rpcStreamEffect pass-through wrappers and replaced all call sites with direct calls to observeRpcEffect, observeRpcStream, and observeRpcStreamEffect.
Or push these changes by commenting:
@cursor push 6cb70ca80d
Preview (6cb70ca80d)
diff --git a/apps/server/src/orchestration/Layers/OrchestrationEngine.ts b/apps/server/src/orchestration/Layers/OrchestrationEngine.ts
--- a/apps/server/src/orchestration/Layers/OrchestrationEngine.ts
+++ b/apps/server/src/orchestration/Layers/OrchestrationEngine.ts
@@ -201,6 +201,15 @@
}
yield* Deferred.succeed(envelope.result, { sequence: committedCommand.lastSequence });
}).pipe(
+ Effect.withSpan(`orchestration.command.${envelope.command.type}`),
+ withMetrics({
+ counter: orchestrationCommandsTotal,
+ timer: orchestrationCommandDuration,
+ attributes: {
+ commandType: envelope.command.type,
+ aggregateKind: aggregateRef.aggregateKind,
+ },
+ }),
Effect.catch((error) =>
Effect.gen(function* () {
yield* reconcileReadModelAfterDispatchFailure.pipe(
@@ -232,16 +241,7 @@
yield* Deferred.fail(envelope.result, error);
}),
),
- Effect.withSpan(`orchestration.command.${envelope.command.type}`),
Effect.asVoid,
- withMetrics({
- counter: orchestrationCommandsTotal,
- timer: orchestrationCommandDuration,
- attributes: {
- commandType: envelope.command.type,
- aggregateKind: aggregateRef.aggregateKind,
- },
- }),
);
};
diff --git a/apps/server/src/ws.ts b/apps/server/src/ws.ts
--- a/apps/server/src/ws.ts
+++ b/apps/server/src/ws.ts
@@ -76,31 +76,9 @@
};
});
- const rpcEffect = <A, E, R>(
- method: string,
- effect: Effect.Effect<A, E, R>,
- traceAttributes?: Readonly<Record<string, unknown>>,
- ) => observeRpcEffect(method, effect, traceAttributes);
-
- const rpcStream = <A, E, R>(
- method: string,
- stream: Stream.Stream<A, E, R>,
- traceAttributes?: Readonly<Record<string, unknown>>,
- ) => observeRpcStream(method, stream, traceAttributes);
-
- const rpcStreamEffect = <A, StreamError, StreamContext, EffectError, EffectContext>(
- method: string,
- effect: Effect.Effect<
- Stream.Stream<A, StreamError, StreamContext>,
- EffectError,
- EffectContext
- >,
- traceAttributes?: Readonly<Record<string, unknown>>,
- ) => observeRpcStreamEffect(method, effect, traceAttributes);
-
return WsRpcGroup.of({
[ORCHESTRATION_WS_METHODS.getSnapshot]: (_input) =>
- rpcEffect(
+ observeRpcEffect(
ORCHESTRATION_WS_METHODS.getSnapshot,
projectionSnapshotQuery.getSnapshot().pipe(
Effect.mapError(
@@ -114,7 +92,7 @@
{ "rpc.aggregate": "orchestration" },
),
[ORCHESTRATION_WS_METHODS.dispatchCommand]: (command) =>
- rpcEffect(
+ observeRpcEffect(
ORCHESTRATION_WS_METHODS.dispatchCommand,
Effect.gen(function* () {
const normalizedCommand = yield* normalizeDispatchCommand(command);
@@ -132,7 +110,7 @@
{ "rpc.aggregate": "orchestration" },
),
[ORCHESTRATION_WS_METHODS.getTurnDiff]: (input) =>
- rpcEffect(
+ observeRpcEffect(
ORCHESTRATION_WS_METHODS.getTurnDiff,
checkpointDiffQuery.getTurnDiff(input).pipe(
Effect.mapError(
@@ -146,7 +124,7 @@
{ "rpc.aggregate": "orchestration" },
),
[ORCHESTRATION_WS_METHODS.getFullThreadDiff]: (input) =>
- rpcEffect(
+ observeRpcEffect(
ORCHESTRATION_WS_METHODS.getFullThreadDiff,
checkpointDiffQuery.getFullThreadDiff(input).pipe(
Effect.mapError(
@@ -160,7 +138,7 @@
{ "rpc.aggregate": "orchestration" },
),
[ORCHESTRATION_WS_METHODS.replayEvents]: (input) =>
- rpcEffect(
+ observeRpcEffect(
ORCHESTRATION_WS_METHODS.replayEvents,
Stream.runCollect(
orchestrationEngine.readEvents(
@@ -179,7 +157,7 @@
{ "rpc.aggregate": "orchestration" },
),
[WS_METHODS.subscribeOrchestrationDomainEvents]: (_input) =>
- rpcStreamEffect(
+ observeRpcStreamEffect(
WS_METHODS.subscribeOrchestrationDomainEvents,
Effect.gen(function* () {
const snapshot = yield* orchestrationEngine.getReadModel();
@@ -238,15 +216,17 @@
{ "rpc.aggregate": "orchestration" },
),
[WS_METHODS.serverGetConfig]: (_input) =>
- rpcEffect(WS_METHODS.serverGetConfig, loadServerConfig, { "rpc.aggregate": "server" }),
+ observeRpcEffect(WS_METHODS.serverGetConfig, loadServerConfig, {
+ "rpc.aggregate": "server",
+ }),
[WS_METHODS.serverRefreshProviders]: (_input) =>
- rpcEffect(
+ observeRpcEffect(
WS_METHODS.serverRefreshProviders,
providerRegistry.refresh().pipe(Effect.map((providers) => ({ providers }))),
{ "rpc.aggregate": "server" },
),
[WS_METHODS.serverUpsertKeybinding]: (rule) =>
- rpcEffect(
+ observeRpcEffect(
WS_METHODS.serverUpsertKeybinding,
Effect.gen(function* () {
const keybindingsConfig = yield* keybindings.upsertKeybindingRule(rule);
@@ -255,15 +235,15 @@
{ "rpc.aggregate": "server" },
),
[WS_METHODS.serverGetSettings]: (_input) =>
- rpcEffect(WS_METHODS.serverGetSettings, serverSettings.getSettings, {
+ observeRpcEffect(WS_METHODS.serverGetSettings, serverSettings.getSettings, {
"rpc.aggregate": "server",
}),
[WS_METHODS.serverUpdateSettings]: ({ patch }) =>
- rpcEffect(WS_METHODS.serverUpdateSettings, serverSettings.updateSettings(patch), {
+ observeRpcEffect(WS_METHODS.serverUpdateSettings, serverSettings.updateSettings(patch), {
"rpc.aggregate": "server",
}),
[WS_METHODS.projectsSearchEntries]: (input) =>
- rpcEffect(
+ observeRpcEffect(
WS_METHODS.projectsSearchEntries,
workspaceEntries.search(input).pipe(
Effect.mapError(
@@ -277,7 +257,7 @@
{ "rpc.aggregate": "workspace" },
),
[WS_METHODS.projectsWriteFile]: (input) =>
- rpcEffect(
+ observeRpcEffect(
WS_METHODS.projectsWriteFile,
workspaceFileSystem.writeFile(input).pipe(
Effect.mapError((cause) => {
@@ -293,15 +273,19 @@
{ "rpc.aggregate": "workspace" },
),
[WS_METHODS.shellOpenInEditor]: (input) =>
- rpcEffect(WS_METHODS.shellOpenInEditor, open.openInEditor(input), {
+ observeRpcEffect(WS_METHODS.shellOpenInEditor, open.openInEditor(input), {
"rpc.aggregate": "workspace",
}),
[WS_METHODS.gitStatus]: (input) =>
- rpcEffect(WS_METHODS.gitStatus, gitManager.status(input), { "rpc.aggregate": "git" }),
+ observeRpcEffect(WS_METHODS.gitStatus, gitManager.status(input), {
+ "rpc.aggregate": "git",
+ }),
[WS_METHODS.gitPull]: (input) =>
- rpcEffect(WS_METHODS.gitPull, git.pullCurrentBranch(input.cwd), { "rpc.aggregate": "git" }),
+ observeRpcEffect(WS_METHODS.gitPull, git.pullCurrentBranch(input.cwd), {
+ "rpc.aggregate": "git",
+ }),
[WS_METHODS.gitRunStackedAction]: (input) =>
- rpcStream(
+ observeRpcStream(
WS_METHODS.gitRunStackedAction,
Stream.callback<GitActionProgressEvent, GitManagerServiceError>((queue) =>
gitManager
@@ -321,61 +305,63 @@
{ "rpc.aggregate": "git" },
),
[WS_METHODS.gitResolvePullRequest]: (input) =>
- rpcEffect(WS_METHODS.gitResolvePullRequest, gitManager.resolvePullRequest(input), {
+ observeRpcEffect(WS_METHODS.gitResolvePullRequest, gitManager.resolvePullRequest(input), {
"rpc.aggregate": "git",
}),
[WS_METHODS.gitPreparePullRequestThread]: (input) =>
- rpcEffect(
+ observeRpcEffect(
WS_METHODS.gitPreparePullRequestThread,
gitManager.preparePullRequestThread(input),
{ "rpc.aggregate": "git" },
),
[WS_METHODS.gitListBranches]: (input) =>
- rpcEffect(WS_METHODS.gitListBranches, git.listBranches(input), { "rpc.aggregate": "git" }),
+ observeRpcEffect(WS_METHODS.gitListBranches, git.listBranches(input), {
+ "rpc.aggregate": "git",
+ }),
[WS_METHODS.gitCreateWorktree]: (input) =>
- rpcEffect(WS_METHODS.gitCreateWorktree, git.createWorktree(input), {
+ observeRpcEffect(WS_METHODS.gitCreateWorktree, git.createWorktree(input), {
"rpc.aggregate": "git",
}),
[WS_METHODS.gitRemoveWorktree]: (input) =>
- rpcEffect(WS_METHODS.gitRemoveWorktree, git.removeWorktree(input), {
+ observeRpcEffect(WS_METHODS.gitRemoveWorktree, git.removeWorktree(input), {
"rpc.aggregate": "git",
}),
[WS_METHODS.gitCreateBranch]: (input) =>
- rpcEffect(WS_METHODS.gitCreateBranch, git.createBranch(input), {
+ observeRpcEffect(WS_METHODS.gitCreateBranch, git.createBranch(input), {
"rpc.aggregate": "git",
}),
[WS_METHODS.gitCheckout]: (input) =>
- rpcEffect(WS_METHODS.gitCheckout, Effect.scoped(git.checkoutBranch(input)), {
+ observeRpcEffect(WS_METHODS.gitCheckout, Effect.scoped(git.checkoutBranch(input)), {
"rpc.aggregate": "git",
}),
[WS_METHODS.gitInit]: (input) =>
- rpcEffect(WS_METHODS.gitInit, git.initRepo(input), { "rpc.aggregate": "git" }),
+ observeRpcEffect(WS_METHODS.gitInit, git.initRepo(input), { "rpc.aggregate": "git" }),
[WS_METHODS.terminalOpen]: (input) =>
- rpcEffect(WS_METHODS.terminalOpen, terminalManager.open(input), {
+ observeRpcEffect(WS_METHODS.terminalOpen, terminalManager.open(input), {
"rpc.aggregate": "terminal",
}),
[WS_METHODS.terminalWrite]: (input) =>
- rpcEffect(WS_METHODS.terminalWrite, terminalManager.write(input), {
+ observeRpcEffect(WS_METHODS.terminalWrite, terminalManager.write(input), {
"rpc.aggregate": "terminal",
}),
[WS_METHODS.terminalResize]: (input) =>
- rpcEffect(WS_METHODS.terminalResize, terminalManager.resize(input), {
+ observeRpcEffect(WS_METHODS.terminalResize, terminalManager.resize(input), {
"rpc.aggregate": "terminal",
}),
[WS_METHODS.terminalClear]: (input) =>
- rpcEffect(WS_METHODS.terminalClear, terminalManager.clear(input), {
+ observeRpcEffect(WS_METHODS.terminalClear, terminalManager.clear(input), {
"rpc.aggregate": "terminal",
}),
[WS_METHODS.terminalRestart]: (input) =>
- rpcEffect(WS_METHODS.terminalRestart, terminalManager.restart(input), {
+ observeRpcEffect(WS_METHODS.terminalRestart, terminalManager.restart(input), {
"rpc.aggregate": "terminal",
}),
[WS_METHODS.terminalClose]: (input) =>
- rpcEffect(WS_METHODS.terminalClose, terminalManager.close(input), {
+ observeRpcEffect(WS_METHODS.terminalClose, terminalManager.close(input), {
"rpc.aggregate": "terminal",
}),
[WS_METHODS.subscribeTerminalEvents]: (_input) =>
- rpcStream(
+ observeRpcStream(
WS_METHODS.subscribeTerminalEvents,
Stream.callback<TerminalEvent>((queue) =>
Effect.acquireRelease(
@@ -386,7 +372,7 @@
{ "rpc.aggregate": "terminal" },
),
[WS_METHODS.subscribeServerConfig]: (_input) =>
- rpcStreamEffect(
+ observeRpcStreamEffect(
WS_METHODS.subscribeServerConfig,
Effect.gen(function* () {
const keybindingsUpdates = keybindings.streamChanges.pipe(
@@ -425,7 +411,7 @@
{ "rpc.aggregate": "server" },
),
[WS_METHODS.subscribeServerLifecycle]: (_input) =>
- rpcStreamEffect(
+ observeRpcStreamEffect(
WS_METHODS.subscribeServerLifecycle,
Effect.gen(function* () {
const snapshot = yield* lifecycleEvents.snapshot;You can send follow-ups to this agent here.
Co-authored-by: codex <codex@users.noreply.github.com>
- expose logs folder and OTLP tracing state in server config - add a Diagnostics section with an open-logs-folder action - update tests and observability docs for the new config shape
- Call `observeRpc*` directly for each WS method - Remove the local `rpcEffect`/`rpcStream` aliases
88db488 to
3e9d5fc
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Autofix Details
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Redundant double call to
normalizeModelMetricLabel- Stored the result of
normalizeModelMetricLabel(input.model)in a localmodelFamilyvariable and reused it for both the truthiness check and the attribute value.
- Stored the result of
Or push these changes by commenting:
@cursor push 472c5ab580
Preview (472c5ab580)
diff --git a/apps/server/src/observability/Metrics.ts b/apps/server/src/observability/Metrics.ts
--- a/apps/server/src/observability/Metrics.ts
+++ b/apps/server/src/observability/Metrics.ts
@@ -155,11 +155,11 @@
readonly provider: string;
readonly model: string | null | undefined;
readonly extra?: Readonly<Record<string, unknown>>;
-}) =>
- compactMetricAttributes({
+}) => {
+ const modelFamily = normalizeModelMetricLabel(input.model);
+ return compactMetricAttributes({
provider: input.provider,
- ...(normalizeModelMetricLabel(input.model)
- ? { modelFamily: normalizeModelMetricLabel(input.model) }
- : {}),
+ ...(modelFamily ? { modelFamily } : {}),
...input.extra,
});
+};You can send follow-ups to this agent here.
- Record success, failure, and interrupt outcomes from command dispatch - Normalize invalid dates and group GPT models under a shared label - Add coverage for failed dispatch metric accounting
- name previously untraced effects - annotate editor launch spans with context
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Autofix Details
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Metric records input provider instead of resolved provider
- Changed sendTurn to use a mutable metricProvider variable set after routing, with a lazy attributes callback in withMetrics, matching the pattern used by all other provider operations.
Or push these changes by commenting:
@cursor push 0ac1d4d33b
Preview (0ac1d4d33b)
diff --git a/apps/server/src/observability/Metrics.ts b/apps/server/src/observability/Metrics.ts
--- a/apps/server/src/observability/Metrics.ts
+++ b/apps/server/src/observability/Metrics.ts
@@ -91,7 +91,9 @@
export interface WithMetricsOptions {
readonly counter?: Metric.Metric<number, unknown>;
readonly timer?: Metric.Metric<Duration.Duration, unknown>;
- readonly attributes?: Readonly<Record<string, unknown>>;
+ readonly attributes?:
+ | Readonly<Record<string, unknown>>
+ | (() => Readonly<Record<string, unknown>>);
readonly outcomeAttributes?: (
outcome: ReturnType<typeof outcomeFromExit>,
) => Readonly<Record<string, unknown>>;
@@ -105,7 +107,8 @@
const startedAt = Date.now();
const exit = yield* Effect.exit(effect);
const duration = Duration.millis(Math.max(0, Date.now() - startedAt));
- const baseAttributes = options.attributes ?? {};
+ const baseAttributes =
+ typeof options.attributes === "function" ? options.attributes() : (options.attributes ?? {});
if (options.timer) {
yield* Metric.update(
diff --git a/apps/server/src/provider/Layers/ProviderService.ts b/apps/server/src/provider/Layers/ProviderService.ts
--- a/apps/server/src/provider/Layers/ProviderService.ts
+++ b/apps/server/src/provider/Layers/ProviderService.ts
@@ -407,12 +407,16 @@
"provider.interaction_mode": input.interactionMode,
"provider.attachment_count": input.attachments.length,
});
+ let metricProvider = "unknown";
+ let metricModel = input.modelSelection?.model;
return yield* Effect.gen(function* () {
const routed = yield* resolveRoutableSession({
threadId: input.threadId,
operation: "ProviderService.sendTurn",
allowRecovery: true,
});
+ metricProvider = routed.adapter.provider;
+ metricModel = input.modelSelection?.model;
yield* Effect.annotateCurrentSpan({
"provider.kind": routed.adapter.provider,
...(input.modelSelection?.model ? { "provider.model": input.modelSelection.model } : {}),
@@ -442,13 +446,14 @@
withMetrics({
counter: providerTurnsTotal,
timer: providerTurnDuration,
- attributes: providerTurnMetricAttributes({
- provider: input.modelSelection?.provider ?? "unknown",
- model: input.modelSelection?.model,
- extra: {
- operation: "send",
- },
- }),
+ attributes: () =>
+ providerTurnMetricAttributes({
+ provider: metricProvider,
+ model: metricModel,
+ extra: {
+ operation: "send",
+ },
+ }),
}),
);
});You can send follow-ups to this agent here.
- Allow withMetrics attributes to be computed lazily - Record sendTurn metrics with the resolved provider
| allowRecovery: true, | ||
| }); | ||
| metricProvider = routed.adapter.provider; | ||
| metricModel = input.modelSelection?.model; |
There was a problem hiding this comment.
Redundant reassignment of metricModel variable
Low Severity
metricModel is assigned input.modelSelection?.model at line 411, then reassigned the exact same expression at line 419 inside the inner Effect.gen. Since input is never mutated between these two assignments, the second assignment is redundant. This may indicate the developer intended to capture a different value at that point (e.g., the resolved model from the routed session), which would be the actual bug, or it's simply dead code.
Additional Locations (1)
| timer: dbQueryDuration, | ||
| attributes: databaseMetricAttributes, | ||
| }), | ||
| ); |
There was a problem hiding this comment.
DB query metrics add overhead to every statement
Medium Severity
The withMetrics wrapper is applied to every individual SQLite statement execution (runStatement and runValues). Each invocation creates an Effect.gen generator, calls Effect.exit (which boxes the result), records Date.now() twice, constructs a Duration, and updates both a counter and a timer metric. For a query-heavy application like an orchestration engine with frequent reads and writes, this per-query overhead on the hot path can meaningfully degrade throughput.
Additional Locations (1)
| return; | ||
| } | ||
|
|
||
| const error = Cause.squash(exit.cause) as OrchestrationDispatchError; |
There was a problem hiding this comment.
Unsafe cast silently mishandles interrupt and defect causes
Medium Severity
Cause.squash(exit.cause) is cast as OrchestrationDispatchError, but for interrupt or defect causes it actually returns an InterruptedException or the raw defect value. The old code used Effect.catch, which only handled typed errors and let interrupts propagate naturally as fiber interruptions. The new code captures interrupts via Effect.exit and then feeds the mistyped squashed value to Deferred.fail, resolving the deferred with a value that doesn't conform to OrchestrationDispatchError. This also causes reconcileReadModelAfterDispatchFailure to run unnecessarily for interrupts since the Schema.is(OrchestrationCommandPreviouslyRejectedError) guard returns false.
| operation: "interrupt", | ||
| }), | ||
| }), | ||
| ); |
There was a problem hiding this comment.
Metrics use outcomeAttributes where attributes fits better
Low Severity
Several provider operations (interruptTurn, respondToRequest, stopSession, rollbackConversation) pass provider/operation labels via outcomeAttributes instead of attributes, and the closure ignores the outcome parameter it receives. The outcomeAttributes option is designed for attributes that vary based on the outcome (success/failure/interrupt), but here the attributes are static. Using attributes with a lazy function would express the intent more clearly and be consistent with sendTurn.



Summary
Testing
Note
Medium Risk
Introduces new tracing/metrics plumbing and wraps several core runtime paths (RPC, orchestration dispatch, provider operations, git, terminal) which could affect performance and failure behavior if instrumentation misbehaves or config is incorrect, though changes are mostly additive.
Overview
Adds server-side observability: local NDJSON trace recording (with batching/rotation) plus optional OTLP trace/metrics export driven by new
ServerConfigfields and CLI env vars (T3CODE_TRACE_*,T3CODE_OTLP_*), and wires this via a newObservabilityLivelayer (replacing file-based logging inServerLoggerLive).Instruments key execution paths with spans and metrics: WebSocket RPC handlers (via
observeRpc*wrappers),GitCore.execute(spans + command metrics + hook events), orchestration command dispatch (duration/total + ack latency on first committed event), provider session/turn APIs and runtime event ingestion, and terminal lifecycle/restart counters; startup phases and editor launches now annotate spans.Exposes observability info to the web client in server config snapshots and adds a Settings "Diagnostics" section to display the logs directory and open it in the preferred editor, with updated fixtures/tests and a new docs link in
README.md.Written by Cursor Bugbot for commit c6b7082. This will update automatically on new commits. Configure here.
Note
Add distributed tracing, metrics, and local NDJSON trace file export to the server
ObservabilityLivelayer (replacingServerLoggerLive) that writes spans to a rotating local NDJSON trace file and optionally exports traces and metrics via OTLP when configured.t3_) covering RPC requests, orchestration commands, provider sessions/turns, git commands, and terminal sessions, with awithMetricscombinator to attach them to effects.observabilityfields toServerConfigandServerConfigShapefor trace file path, OTLP URLs, batch settings, and service name, read fromT3CODE_*environment variables.GeneralSettingsPanelshowing the logs directory path and a button to open it, powered by a newuseServerObservabilityhook.ServerConfigschema now requires anobservabilityobject; any existing serialized configs without it will fail schema validation.Macroscope summarized c6b7082.