Summary
Our current tracing experience has several issues that make it unsuitable for debugging live systems. As more customers request end-to-end tracing, we need to revisit how the runtime publishes OpenTelemetry spans.
Current Problems
1. Spans per journal entry are confusing
- For
ctx.run, the span duration includes internal retry timer backoffs and is only published when completed
- If an invocation is stuck in a retry loop, the span looks fine — you can't tell it's retrying
- Runtime-generated spans for
ctx.run are not correlatable with user-created spans in the SDK
2. Parent span published too late
- Until the entire invocation completes, OTel traces are not correctly shown/correlated because the parent span is published at the end
- This makes tracing useless for debugging things that are actively on fire
- The problem compounds with e2e tracing — more info is uncorrelated
Root Cause
The current tracing was designed pre-UI and optimized for post-hoc viewing (a "logical view" of the invocation). But the OpenTelemetry spec doesn't support giving a logical view while the invocation is still running.
Proposal
Shift philosophy to representing what's physically happening, while it's happening.
- Focus on invocation attempts, not the whole invocation
- Publish spans as soon as an invocation attempt ends — this helps correlation and allows introspection of in-progress invocations
- Correlate all invocation attempts under a parent span (e.g. a "started" span)
- Use events instead of spans for journal entries inside the invocation attempt span (e.g.
ctx.run becomes an event, not a child span)
- Still generate service-to-service spans as children (or linked — TBD) to enable navigating between service calls
Expected Outcome
- Users get a physical view of what happened during the invocation via OTel traces
- The logical view remains the Restate UI
- Users can search spans by invocation ID (navigate between Restate UI ↔ Jaeger and vice versa)
Example
An invocation retried 3 times with a 3-second retry interval, doing a ctx.run then a call, would show:
- A parent span covering the full invocation
- 3 child attempt spans (one per attempt), each published as soon as the attempt ends
- Events within each attempt span for journal entries
- A service-to-service child/linked span for the outgoing call
Summary
Our current tracing experience has several issues that make it unsuitable for debugging live systems. As more customers request end-to-end tracing, we need to revisit how the runtime publishes OpenTelemetry spans.
Current Problems
1. Spans per journal entry are confusing
ctx.run, the span duration includes internal retry timer backoffs and is only published when completedctx.runare not correlatable with user-created spans in the SDK2. Parent span published too late
Root Cause
The current tracing was designed pre-UI and optimized for post-hoc viewing (a "logical view" of the invocation). But the OpenTelemetry spec doesn't support giving a logical view while the invocation is still running.
Proposal
Shift philosophy to representing what's physically happening, while it's happening.
ctx.runbecomes an event, not a child span)Expected Outcome
Example
An invocation retried 3 times with a 3-second retry interval, doing a
ctx.runthen a call, would show: