Skip to content

Revamp runtime OpenTelemetry tracing: attempt-based spans with early publishing #4530

@slinkydeveloper

Description

@slinkydeveloper

Summary

Our current tracing experience has several issues that make it unsuitable for debugging live systems. As more customers request end-to-end tracing, we need to revisit how the runtime publishes OpenTelemetry spans.

Current Problems

1. Spans per journal entry are confusing

  • For ctx.run, the span duration includes internal retry timer backoffs and is only published when completed
  • If an invocation is stuck in a retry loop, the span looks fine — you can't tell it's retrying
  • Runtime-generated spans for ctx.run are not correlatable with user-created spans in the SDK

2. Parent span published too late

  • Until the entire invocation completes, OTel traces are not correctly shown/correlated because the parent span is published at the end
  • This makes tracing useless for debugging things that are actively on fire
  • The problem compounds with e2e tracing — more info is uncorrelated

Root Cause

The current tracing was designed pre-UI and optimized for post-hoc viewing (a "logical view" of the invocation). But the OpenTelemetry spec doesn't support giving a logical view while the invocation is still running.

Proposal

Shift philosophy to representing what's physically happening, while it's happening.

  1. Focus on invocation attempts, not the whole invocation
  2. Publish spans as soon as an invocation attempt ends — this helps correlation and allows introspection of in-progress invocations
  3. Correlate all invocation attempts under a parent span (e.g. a "started" span)
  4. Use events instead of spans for journal entries inside the invocation attempt span (e.g. ctx.run becomes an event, not a child span)
  5. Still generate service-to-service spans as children (or linked — TBD) to enable navigating between service calls

Expected Outcome

  • Users get a physical view of what happened during the invocation via OTel traces
  • The logical view remains the Restate UI
  • Users can search spans by invocation ID (navigate between Restate UI ↔ Jaeger and vice versa)

Example

An invocation retried 3 times with a 3-second retry interval, doing a ctx.run then a call, would show:

  • A parent span covering the full invocation
  • 3 child attempt spans (one per attempt), each published as soon as the attempt ends
  • Events within each attempt span for journal entries
  • A service-to-service child/linked span for the outgoing call

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions