Skip to content

LiteLLM integration does not report cached, reasoning, or cache-write token usage #5455

@InterstellarStella

Description

@InterstellarStella

How do you use Sentry?

Sentry Saas (sentry.io)

Version

2.52.0

Steps to Reproduce

  1. Initialize Sentry with the LiteLLM integration and tracing enabled
  2. Make a completion call through LiteLLM to a provider that supports prompt caching (e.g., OpenAI, Anthropic, etc.)
  3. Inspect the resulting span data in Sentry's AI Agents dashboard

Expected Result

The span should include all available token usage detail attributes, just like the OpenAI and Anthropic integrations do:

  • gen_ai.usage.input_tokens (total input tokens)
  • gen_ai.usage.input_tokens.cached (cached input tokens, subset of total)
  • gen_ai.usage.input_tokens.cache_write (cache write tokens, if available)
  • gen_ai.usage.output_tokens (total output tokens)
  • gen_ai.usage.output_tokens.reasoning (reasoning tokens, subset of total)
  • gen_ai.usage.total_tokens

This data is necessary for Sentry to correctly calculate model costs using the formula documented here:

input cost = (input_tokens - cached_tokens) x input_rate + cached_tokens x cached_rate

Without cached/reasoning token breakdown, all tokens are charged at the full standard rate, producing inaccurate cost estimates.

Actual Result

The LiteLLM integration's _success_callback only extracts three basic fields:

record_token_usage(
    span,
    input_tokens=getattr(usage, "prompt_tokens", None),
    output_tokens=getattr(usage, "completion_tokens", None),
    total_tokens=getattr(usage, "total_tokens", None),
)

The input_tokens_cached, input_tokens_cache_write, and output_tokens_reasoning parameters of record_token_usage() are never passed. Therefore, cost calculations in the AI Agents dashboard overestimate costs for cached-heavy workloads (all input tokens billed at the full rate) and misattribute output vs. reasoning token costs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions