feat: add token usage metrics with OpenTelemetry integration by ajbozarth · Pull Request #563 · generative-computing/mellea

ajbozarth · 2026-02-26T22:45:25Z

Misc PR

Type of PR

Bug Fix
New Feature
Documentation
Other

Description

Link to Issue: Closes Implement counters to track token usage across all LLM backends with model and backend labels #462

Summary

Adds token usage metrics tracking across all Mellea backends using OpenTelemetry metrics counters, following Gen-AI Semantic Conventions for standardized observability.

Changes

Core Implementation

Added record_token_usage_metrics() function in mellea/telemetry/metrics.py
Implemented lazy initialization of token counters (mellea.llm.tokens.input, mellea.llm.tokens.output)
Integrated token tracking into all backends: OpenAI, Ollama, WatsonX, LiteLLM, and HuggingFace
Added console exporter support for debugging (MELLEA_METRICS_CONSOLE=true)

Configuration

New environment variable: MELLEA_METRICS_ENABLED (default: false)
New environment variable: MELLEA_METRICS_CONSOLE (default: false)
Metrics export via existing OTEL_EXPORTER_OTLP_ENDPOINT

Metrics Attributes

All token metrics include Gen-AI semantic convention attributes:

gen_ai.system - Backend system name (e.g., openai, ollama)
gen_ai.request.model - Model identifier
mellea.backend - Backend class name

Testing

Added comprehensive unit tests for metrics configuration and recording
Added integration tests for all backends (Ollama, OpenAI, WatsonX, LiteLLM, HuggingFace)
Tests verify proper token counting and attribute tagging

Documentation

Updated docs/dev/telemetry.md with complete metrics documentation
Added usage examples and configuration guide
Documented backend support matrix

Backend Support

Backend	Support	Token Source
OpenAI	✅ Full	`usage.prompt_tokens`, `usage.completion_tokens`
Ollama	✅ Full	`prompt_eval_count`, `eval_count`
WatsonX	✅ Full	`input_token_count`, `generated_token_count`
LiteLLM	✅ Full	`usage.prompt_tokens`, `usage.completion_tokens`
HuggingFace	✅ Full	Calculated from input_ids and output sequences

Breaking Changes

None - metrics are disabled by default and require explicit opt-in via MELLEA_METRICS_ENABLED=true.

Testing

Tests added to the respective file if code was changed
New code has 100% coverage if code as added
Ensure existing tests and github automation passes (a maintainer will kick off the github automation when the rest of the PR is populated)

Add mellea.llm.tokens.input/output counters following Gen-AI semantic conventions with zero overhead when disabled Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com>

…LM backends Add record_token_usage_metrics() calls to all backend post_processing methods to track input/output tokens. Add get_value() helper in backends/utils.py to handle dict/object attribute extraction. Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com>

Calculate token counts from input_ids and output sequences. Records to both tracing spans and metrics using helper function. Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com>

- Add integration tests for Ollama, OpenAI, LiteLLM, HuggingFace, WatsonX - Tests revealed metrics were coupled with tracing (architectural issue) - Fixed: Metrics now record independently of tracing spans - WatsonX: Store full response to preserve usage information - HuggingFace: Add zero-overhead guard, optimize test model Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com>

…ation Use MonkeyPatch for cleanup and update Watsonx to granite-4-h-small. Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com>

- Add Token Usage Metrics section to docs/dev/telemetry.md with metric definitions, backend support table, and configuration examples - Create metrics_example.py demonstrating token tracking with tested console output - Update telemetry_example.py to reference new metrics example - Update examples/telemetry/README.md with metrics quick start guide Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com>

github-actions · 2026-02-26T22:45:36Z

The PR description has been updated. Please fill out the template for your PR to be reviewed.

mergify · 2026-02-26T22:46:01Z

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert|release)(?:\(.+\))?:

Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com>

ajbozarth · 2026-02-26T23:55:13Z

After opening this I had Bob and Claude to in depth reviews and they came back with a handful of things I want to address. I will work on fixing those tomorrow

Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com>

ajbozarth · 2026-02-27T16:45:36Z

I've push a small update to test and doc streaming support, as suggested by AI review.

As of now this is ready for full review and merge

ajbozarth added 6 commits February 26, 2026 15:54

feat: add token usage counter metrics

50f0aa6

Add mellea.llm.tokens.input/output counters following Gen-AI semantic conventions with zero overhead when disabled Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com>

feat: add token metrics to HuggingFace backend

8dc02c1

Calculate token counts from input_ids and output sequences. Records to both tracing spans and metrics using helper function. Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com>

fix: use module-scoped fixture to prevent tracer provider reinitializ…

4c9788e

…ation Use MonkeyPatch for cleanup and update Watsonx to granite-4-h-small. Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com>

ajbozarth self-assigned this Feb 26, 2026

ajbozarth requested a review from a team as a code owner February 26, 2026 22:45

fix: lazy import is_metrics_enabled in backends

03e3663

Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com>

test: add streaming token metrics test and document timing

5bb388c

Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add token usage metrics with OpenTelemetry integration#563

feat: add token usage metrics with OpenTelemetry integration#563
ajbozarth wants to merge 8 commits intogenerative-computing:mainfrom
ajbozarth:feat/token-usage-metrics-v2

ajbozarth commented Feb 26, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 26, 2026

Uh oh!

mergify bot commented Feb 26, 2026

Uh oh!

ajbozarth commented Feb 26, 2026

Uh oh!

ajbozarth commented Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ajbozarth commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Misc PR

Type of PR

Description

Summary

Changes

Core Implementation

Configuration

Metrics Attributes

Testing

Documentation

Backend Support

Breaking Changes

Testing

Uh oh!

github-actions bot commented Feb 26, 2026

Uh oh!

mergify bot commented Feb 26, 2026

Merge Protections

🟢 Enforce conventional commit

Uh oh!

ajbozarth commented Feb 26, 2026

Uh oh!

ajbozarth commented Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ajbozarth commented Feb 26, 2026 •

edited

Loading