-
Notifications
You must be signed in to change notification settings - Fork 52
feat(llma): pass raw provider usage metadata for backend cost calculations #411
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add raw_usage field to TokenUsage type to capture raw provider usage metadata (OpenAI, Anthropic, Gemini). This enables the backend to extract modality-specific token counts (text vs image vs audio) for accurate cost calculations. - Add raw_usage field to TokenUsage TypedDict - Update all provider converters to capture raw usage: - OpenAI: capture response.usage and chunk usage - Anthropic: capture usage from message_start and message_delta events - Gemini: capture usage_metadata from responses and chunks - Pass raw usage as $ai_usage property in PostHog events - Update merge_usage_stats to handle raw_usage in both modes - Add tests verifying $ai_usage is captured for all providers Backend will extract provider-specific details and delete $ai_usage after processing to avoid bloating properties. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
andrewm4894
reviewed
Jan 27, 2026
Member
andrewm4894
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just double checking one or two things
Address PR review feedback from @andrewm4894: 1. **Serialization**: Add serialize_raw_usage() helper with fallback chain: - .model_dump() for Pydantic models (OpenAI/Anthropic) - .to_dict() for protobuf-like objects - vars() for simple objects - str() as last resort This ensures we never pass unserializable objects to PostHog client. 2. **Data loss prevention**: Change from replacing to merging raw_usage in incremental mode. For Anthropic streaming, message_start has input token details and message_delta has output token details - merging preserves both instead of losing input data. 3. **Test coverage**: Enhanced tests to verify: - JSON serializability with json.dumps() - Expected structure of raw_usage dicts - Coverage for both non-streaming and streaming modes - Fixed Gemini test mocks to return proper dicts from model_dump() Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Address PR feedback from @andrewm4894 - serialize in converters, not utils. **Problem:** Utils was receiving raw Pydantic/protobuf objects and serializing them, which meant provider-specific knowledge leaked into generic code. **Solution:** Move serialization into converters where provider context exists: Converters (NEW): - OpenAI: serialize_raw_usage(response.usage) → dict - Anthropic: serialize_raw_usage(event.usage) → dict - Gemini: serialize_raw_usage(metadata) → dict Utils (SIMPLIFIED): - Just passes dicts through, no serialization needed - Merge operations work with dicts only **Benefits:** 1. Type correctness: raw_usage is always Dict[str, Any] 2. Separation of concerns: converters handle provider formats 3. Fail fast: serialization errors in converters with context 4. Cleaner abstraction: utils doesn't know about Pydantic/protobuf **Flow:** Provider object → Converter serializes → dict → Utils → PostHog Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Fix mypy error: "Need type annotation for 'current_raw'" Extract value first, then apply explicit type annotation with ternary conditional to satisfy mypy's type checker. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
andrewm4894
approved these changes
Jan 28, 2026
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add raw_usage field to TokenUsage type to capture raw provider usage metadata (OpenAI, Anthropic, Gemini). This enables the backend to extract modality-specific token counts (text vs image vs audio) for accurate cost calculations.
Backend will extract provider-specific details and delete $ai_usage after processing to avoid bloating properties.