fix: restore per-cycle span duration for execute_event_loop_cycle#1939
Open
Di-Is wants to merge 3 commits intostrands-agents:mainfrom
Open
fix: restore per-cycle span duration for execute_event_loop_cycle#1939Di-Is wants to merge 3 commits intostrands-agents:mainfrom
Di-Is wants to merge 3 commits intostrands-agents:mainfrom
Conversation
PR strands-agents#1293 wrapped event_loop_cycle() in use_span(end_on_exit=True) and removed explicit span.end() calls. Because event_loop_cycle is an async generator, yield keeps the context manager open across recursive cycles, causing all execute_event_loop_cycle spans to share the same OTel end_time. Switch to end_on_exit=False and explicitly call span.end() via _end_span() in end_event_loop_cycle_span() and end_model_invoke_span(), restoring end_span_with_error() in all exception paths.
…boardInterrupt Trace spans were not properly closed when BaseException (e.g. KeyboardInterrupt, asyncio.CancelledError) was raised. Add explicit BaseException handlers to close spans and aclose() calls to ensure async generators are cleaned up.
Reduce overhead by limiting force_flush calls to agent span completion instead of every span end. Add flush parameter to _end_span() with default False, passing True only from end_agent_span().
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Since v1.24.0 (PR #1293),
execute_event_loop_cyclespans no longer reflect per-cycle duration. When a cycle performs tool use and recurses, the parent cycle's native OTel span stays open until all recursive children complete, producing cumulative bottom-up latency instead of per-step latency in observability backends (Langfuse, Jaeger, etc.).The root cause:
event_loop_cycle()is an async generator whose body was wrapped inuse_span(end_on_exit=True). Becauseyieldkeeps the context manager open across recursive cycles, allspan.end()calls fire simultaneously when the generator chain unwinds. The logical metadata (gen_ai.event.end_timeattribute) is set at the correct time, but backends use the native OTelendTimeUnixNanofromspan.end().Three commits, each reviewable independently:
fix: restore explicit span.end() to fix span end_time regressionend_on_exit=Falseand restores explicitspan.end()calls inend_event_loop_cycle_span()andend_model_invoke_span(), withend_span_with_error()on exception paths.fix: handle BaseException in trace spans to prevent span leaks on KeyboardInterruptend_on_exit=False, spans must be explicitly closed on all paths.except BaseExceptionhandlers andaclose()for async generators to coverKeyboardInterruptandasyncio.CancelledError.perf: only force flush tracer provider when ending agent spans_end_span()previously calledforce_flush()on every span end; this limits it to agent span completion only.Resolve #1930, #1938
Related Issues
execute_event_loop_cyclespans all share the same OTel end_time #1938 (duplicate, closed)Documentation PR
N/A
Type of Change
Bug fix
Testing
How have you tested the change? Verify that the changes do not break functionality or introduce warnings in consuming repositories: agents-docs, agents-tools, agents-cli
hatch run prepareChecklist
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.