fix: stabilize function call IDs across streaming events#4653
fix: stabilize function call IDs across streaming events#4653giulio-leone wants to merge 1 commit intogoogle:mainfrom
Conversation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request resolves an issue where client-generated function call IDs were inconsistent across streaming LLM response events. By implementing a caching mechanism, the system now ensures that a unique and stable identifier is maintained for each logical function call throughout its streaming lifecycle, thereby improving the reliability of various downstream processes that depend on these IDs. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
Response from ADK Triaging Agent Hello @giulio-leone, thank you for your contribution! Before we can merge this pull request, we need you to sign our Contributor License Agreement (CLA). You can find more information and sign it here: https://cla.developers.google.com/ Thank you! |
There was a problem hiding this comment.
Code Review
This pull request introduces a caching mechanism to ensure stable function call IDs across partial and final events in streaming mode, which is a good fix for the described problem. The implementation correctly uses a dictionary to cache generated IDs, keyed by function name and index to handle multiple calls. The changes are well-tested with new unit tests covering various scenarios.
I have two main points of feedback:
- A suggestion to refactor the caching logic in
functions.pyto be more concise usingdict.setdefault(). - A more critical concern that the fix appears to be incomplete, as it only covers the SSE streaming mode (
run_async) and not the live/bidi-streaming mode (run_live), where function call IDs would likely remain unstable. This should be addressed to ensure consistent behavior across all streaming modes.
| # Cache maps function call names to generated IDs so that partial and | ||
| # final streaming events for the same call share a stable ID. | ||
| function_call_id_cache: dict[str, str] = {} |
There was a problem hiding this comment.
This change correctly introduces a cache to stabilize function call IDs for the SSE streaming mode handled by _run_one_step_async. However, the live/bidi-streaming mode handled by run_live appears to be missing this fix. The run_live method does not create or pass a function_call_id_cache, leading to unstable function call IDs in that streaming scenario. The caching mechanism should also be implemented for the run_live flow to ensure consistent behavior across all streaming modes.
| if function_call_id_cache is not None and cache_key in function_call_id_cache: | ||
| function_call.id = function_call_id_cache[cache_key] | ||
| else: | ||
| function_call.id = generate_client_function_call_id() | ||
| if function_call_id_cache is not None: | ||
| function_call_id_cache[cache_key] = function_call.id |
There was a problem hiding this comment.
This logic for handling the cache can be simplified. Using dict.setdefault() can make the code more concise and easier to read by combining the check for existence, value retrieval, and setting the default value into a single operation.
| if function_call_id_cache is not None and cache_key in function_call_id_cache: | |
| function_call.id = function_call_id_cache[cache_key] | |
| else: | |
| function_call.id = generate_client_function_call_id() | |
| if function_call_id_cache is not None: | |
| function_call_id_cache[cache_key] = function_call.id | |
| if function_call_id_cache is not None: | |
| function_call.id = function_call_id_cache.setdefault( | |
| cache_key, generate_client_function_call_id() | |
| ) | |
| else: | |
| function_call.id = generate_client_function_call_id() |
5852641 to
a1abd98
Compare
When models don't provide function call IDs, ADK generates client-side IDs via populate_client_function_call_id(). In streaming mode, partial and final events for the same logical function call each get a fresh uuid4, causing an ID mismatch that breaks HITL (human-in-the-loop) workflows and SSE consumers that correlate function calls across chunks. Root cause: _finalize_model_response_event creates a new Event object for each llm_response chunk, and populate_client_function_call_id generates a brand-new ID every time without knowledge of prior IDs. Fix: Add an optional function_call_id_cache dict that maps (name, index) keys to previously generated IDs. The streaming loop in _run_async creates the cache before iteration and threads it through _postprocess_async → _finalize_model_response_event → populate_client_function_call_id, ensuring the same logical function call gets a stable ID across all streaming events. The cache is keyed by (name:index) to correctly handle multiple calls to the same function within a single response. Fixes google#4609
a1abd98 to
82f8d5e
Compare
|
Closing — CLA not yet signed. Will resubmit when ready. |
Problem
When the LLM does not provide function call IDs (common with some model backends), ADK generates client-side IDs via
populate_client_function_call_id(). In streaming mode, partial and final events for the same logical function call each receive a freshuuid4(), resulting in an ID mismatch.This breaks:
Root Cause
_finalize_model_response_eventcreates a newEventobject for eachllm_responsechunk by mergingmodel_response_eventandllm_responseviaEvent.model_validate(). Since the new event has no function call IDs,populate_client_function_call_id()generates brand-new IDs every time — there is no mechanism to remember previously generated IDs.Fix
Introduce an optional
function_call_id_cache: dict[str, str]parameter that maps(name:index)keys to previously generated IDs:populate_client_function_call_id()— accepts the cache, looks up existing IDs before generating new ones, and stores newly generated IDs_finalize_model_response_event()— threads the cache topopulate_client_function_call_id()_run_async()— creates a single cache dict before iteration and passes it through_postprocess_async→_finalize_model_response_event→populate_client_function_call_idThe cache is keyed by
name:index(not just name) to correctly handle multiple calls to the same function within a single response.Testing
Added 8 unit tests in
test_streaming_function_call_ids.py:_finalize_model_response_eventAll 358 flow tests pass, 33 streaming tests pass (2 consecutive clean runs).
Fixes #4609