Skip to content

fix: stabilize function call IDs across streaming events#4653

Closed
giulio-leone wants to merge 1 commit intogoogle:mainfrom
giulio-leone:fix/streaming-function-call-id-mismatch
Closed

fix: stabilize function call IDs across streaming events#4653
giulio-leone wants to merge 1 commit intogoogle:mainfrom
giulio-leone:fix/streaming-function-call-id-mismatch

Conversation

@giulio-leone
Copy link

Problem

When the LLM does not provide function call IDs (common with some model backends), ADK generates client-side IDs via populate_client_function_call_id(). In streaming mode, partial and final events for the same logical function call each receive a fresh uuid4(), resulting in an ID mismatch.

This breaks:

  • HITL (human-in-the-loop) workflows that correlate partial and final function calls
  • SSE consumers that track function call state across streaming chunks
  • Tool execution that relies on stable IDs for function call → function response pairing

Root Cause

_finalize_model_response_event creates a new Event object for each llm_response chunk by merging model_response_event and llm_response via Event.model_validate(). Since the new event has no function call IDs, populate_client_function_call_id() generates brand-new IDs every time — there is no mechanism to remember previously generated IDs.

Fix

Introduce an optional function_call_id_cache: dict[str, str] parameter that maps (name:index) keys to previously generated IDs:

  1. populate_client_function_call_id() — accepts the cache, looks up existing IDs before generating new ones, and stores newly generated IDs
  2. _finalize_model_response_event() — threads the cache to populate_client_function_call_id()
  3. Streaming loop in _run_async() — creates a single cache dict before iteration and passes it through _postprocess_async_finalize_model_response_eventpopulate_client_function_call_id

The cache is keyed by name:index (not just name) to correctly handle multiple calls to the same function within a single response.

Testing

Added 8 unit tests in test_streaming_function_call_ids.py:

  • Cache generation, reuse, and skip-when-server-provides-id
  • Multiple same-name function calls with separate stable IDs
  • Partial → final event ID stability via _finalize_model_response_event
  • Regression test showing IDs differ without cache (the bug scenario)

All 358 flow tests pass, 33 streaming tests pass (2 consecutive clean runs).

Fixes #4609

@google-cla
Copy link

google-cla bot commented Feb 28, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves an issue where client-generated function call IDs were inconsistent across streaming LLM response events. By implementing a caching mechanism, the system now ensures that a unique and stable identifier is maintained for each logical function call throughout its streaming lifecycle, thereby improving the reliability of various downstream processes that depend on these IDs.

Highlights

  • Problem Identification: When LLMs do not provide function call IDs, client-side generation via uuid4() in streaming mode results in different IDs for the same logical function call across partial and final events, breaking HITL workflows, SSE consumers, and tool execution.
  • Root Cause Analysis: The _finalize_model_response_event function creates new Event objects for each LLM response chunk, causing populate_client_function_call_id() to generate fresh, unremembered IDs every time.
  • Solution Implementation: An optional function_call_id_cache (a dictionary mapping name:index to IDs) was introduced. This cache is passed through _run_async -> _postprocess_async -> _finalize_model_response_event -> populate_client_function_call_id() to ensure stable IDs for function calls across streaming events.
  • Comprehensive Testing: Eight new unit tests were added to test_streaming_function_call_ids.py to cover cache generation, reuse, server-provided ID handling, multiple same-name function calls, and partial-to-final event ID stability, including a regression test for the original bug scenario.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • src/google/adk/flows/llm_flows/base_llm_flow.py
    • Modified _finalize_model_response_event to accept an optional function_call_id_cache parameter.
    • Updated _run_one_step_async to initialize and pass the function_call_id_cache to subsequent processing steps.
    • Adjusted _postprocess_async to receive and forward the function_call_id_cache.
  • src/google/adk/flows/llm_flows/functions.py
    • Enhanced populate_client_function_call_id to accept an optional function_call_id_cache and use it to store and retrieve function call IDs, ensuring stability based on a name:index key.
  • tests/unittests/flows/llm_flows/test_streaming_function_call_ids.py
    • Added a new test file with unit tests to validate the stability of function call IDs across streaming events, covering cache functionality, multiple function calls, and regression scenarios.
Activity
  • The author, giulio-leone, identified a problem where client-side generated function call IDs were unstable during streaming LLM responses.
  • A solution was designed to introduce a caching mechanism to maintain consistent IDs for function calls across partial and final streaming events.
  • Code changes were implemented in base_llm_flow.py and functions.py to integrate the new function_call_id_cache.
  • New unit tests were added to thoroughly verify the correctness and stability of the function call ID generation and caching.
  • The author confirmed that all existing flow tests and streaming tests passed after the changes, indicating no regressions.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@adk-bot
Copy link
Collaborator

adk-bot commented Feb 28, 2026

Response from ADK Triaging Agent

Hello @giulio-leone, thank you for your contribution!

Before we can merge this pull request, we need you to sign our Contributor License Agreement (CLA). You can find more information and sign it here: https://cla.developers.google.com/

Thank you!

@adk-bot adk-bot added the core [Component] This issue is related to the core interface and implementation label Feb 28, 2026
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a caching mechanism to ensure stable function call IDs across partial and final events in streaming mode, which is a good fix for the described problem. The implementation correctly uses a dictionary to cache generated IDs, keyed by function name and index to handle multiple calls. The changes are well-tested with new unit tests covering various scenarios.

I have two main points of feedback:

  1. A suggestion to refactor the caching logic in functions.py to be more concise using dict.setdefault().
  2. A more critical concern that the fix appears to be incomplete, as it only covers the SSE streaming mode (run_async) and not the live/bidi-streaming mode (run_live), where function call IDs would likely remain unstable. This should be addressed to ensure consistent behavior across all streaming modes.

Comment on lines +836 to +838
# Cache maps function call names to generated IDs so that partial and
# final streaming events for the same call share a stable ID.
function_call_id_cache: dict[str, str] = {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This change correctly introduces a cache to stabilize function call IDs for the SSE streaming mode handled by _run_one_step_async. However, the live/bidi-streaming mode handled by run_live appears to be missing this fix. The run_live method does not create or pass a function_call_id_cache, leading to unstable function call IDs in that streaming scenario. The caching mechanism should also be implemented for the run_live flow to ensure consistent behavior across all streaming modes.

Comment on lines +197 to +202
if function_call_id_cache is not None and cache_key in function_call_id_cache:
function_call.id = function_call_id_cache[cache_key]
else:
function_call.id = generate_client_function_call_id()
if function_call_id_cache is not None:
function_call_id_cache[cache_key] = function_call.id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This logic for handling the cache can be simplified. Using dict.setdefault() can make the code more concise and easier to read by combining the check for existence, value retrieval, and setting the default value into a single operation.

Suggested change
if function_call_id_cache is not None and cache_key in function_call_id_cache:
function_call.id = function_call_id_cache[cache_key]
else:
function_call.id = generate_client_function_call_id()
if function_call_id_cache is not None:
function_call_id_cache[cache_key] = function_call.id
if function_call_id_cache is not None:
function_call.id = function_call_id_cache.setdefault(
cache_key, generate_client_function_call_id()
)
else:
function_call.id = generate_client_function_call_id()

@giulio-leone giulio-leone force-pushed the fix/streaming-function-call-id-mismatch branch from 5852641 to a1abd98 Compare February 28, 2026 14:38
When models don't provide function call IDs, ADK generates client-side
IDs via populate_client_function_call_id(). In streaming mode, partial
and final events for the same logical function call each get a fresh
uuid4, causing an ID mismatch that breaks HITL (human-in-the-loop)
workflows and SSE consumers that correlate function calls across chunks.

Root cause: _finalize_model_response_event creates a new Event object
for each llm_response chunk, and populate_client_function_call_id
generates a brand-new ID every time without knowledge of prior IDs.

Fix: Add an optional function_call_id_cache dict that maps
(name, index) keys to previously generated IDs. The streaming loop in
_run_async creates the cache before iteration and threads it through
_postprocess_async → _finalize_model_response_event →
populate_client_function_call_id, ensuring the same logical function
call gets a stable ID across all streaming events.

The cache is keyed by (name:index) to correctly handle multiple calls
to the same function within a single response.

Fixes google#4609
@giulio-leone giulio-leone force-pushed the fix/streaming-function-call-id-mismatch branch from a1abd98 to 82f8d5e Compare February 28, 2026 14:39
@giulio-leone
Copy link
Author

Closing — CLA not yet signed. Will resubmit when ready.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core [Component] This issue is related to the core interface and implementation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

populate_client_function_call_id generates different UUIDs for the same function call across partial and final SSE streaming events

2 participants