Python: fix: filter history providers in handoff cloning to prevent duplicate messages#5214
Python: fix: filter history providers in handoff cloning to prevent duplicate messages#5214LEDazzio01 wants to merge 2 commits intomicrosoft:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR aims to fix duplicate chat history in Python handoff workflows by changing how agents are cloned for HandoffAgentExecutor, specifically preventing cloned agents from re-loading/storing prior messages via history providers.
Changes:
- Filter
BaseHistoryProviderinstances out of cloned agents’context_providersand add a no-opInMemoryHistoryProvider. - Refactor handoff executor conversation tracking/broadcasting and adjust handoff detection to return additional context.
- Add a new build-time validation intended to enforce a per-service-call history persistence requirement.
Comments suppressed due to low confidence (1)
python/packages/orchestrations/agent_framework_orchestrations/_handoff.py:428
- On handoff, the code appends
handoff_message(a function_result-bearing message) toself._cacheand then returns without clearing it. That leaves stale tool-result content in this executor’s cache, so the next time this agent is asked to respond it may replay tool artifacts and potentially trigger tool call/result mismatches. Either avoid caching this message, ensure it’s cleaned, or clear the cache before returning.
# Add the handoff message to the cache so that the next invocation of the agent includes
# the tool call result. This is necessary because each tool call must have a corresponding
# tool result.
self._cache.append(handoff_message)
await ctx.send_message(
AgentExecutorRequest(messages=[], should_respond=True),
target_id=handoff_target,
)
await ctx.add_event(
WorkflowEvent("handoff_sent", data=HandoffSentEvent(source=self.id, target=handoff_target))
)
self._autonomous_mode_turns = 0 # Reset autonomous mode turn counter on handoff
return
| return Agent( | ||
| client=agent.client, | ||
| id=agent.id, | ||
| name=agent.name, | ||
| description=agent.description, | ||
| context_providers=agent.context_providers, | ||
| context_providers=filtered_providers, | ||
| middleware=agent.agent_middleware, | ||
| require_per_service_call_history_persistence=agent.require_per_service_call_history_persistence, | ||
| default_options=cloned_options, # type: ignore[assignment] |
There was a problem hiding this comment.
agent.require_per_service_call_history_persistence is not an attribute on agent_framework.Agent (Agent stores unknown kwargs in additional_properties). This will raise AttributeError at runtime when cloning. If you need to propagate a flag, read it from agent.additional_properties (or add a real Agent property in core) and avoid passing it as a deprecated direct kwarg to Agent(...).
| # this ensures all options (including custom ones) are kept | ||
| cloned_options = deepcopy(options) | ||
| # Disable parallel tool calls to prevent the agent from invoking multiple handoff tools at once. | ||
| cloned_options["allow_multiple_tool_calls"] = False |
There was a problem hiding this comment.
_clone_chat_agent() no longer forces default_options['store']=False. Handoff executors replay conversation state via their own _full_conversation; allowing provider/service-side storage here can reintroduce duplicated history injection or stale tool-call state across turns. Consider restoring store=False (or otherwise explicitly defining the expected persistence mode for handoff clones).
| cloned_options["allow_multiple_tool_calls"] = False | |
| cloned_options["allow_multiple_tool_calls"] = False | |
| # Handoff executors replay the full conversation via _full_conversation, so cloned agents | |
| # must not rely on provider/service-side persistence that could rehydrate prior turns or | |
| # stale tool-call state and cause duplicate history injection. | |
| cloned_options["store"] = False |
| # Full conversation maintains the chat history between agents across handoffs, | ||
| # excluding internal agent messages such as tool calls and results. | ||
| self._full_conversation.extend(self._cache.copy()) | ||
|
|
There was a problem hiding this comment.
_full_conversation is extended with the raw _cache before the run. _cache can contain tool-control messages (e.g., function approval responses are stored as role="tool" with non-text contents by the base AgentExecutor). Persisting those into _full_conversation contradicts the comment here and can leak tool artifacts into termination checks/output. Clean/filter _cache (e.g., via clean_conversation_for_handoff) before extending _full_conversation.
| # Validate that all agents have require_per_service_call_history_persistence enabled. | ||
| # Handoff workflows use middleware that short-circuits tool calls (MiddlewareTermination), | ||
| # which means the service never sees those tool results. Without per-service-call | ||
| # history persistence, local history providers would persist tool results that | ||
| # the service has no record of, causing call/result mismatches on subsequent turns. | ||
| agents_missing_flag = [ | ||
| resolve_agent_id(agent) | ||
| for agent in resolved_agents.values() | ||
| if not agent.require_per_service_call_history_persistence | ||
| ] | ||
| if agents_missing_flag: | ||
| raise ValueError( | ||
| f"Handoff workflows require all participant agents to have " | ||
| f"'require_per_service_call_history_persistence=True'. " | ||
| f"The following agents are missing this setting: {', '.join(agents_missing_flag)}. " | ||
| f"Set this flag when constructing each Agent to ensure local history stays " | ||
| f"consistent with the service across handoff tool-call short-circuits." | ||
| ) |
There was a problem hiding this comment.
The new build-time validation references agent.require_per_service_call_history_persistence, but Agent does not define this attribute (unknown kwargs are stored under additional_properties). As written, .build() will raise AttributeError for normal agents. Either remove this check, implement the flag as a first-class Agent property in core, or read it safely from additional_properties with a documented default.
| # Filter out history providers to prevent duplicate messages. | ||
| # The HandoffAgentExecutor manages conversation history via _full_conversation, | ||
| # so history providers would re-inject previously stored messages on each | ||
| # agent.run() call, causing the entire conversation to appear twice. | ||
| # A no-op InMemoryHistoryProvider placeholder prevents the agent from | ||
| # auto-injecting a default one at runtime. | ||
| filtered_providers = [ | ||
| p for p in agent.context_providers | ||
| if not isinstance(p, BaseHistoryProvider) | ||
| ] | ||
| # Always add a no-op placeholder to prevent the agent from | ||
| # auto-injecting a default InMemoryHistoryProvider at runtime. | ||
| filtered_providers.append( | ||
| InMemoryHistoryProvider( | ||
| load_messages=False, | ||
| store_inputs=False, | ||
| store_outputs=False, | ||
| ) | ||
| ) | ||
|
|
There was a problem hiding this comment.
Behavior change: cloning now filters out BaseHistoryProvider instances and injects a no-op InMemoryHistoryProvider to avoid duplicate message injection. There are existing orchestration tests in this repo for handoff behavior; please add/adjust a regression test that asserts no duplicated messages are sent after handoff + resume when agents originally had history providers configured.
|
Addressing Copilot's review comments: Important context: This PR branch was created from a fork whose
Everything else in the diff — Specifically responding to each comment: 1. 2. 3. 4. 5. Missing regression tests — Valid feedback. I'll push a test for the history provider filtering behavior. |
Summary
Fixes #4695 — Supersedes #4714 (rebased onto current
main).When a
HandoffAgentExecutorclones an agent via_clone_chat_agent(), history providers fromthe original agent are copied verbatim. These providers re-inject previously stored messages on
each
agent.run()call, causing the entire conversation to appear twice — once from thehandoff's
_full_conversationand again from the history provider.Fix
Filter out
BaseHistoryProviderinstances during cloning and replace them with a no-opInMemoryHistoryProvider(load_messages=False, store_inputs=False, store_outputs=False)to prevent the agent from auto-injecting a default one at runtime.
Contribution Checklist