Skip to content

fix: replace RuntimeError with cancel_tool to prevent memory corruption#1

Open
xiaosu19 wants to merge 3 commits into
aws-samples:mainfrom
xiaosu19:fix/tool-limit-memory-corruption
Open

fix: replace RuntimeError with cancel_tool to prevent memory corruption#1
xiaosu19 wants to merge 3 commits into
aws-samples:mainfrom
xiaosu19:fix/tool-limit-memory-corruption

Conversation

@xiaosu19
Copy link
Copy Markdown

Problem

When tool calls reach the 20-call limit, raise RuntimeError in AfterToolCallEvent breaks the message history — toolUse blocks are left without matching toolResult blocks. When AgentCore Memory restores this corrupted history in subsequent requests, Bedrock's ConverseStream API rejects it with:

ValidationException: The number of toolResult blocks at messages.19.content exceeds the number of toolUse blocks of previous turn.

This makes the affected session permanently broken.

Fix

Replace AfterToolCallEvent + raise RuntimeError with BeforeToolCallEvent + event.cancel_tool.

This is the recommended pattern from Strands SDK docs. The tool call is cancelled gracefully — the model receives an error message and responds using already-gathered information. The conversation history remains consistent.

Changes

  • main.py: ~10 lines changed in the tool limit hook section

Testing

Deployed and verified in us-east-2 with Memory enabled. New sessions work correctly when hitting the 20-tool limit.

xiaosu added 3 commits May 13, 2026 15:22
…on on tool limit

When tool calls reach the 20-call limit, raising RuntimeError in
AfterToolCallEvent breaks the message history, leaving toolUse blocks
without matching toolResult blocks. When AgentCore Memory restores this
corrupted history in subsequent requests, Bedrock's ConverseStream API
rejects it with ValidationException.

Fix: Use BeforeToolCallEvent with event.cancel_tool instead. This
cancels the tool gracefully by returning an error message to the model,
which then responds using already-gathered information. The conversation
history remains consistent and Memory can safely restore it.
…emory

When MCP tool calls are interrupted (timeout, network error), Memory
saves incomplete history with toolUse but no toolResult. On restoration,
Strands SDK's repair logic can add incorrect toolResult counts
(strands-agents/sdk-python#2296), causing Bedrock API rejection.

Add fix_message_history() that validates toolUse/toolResult pairing
before each invocation and corrects any mismatches.
The previous fix_message_history() ran after Agent creation but before
invoke_async(). However, session_manager restores history inside
invoke_async(), so the fix ran too early.

Move to BeforeModelCallEvent hook which fires right before each model
call, after history restoration and SDK's own (buggy) repair logic.
This ensures messages are always valid when sent to Bedrock.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant