fix: replace RuntimeError with cancel_tool to prevent memory corruption by xiaosu19 · Pull Request #1 · aws-samples/sample-aws-techbot

xiaosu19 · 2026-05-13T07:23:01Z

Problem

When tool calls reach the 20-call limit, raise RuntimeError in AfterToolCallEvent breaks the message history — toolUse blocks are left without matching toolResult blocks. When AgentCore Memory restores this corrupted history in subsequent requests, Bedrock's ConverseStream API rejects it with:

ValidationException: The number of toolResult blocks at messages.19.content exceeds the number of toolUse blocks of previous turn.

This makes the affected session permanently broken.

Fix

Replace AfterToolCallEvent + raise RuntimeError with BeforeToolCallEvent + event.cancel_tool.

This is the recommended pattern from Strands SDK docs. The tool call is cancelled gracefully — the model receives an error message and responds using already-gathered information. The conversation history remains consistent.

Changes

main.py: ~10 lines changed in the tool limit hook section

Testing

Deployed and verified in us-east-2 with Memory enabled. New sessions work correctly when hitting the 20-tool limit.

…on on tool limit When tool calls reach the 20-call limit, raising RuntimeError in AfterToolCallEvent breaks the message history, leaving toolUse blocks without matching toolResult blocks. When AgentCore Memory restores this corrupted history in subsequent requests, Bedrock's ConverseStream API rejects it with ValidationException. Fix: Use BeforeToolCallEvent with event.cancel_tool instead. This cancels the tool gracefully by returning an error message to the model, which then responds using already-gathered information. The conversation history remains consistent and Memory can safely restore it.

…emory When MCP tool calls are interrupted (timeout, network error), Memory saves incomplete history with toolUse but no toolResult. On restoration, Strands SDK's repair logic can add incorrect toolResult counts (strands-agents/sdk-python#2296), causing Bedrock API rejection. Add fix_message_history() that validates toolUse/toolResult pairing before each invocation and corrects any mismatches.

The previous fix_message_history() ran after Agent creation but before invoke_async(). However, session_manager restores history inside invoke_async(), so the fix ran too early. Move to BeforeModelCallEvent hook which fires right before each model call, after history restoration and SDK's own (buggy) repair logic. This ensures messages are always valid when sent to Bedrock.

xiaosu added 3 commits May 13, 2026 15:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: replace RuntimeError with cancel_tool to prevent memory corruption#1

fix: replace RuntimeError with cancel_tool to prevent memory corruption#1
xiaosu19 wants to merge 3 commits into
aws-samples:mainfrom
xiaosu19:fix/tool-limit-memory-corruption

xiaosu19 commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

xiaosu19 commented May 13, 2026

Problem

Fix

Changes

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant