mcp: add response size cap to prevent blowing LLM context windows#35782
mcp: add response size cap to prevent blowing LLM context windows#35782bobbyiliev merged 1 commit intoMaterializeInc:mainfrom
Conversation
|
Thanks for opening this PR! Here are a few tips to help make the review process smooth for everyone. PR title guidelines
Pre-merge checklist
|
d8016d0 to
fe75fcc
Compare
src/environmentd/src/http/mcp.rs
Outdated
| let rows = execute_sql(client, DISCOVERY_QUERY).await?; | ||
| debug!("get_data_products returned {} rows", rows.len()); | ||
| if rows.is_empty() { | ||
| warn!("No data products found - indexes must have comments"); |
There was a problem hiding this comment.
Is this error msg correct? mz_mcp_data_products has a LEFT JOIN with mz_comments, so it would return rows even if there are no comments on objects, if I'm reading things correctly.
(Also, is it intended that this is a log msg, and not an error to be returned to the user?)
There was a problem hiding this comment.
Oh yes, good catch! Removed the warning entirely as the agent already sees the empty array in the response.
src/environmentd/src/http/mcp.rs
Outdated
|
|
||
| let query_tool_enabled = ENABLE_MCP_AGENTS_QUERY_TOOL.get(dyncfgs); | ||
| let max_response_size = | ||
| usize::try_from(MCP_MAX_RESPONSE_SIZE.get(dyncfgs)).unwrap_or(usize::MAX); |
There was a problem hiding this comment.
Why not just make MCP_MAX_RESPONSE_SIZE a usize instead of a u32, to avoid the casting?
src/adapter-types/src/dyncfgs.rs
Outdated
| /// Maximum size (in bytes) of MCP tool response content after JSON serialization. | ||
| /// Responses exceeding this limit are rejected with a clear error telling the | ||
| /// agent to narrow its query. Keeps responses within LLM context window limits. | ||
| pub const MCP_MAX_RESPONSE_SIZE: Config<u32> = Config::new( |
There was a problem hiding this comment.
should this be in the tool description as well?
There was a problem hiding this comment.
Good call! Added the response limit dynamically to the read_data_product, query, and query_system_catalog tool descriptions. It's populated from the mcp_max_response_size dyncfg value at request time so it stays in sync.
The only thing that I can think of as a small tradeoff is a few extra tokens in the tool descriptions that the LLM processes on every request, but it should help agents proactively use LIMIT instead of hitting the error.
c99e14b to
d02243f
Compare
d02243f to
bda927d
Compare
Adds an
mcp_max_response_sizedyncfg (default 1MB) that rejects MCP tool responses exceeding the limit with a clear JSON-RPC error telling the agent to useLIMITorWHERE. This catches oversized responses after JSON serialization, which is what the LLM actually consumes.The existing
max_result_sizeoperates on raw row bytes during streaming and doesn't account for JSON formatting overhead.Fixes https://linear.app/materializeinc/issue/DEX-3/mcp-query-safety-limits-size-timeout