Skip to content

mcp: add response size cap to prevent blowing LLM context windows#35782

Merged
bobbyiliev merged 1 commit intoMaterializeInc:mainfrom
bobbyiliev:mcp-response-size-cap
Mar 31, 2026
Merged

mcp: add response size cap to prevent blowing LLM context windows#35782
bobbyiliev merged 1 commit intoMaterializeInc:mainfrom
bobbyiliev:mcp-response-size-cap

Conversation

@bobbyiliev
Copy link
Copy Markdown
Contributor

@bobbyiliev bobbyiliev commented Mar 30, 2026

Adds an mcp_max_response_size dyncfg (default 1MB) that rejects MCP tool responses exceeding the limit with a clear JSON-RPC error telling the agent to use LIMIT or WHERE. This catches oversized responses after JSON serialization, which is what the LLM actually consumes.

The existing max_result_size operates on raw row bytes during streaming and doesn't account for JSON formatting overhead.

Fixes https://linear.app/materializeinc/issue/DEX-3/mcp-query-safety-limits-size-timeout

@github-actions
Copy link
Copy Markdown

Thanks for opening this PR! Here are a few tips to help make the review process smooth for everyone.

PR title guidelines

  • Use imperative mood: "Fix X" not "Fixed X" or "Fixes X"
  • Be specific: "Fix panic in catalog sync when controller restarts" not "Fix bug" or "Update catalog code"
  • Prefix with area if helpful: compute: , storage: , adapter: , sql:

Pre-merge checklist

  • The PR title is descriptive and will make sense in the git log.
  • This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).

@bobbyiliev bobbyiliev force-pushed the mcp-response-size-cap branch from d8016d0 to fe75fcc Compare March 30, 2026 13:38
@bobbyiliev bobbyiliev marked this pull request as ready for review March 30, 2026 13:49
@bobbyiliev bobbyiliev requested review from a team as code owners March 30, 2026 13:49
let rows = execute_sql(client, DISCOVERY_QUERY).await?;
debug!("get_data_products returned {} rows", rows.len());
if rows.is_empty() {
warn!("No data products found - indexes must have comments");
Copy link
Copy Markdown
Contributor

@ggevay ggevay Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this error msg correct? mz_mcp_data_products has a LEFT JOIN with mz_comments, so it would return rows even if there are no comments on objects, if I'm reading things correctly.

(Also, is it intended that this is a log msg, and not an error to be returned to the user?)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes, good catch! Removed the warning entirely as the agent already sees the empty array in the response.


let query_tool_enabled = ENABLE_MCP_AGENTS_QUERY_TOOL.get(dyncfgs);
let max_response_size =
usize::try_from(MCP_MAX_RESPONSE_SIZE.get(dyncfgs)).unwrap_or(usize::MAX);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just make MCP_MAX_RESPONSE_SIZE a usize instead of a u32, to avoid the casting?

/// Maximum size (in bytes) of MCP tool response content after JSON serialization.
/// Responses exceeding this limit are rejected with a clear error telling the
/// agent to narrow its query. Keeps responses within LLM context window limits.
pub const MCP_MAX_RESPONSE_SIZE: Config<u32> = Config::new(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be in the tool description as well?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call! Added the response limit dynamically to the read_data_product, query, and query_system_catalog tool descriptions. It's populated from the mcp_max_response_size dyncfg value at request time so it stays in sync.

The only thing that I can think of as a small tradeoff is a few extra tokens in the tool descriptions that the LLM processes on every request, but it should help agents proactively use LIMIT instead of hitting the error.

@bobbyiliev bobbyiliev force-pushed the mcp-response-size-cap branch 2 times, most recently from c99e14b to d02243f Compare March 31, 2026 12:38
@bobbyiliev bobbyiliev force-pushed the mcp-response-size-cap branch from d02243f to bda927d Compare March 31, 2026 12:57
@bobbyiliev bobbyiliev merged commit 860ea8e into MaterializeInc:main Mar 31, 2026
118 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants