feat: add maxMessageBytes to stdio transports, make ReadBuffer amortized O(1)#2245
feat: add maxMessageBytes to stdio transports, make ReadBuffer amortized O(1)#2245felixweinberger wants to merge 4 commits into
Conversation
🦋 Changeset detectedLatest commit: ee4a482 The changes in this PR will be included in the next version bump. This PR includes changesets to release 6 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
@modelcontextprotocol/client
@modelcontextprotocol/codemod
@modelcontextprotocol/server
@modelcontextprotocol/server-legacy
@modelcontextprotocol/express
@modelcontextprotocol/fastify
@modelcontextprotocol/hono
@modelcontextprotocol/node
commit: |
…ortized O(1) per byte A peer that writes a large amount of data without a newline previously grew the receiving process's memory without bound, and every incoming chunk re-copied the entire buffered backlog via Buffer.concat. There was no public way to bound or replace the read buffer. StdioClientTransport and StdioServerTransport now accept an optional maxMessageBytes. Oversized messages are dropped and reported through onerror as an SdkError with the new code SdkErrorCode.MessageTooLarge, and the transport recovers at the next newline boundary. Default is unlimited, so behavior is unchanged unless the option is set. The read buffer now uses a single growable buffer with read/scan offsets: appends are amortized O(1) per byte and newline scanning never revisits bytes, instead of concatenating the full backlog on every chunk.
…n guides v2 silently skips non-JSON lines on the stdio stream where v1 surfaced a SyntaxError through onerror; migrators relying on that signal should know. Also documents the new maxMessageBytes option as the supported replacement for flood protection previously built against v1 transport internals.
…tream flood recovery The existing chunked-assembly tests only exercised capacity events with no consumed prefix, so a copy or offset bug in the compaction and growth paths could corrupt a partially buffered message without any test noticing. The new cases interleave a consumed message with a partial one before forcing each capacity event, using distinct payload characters so spliced bytes change content detectably. Also pins the shrink-after-drain memory bound and adds a client transport test where a child process floods stdout past the limit without a newline and the transport recovers at the next boundary.
1616369 to
ee4a482
Compare
| ### Stdio transports: non-JSON lines are now skipped silently | ||
|
|
||
| In v1, a non-JSON line on the stdio stream (for example, debug output from a hot-reload | ||
| tool writing to stdout) surfaced as a `SyntaxError` through the transport's `onerror` | ||
| callback. In v2, the stdio read buffer silently skips lines that are not valid JSON and | ||
| continues with the next line; only valid-JSON messages that fail schema validation still | ||
| reach `onerror`. If you relied on `onerror` to detect a misbehaving server that writes | ||
| noise to stdout, that signal no longer fires for non-JSON lines. | ||
|
|
||
| Relatedly, both stdio transports now accept an optional `maxMessageBytes` setting that | ||
| bounds how large a single message may grow before it is dropped and reported via | ||
| `onerror` (`SdkError` with code `SdkErrorCode.MessageTooLarge`). v1 had no built-in | ||
| protection against a peer flooding the stream with unbounded data on a single line; if | ||
| you implemented such protection against v1 transport internals, migrate to this option. | ||
|
|
||
| ```typescript | ||
| import { StdioClientTransport } from '@modelcontextprotocol/client/stdio'; | ||
|
|
||
| const transport = new StdioClientTransport({ command: 'my-server' }, { maxMessageBytes: 4 * 1024 * 1024 }); | ||
| ``` | ||
|
|
There was a problem hiding this comment.
🟡 The new SdkErrorCode.MessageTooLarge member referenced in this section is missing from the authoritative SdkErrorCode enum listings later in this same guide (the "New SdkErrorCode enum" table, ~lines 746-761) and in docs/migration-SKILL.md (the "New SdkErrorCode enum values:" bullet list, ~lines 129-142), which were complete enumerations before this PR. Adding one table row and one bullet keeps both lists in sync with the enum and tells SKILL-file consumers the code's string value ('MESSAGE_TOO_LARGE').
Extended reasoning...
What the issue is. This PR adds a new enum member, SdkErrorCode.MessageTooLarge = 'MESSAGE_TOO_LARGE' (packages/core/src/errors/sdkErrors.ts:30), and the new stdio sections in both migration guides tell migrators to branch on exactly that code (docs/migration.md:155 and section 13 of docs/migration-SKILL.md). However, the authoritative enumerations of SdkErrorCode in those same documents were not updated: the "New SdkErrorCode enum" table in docs/migration.md:746-761 ("The new SdkErrorCode enum contains string-valued codes for local SDK errors:") and the "New SdkErrorCode enum values:" bullet list in docs/migration-SKILL.md:129-142 both still list only the 14 pre-PR members.
Why this is introduced by this PR. Before this change, both lists matched the enum exactly — 14 members in the enum, 14 entries in each list — so they read as complete enumerations. After this change the enum has 15 members and the lists are stale by precisely the member the PR adds. Within the same documents the PR edits, one section instructs users to handle SdkErrorCode.MessageTooLarge while the section that purports to enumerate every SdkErrorCode value omits it, leaving the guides internally inconsistent.
Step-by-step proof.
packages/core/src/errors/sdkErrors.tsin this PR addsMessageTooLarge = 'MESSAGE_TOO_LARGE'betweenSendFailedandInvalidResult, bringing the enum to 15 members.- The new stdio section added by this PR at
docs/migration.md:144-164says oversized messages are "reported viaonerror(SdkErrorwith codeSdkErrorCode.MessageTooLarge)";docs/migration-SKILL.mdsection 13 says the same. - The "New
SdkErrorCodeenum" table atdocs/migration.md:746-761listsNotConnected,AlreadyConnected,NotInitialized,CapabilityNotSupported,RequestTimeout,ConnectionClosed,SendFailed,InvalidResult, and the sixClientHttp*codes — every member of the enum exceptMessageTooLarge. - The "New
SdkErrorCodeenum values:" bullet list atdocs/migration-SKILL.md:129-142has the same complete-minus-one set, and is the only place in the SKILL file that records the string values of the codes — so an LLM consuming that file as the source of truth for the enum will not learn that'MESSAGE_TOO_LARGE'exists or what string it maps to.
Why nothing else prevents it. Both lists are hand-maintained prose with no check against the actual enum, so the only safeguard is updating them in the same change that extends the enum — which this PR does for the "Unchanged APIs" sections (already qualified for maxMessageBytes) but not for the enum listings. This matches the repo guideline that both migration docs stay in sync with behavior/API changes and that related additions be grouped into the existing sections rather than only mentioned in new prose.
Impact. Low — the new code is documented in the new stdio sections of both guides, so nobody is misled about behavior; this is purely a completeness/consistency gap in the reference lists. It mainly affects readers (or LLMs) who treat those lists as the canonical enumeration of SdkErrorCode and its string values. It is also distinct from the previously posted "Unchanged APIs list is stale" comment, which concerns a different section that this PR did update.
How to fix. Add one row to the table in docs/migration.md (e.g. | \SdkErrorCode.MessageTooLarge` | A single inbound message exceeded the configured maximum size |, ideally next to the other transport codes after SendFailed) and one bullet to docs/migration-SKILL.md (- `SdkErrorCode.MessageTooLarge` = `'MESSAGE_TOO_LARGE'``) in the same position.
| export type StdioServerTransportOptions = { | ||
| /** | ||
| * Maximum size, in bytes, that a single inbound message may occupy. | ||
| * | ||
| * Protects against a misbehaving client flooding the server with an unbounded | ||
| * amount of data on a single line, which would otherwise grow server memory | ||
| * without limit. When a message exceeds this size it is dropped, an | ||
| * {@linkcode SdkError} with code `SdkErrorCode.MessageTooLarge` is reported via | ||
| * `onerror`, and the transport recovers at the next newline boundary. | ||
| * | ||
| * Defaults to undefined (no limit), matching previous behavior. | ||
| */ | ||
| maxMessageBytes?: number; | ||
| }; | ||
|
|
There was a problem hiding this comment.
🟡 The pre-existing class-level JSDoc for StdioServerTransport (the "Server transport for stdio..." description plus the @example sourced from ./stdio.examples.ts#StdioServerTransport_basicUsage) now sits directly above the newly inserted StdioServerTransportOptions type, so the options bag gets documented as the transport class while the public class itself loses its JSDoc entirely. Move the type declaration above the JSDoc block (or give it its own doc comment), as StdioClientTransportOptions is handled in the client package.
Extended reasoning...
What the bug is. In packages/server/src/server/stdio.ts, the new export type StdioServerTransportOptions = {...} was inserted between the pre-existing class-level JSDoc block (lines 7–18: "Server transport for stdio: this communicates with an MCP client by reading from the current process' stdin..." including the @example fence sourced from ./stdio.examples.ts#StdioServerTransport_basicUsage) and export class StdioServerTransport (now at line 43). JSDoc/TSDoc binds a doc comment to the declaration that immediately follows it, so the doc block now attaches to the options type rather than the class.
How it manifests. Two visible effects: (1) IDE hover and TypeDoc output for StdioServerTransportOptions show "Server transport for stdio..." plus a usage example that constructs a transport and calls server.connect(transport) — wrong and confusing for an options bag; (2) hovering or generating docs for the public StdioServerTransport class itself now shows nothing, because the class has no doc comment at all.
Why this is introduced by this PR, not pre-existing. The diff hunk @@ -16,15 +16,52 @@ shows the JSDoc as unchanged context, with the type declaration inserted immediately after it; before this PR (git show HEAD~4:packages/server/src/server/stdio.ts) the same JSDoc sat directly above export class StdioServerTransport. The client-side counterpart in this PR does not have the problem: StdioClientTransportOptions in packages/client/src/client/stdio.ts is declared above (and separate from) the StdioClientTransport class JSDoc, so that class keeps its documentation.
Why nothing else catches it. TypeScript compiles fine either way — doc-comment attachment is not type-checked — and the sync:snippets tooling only keeps the @example fence content in sync with the source region; it does not verify which declaration the fence is attached to. So lint/typecheck/tests all pass with the misplaced doc.
Step-by-step proof.
- Open
packages/server/src/server/stdio.tsat HEAD: lines 7–18 are the/** Server transport for stdio... @example ... */block; line 19 isexport type StdioServerTransportOptions = {; line 43 isexport class StdioServerTransport implements Transport {with no preceding comment. - In an editor, hover
StdioServerTransportOptionsat any usage site (e.g. the constructor parameter): the tooltip reads "Server transport for stdio: this communicates with an MCP client..." and renders thenew StdioServerTransport(); await server.connect(transport)example. - Hover
StdioServerTransportitself: no documentation is shown. - Compare with
git show HEAD~4:packages/server/src/server/stdio.ts, where the same block immediately precedes the class declaration and hover on the class shows the description and example.
Impact. Documentation-only — no runtime behavior changes — but it degrades a public API of @modelcontextprotocol/server: the class loses its docs and the new options type carries misleading ones, contrary to the repo's CLAUDE.md requirement of JSDoc on public APIs and the @example/sync:snippets convention that targets this class doc.
How to fix. Move the StdioServerTransportOptions type declaration above the JSDoc block (optionally with its own short doc comment, e.g. "Options for {@linkcode StdioServerTransport}."), so the existing class JSDoc is again immediately followed by export class StdioServerTransport. This mirrors how the client package arranges StdioClientTransportOptions relative to StdioClientTransport.
Add an opt-in
maxMessageByteslimit to both stdio transports, surface violations as a newSdkErrorcode throughonerror, and rewriteReadBufferto be amortized O(1) per byte. Also documents an undocumented v1→v2 stdio behavior change (silent skipping of non-JSON lines).Motivation and Context
Backwards-compatibility gap this fixes: in v1 there was no supported way to protect a client from a stdio server that floods stdout, so integrators built that protection against transport internals — e.g. subclassing
StdioClientTransportand replacing its private_readBufferwith a bounded implementation. v2 has the same private internals (_readBufferis soft-private and may break at any release), so those integrations have nothing supported to migrate to: while migrating a large production MCP host application from v1 to v2, this was the one transport integration point that could not be expressed against the public v2 API at all.The underlying exposure affects stock consumers too, on both major versions: a server that writes a large amount of data to stdout without a newline (accidental binary output, a runaway log line, or a malicious peer) grows client memory without bound.
ReadBuffer.appendalso performsBuffer.concaton every chunk, so a large incomplete message costs O(n²) in copies on top of the unbounded retention.This PR adds the missing public hook:
Semantics: when a single message exceeds the limit — whether or not its newline has arrived — the data is dropped, one
SdkErrorwith the new codeSdkErrorCode.MessageTooLargeis reported viaonerror(matching how the stdio transports already surface read errors), and the transport recovers at the next newline boundary. The default isundefined(unlimited), so existing behavior is unchanged unless the option is set.Independent of the limit,
ReadBuffernow uses a single geometrically-growing buffer with read/scan offsets: appending is amortized O(1) per byte, newline scanning never revisits bytes, and the buffer shrinks back after a large message drains, replacing the per-chunkBuffer.concatof the whole backlog.The migration guides also gain a note for another stdio behavior change migrators currently discover by accident: v2 silently skips non-JSON lines (v1 surfaced a
SyntaxErrorviaonerror), so v1-era detection of noisy servers no longer fires.How Has This Been Tested?
ReadBufferunit tests: limit crossed mid-message (single error, silent discard while the oversized message streams in, recovery at next newline), oversized complete line dropped with the rest of the buffer processed, byte counting across many appends,clear()resetting recovery state, default-unlimited unchanged.tee/more) and server (pushedReadable) both reportMessageTooLargeviaonerrorexactly once and continue processing subsequent messages.onerrorfires exactly once withMessageTooLarge, and the subsequent message is delivered.StdioServerTransport({ maxMessageBytes })form (default process streams, limit enforced, listeners attached/detached on start/close) and the three-argument form withundefinedstream placeholders.Breaking Changes
None. The option defaults to unlimited;
ReadBuffer's public surface (append/readMessage/clear) is unchanged, with an optional constructor argument added.SdkErrorCode.MessageTooLargeis a new enum member.Types of changes
Checklist
Additional context
StdioServerTransportaccepts options as the sole argument (new StdioServerTransport({ maxMessageBytes })) in addition to the(stdin, stdout, options)form, so default-stream users are not forced intoundefinedplaceholders.ReadBufferOptionsis exported alongsideReadBufferfor custom transport authors. The "Unchanged APIs" lists in both migration guides now carry a qualifier for the new option.Limit semantics apply to complete-but-oversized lines as well as incomplete ones: parsing a huge line costs the same memory the limit exists to bound, so both paths drop and report.
During discard-until-newline recovery, buffered data is dropped eagerly on every read, so memory stays bounded by the OS pipe chunk size while the tail of an oversized message streams in.
The internal buffer shrinks back to its initial capacity once drained past a 128 KiB threshold, so one large message does not pin memory for the life of the transport.
A follow-up could consider a default limit in a future major; this PR deliberately keeps the default unlimited so v2 adopters see no behavior change.