Fix process crash when testing Stderr by ericstj · Pull Request #1449 · modelcontextprotocol/csharp-sdk

ericstj · 2026-03-19T17:48:04Z

Summary of fixes:

Unhandled exceptions in child processes — The conformance client could crash with an unhandled exception,
generating crash dumps that abort the parent test host. Wrapped the main logic in try/catch to report errors to
stderr instead.
Throwing stderr callbacks crash the process — A user-provided StandardErrorLines callback that throws would
propagate to the background ErrorDataReceived dispatch thread, crashing the process. Added try/catch with a
warning-level log.
Event handler lifetime — ErrorDataReceived/OutputDataReceived handlers could fire after disposal, including
ITestOutputHelper.WriteLine which throws after test completion. Stored handler delegates as named fields and detach
them during cleanup.
WaitForExit() hangs — The parameterless WaitForExit() waits for stdio stream EOF and hangs indefinitely when
grandchild processes hold pipe handles open. Replaced with WaitForExitAsync(ct) on .NET and WaitForExit(timeout) on
.NET Framework.
Cleanup race condition — When ReadMessagesAsync and DisposeAsync enter CleanupAsync concurrently, the second
caller returned without completing the message channel, causing await Completion to hang forever. The second caller
now calls SetDisconnected to ensure the channel completes.
Thread pool starvation from in-process StdioServerTransport — Tests that only validate DI/config were
instantiating StdioServerTransport, which permanently blocks a thread pool thread via Console.OpenStandardInput().
On 2-vCPU CI runners, multiple instances starved the pool. Changed these tests to use
StreamServerTransport(Stream.Null, Stream.Null)

src/ModelContextProtocol.Core/Client/StdioClientTransport.cs

ericstj · 2026-03-20T23:52:43Z

This may not be the root cause of the problem. We've now got dumps on both windows and linux. I'll have a look and see if the dumps reveal a better root-cause.

ericstj · 2026-03-23T16:28:29Z

Crashes fixed, but we have tests hanging. Could be due to the WaitForExit is now hanging. Downloaded the dumps and having a look.

ericstj · 2026-03-23T16:52:46Z

Yeah, WaitForExit is now hanging... I'm not convinced this is the right change. I'm wondering more "why aren't processes being cleaned up" and do we have a hang or test bug somewhere. Also - the fragility here where missing something means a crash seems busted. I think we should have a design that's resilient to not cleaning up external state.

…fix1448

ericstj · 2026-03-24T17:08:27Z

Found leaked StdioServerTransport objects in dumps, these kept handles open to StdIn and were causing threadpool starvation. -- at least that's the theory.

What this fixes is the test host hang, which CI reports as a failure. The causal chain:

7 tests create StdioServerTransport via DI without disposing ServiceProvider

Each transport starts Task.Run(ReadMessagesAsync) → blocks a thread pool thread on Console.OpenStandardInput()
(native read that never returns)

On 2-vCPU CI runners, 7 permanently blocked TP threads exhaust the pool

Other tests needing TP threads for async I/O (like Process.Start completions) starve and hang

CI blame collector detects the hang → kills the test host → reported as failure

The await using ensures DisposeAsync() is called on the ServiceProvider, which calls
StdioServerTransport.DisposeAsync(), which cancels the CancellationTokenSource and unblocks the read loop — freeing
those TP threads.

StdioServerTransport does not reliably close on linux, which causes a threadpool thread to be blocked on read. This can lead to threadpool starvation and application hangs. Since the tests in question do not require stdio transport, switch to using StreamServerTransport with null streams to avoid this issue.

src/ModelContextProtocol.Core/Client/StdioClientSessionTransport.cs

src/ModelContextProtocol.Core/Client/StdioClientTransport.cs

stephentoub · 2026-03-26T12:12:05Z

src/ModelContextProtocol.Core/Client/StdioClientTransport.cs

-                if (!processRunning && HasExited(process))
-                {
-                    process.WaitForExit();
-                }


Won't we now possibly miss some error messages being reported over stderr from the child process?

The WaitForExit() that was removed from DisposeProcess was only called when !processRunning && HasExited(process) — i.e. when the process had already exited on its own before we got here. The stderr flushing for that case is now handled earlier in StdioClientSessionTransport.GetUnexpectedExitExceptionAsync (line 99-106), which calls WaitForExitAsync(ct) on .NET or WaitForExit(timeout) on Framework. So stderr is still flushed before we build the error message and completion details.

In the processRunning case (where we call KillTree), the old code never called WaitForExit() either — the !processRunning guard prevented it. So there's no behavior change for that path.

The reason for removing it from DisposeProcess: DisposeProcess is a static helper called from both the session transport cleanup and the ConnectAsync error path. The session transport cleanup already handles stderr flushing, and the ConnectAsync error path doesn't need it. Keeping the no-arg WaitForExit() in DisposeProcess risked hangs when called from contexts where the streams aren't fully set up.

or WaitForExit(timeout) on Framework. So stderr is still flushed before we build the error message and completion details

WaitForExit(timeout) doesn't wait for EOF of the streams though, right?

You're right — WaitForExit(timeout) does not wait for stream EOF.

I checked the .NET runtime source for WaitForExitAsync. It calls WaitUntilOutputEOF(cancellationToken) which does _error.EOF.WaitAsync(cancellationToken) — so it DOES wait for stream EOF, and the cancellation token IS respected for the stream-drain wait, not just the process-exit wait. So on .NET, WaitForExitAsync(ct) is strictly better than the old WaitForExit() — it flushes streams AND can be cancelled via CancelShutdown() if grandchild processes hold pipe handles open.

On .NET Framework there's no WaitForExitAsync, so we use WaitForExit(timeout) which doesn't guarantee stream flush. The risk of missing events is low in practice: by this point HasExited has returned true (line 93), the error handler is still attached and receiving events, and the ErrorDataReceived dispatching happens asynchronously — events that were already in the OS pipe buffer will still be delivered during the timeout window. We'd only miss events if there's a significant delay between the process exiting and the runtime dispatching buffered pipe data, which shouldn't happen when the process has cleanly exited. The alternative — no-arg WaitForExit() — has no cancellation mechanism on Framework and hangs indefinitely when grandchild processes hold pipe handles open.

When CleanupAsync is entered from ReadMessagesAsync (stdout EOF), the cancellation token is _shutdownCts.Token which hasn't been canceled yet (base.CleanupAsync runs later). If stderr EOF hasn't been delivered by the threadpool yet, WaitForExitAsync hangs indefinitely. Use a linked CTS with ShutdownTimeout to bound the wait. The process is already dead; we're just draining stderr pipe buffers. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…-sdk into fix1448

Fix process crash when testing Stderr

69a22a2

ericstj mentioned this pull request Mar 19, 2026

Update System10Version to 10.0.4 and MicrosoftExtensionsVersion to 10.4.0 #1445

Merged

stephentoub reviewed Mar 19, 2026

View reviewed changes

src/ModelContextProtocol.Core/Client/StdioClientTransport.cs Outdated Show resolved Hide resolved

Clarify comment.

ee120f6

Fix additional crashes identified in dumps.

2539d5c

ericstj added 5 commits March 23, 2026 12:16

Undo WaitForExit changes

82623d1

Detach error handlers before disposing process.

db2baaa

Merge branch 'main' into fix1448

c1d3675

Fix undisposed resources

8f6e91f

Merge branch 'fix1448' of https://github.com/ericstj/csharp-sdk into …

fa6e7b3

…fix1448

ericstj added 3 commits March 25, 2026 10:19

More test fixes

5cae6fb

Fix the race condition between DisposeAsync and ReadMessageAsync.

0647e71

ericstj requested a review from stephentoub March 26, 2026 03:50

Fix test regression

a698b02

ericstj force-pushed the fix1448 branch from 94817c1 to a698b02 Compare March 26, 2026 04:30

stephentoub reviewed Mar 26, 2026

View reviewed changes

src/ModelContextProtocol.Core/Client/StdioClientSessionTransport.cs Outdated Show resolved Hide resolved

stephentoub reviewed Mar 26, 2026

View reviewed changes

src/ModelContextProtocol.Core/Client/StdioClientSessionTransport.cs Show resolved Hide resolved

stephentoub reviewed Mar 26, 2026

View reviewed changes

src/ModelContextProtocol.Core/Client/StdioClientTransport.cs Outdated Show resolved Hide resolved

stephentoub reviewed Mar 26, 2026

View reviewed changes

ericstj and others added 3 commits March 26, 2026 11:08

Address feedback and CI failure

8ad61e5

Merge branch 'main' of https://github.com/modelcontextprotocol/csharp…

289e6b6

…-sdk into fix1448

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix process crash when testing Stderr#1449

Fix process crash when testing Stderr#1449
ericstj wants to merge 15 commits intomodelcontextprotocol:mainfrom
ericstj:fix1448

ericstj commented Mar 19, 2026 •

edited

Loading

Uh oh!

Uh oh!

ericstj commented Mar 20, 2026

Uh oh!

ericstj commented Mar 23, 2026

Uh oh!

ericstj commented Mar 23, 2026

Uh oh!

ericstj commented Mar 24, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

stephentoub Mar 26, 2026

Uh oh!

ericstj Mar 26, 2026

Uh oh!

stephentoub Mar 26, 2026

Uh oh!

ericstj Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ericstj commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

ericstj commented Mar 20, 2026

Uh oh!

ericstj commented Mar 23, 2026

Uh oh!

ericstj commented Mar 23, 2026

Uh oh!

ericstj commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

stephentoub Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

ericstj Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

stephentoub Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

ericstj Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ericstj commented Mar 19, 2026 •

edited

Loading

ericstj commented Mar 24, 2026 •

edited

Loading