Skip to content

mcp: log out-of-band errors instead of dropping them#887

Open
ravyg wants to merge 2 commits intomodelcontextprotocol:mainfrom
ravyg:feat/218-log-dropped-errors
Open

mcp: log out-of-band errors instead of dropping them#887
ravyg wants to merge 2 commits intomodelcontextprotocol:mainfrom
ravyg:feat/218-log-dropped-errors

Conversation

@ravyg
Copy link
Copy Markdown
Contributor

@ravyg ravyg commented Apr 9, 2026

Summary

Two error sites in the MCP runtime were previously unobservable:

  • mcp/shared.go startKeepalive — when a keepalive ping failed, the session was closed silently with no indication of why.
  • mcp/transport.go connectjsonrpc2.OnInternalError printed via the package-level log.Printf, bypassing the *slog.Logger configured on ServerOptions/ClientOptions.

Both sites now report through the existing *slog.Logger that Server and Client already guarantee non-nil via ensureLogger. The logger is threaded into the unexported startKeepalive and connect helpers as a parameter — no new public API surface.

Context

Per @jba's suggestion on #865:

Maybe we can just use the logger in all the places we currently drop the error. Then it's still reported, but we don't have to add any API.

This PR is that smaller alternative. It addresses the concrete observability gap in #218 without expanding ServerOptions. PR #865 (the ErrorHandler callback proposal) remains open pending community input on whether a structured callback API is also wanted on top of logging.

notifySessions already logs at Warn level on delivery errors and is intentionally left untouched.

Test plan

  • New regression test TestKeepAliveFailure_Logged asserts the keepalive failure produces a log line on the configured logger.
  • Verified the test fails when the new logger.Error(...) call is removed (sanity check that it pins the behavior).
  • go test ./mcp/... -count=1 passes.
  • go vet ./... clean.
  • go build ./... clean.

Refs #218

@ravyg ravyg marked this pull request as draft April 9, 2026 00:39
@ravyg ravyg marked this pull request as ready for review April 9, 2026 00:42
mcp/mcp_test.go Outdated
}

func (b *syncBuffer) Write(p []byte) (int, error) {
b.mu.Lock()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

slog already has a mutex around Write, so I don't think you need this lock, or this type.

ravyg added 2 commits April 10, 2026 11:53
Two errors that occur outside the request/response path were previously
unobservable:

  - Keepalive ping failures (mcp/shared.go startKeepalive) silently
    closed the session with no record of why.
  - jsonrpc2 internal errors (mcp/transport.go connect) were printed
    via the global log.Printf, bypassing the configured slog.Logger.

Both sites now report through the existing *slog.Logger that Server
and Client already guarantee non-nil via ensureLogger. No new public
API surface; the logger is threaded into the unexported helpers as a
parameter.

Includes a regression test (TestKeepAliveFailure_Logged) that asserts
the keepalive failure produces a log line on the configured logger,
and was verified to fail when the new log call is removed.

Per @jba's suggestion on modelcontextprotocol#865, this is the smaller no-new-API
alternative to the ErrorHandler callback proposal. modelcontextprotocol#865 remains open
pending community input on whether a structured callback is also
wanted.

Refs modelcontextprotocol#218
slog already serializes Write calls internally, so the mutex wrapper
is unnecessary.
@ravyg ravyg force-pushed the feat/218-log-dropped-errors branch from cf4b382 to 0586871 Compare April 10, 2026 19:03
@ravyg ravyg requested a review from jba April 10, 2026 19:07
Copy link
Copy Markdown
Contributor

@jba jba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants