Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
159 changes: 159 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ It implements the Model Context Protocol specification, handling model context r
- Supports resource registration and retrieval
- Supports stdio & Streamable HTTP (including SSE) transports
- Supports notifications for list changes (tools, prompts, resources)
- Supports sampling (server-to-client LLM completion requests)

### Supported Methods

Expand All @@ -50,6 +51,7 @@ It implements the Model Context Protocol specification, handling model context r
- `resources/list` - Lists all registered resources and their schemas
- `resources/read` - Retrieves a specific resource by name
- `resources/templates/list` - Lists all registered resource templates and their schemas
- `sampling/createMessage` - Requests LLM completion from the client (server-to-client)

### Custom Methods

Expand Down Expand Up @@ -102,6 +104,163 @@ end
- Raises `MCP::Server::MethodAlreadyDefinedError` if trying to override an existing method
- Supports the same exception reporting and instrumentation as standard methods

### Sampling

The Model Context Protocol allows servers to request LLM completions from clients through the `sampling/createMessage` method.
This enables servers to leverage the client's LLM capabilities without needing direct access to AI models.

**Key Concepts:**

- **Server-to-Client Request**: Unlike typical MCP methods (client→server), sampling is initiated by the server
- **Client Capability**: Clients must declare `sampling` capability during initialization
- **Tool Support**: When using tools in sampling requests, clients must declare `sampling.tools` capability
- **Human-in-the-Loop**: Clients can implement user approval before forwarding requests to LLMs

**Usage Example (Stdio transport):**

`Server#create_sampling_message` is for single-client transports (e.g., `StdioTransport`).
For multi-client transports (e.g., `StreamableHTTPTransport`), use `server_context.create_sampling_message` inside tools instead,
which routes the request to the correct client session.

```ruby
server = MCP::Server.new(name: "my_server")
transport = MCP::Server::Transports::StdioTransport.new(server)
server.transport = transport
```

Client must declare sampling capability during initialization.
This happens automatically when the client connects.

```ruby
result = server.create_sampling_message(
messages: [
{ role: "user", content: { type: "text", text: "What is the capital of France?" } }
],
max_tokens: 100,
system_prompt: "You are a helpful assistant.",
temperature: 0.7
)
```

Result contains the LLM response:

```ruby
{
role: "assistant",
content: { type: "text", text: "The capital of France is Paris." },
model: "claude-3-sonnet-20240307",
stopReason: "endTurn"
}
```

**Parameters:**

Required:

- `messages:` (Array) - Array of message objects with `role` and `content`
- `max_tokens:` (Integer) - Maximum tokens in the response

Optional:

- `system_prompt:` (String) - System prompt for the LLM
- `model_preferences:` (Hash) - Model selection preferences (e.g., `{ intelligencePriority: 0.8 }`)
- `include_context:` (String) - Context inclusion: `"none"`, `"thisServer"`, or `"allServers"` (soft-deprecated)
- `temperature:` (Float) - Sampling temperature
- `stop_sequences:` (Array) - Sequences that stop generation
- `metadata:` (Hash) - Additional metadata
- `tools:` (Array) - Tools available to the LLM (requires `sampling.tools` capability)
- `tool_choice:` (Hash) - Tool selection mode (e.g., `{ mode: "auto" }`)

**Using Sampling in Tools (works with both Stdio and HTTP transports):**

Tools that accept a `server_context:` parameter can call `create_sampling_message` on it.
The request is automatically routed to the correct client session.
Set `server.server_context = server` so that `server_context.create_sampling_message` delegates to the server:

```ruby
class SummarizeTool < MCP::Tool
description "Summarize text using LLM"
input_schema(
properties: {
text: { type: "string" }
},
required: ["text"]
)

def self.call(text:, server_context:)
result = server_context.create_sampling_message(
messages: [
{ role: "user", content: { type: "text", text: "Please summarize: #{text}" } }
],
max_tokens: 500
)

MCP::Tool::Response.new([{
type: "text",
text: result[:content][:text]
}])
end
end

server = MCP::Server.new(name: "my_server", tools: [SummarizeTool])
server.server_context = server
```

**Tool Use in Sampling:**

When tools are provided in a sampling request, the LLM can call them during generation.
The server must handle tool calls and continue the conversation with tool results:

```ruby
result = server.create_sampling_message(
messages: [
{ role: "user", content: { type: "text", text: "What's the weather in Paris?" } }
],
max_tokens: 1000,
tools: [
{
name: "get_weather",
description: "Get weather for a city",
inputSchema: {
type: "object",
properties: { city: { type: "string" } },
required: ["city"]
}
}
],
tool_choice: { mode: "auto" }
)

if result[:stopReason] == "toolUse"
tool_results = result[:content].map do |tool_use|
weather_data = get_weather(tool_use[:input][:city])

{
type: "tool_result",
toolUseId: tool_use[:id],
content: [{ type: "text", text: weather_data.to_json }]
}
end

final_result = server.create_sampling_message(
messages: [
{ role: "user", content: { type: "text", text: "What's the weather in Paris?" } },
{ role: "assistant", content: result[:content] },
{ role: "user", content: tool_results }
],
max_tokens: 1000,
tools: [...]
)
end
```

**Error Handling:**

- Raises `RuntimeError` if transport is not set
- Raises `RuntimeError` if client does not support `sampling` capability
- Raises `RuntimeError` if `tools` are used but client lacks `sampling.tools` capability
- Raises `StandardError` if client returns an error response

### Notifications

The server supports sending notifications to clients when lists of tools, prompts, or resources change. This enables real-time updates without polling.
Expand Down
4 changes: 1 addition & 3 deletions conformance/expected_failures.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,5 @@
server:
# TODO: Server-to-client requests (sampling/createMessage, elicitation/create) are not implemented.
# `Transport#send_request` does not exist in the current SDK.
- tools-call-sampling
# TODO: Server-to-client requests (elicitation/create) are not implemented.
- tools-call-elicitation
- elicitation-sep1034-defaults
- elicitation-sep1330-enums
13 changes: 8 additions & 5 deletions conformance/server.rb
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,6 @@ def call(server_context:, **_args)
end
end

# TODO: Implement when `Transport` supports server-to-client requests.
class TestSampling < MCP::Tool
tool_name "test_sampling"
description "A tool that requests LLM sampling from the client"
Expand All @@ -166,11 +165,15 @@ class TestSampling < MCP::Tool
)

class << self
def call(prompt:)
MCP::Tool::Response.new(
[MCP::Content::Text.new("Sampling not supported in this SDK version").to_h],
error: true,
def call(prompt:, server_context:)
result = server_context.create_sampling_message(
messages: [{ role: "user", content: { type: "text", text: prompt } }],
max_tokens: 100,
)
model = result[:model] || "unknown"
text = result.dig(:content, :text) || ""

MCP::Tool::Response.new([MCP::Content::Text.new("LLM response: #{text} (model: #{model})").to_h])
end
end
end
Expand Down
81 changes: 80 additions & 1 deletion lib/mcp/server.rb
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ def initialize(method_name)
include Instrumentation

attr_accessor :description, :icons, :name, :title, :version, :website_url, :instructions, :tools, :prompts, :resources, :server_context, :configuration, :capabilities, :transport, :logging_message_notification
attr_reader :client_capabilities

def initialize(
description: nil,
Expand Down Expand Up @@ -86,6 +87,7 @@ def initialize(
validate!

@capabilities = capabilities || default_capabilities
@client_capabilities = nil
@logging_message_notification = nil

@handlers = {
Expand Down Expand Up @@ -198,6 +200,43 @@ def notify_log_message(data:, level:, logger: nil)
report_exception(e, { notification: "log_message" })
end

# Sends a `sampling/createMessage` request to the client.
# For single-client transports (e.g., `StdioTransport`). For multi-client transports
# (e.g., `StreamableHTTPTransport`), use `ServerSession#create_sampling_message` instead
# to ensure the request is routed to the correct client.
def create_sampling_message(
messages:,
max_tokens:,
system_prompt: nil,
model_preferences: nil,
include_context: nil,
temperature: nil,
stop_sequences: nil,
metadata: nil,
tools: nil,
tool_choice: nil
)
unless @transport
raise "Cannot send sampling request without a transport."
end

params = build_sampling_params(
@client_capabilities,
messages: messages,
max_tokens: max_tokens,
system_prompt: system_prompt,
model_preferences: model_preferences,
include_context: include_context,
temperature: temperature,
stop_sequences: stop_sequences,
metadata: metadata,
tools: tools,
tool_choice: tool_choice,
)

@transport.send_request(Methods::SAMPLING_CREATE_MESSAGE, params)
end

# Sets a custom handler for `resources/read` requests.
# The block receives the parsed request params and should return resource
# contents. The return value is set as the `contents` field of the response.
Expand All @@ -208,6 +247,45 @@ def resources_read_handler(&block)
@handlers[Methods::RESOURCES_READ] = block
end

def build_sampling_params(
capabilities,
messages:,
max_tokens:,
system_prompt: nil,
model_preferences: nil,
include_context: nil,
temperature: nil,
stop_sequences: nil,
metadata: nil,
tools: nil,
tool_choice: nil
)
unless capabilities&.dig(:sampling)
raise "Client does not support sampling."
end

if tools && !capabilities.dig(:sampling, :tools)
raise "Client does not support sampling with tools."
end

if tool_choice && !capabilities.dig(:sampling, :tools)
raise "Client does not support sampling with tool_choice."
end

{
messages: messages,
maxTokens: max_tokens,
systemPrompt: system_prompt,
modelPreferences: model_preferences,
includeContext: include_context,
temperature: temperature,
stopSequences: stop_sequences,
metadata: metadata,
tools: tools,
toolChoice: tool_choice,
}.compact
end

private

def validate!
Expand Down Expand Up @@ -355,10 +433,11 @@ def init(params, session: nil)
session.store_client_info(client: params[:clientInfo], capabilities: params[:capabilities])
else
@client = params[:clientInfo]
@client_capabilities = params[:capabilities]
end
protocol_version = params[:protocolVersion]
end

protocol_version = params[:protocolVersion] if params
negotiated_version = if Configuration::SUPPORTED_STABLE_PROTOCOL_VERSIONS.include?(protocol_version)
protocol_version
else
Expand Down
35 changes: 35 additions & 0 deletions lib/mcp/server/transports/stdio_transport.rb
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,41 @@ def send_notification(method, params = nil)
MCP.configuration.exception_reporter.call(e, { error: "Failed to send notification" })
false
end

def send_request(method, params = nil)
request_id = generate_request_id
request = { jsonrpc: "2.0", id: request_id, method: method }
request[:params] = params if params

begin
send_response(request)
rescue => e
MCP.configuration.exception_reporter.call(e, { error: "Failed to send request" })
raise
end

while @open && (line = $stdin.gets)
begin
parsed = JSON.parse(line.strip, symbolize_names: true)
rescue JSON::ParserError => e
MCP.configuration.exception_reporter.call(e, { error: "Failed to parse response" })
raise
end

if parsed[:id] == request_id && !parsed.key?(:method)
if parsed[:error]
raise StandardError, "Client returned an error for #{method} request (code: #{parsed[:error][:code]}): #{parsed[:error][:message]}"
end

return parsed[:result]
else
response = @session ? @session.handle(parsed) : @server.handle(parsed)
send_response(response) if response
end
end

raise "Transport closed while waiting for response to #{method} request."
end
end
end
end
Expand Down
Loading