modelcontextprotocol · koic · Mar 29, 2026
diff --git a/README.md b/README.md
@@ -38,6 +38,7 @@ It implements the Model Context Protocol specification, handling model context r
 - Supports resource registration and retrieval
 - Supports stdio & Streamable HTTP (including SSE) transports
 - Supports notifications for list changes (tools, prompts, resources)
+- Supports sampling (server-to-client LLM completion requests)
 
 ### Supported Methods
 
@@ -50,6 +51,7 @@ It implements the Model Context Protocol specification, handling model context r
 - `resources/list` - Lists all registered resources and their schemas
 - `resources/read` - Retrieves a specific resource by name
 - `resources/templates/list` - Lists all registered resource templates and their schemas
+- `sampling/createMessage` - Requests LLM completion from the client (server-to-client)
 
 ### Custom Methods
 
@@ -102,6 +104,163 @@ end
 - Raises `MCP::Server::MethodAlreadyDefinedError` if trying to override an existing method
 - Supports the same exception reporting and instrumentation as standard methods
 
+### Sampling
+
+The Model Context Protocol allows servers to request LLM completions from clients through the `sampling/createMessage` method.
+This enables servers to leverage the client's LLM capabilities without needing direct access to AI models.
+
+**Key Concepts:**
+
+- **Server-to-Client Request**: Unlike typical MCP methods (client→server), sampling is initiated by the server
+- **Client Capability**: Clients must declare `sampling` capability during initialization
+- **Tool Support**: When using tools in sampling requests, clients must declare `sampling.tools` capability
+- **Human-in-the-Loop**: Clients can implement user approval before forwarding requests to LLMs
+
+**Usage Example (Stdio transport):**
+
+`Server#create_sampling_message` is for single-client transports (e.g., `StdioTransport`).
+For multi-client transports (e.g., `StreamableHTTPTransport`), use `server_context.create_sampling_message` inside tools instead,
+which routes the request to the correct client session.
+
+```ruby
+server = MCP::Server.new(name: "my_server")
+transport = MCP::Server::Transports::StdioTransport.new(server)
+server.transport = transport
+```
+
+Client must declare sampling capability during initialization.
+This happens automatically when the client connects.
+
+```ruby
+result = server.create_sampling_message(
+  messages: [
+    { role: "user", content: { type: "text", text: "What is the capital of France?" } }
+  ],
+  max_tokens: 100,
+  system_prompt: "You are a helpful assistant.",
+  temperature: 0.7
+)
+```
+
+Result contains the LLM response:
+
+```ruby
+{
+  role: "assistant",
+  content: { type: "text", text: "The capital of France is Paris." },
+  model: "claude-3-sonnet-20240307",
+  stopReason: "endTurn"
+}
+```
+
+**Parameters:**
+
+Required:
+
+- `messages:` (Array) - Array of message objects with `role` and `content`
+- `max_tokens:` (Integer) - Maximum tokens in the response
+
+Optional:
+
+- `system_prompt:` (String) - System prompt for the LLM
+- `model_preferences:` (Hash) - Model selection preferences (e.g., `{ intelligencePriority: 0.8 }`)
+- `include_context:` (String) - Context inclusion: `"none"`, `"thisServer"`, or `"allServers"` (soft-deprecated)
+- `temperature:` (Float) - Sampling temperature
+- `stop_sequences:` (Array) - Sequences that stop generation
+- `metadata:` (Hash) - Additional metadata
+- `tools:` (Array) - Tools available to the LLM (requires `sampling.tools` capability)
+- `tool_choice:` (Hash) - Tool selection mode (e.g., `{ mode: "auto" }`)
+
+**Using Sampling in Tools (works with both Stdio and HTTP transports):**
+
+Tools that accept a `server_context:` parameter can call `create_sampling_message` on it.
+The request is automatically routed to the correct client session.
+Set `server.server_context = server` so that `server_context.create_sampling_message` delegates to the server:
+
+```ruby
+class SummarizeTool < MCP::Tool
+  description "Summarize text using LLM"
+  input_schema(
+    properties: {
+      text: { type: "string" }
+    },
+    required: ["text"]
+  )
+
+  def self.call(text:, server_context:)
+    result = server_context.create_sampling_message(
+      messages: [
+        { role: "user", content: { type: "text", text: "Please summarize: #{text}" } }
+      ],
+      max_tokens: 500
+    )
+
+    MCP::Tool::Response.new([{
+      type: "text",
+      text: result[:content][:text]
+    }])
+  end
+end
+
+server = MCP::Server.new(name: "my_server", tools: [SummarizeTool])
+server.server_context = server
+```
+
+**Tool Use in Sampling:**
+
+When tools are provided in a sampling request, the LLM can call them during generation.
+The server must handle tool calls and continue the conversation with tool results:
+
+```ruby
+result = server.create_sampling_message(
+  messages: [
+    { role: "user", content: { type: "text", text: "What's the weather in Paris?" } }
+  ],
+  max_tokens: 1000,
+  tools: [
+    {
+      name: "get_weather",
+      description: "Get weather for a city",
+      inputSchema: {
+        type: "object",
+        properties: { city: { type: "string" } },
+        required: ["city"]
+      }
+    }
+  ],
+  tool_choice: { mode: "auto" }
+)
+
+if result[:stopReason] == "toolUse"
+  tool_results = result[:content].map do |tool_use|
+    weather_data = get_weather(tool_use[:input][:city])
+
+    {
+      type: "tool_result",
+      toolUseId: tool_use[:id],
+      content: [{ type: "text", text: weather_data.to_json }]
+    }
+  end
+
+  final_result = server.create_sampling_message(
+    messages: [
+      { role: "user", content: { type: "text", text: "What's the weather in Paris?" } },
+      { role: "assistant", content: result[:content] },
+      { role: "user", content: tool_results }
+    ],
+    max_tokens: 1000,
+    tools: [...]
+  )
+end
+```
+
+**Error Handling:**
+
+- Raises `RuntimeError` if transport is not set
+- Raises `RuntimeError` if client does not support `sampling` capability
+- Raises `RuntimeError` if `tools` are used but client lacks `sampling.tools` capability
+- Raises `StandardError` if client returns an error response
+
 ### Notifications
 
 The server supports sending notifications to clients when lists of tools, prompts, or resources change. This enables real-time updates without polling.

diff --git a/conformance/expected_failures.yml b/conformance/expected_failures.yml
@@ -1,7 +1,5 @@
 server:
-  # TODO: Server-to-client requests (sampling/createMessage, elicitation/create) are not implemented.
-  # `Transport#send_request` does not exist in the current SDK.
-  - tools-call-sampling
+  # TODO: Server-to-client requests (elicitation/create) are not implemented.
   - tools-call-elicitation
   - elicitation-sep1034-defaults
   - elicitation-sep1330-enums
diff --git a/conformance/server.rb b/conformance/server.rb
@@ -156,7 +156,6 @@ def call(server_context:, **_args)
       end
     end
 
-    # TODO: Implement when `Transport` supports server-to-client requests.
     class TestSampling < MCP::Tool
       tool_name "test_sampling"
       description "A tool that requests LLM sampling from the client"
@@ -166,11 +165,15 @@ class TestSampling < MCP::Tool
       )
 
       class << self
-        def call(prompt:)
-          MCP::Tool::Response.new(
-            [MCP::Content::Text.new("Sampling not supported in this SDK version").to_h],
-            error: true,
+        def call(prompt:, server_context:)
+          result = server_context.create_sampling_message(
+            messages: [{ role: "user", content: { type: "text", text: prompt } }],
+            max_tokens: 100,
           )
+          model = result[:model] || "unknown"
+          text = result.dig(:content, :text) || ""
+
+          MCP::Tool::Response.new([MCP::Content::Text.new("LLM response: #{text} (model: #{model})").to_h])
         end
       end
     end

diff --git a/lib/mcp/server.rb b/lib/mcp/server.rb
@@ -48,6 +48,7 @@ def initialize(method_name)
     include Instrumentation
 
     attr_accessor :description, :icons, :name, :title, :version, :website_url, :instructions, :tools, :prompts, :resources, :server_context, :configuration, :capabilities, :transport, :logging_message_notification
+    attr_reader :client_capabilities
 
     def initialize(
       description: nil,
@@ -86,6 +87,7 @@ def initialize(
       validate!
 
       @capabilities = capabilities || default_capabilities
+      @client_capabilities = nil
       @logging_message_notification = nil
 
       @handlers = {
@@ -198,6 +200,43 @@ def notify_log_message(data:, level:, logger: nil)
       report_exception(e, { notification: "log_message" })
     end
 
+    # Sends a `sampling/createMessage` request to the client.
+    # For single-client transports (e.g., `StdioTransport`). For multi-client transports
+    # (e.g., `StreamableHTTPTransport`), use `ServerSession#create_sampling_message` instead
+    # to ensure the request is routed to the correct client.
+    def create_sampling_message(
+      messages:,
+      max_tokens:,
+      system_prompt: nil,
+      model_preferences: nil,
+      include_context: nil,
+      temperature: nil,
+      stop_sequences: nil,
+      metadata: nil,
+      tools: nil,
+      tool_choice: nil
+    )
+      unless @transport
+        raise "Cannot send sampling request without a transport."
+      end
+
+      params = build_sampling_params(
+        @client_capabilities,
+        messages: messages,
+        max_tokens: max_tokens,
+        system_prompt: system_prompt,
+        model_preferences: model_preferences,
+        include_context: include_context,
+        temperature: temperature,
+        stop_sequences: stop_sequences,
+        metadata: metadata,
+        tools: tools,
+        tool_choice: tool_choice,
+      )
+
+      @transport.send_request(Methods::SAMPLING_CREATE_MESSAGE, params)
+    end
+
     # Sets a custom handler for `resources/read` requests.
     # The block receives the parsed request params and should return resource
     # contents. The return value is set as the `contents` field of the response.
@@ -208,6 +247,45 @@ def resources_read_handler(&block)
       @handlers[Methods::RESOURCES_READ] = block
     end
 
+    def build_sampling_params(
+      capabilities,
+      messages:,
+      max_tokens:,
+      system_prompt: nil,
+      model_preferences: nil,
+      include_context: nil,
+      temperature: nil,
+      stop_sequences: nil,
+      metadata: nil,
+      tools: nil,
+      tool_choice: nil
+    )
+      unless capabilities&.dig(:sampling)
+        raise "Client does not support sampling."
+      end
+
+      if tools && !capabilities.dig(:sampling, :tools)
+        raise "Client does not support sampling with tools."
+      end
+
+      if tool_choice && !capabilities.dig(:sampling, :tools)
+        raise "Client does not support sampling with tool_choice."
+      end
+
+      {
+        messages: messages,
+        maxTokens: max_tokens,
+        systemPrompt: system_prompt,
+        modelPreferences: model_preferences,
+        includeContext: include_context,
+        temperature: temperature,
+        stopSequences: stop_sequences,
+        metadata: metadata,
+        tools: tools,
+        toolChoice: tool_choice,
+      }.compact
+    end
+
     private
 
     def validate!
@@ -355,10 +433,11 @@ def init(params, session: nil)
           session.store_client_info(client: params[:clientInfo], capabilities: params[:capabilities])
         else
           @client = params[:clientInfo]
+          @client_capabilities = params[:capabilities]
         end
+        protocol_version = params[:protocolVersion]
       end
 
-      protocol_version = params[:protocolVersion] if params
       negotiated_version = if Configuration::SUPPORTED_STABLE_PROTOCOL_VERSIONS.include?(protocol_version)
         protocol_version
       else

diff --git a/lib/mcp/server/transports/stdio_transport.rb b/lib/mcp/server/transports/stdio_transport.rb
@@ -53,6 +53,41 @@ def send_notification(method, params = nil)
           MCP.configuration.exception_reporter.call(e, { error: "Failed to send notification" })
           false
         end
+
+        def send_request(method, params = nil)
+          request_id = generate_request_id
+          request = { jsonrpc: "2.0", id: request_id, method: method }
+          request[:params] = params if params
+
+          begin
+            send_response(request)
+          rescue => e
+            MCP.configuration.exception_reporter.call(e, { error: "Failed to send request" })
+            raise
+          end
+
+          while @open && (line = $stdin.gets)
+            begin
+              parsed = JSON.parse(line.strip, symbolize_names: true)
+            rescue JSON::ParserError => e
+              MCP.configuration.exception_reporter.call(e, { error: "Failed to parse response" })
+              raise
+            end
+
+            if parsed[:id] == request_id && !parsed.key?(:method)
+              if parsed[:error]
+                raise StandardError, "Client returned an error for #{method} request (code: #{parsed[:error][:code]}): #{parsed[:error][:message]}"
+              end
+
+              return parsed[:result]
+            else
+              response = @session ? @session.handle(parsed) : @server.handle(parsed)
+              send_response(response) if response
+            end
+          end
+
+          raise "Transport closed while waiting for response to #{method} request."
+        end
       end
     end
   end