diff --git a/.github/workflows/ci-openab-agent.yml b/.github/workflows/ci-openab-agent.yml index c0d5a3727..b33d8b613 100644 --- a/.github/workflows/ci-openab-agent.yml +++ b/.github/workflows/ci-openab-agent.yml @@ -4,9 +4,11 @@ on: push: paths: - 'openab-agent/**' + - '.github/workflows/ci-openab-agent.yml' pull_request: paths: - 'openab-agent/**' + - '.github/workflows/ci-openab-agent.yml' jobs: check: @@ -24,7 +26,9 @@ jobs: workspaces: openab-agent - run: cargo fmt --check - run: cargo clippy -- -D warnings + - run: cargo clippy --features mcp -- -D warnings - run: cargo test + - run: cargo test --features mcp - run: cargo test -- --ignored env: ANTHROPIC_API_KEY: "fake-key-for-ci" diff --git a/docs/adr/openab-agent-mcp.md b/docs/adr/openab-agent-mcp.md new file mode 100644 index 000000000..c90f1fa47 --- /dev/null +++ b/docs/adr/openab-agent-mcp.md @@ -0,0 +1,677 @@ +# ADR: openab-agent — MCP Client Support + +## 1. Context & Motivation + +`openab-agent` is the native Rust coding agent shipped with OpenAB (Cargo workspace member `openab-agent/`, introduced 2026-05-26 via PR #924, targeted at the v0.8.4-beta series). Its `docs/adr/openab-agent.md` charter commits to a small surface: 4 built-in tools (`read`, `write`, `edit`, `bash`), a ~500-token system prompt, no LLM SDK dependency, multi-model via thin HTTP. PR #955 added `Skills` support (`openab-agent/src/skills.rs`, 224 LOC, zero new crate dependencies) as the first extension mechanism — descriptor-only injection plus on-demand load via the existing `read` tool. + +The agent currently has **no MCP (Model Context Protocol) client**. This ADR proposes one. + +### 1.1 Why MCP for openab-agent + +- **Ecosystem leverage.** Every Postgres/GitHub/Figma/Jira/Slack integration users will ask for already exists as an MCP server (mcpbundles.com tracks ~9k tools across ~1.4k providers as of 2026-Q2). Re-implementing each as a Skill or built-in tool is duplicative. +- **Parity with peer agents.** Claude Code, Codex CLI, Cursor, Cline, Goose, opencode, OpenHands, Kiro, Junie, Roo Code all ship MCP clients. Users coming from any of these expect `mcpServers` config to "just work". +- **Skills cannot replace MCP.** Per Anthropic's framing — **Skills = procedural (how to do); MCP = connectivity (where data/tools live)**. Skills wrap CLI tools; MCP handles network, auth, streaming, server-side state. + +### 1.2 Why now + +Skills landed in PR #955. The repo's design pattern for "first-tier-but-tiny" extension is now established. MCP is the natural next layer. + +### 1.3 Prior internal attempts + +Four MCP PRs to upstream `openabdev/openab` have closed without merging: + +| PR | Title | State | Scope | +|---|---|---|---| +| #329, #330 | `feat(mcp): inject per-user MCP servers from Discord profiles into ACP sessions` | CLOSED | Broker forward | +| #345 | `feat: inject per-user MCP servers into ACP sessions` | CLOSED | Broker forward | +| #903 | `feat(agent): forward configured MCP servers` | CLOSED | Broker forward | + +All four targeted the broker layer — pass MCP server config through to the backing CLI (Claude Code / Codex / Cursor) and let that CLI handle MCP. **None addressed native MCP support inside `openab-agent` itself.** This ADR is scoped to the native agent. + +Issue #753 remains open and is broker-side (`[agent].inherit_cloud_mcp_servers` opt-out). This ADR does not change broker behavior. + +--- + +## 2. Goals & Non-Goals + +### In scope + +- MCP **client** support inside `openab-agent` +- Transports: stdio (local servers — Anthropic reference, npm/pypi community) and Streamable HTTP (vendor-hosted SaaS — Atlassian, Figma, Linear, Notion, Sentry, etc.). HTTP+SSE intentionally omitted — superseded by MCP spec 2025-11-25 and actively sunset by vendors (Atlassian deadline 2026-06-30). See §3.8 for landscape. +- OAuth login flow for MCP servers requiring it +- Per-session lifecycle with idle eviction +- Per-session config refresh — new ACP session re-reads `mcpServers` from disk (no file watcher, no mid-session reload; openab spawns short-lived sessions per thread so process restart is rarely needed) +- Progressive-disclosure tool surface (single meta-tool, not flat fan-out) +- Reuse of existing `src/auth.rs` PKCE / TokenStore where possible + +### Out of scope + +- MCP **server** functionality (host only) +- WASM / cdylib plugin runtime +- Sidecar / out-of-process MCP bridge +- Per-thread MCP isolation (broker concern, not agent) +- Replacing Skills (Skills and MCP coexist) + +--- + +## 3. Prior Art Survey + +Per `docs/adr/pr-contribution-guidelines.md`, OpenClaw and Hermes Agent are the mandatory references for architectural PRs. OpenClaw was evaluated and found **not applicable to this ADR**: it is a multi-channel messaging gateway (chat platforms ↔ MCP), not a coding agent. Its substantial MCP code (~2,900 LOC across `src/agents/mcp-*`, `src/config/mcp-*`, `src/mcp/`) addresses channel bridging rather than agent-side tool calling. The closer comparison for a coding-agent MCP client is **opencode (§3.2)**, included in addition to Hermes Agent. + +Five projects are surveyed below. Each contributes a design pattern the chosen architecture borrows from: + +| § | Project | Borrowed pattern | +|---|---|---| +| 3.1 | Hermes Agent | Circuit breaker (per-server fail threshold + cooldown) | +| 3.2 | opencode | Per-server status enum + RFC 7591 dynamic OAuth | +| 3.3 | pi-mcp-adapter | Single `mcp` meta-tool with action dispatch (progressive disclosure) | +| 3.4 | Goose | MCP-as-primary-extension validation in a Rust codebase | +| 3.5 | OpenHands | `filter_tools_regex` per-server tool scoping | + +### 3.1 Hermes Agent (mandatory reference) + +- Repo: https://github.com/NousResearch/hermes-agent (Python, Apache 2.0) +- MCP module: ~5,175 LOC across 3 files (`mcp_tool.py` + 2 OAuth modules) +- SDK: official `mcp==1.26.0` +- Transports: stdio + Streamable HTTP + SSE +- Tool naming: `mcp_{server}_{tool}` (single-underscore separators, no `__` boundary marker) +- Lifecycle: per-server long-lived `asyncio.Task` on dedicated background event loop +- Lazy loading: eager connect, but background-thread discovery with 0.75s join — non-blocking +- Hot reload: mtime-poll on `~/.hermes/config.yaml` + `/reload-mcp` slash command +- OAuth: mtime-based disk-watch for cross-process token refresh +- **Notable**: ships a real circuit breaker — threshold 3 failures / 60s cooldown / half-open probe state. The only project surveyed that does so. + +### 3.2 opencode (anomalyco/opencode, formerly sst/opencode) + +- Repo: https://github.com/anomalyco/opencode (TypeScript, MIT) — `sst/opencode` 301-redirects here after org transfer +- **Closest coding-agent comparison to openab-agent** +- MCP module: ~1,664 LOC across 5 files (`mcp/`, `auth.ts`, OAuth provider/callback, config) +- SDK: `@modelcontextprotocol/sdk@1.27.1` +- Transports: stdio + Streamable HTTP + SSE +- Tool naming: `{sanitized_client}_{sanitized_tool}` (single underscore) +- Lifecycle: shared singleton service via Effect `Layer`; one `Client` per server +- Lazy loading: eager connect with `concurrency: "unbounded"`; per-server status union prevents one bad server from crashing others +- Hot reload: subscribes to MCP spec's `notifications/tools/list_changed`; **no file watcher** for config — config change still requires restart +- OAuth: RFC 7591 dynamic client registration, callback `http://127.0.0.1:19876/mcp/oauth/callback`, EffectFlock cross-process locking on token store +- **Known issues** (cited as architectural cautionary tales): #11868 (113 GB virtual-memory leak, Windows v1.1.21), #7261 (heap not released + MCP orphan processes, v1.1.6), #13041 (per-session MCP+LSP duplication across concurrent sessions) — all rooted in child-process lifecycle, not protocol code + +### 3.3 pi-mcp-adapter + +- Repo: https://github.com/nicobailon/pi-mcp-adapter (TypeScript, MIT) +- An out-of-tree extension for the Pi coding agent (`pi.extensions`) — Pi itself has **no native MCP** +- MCP module: ~3,661 LOC (server-manager, proxy-modes, direct-tools, OAuth) +- SDK: `@modelcontextprotocol/sdk@^1.25.1` + `@modelcontextprotocol/ext-apps@^1.2.2` +- Transports: stdio + Streamable HTTP + SSE +- **Notable — the reason this is cited**: ships a **single `mcp` meta-tool** with sub-actions (`connect`, `describe`, `search`, `list`, `call`, `status`). All MCP capability is exposed through this one tool. Lazy connect happens inside `lazyConnect()` on first action that needs it. This is the **progressive-disclosure pattern** this ADR adopts. + +### 3.4 Goose (block / aaif-goose) + +- Repo: https://github.com/block/goose → https://github.com/aaif-goose/goose (Rust, Apache 2.0) +- **Most relevant precedent: a Rust coding agent built around MCP** +- Launched Jan 2025 with MCP as the *only* extension surface (no first-party plugin API to retrofit) +- Hand-rolled `mcp-client` crate (predated official Rust SDK) +- Per-session `Agent` owns an `ExtensionManager` that spawns MCP servers (stdio/SSE) as child processes +- Tools flattened into one namespace; extension name used as prefix to avoid collisions +- Supports `tools/list_changed` for hot reload +- Precedent for a Rust agent shipping MCP as the primary extension surface without WASM / cdylib / sidecar plumbing. + +### 3.5 OpenHands (All-Hands-AI) + +- Repo: https://github.com/OpenHands/OpenHands (Python, MIT) +- SDK: FastMCP (jlowin/fastmcp), not the reference SDK +- **Notable**: per-agent `filter_tools_regex` config — subset a server's tools without modifying the server. OAuth tokens cached under `~/.fastmcp/oauth-mcp-client-cache/` with auto-refresh; explicit "incompatible with headless" caveat for browser-based auth. +- Cited for OAuth + tool-surface scoping patterns where Hermes/opencode/Pi are weaker. + +### 3.6 Comparison matrix + +| | Hermes | opencode | pi-mcp-adapter | Goose | OpenHands | +|---|---|---|---|---|---| +| Language | Python | TS | TS | Rust | Python | +| SDK | mcp 1.26 | sdk 1.27 | sdk 1.25 | hand-rolled | FastMCP | +| Transports | stdio+HTTP+SSE | stdio+HTTP+SSE | stdio+HTTP+SSE | stdio+SSE | stdio+HTTP | +| Tool naming | `mcp_s_t` | `s_t` | configurable | ext-prefix | filter | +| Lifecycle | per-srv task | shared singleton | per-ext + idle 10m | per-session ExtensionMgr | per-agent | +| Lazy connect | no | no | ✅ meta | no (eager) | no | +| Hot reload | mtime+cmd | `tools/list_changed` | session boundary | `tools/list_changed` | no | +| OAuth | mtime disk-watch | RFC7591 + Flock | PKCE+auto | ? | FastMCP cache | +| Circuit breaker | ✅ 3/60s | no | partial | no | no | +| LOC | ~5,175 | ~1,664 | ~3,661 | unmeasured | unmeasured | + +### 3.7 Skills vs MCP — industry research + +Anthropic positions the two as **complementary**, not competing. The 2025-2026 consensus across practitioner blogs (Simon Willison, Anthropic engineering, StackOne) converged on: + +``` + Skills MCP + ────── ──── + Procedural knowledge Live connectivity + Markdown + YAML frontmatter Protocol spec + SDK + ~100 tokens/skill in prompt 10K-17K tokens/server in prompt + Body lazy-loaded via read tool All tool schemas eagerly loaded + Local file Server (process or HTTP endpoint) + No auth, no lifecycle OAuth, lifecycle, transports + Open standard (Dec 2025) Linux Foundation steward (late 2025) +``` + +**Adoption**: no major OSS coding agent has rejected MCP in favor of Skills-only (or vice versa). All 11 surveyed agents (Claude Code, Codex CLI, Gemini CLI, Cursor, Cline, Goose, opencode, Junie, Kiro, Roo, GitHub Copilot agent-mode) support both. + +**Cost data**: large MCP server collections have been documented consuming substantial context budget — StackOne benchmarks Sonnet 4.6 at 42% tool-selection accuracy on the unmodified MCP surface vs 80% with their Code Mode wrapper, motivating the spec-level fix in MCP SEP-1576 ("Mitigating Token Bloat in MCP") which proposes progressive disclosure (**not yet ratified**). + +**Implication for this ADR**: progressive disclosure is not optional for openab-agent. The agent's design principle commits to a ~500-token system prompt; a naïve flat MCP integration would 30× that budget. Skills' descriptor-only injection pattern is the precedent. + +### 3.8 Transport landscape & SaaS MCP server adoption + +MCP defines three transport profiles. Their 2026-Q2 status: + +| Transport | Spec status | Where it lives | +|---|---|---| +| **stdio** | Stable | Local child process — Anthropic reference servers, npm/pypi community packages | +| **Streamable HTTP** | Current (MCP spec 2025-11-25), supersedes HTTP+SSE | Vendor-hosted SaaS endpoints | +| **HTTP+SSE** | Deprecated by spec 2025-11-25; vendor sunsets in progress | Legacy fixtures — Atlassian sunsets 2026-06-30 | + +``` + ┌──────────────────────────────── MCP Server Universe ─────────────────────────────────┐ + │ │ + │ ┌─────────────────────────────┐ ┌────────────────────────────────────┐ │ + │ │ LOCAL (majority of registry) │ │ REMOTE (vendor SaaS, growing) │ │ + │ │ │ │ │ │ + │ │ Transport: stdio │ │ Transport: Streamable HTTP │ │ + │ │ │ │ │ │ + │ │ filesystem sqlite │ │ Atlassian Figma Linear │ │ + │ │ postgres puppeteer │ │ Notion Sentry Supabase │ │ + │ │ github fetch │ │ HubSpot Slack Stripe │ │ + │ │ time gitlab │ │ Cloudflare Vercel Neon ... │ │ + │ │ ... │ │ │ │ + │ └─────────────────────────────┘ └────────────────────────────────────┘ │ + │ │ + │ ┌────────────────────────────────────────────────────────────────────────────┐ │ + │ │ LEGACY (deprecated, vendor sunsets in progress) │ │ + │ │ Transport: HTTP+SSE │ │ + │ │ e.g. Atlassian https://mcp.atlassian.com/v1/sse (off 2026-06-30) │ │ + │ └────────────────────────────────────────────────────────────────────────────┘ │ + │ │ + └──────────────────────────────────────────────────────────────────────────────────────┘ +``` + +#### Local stdio servers (representative sample) + +Anthropic reference + community packages. All ship as `command + args`; no network endpoint. + +| Server | Implementation | Distribution | +|---|---|---| +| `mcp-server-filesystem` | Node | `@modelcontextprotocol/server-filesystem` (npm) | +| `mcp-server-sqlite` | Python | `mcp-server-sqlite` (pypi) | +| `mcp-server-postgres` | Python | `mcp-server-postgres` (pypi) — local DB | +| `mcp-server-puppeteer` | Node | `@modelcontextprotocol/server-puppeteer` (npm) | +| `mcp-server-github` | Go / Node | `github-mcp-server` (binary) / `@modelcontextprotocol/server-github` (npm) | +| `mcp-server-fetch` | Python | `mcp-server-fetch` (pypi) | +| `mcp-server-time` | Rust | `mcp-server-time` (cargo) | +| `mcp-server-gitlab` | Node | `@modelcontextprotocol/server-gitlab` (npm) | + +**Container-image caveat for headless deployments**: Node/Python stdio servers require the corresponding interpreter (`node`, `python3`, `uvx`, `npx`) in the image. The openab base image ships none. Operators running openab-agent in headless environments (Fargate, Kubernetes pods, CI) must either bake the interpreter into a derived image or limit `mcpServers` to Go/Rust binaries (column above). A misconfigured server fails in isolation per §5.9. + +#### Vendor-hosted SaaS servers — all Streamable HTTP + +Survey of mainstream public endpoints (2026-Q2). Every active vendor endpoint surveyed is Streamable HTTP. The Atlassian SSE URL is the lone holdout and has a published sunset date. + +| Vendor | Endpoint | Transport | Notes | +|---|---|---|---| +| Atlassian (Rovo) | `https://mcp.atlassian.com/v1/mcp` | Streamable HTTP | Legacy SSE at `/v1/sse` sunset **2026-06-30** | +| Figma | `https://mcp.figma.com/mcp` | Streamable HTTP | OAuth via Figma account | +| Linear | `https://mcp.linear.app/mcp` | Streamable HTTP | OAuth | +| Notion | `https://mcp.notion.com/mcp` | Streamable HTTP | OAuth | +| Sentry | `https://mcp.sentry.dev/mcp` | Streamable HTTP | OAuth | +| Supabase | `https://mcp.supabase.com/mcp` | Streamable HTTP | OAuth | +| HubSpot | `https://mcp.hubspot.com/anthropic` | Streamable HTTP | OAuth | +| Slack | (vendor-hosted) | Streamable HTTP | OAuth | +| Stripe | hosted (see Stripe MCP docs for current path) | Streamable HTTP | API key | +| Cloudflare | multiple endpoints under `*.mcp.cloudflare.com` | Streamable HTTP | OAuth (workers/dns/r2/...) | +| Vercel | `https://mcp.vercel.com/` | Streamable HTTP | OAuth | +| Neon | `https://mcp.neon.tech/` | Streamable HTTP | OAuth | + +**Cover map**: stdio + Streamable HTTP covers all mainstream public MCP endpoints surveyed as of 2026-Q2. SSE-only deployments are legacy fixtures with vendor sunsets in progress; deferred to a hypothetical v2. + +--- + +## 4. Design Decision + +### 4.1 Architectural alternatives compared + +**Alternative A — Naïve flat in-core.** Every MCP tool from every connected server becomes a top-level entry in `tool_definitions()`. Surface explodes from 4 → 150+ tools; system prompt grows ~500 → ~17,000 tokens (5 servers × ~20 tools each, ~160 tokens per descriptor). Hermes Agent and opencode both pay this cost; StackOne benchmarks (§3.7) show tool-selection accuracy drops sharply under naïve flat surfaces. + +**Alternative B — Sidecar / plugin process.** Spawn a separate `openab-mcp-bridge` binary; agent core has no `rmcp` dependency; communicate via stdio JSON-RPC. RAM saving is 1-2 MB on a 15-40 MB baseline — noise — but the bridge process itself adds ~15 MB and inherits opencode's documented sidecar failure modes (#11868 113 GB leak / #7261 orphan processes / #13041 per-session duplication). Cost ≫ benefit (see §7). + +**Alternative C — CHOSEN: in-core `rmcp` + progressive-disclosure meta-tool.** `rmcp` enters `Cargo.toml`. Tool surface grows by exactly **1 tool**: `mcp`. All MCP capability (server enumeration, tool discovery, invocation, status) flows through that single tool's `action` field. System prompt grows ~500 → ~600 tokens (+100 for the meta-tool blurb). + +### 4.2 Why C honors openab-agent design principles + +| Principle (`docs/adr/openab-agent.md` §2) | A (flat) | B (sidecar) | **C (chosen)** | +|---|:---:|:---:|:---:| +| Minimal tool surface (4 tools) | ⛔ 150+ | ✅ 4 | ✅ 5 | +| Tiny system prompt (~500 tokens) | ⛔ ~17K | ✅ ~500 | ⚠️ ~600 (+100 over budget; accepted as smallest viable surface) | +| No SDK dependency | ⛔ rmcp | ✅ none | ⚠️ rmcp (+1-2 MB binary, see §7) | +| Multi-model | ✅ | ✅ | ✅ | + +C concedes the "no SDK dependency" principle for a 1-2 MB binary cost. §7 shows that cost is dominated by child-process RAM (5-80 MB per server, depending on implementation language) regardless of architecture, so the concession is dwarfed by usage cost. + +### 4.3 Symmetry with Skills (PR #955) + +Skills is openab's existing "first-tier-but-tiny" extension. The mapping is exact: + +``` +┌────────────────────────────┬─────────────────────────────────────┐ +│ Skills (224 LOC, in-core) │ MCP (proposed, in-core) │ +├────────────────────────────┼─────────────────────────────────────┤ +│ Inject metadata only │ Inject 1 meta-tool only │ +│ (name + description) │ (name + action sketch) │ +├────────────────────────────┼─────────────────────────────────────┤ +│ Body load via `read` tool │ Server connect via `mcp` tool │ +│ on agent's demand │ on agent's demand (lazy connect) │ +├────────────────────────────┼─────────────────────────────────────┤ +│ ~100 tokens / 10 skills │ ~100 tokens / N servers │ +├────────────────────────────┼─────────────────────────────────────┤ +│ No new crate dep │ Adds rmcp (1-2 MB binary delta) │ +└────────────────────────────┴─────────────────────────────────────┘ +``` + +Skills' authors weighed "simple in-core mechanism vs plugin abstraction" and chose in-core. The same trade-off applies to MCP: plugin abstraction is ~10× the complexity for negligible RAM saving. + +--- + +## 5. Detailed Design + +### 5.1 Tool surface (4 + 1) + +``` +openab-agent/src/tools.rs::tool_definitions() returns 5 entries: + + [ "read" ] ─── existing, unchanged + [ "write" ] ─── existing, unchanged + [ "edit" ] ─── existing, unchanged + [ "bash" ] ─── existing, unchanged + [ "mcp" ] ─── NEW +``` + +### 5.2 The `mcp` meta-tool schema + +```jsonc +{ + "name": "mcp", + "description": "Interact with configured MCP servers. Use action='help' for usage.", + "input_schema": { + "type": "object", + "properties": { + "action": { + "type": "string", + "enum": ["help", "list_servers", "list_tools", + "describe_tool", "call", "status", + "login", "complete_login"] + }, + "server": { "type": "string" }, + "tool": { "type": "string" }, + "arguments": { "type": "object" }, + "redirect_url": { "type": "string" } + }, + "required": ["action"] + } +} +``` + +Per-action contract: + +| action | required fields | returns | +|---|---|---| +| `help` | — | usage doc string | +| `list_servers` | — | `[{ name, status, transport, tools_count }]` | +| `list_tools` | `server` | `[{ name, description }]` | +| `describe_tool` | `server`, `tool` | `{ name, description, input_schema }` | +| `call` | `server`, `tool`, `arguments` | tool's `CallToolResult` | +| `status` | `server?` | per-server health / last error / OAuth state | +| `login` | `server` | `{ flow: "device", user_code, verification_url, ... }` or `{ flow: "paste", authorize_url, state, ... }` — see §6.4 | +| `complete_login` | `server`, `redirect_url` | `{ ok: true }` or `{ error }` — paste flow only; device flow polls internally | + +### 5.3 Agent loop interaction + +Typical multi-turn usage (lazy connect at first need, idle eviction after TTL): + +- **Turn 1** — LLM calls `mcp(action: "list_servers")`; no IO, served from config cache. Returns `["github (stdio)", ...]`. +- **Turn 2** — LLM calls `mcp(action: "list_tools", server: "github")`; `lazy_connect("github")` spawns child proc, `peer.list_all_tools()` fetches descriptors. Returns `[{name, description}, ...]`. +- **Turn 3** — LLM calls `mcp(action: "call", server, tool, arguments)`; `peer.call_tool()` invokes. Returns `CallToolResult`. +- **Idle (no MCP call for `idle_ttl`)** — `IdleEvictor` shuts down child proc, drops Peer; config + descriptor cache retained for fast re-connect. + +### 5.4 Module layout + +``` +openab-agent/src/ +├── agent.rs (existing — add 1 match arm in execute_tool) +├── auth.rs (existing — TokenStore reused by mcp/oauth.rs) +├── llm.rs (existing — UNCHANGED, ToolDef is already generic) +├── tools.rs (existing — add `mcp` to tool_definitions()) +├── skills.rs (existing — UNCHANGED) +└── mcp/ (NEW module) + ├── mod.rs (public: McpRuntimeManager, dispatch()) + ├── config.rs (mcpServers schema, ${env:VAR} interpolation) + ├── runtime.rs (per-server lifecycle, lazy connect, idle TTL) + ├── meta_tool.rs (action dispatch: list_servers / list_tools / ...) + ├── oauth.rs (uses src/auth.rs TokenStore; built-in providers) + └── breaker.rs (circuit breaker per server) +``` + +Estimated total: **500-750 LOC** (no `reload.rs`; per-session refresh handled by `McpRuntimeManager::new()` re-reading config at session start). `llm.rs` is unchanged because both Anthropic and OpenAI Responses providers consume the generic `ToolDef` abstraction. + +#### 5.4.1 Runtime activation & isolation choices + +Three intentional choices that surfaced in PR #959 review (chaodu F2 / F6 / F7) and are load-bearing enough to belong in the design contract: + +1. **Runtime opt-in gate (F6, env-only).** `load_runtime_or_warn()` returns `None` unless `OPENAB_AGENT_MCP={1,true,yes,on}` (case-insensitive) is set in the process env, even when `mcp.json` is present. Reasoning: file presence is not a strong enough activation signal — `mcp.json` can land in a deploy tree incidentally (image baseline, project clone) and an unrelated agent shouldn't start spawning third-party child processes. The CLI subcommands (`mcp list / status / connect / doctor`) call `load_config_or_exit` instead and work without the env var so operators can inspect a config before activating it. +2. **Stdio child env scrubbing (F2, intentional security).** `Dial::Stdio` calls `env_clear()` and passes only the 4-var baseline allowlist (`HOME`, `PATH`, `TERM`, `USER` on Unix; Windows equivalents) plus the explicit `env:` map from `mcp.json`. Reasoning: openab-agent inherits high-value secrets from its launcher (`DISCORD_BOT_TOKEN`, `ANTHROPIC_API_KEY`, AWS credentials, GitHub tokens) and stdio MCP servers are third-party binaries with no contractual constraint on what they read from their environment. Leaking those by default is a much larger risk than the convenience of inherited proxy/locale settings. Servers that genuinely need additional env (proxy, certs, locale, provider config) declare them per-server in the config — a future `inherit_env` opt-in list is tracked as follow-up if user demand surfaces. +3. **Per-process shared `McpRuntimeManager` (F7).** A single manager is `Arc`-cloned across all ACP sessions of the same process. Reasoning: MCP servers are expensive to spawn (stdio child fork, HTTP handshake + OAuth) and most are pure-state read-only tools where cross-session visibility is benign. Trade-off: a `mcp connect github` in session A makes the `github` server immediately available in session B. We accept this — per-session isolation would multiply child processes and break the breaker / TTL accounting in §5.7 / §5.9. + +#### 5.4.2 Discovery slice — bounded catalogue + idle semantics (F1) + +The §5.1 / §5.2 single `mcp` meta-tool minimizes the LLM-facing tool surface, but it also *hides* the configured server names from the LLM. The F1 PoC reproduced the resulting failure mode: when a user said "use mcp fs to list /workspace", the LLM called `mcp(action: "status")`, saw `fs: disconnected`, read it as "broken", and refused to retry. Two intentional choices remove that failure mode without re-flattening the tool surface: + +1. **Static server catalogue in the system prompt.** `mcp::format_system_prompt_appendix(manager)` appends a `## MCP tool` section containing the tool intro plus `- **{name}** ({transport})` per configured server (with a `requires \`mcp login \`` annotation when an `oauth` block is present). The list is built once at `Agent::new_boxed` time from a sync `manager.catalog()` snapshot frozen at `from_config`, so no async or lock coordination is needed inside `build_system_prompt`. Token-budget invariance is preserved: section size grows **O(server count)** — not O(server count × tool count) — because per-tool descriptors stay behind `mcp(action: "list_tools", server)`. The PoC measured ≤100 tokens per server-side entry under this pattern; flattening tools per-server (≈ what the multi-tool alternative in §4.1 would expose) blows that budget by ~30× for a typical 30-tool github server. + + Mirror with the Skills catalogue (`skills::format_skills_prompt`): both advertise *names + headline metadata* in the always-present system prompt and force *body / contract* discovery through an explicit tool call (`mcp(action: "list_tools" | "describe_tool")` here, `read("skills/")` there). Same intent (the LLM knows the surface exists; details are lazy), same token-budget shape (linear in surface count, not in surface depth). + +2. **Status label `idle` for lazy-connect servers.** The meta-tool's `status_label` returns `idle` — not `disconnected` — when a server is in `ServerStatus::Disconnected` with no failure history. `disconnected` reads as "broken" to the LLM (PoC observation above); `idle` correctly signals "ready, will dial on first call". The genuine failure case still maps to `status: "failed"` with the dial / handshake error in `last_error`, so the LLM can distinguish "not tried yet" from "tried and broke". The system-prompt section also advertises these semantics explicitly so the LLM doesn't have to guess. + +These choices are wired into PR #959 (Phase 1) because the failure mode they fix is reachable as soon as `list_servers` and `status` ship — deferring to Phase 2/3 would mean shipping a known-broken discovery UX on the foundation slice. + +### 5.5 `rmcp` dependency & features + +```toml +# openab-agent/Cargo.toml +[dependencies] +rmcp = { version = "1.7", default-features = false, features = [ + "client", + "transport-child-process", + "transport-streamable-http-client-reqwest", + "auth", +] } +``` + +- `client` only — we host nothing +- `transport-child-process` — stdio servers (majority of registry, see §3.8) +- `transport-streamable-http-client-reqwest` — vendor-hosted SaaS endpoints (reqwest is already a transitive dep) +- `auth` — OAuth helpers +- `default-features = false` — avoid pulling SSE / server features we don't need (SSE intentionally omitted per §3.8 — superseded by Streamable HTTP in MCP spec 2025-11-25, all surveyed vendors migrated or migrating) + +Binary delta estimate: **+1-2 MB** (see §7). + +### 5.6 Config schema + +Single root key `mcpServers` to match Claude Code / Codex / Cursor / Cline convention. Loaded from `.openab/agent/mcp.json` (project) and `~/.openab/agent/mcp.json` (global), project-local takes precedence on name collision. + +```jsonc +{ + "mcpServers": { + "github": { + "type": "stdio", + "command": "github-mcp-server", + "args": ["--repo-token", "${env:GITHUB_TOKEN}"], + "env": { "GH_HOST": "github.com" } + }, + "linear": { + "type": "http", + "url": "https://mcp.linear.app/mcp", + "oauth": { "provider": "linear" } + }, + "fs": { + "type": "stdio", + "command": "mcp-server-filesystem", + "args": ["/workspace"], + "tool_filter": { "include": ["read_*", "list_*"] } + } + } +} +``` + +- `${env:VAR}` interpolation matches Cursor / Cline; missing var = startup error for that server (others continue) +- `tool_filter` supports `include` / `exclude` glob lists (lifted from OpenHands' `filter_tools_regex`) +- Per-server failure isolated — one bad server does not block agent boot + +### 5.7 Lifecycle + +``` + ┌─────────────────────────────────────┐ + │ McpRuntimeManager (1 per agent) │ + │ │ + │ config: Arc │ + │ servers: Map │ + │ idle_ttl: Duration (10m) │ + │ max_concurrent: usize (10) │ + └─────────────────────────────────────┘ + │ + │ on first call needing server X: + ▼ + ┌─────────────────────────────────────┐ + │ ServerHandle (lazy) │ + │ │ + │ state: Disconnected | Connecting | │ + │ Connected(Peer) | Failed | │ + │ NeedsAuth │ + │ last_used: Instant │ + │ breaker: CircuitBreaker │ + │ tools_cache: Vec │ + └─────────────────────────────────────┘ + │ + ┌─────────────────┼─────────────────┐ + │ │ + ┌───────────┐ ┌───────────┐ + │ child proc│ │ HTTP conn │ + │ (stdio) │ │ (reqwest) │ + └───────────┘ └───────────┘ +``` + +- **Lazy connect**: server is `Disconnected` at boot; transitions to `Connecting → Connected` on first action needing it +- **Idle eviction**: background task evicts servers idle > `idle_ttl` (default 10m, configurable per server). State drops to `Disconnected`; tools cache retained for fast re-connect +- **Concurrency cap**: `max_concurrent_servers` bounds simultaneously-`Connected` servers (default 10; see §7 for constrained-env tuning). When at cap, the LRU connected server is force-evicted before connecting a new one +- **Connection reuse**: while connected, all `mcp call` actions reuse the same `Peer` + +### 5.8 Config refresh model + +Rather than file-watching mid-session, openab-agent re-reads `mcp.json` at session boundaries: + +- **New ACP session** → `McpRuntimeManager::new()` parses `mcp.json` from scratch; ~5 LOC of glue, zero hot-path code +- **Mid-session config edit** → not visible until next session; users re-open the Discord/Slack thread (cheap in openab's per-thread session model) +- **Process restart** → applies config changes globally; rarely needed because broker spawns short-lived agent processes per session + +This drops `notify` crate + lease counter + diff applier (~150 LOC, race-condition hotspot) for an 80% UX coverage. Hermes' `/reload-mcp` slash command (§3.1) is the precedent for "explicit user-triggered reload >> implicit file watcher" in a coding-agent context. + +### 5.9 Error isolation & circuit breaker + +Adopted from Hermes Agent (the only surveyed project that ships one): + +``` + ┌─────────────────────────────────────────┐ + │ CircuitBreaker (per server) │ + │ │ + │ state: Closed | Open | HalfOpen │ + │ fail_threshold: 3 (configurable) │ + │ cooldown: 60s (configurable) │ + └─────────────────────────────────────────┘ + │ + ┌───────────────────────┼───────────────────────┐ + │ │ │ + ▼ ▼ ▼ + 3 fails in 30s 60s elapsed 1 success + ─────────────► ─────────────► ─────────────► + Closed → Open Open → HalfOpen HalfOpen → Closed + (allow 1 probe) + │ + │ probe fails + ▼ + HalfOpen → Open + (reset cooldown) +``` + +While `Open`, `mcp call` returns `{"error":"server unavailable, cooldown 45s remaining"}` immediately — no child-process resurrection attempts, no LLM hang. + +`rmcp` error model maps cleanly: + +| `rmcp` error | meta-tool response | Counts toward breaker? | +|---|---|---| +| `ServiceError::McpError` (protocol) | `{ error: msg, code }` | No (server-level intent) | +| `ServiceError::TransportSend/Closed` | `{ error: "transport", server: ... }` | Yes | +| `CallToolResult { isError: true }` | passed through as result | No (tool-level) | + +--- + +## 6. OAuth + +### 6.1 Shared TokenStore + +`openab-agent/src/auth.rs` already implements hand-rolled PKCE for Codex (`CODEX_AUTHORIZE_URL`, port 1455). The TokenStore (`~/.openab/agent/auth.json`, 0o600) is reused — `mcp/oauth.rs` calls into the same store with namespaced keys (`mcp:` vs `codex`). + +**Persistence assumption**: TokenStore is treated as persistent state. Deployments must mount `~/.openab/` on durable storage — hostPath / PVC (k8s work-agents), volume + S3 sync (Fargate Mira), or developer-laptop home directory. Ephemeral container filesystems force a re-bootstrap on every restart and are not a supported configuration. + +**Cold-start refresh**: on process start the runtime reads TokenStore lazily (on first `mcp call` per server). Expired access tokens trigger an in-process refresh via the stored refresh token; success updates the store and proceeds transparently. Refresh failure (revoked / expired refresh token) flips the server's state to `NeedsAuth` (§5.7); the next `mcp call` returns an error that prompts the LLM to re-run the §6.4 login flow. No human interaction is required as long as the refresh token remains valid. + +**Refresh-token rotation race with async persistence layers**: OAuth 2.1 servers issue a new refresh token on every rotation and immediately revoke the previous one; reuse of a revoked refresh token is treated as a replay attack and cascade-revokes the entire token chain. Deployments where TokenStore persistence is asynchronous (Fargate S3 sidecar sync, eventually-consistent volumes) must flush new tokens to durable storage *before* the agent can be killed — otherwise a Spot interruption between local write and remote sync restores the revoked token from S3 on the next task and locks the user out. Contract: + +- **Agent side**: `TokenStore` calls `fsync(2)` after every write to `auth.json` +- **Deployment side**: the S3 / volume sync layer must trigger on `auth.json` mtime change (`inotify` / `fsnotify` event), not poll on a cron. Cron-driven sync (≥1 min interval) is incompatible with refresh-token rotation under Spot interruption +- **Reference deployment**: Mira (openab-ecs Fargate Spot) `mira-home/` S3 sync configuration + +### 6.2 Built-in providers (Phase 2) + +| Provider | Auth URL | Token URL | Callback | Scopes | +|---|---|---|---|---| +| `anthropic-mcp` | `https://claude.ai/oauth/authorize` | `https://platform.claude.com/v1/oauth/token` | `localhost:53692/callback` | `org:create_api_key user:profile user:inference user:sessions:claude_code user:mcp_servers user:file_upload` (subset varies per use) | +| `github-copilot` | (existing pi/anthropic flow) | existing | existing | existing | +| `generic` | from `mcpServers[name].oauth.authorize_url` | from `.oauth.token_url` | dynamically allocated port | from `.oauth.scopes` | + +Callback values apply when the browser flow is engaged (`--browser` / `$DISPLAY` set), and when the agent-guided paste-back branch of §6.4 is selected (user copies the redirect URL from the browser URL bar). The device-code branch of §6.4 ignores the callback entirely. + +### 6.3 Custom provider extension point + +Config can declare `oauth: { authorize_url, token_url, client_id, scopes, device_authorization_endpoint?, discovery?, discovery_allowlist? }` for any server. The generic provider handles PKCE + callback + token persistence. No code change needed for new MCP servers that use standard OAuth 2.1. If `device_authorization_endpoint` is set, §6.4 device-code flow is preferred over paste-back. RFC 8414 dynamic discovery is opt-in only and requires an allowlist — see §6.4. + +### 6.4 Agent-guided OAuth flow (default) + +openab-agent's primary deployment surface is containerized (k8s pods, Fargate tasks) where `localhost:53692/callback` is unreachable and there is no display to open. Two non-browser flows are supported; the runtime picks per server based on capability. Browser-callback remains a laptop-only opt-in (`$DISPLAY` set, or `--browser` passed to `openab-agent mcp login`). + +**Selection logic** (on `mcp(action: "login", server: X)`): + +1. If `X` declares an `oauth.device_authorization_endpoint` in config (§6.3), runtime uses **device-code flow** (RFC 8628). Matches openab's existing CLI convention (`claude auth login`, `codex --device-auth`, `grok --device-auth`). +2. Else runtime uses **paste-back flow** (standard auth-code + PKCE). Universal fallback for OAuth 2.1 servers without a device endpoint (Linear, Notion, Figma, Sentry, ...). + +RFC 8414 dynamic discovery (`/.well-known/oauth-authorization-server`) is **disabled by default**. Operators opt in per-server via `oauth.discovery: true` plus an explicit `oauth.discovery_allowlist` of permitted domains (e.g. `["*.anthropic.com"]`); boot rejects `discovery: true` without an allowlist. Rationale: awsvpc egress restrictions + SSRF surface in multi-tenant deployments. + +**Device-code flow** (typically platform OAuth: Anthropic, OpenAI, xAI): + +- `login` returns `{ flow: "device", user_code, verification_url, expires_in }`. Agent relays to chat: "Open `https://example.com/device`, enter code: `ABCD-EFGH`". +- Runtime polls the token endpoint in background (5s interval, RFC 8628 §3.5). On success, persists tokens under `mcp:X`, transitions server to `Connected`. +- LLM checks `mcp(action: "status", server: X)` to learn when ready; `complete_login` not required for this branch. + +**Paste-back flow** (typically MCP SaaS: Linear, Notion, Figma, ...): + +- `login` returns `{ flow: "paste", authorize_url, state }`. Runtime persists transient `{verifier, state}` in TokenStore. Agent relays to chat: "Open this link, sign in, paste the URL you land on back here". +- User pastes the URL as next chat message; LLM calls `mcp(action: "complete_login", server: X, redirect_url: "...")`. +- Runtime parses `code` + `state`, validates `state`, performs PKCE token exchange against `token_url`, persists tokens under `mcp:X`, drops transient state. + +**Security** (both flows): + +- Device-code `user_code` is short-lived (RFC 8628 §3.2, typically ≤10 min); an attacker who sees the code in chat must also race the polling loop and prove device ownership. +- Paste-back redirect URL carries only the authorization code (OAuth 2.1 PKCE; implicit/hybrid removed); code is single-use + ≤10 min; PKCE verifier held in-process makes intercepted codes unusable. +- Token exchange happens entirely inside the agent process; the chat channel never carries access or refresh tokens. Refresh rotation runs in-process per §6.1. + +`openab-agent/src/auth.rs` already ships all three paths for Codex OAuth (browser L150-244, paste-back L165-201, device L328-440). This ADR generalizes that pattern across MCP servers and centralizes flow selection on per-server capability rather than per-CLI hard-coding. OpenHands notes the same headless-OAuth incompatibility (§3.5) without shipping a fix. + +--- + +## 7. Memory Impact Analysis + +Included because the sidecar alternative (§4.1 B) was motivated by memory. + +`openab-agent` baseline is 15-40 MB RSS. `rmcp` with the §5.5 feature set adds +1-2 MB binary delta and +0 MB idle RSS (no servers configured). Once servers connect, child processes dominate: Go ~10-20 MB, Rust ~5-10 MB, Python/Node ~30-80 MB each. + +| Aspect | A. Naïve flat | B. Sidecar | **C. In-core + meta-tool** | +|---|---|---|---| +| Idle RAM delta | +1-2 MB | +0 MB | +1-2 MB | +| Per-server RAM | +5-80 MB (child) | +15 MB bridge + 5-80 MB | +5-80 MB | +| System prompt tokens | +17,000 | +600 (if sidecar discloses lazily) | +600 | +| Lifecycle complexity | Medium | High (2 procs, IPC, version skew) | Medium | +| Crash blast radius | Bad server kills loop | Bridge crash = all gone | Bad server isolated | + +The 1-2 MB sidecar saving is dominated by per-server child RAM (identical across architectures) and by token cost (identical *as long as progressive disclosure is used*). Memory does not justify the sidecar. + +**Constrained-environment note (Fargate / small Kubernetes pods).** Fargate Spot tasks at 512 MB / 1 GB have no swap; OOMKill is hard. Worst-case stack — agent baseline 40 MB + 5 Node/Python stdio servers at 80 MB each + LLM context buffers — sums to ~440-540 MB, which trips a 512 MB task before any prompt processing. Two mitigations: (a) lower `max_concurrent_servers` to 3 in `mcp.json` (§5.7), bounding worst case to ~280 MB; (b) prefer Go/Rust stdio servers (5-20 MB) or HTTP servers (0 MB local) over Node/Python interpreters. The `mcp doctor` CLI (§8) flags configurations whose worst-case sum exceeds the cgroup limit. + +--- + +## 8. CLI Surface + +``` +openab-agent mcp list — show configured servers + status +openab-agent mcp status [server] — health, last error, OAuth state +openab-agent mcp add — append a stdio server to config +openab-agent mcp add --url — append an http server +openab-agent mcp remove — remove a server from config +openab-agent mcp login [--browser] — run OAuth flow (see §6.4; --browser opts into localhost callback) +openab-agent mcp refresh — force-refresh OAuth token +openab-agent mcp test [json] — invoke a tool from CLI (debug) +openab-agent mcp doctor — diagnose config, network, auth +``` + +Subcommand placement under existing `openab-agent` binary — no new binary. CLI is a thin wrapper over `McpRuntimeManager` to keep the same code path validated by both LLM-driven and human-driven flows. + +--- + +## 9. Rollout Plan + +~6 weeks across three phases: + +1. **Foundation (3w)** — `rmcp` + stdio + meta-tool + minimal CLI, behind `--features mcp` +2. **Network & auth (2w)** — Streamable HTTP transport + OAuth providers + `login`/`refresh` CLI; promote flag default-on +3. **Resilience (1w)** — circuit breaker + `doctor` CLI; remove flag + +Week-by-week task breakdown lives on the tracking issue (filed at PR open). + +--- + +## 10. Open Questions + +1. **Should `mcp.json` live in the agent or the broker?** Agent owns its own config today; broker's `[agent].inherit_cloud_mcp_servers` (issue #753) is a separate concern. Proposal: agent reads `mcp.json` directly; broker can layer additional servers via env or kubectl ConfigMap. **Owner**: needs broker-team alignment. +2. **Native-agent feature parity with broker-forward path.** PRs #329/#330/#345/#903 attempted broker-side MCP forwarding to backing CLIs. With native MCP in openab-agent, do we deprecate that path, keep it for non-native CLIs, or unify? Proposal: native agent uses its own MCP runtime; broker continues to forward to backing CLIs that lack native MCP (Cursor, Copilot). **Owner**: broker-team. + +Resolved at design time (tracked in tracking issue, not open): tool-naming prefix (`_` single-underscore, matching Hermes §3.1 / opencode §3.2 convention), `session/load` re-enumeration (process-local state, re-read), per-tool permission gates (post-Phase-3 opt-in flag), `resources`/`prompts` capabilities (v2). + +--- + +## 11. References + +### Internal + +- `docs/adr/openab-agent.md` — agent charter, design principles cited in §4.2 +- `docs/adr/pr-contribution-guidelines.md` — prior-art requirements followed in §3 +- `openab-agent/src/skills.rs` (PR #955) — extension-pattern precedent cited in §4.3 +- `openab-agent/src/auth.rs` — TokenStore reused in §6.1 +- PRs #329, #330, #345, #903 — closed broker-forward attempts, §1.3 +- Issue #753 — broker-side MCP opt-out (out of scope) +- PR #951 — SessionPool persisted-mapping fix (informs §10 resolved-at-design-time list) + +### External — projects + +- Hermes Agent: https://github.com/NousResearch/hermes-agent +- opencode: https://github.com/anomalyco/opencode (formerly https://github.com/sst/opencode) +- pi-mcp-adapter: https://github.com/nicobailon/pi-mcp-adapter +- Goose: https://github.com/aaif-goose/goose (formerly https://github.com/block/goose) +- OpenHands: https://github.com/OpenHands/OpenHands +- rmcp: https://github.com/modelcontextprotocol/rust-sdk +- OpenClaw (evaluated per `pr-contribution-guidelines.md`, scope not applicable — see §3; canonical repo URL not publicly resolvable, internal reference via avasdream blog cited in guidelines) + +### External — specs & research + +- MCP spec: https://modelcontextprotocol.io +- MCP spec changelog 2025-11-25 (Streamable HTTP supersedes HTTP+SSE): https://modelcontextprotocol.io/specification/2025-11-25/basic/transports +- MCP SEP-1576 — Mitigating Token Bloat in MCP: https://github.com/modelcontextprotocol/modelcontextprotocol/issues/1576 +- Atlassian Rovo MCP SSE→Streamable HTTP migration notice (sunset 2026-06-30): https://community.atlassian.com/forums/Rovo-articles/Migrating-from-Atlassian-s-MCP-Server-SSE-to-Streamable-HTTP/ba-p/3092878 +- Figma MCP server (Streamable HTTP): https://help.figma.com/hc/en-us/articles/32132100833559-Guide-to-the-Dev-Mode-MCP-Server +- Anthropic — Equipping agents for the real world with Agent Skills: https://www.anthropic.com/engineering/equipping-agents-for-the-real-world-with-agent-skills +- Anthropic — Code execution with MCP: https://www.anthropic.com/engineering/code-execution-with-mcp +- Simon Willison — Claude Skills (2025-10-16): https://simonwillison.net/2025/Oct/16/claude-skills/ +- StackOne — MCP Token Optimization: https://www.stackone.com/blog/mcp-token-optimization/ +- opencode issues cited in §3.2, §4.1, §7: #11868, #7261, #13041 diff --git a/openab-agent/Cargo.lock b/openab-agent/Cargo.lock index 5f878017f..42ed19750 100644 --- a/openab-agent/Cargo.lock +++ b/openab-agent/Cargo.lock @@ -11,6 +11,15 @@ dependencies = [ "memchr", ] +[[package]] +name = "android_system_properties" +version = "0.1.5" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "819e7219dbd41043ac279b19830f2efc897156490d7fd6ea916720117ee66311" +dependencies = [ + "libc", +] + [[package]] name = "anstream" version = "1.0.0" @@ -67,12 +76,29 @@ version = "1.0.102" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "7f202df86484c868dbad7eaa557ef785d5c66295e41b460ef922eca0723b842c" +[[package]] +name = "async-trait" +version = "0.1.89" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "9035ad2d096bed7955a320ee7e2230574d28fd3c3a0f186cbea1ff3c7eed5dbb" +dependencies = [ + "proc-macro2", + "quote", + "syn", +] + [[package]] name = "atomic-waker" version = "1.1.2" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "1505bd5d3d116872e7271a6d4e16d81d0c8570876c8de68093a09ac269d8aac0" +[[package]] +name = "autocfg" +version = "1.5.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "f2032f911046de80f0a198e0901378627c33f59ea0ac00e363d481118bd70a53" + [[package]] name = "base64" version = "0.22.1" @@ -85,6 +111,24 @@ version = "2.11.1" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "c4512299f36f043ab09a583e57bceb5a5aab7a73db1805848e8fef3c9e8c78b3" +[[package]] +name = "block-buffer" +version = "0.10.4" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "3078c7629b62d3f0439517fa394996acacc5cbc91c5a20d8c658e77abd503a71" +dependencies = [ + "generic-array", +] + +[[package]] +name = "block-buffer" +version = "0.12.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "cdd35008169921d80bc60d3d0ab416eecb028c4cd653352907921d95084790be" +dependencies = [ + "hybrid-array", +] + [[package]] name = "bumpalo" version = "3.20.3" @@ -99,9 +143,9 @@ checksum = "1e748733b7cbc798e1434b6ac524f0c1ff2ab456fe201501e6497c8417a4fc33" [[package]] name = "cc" -version = "1.2.62" +version = "1.2.63" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "a1dce859f0832a7d088c4f1119888ab94ef4b5d6795d1ce05afb7fe159d79f98" +checksum = "556e016178bb5662a08681bbe0f00f8e17631781a4dfc8c45e466e4b185ec27f" dependencies = [ "find-msvc-tools", "shlex", @@ -119,6 +163,20 @@ version = "0.2.1" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "613afe47fcd5fac7ccf1db93babcb082c5994d996f20b8b159f2ad1658eb5724" +[[package]] +name = "chrono" +version = "0.4.44" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "c673075a2e0e5f4a1dde27ce9dee1ea4558c7ffe648f576438a20ca1d2acc4b0" +dependencies = [ + "iana-time-zone", + "js-sys", + "num-traits", + "serde", + "wasm-bindgen", + "windows-link", +] + [[package]] name = "clap" version = "4.6.1" @@ -165,11 +223,81 @@ version = "1.0.5" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "1d07550c9036bf2ae0c684c4297d503f838287c83c53686d05370d0e139ae570" +[[package]] +name = "const-oid" +version = "0.10.2" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "a6ef517f0926dd24a1582492c791b6a4818a4d94e789a334894aa15b0d12f55c" + +[[package]] +name = "core-foundation-sys" +version = "0.8.7" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "773648b94d0e5d620f64f280777445740e61fe701025087ec8b57f45c791888b" + +[[package]] +name = "cpufeatures" +version = "0.2.17" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "59ed5838eebb26a2bb2e58f6d5b5316989ae9d08bab10e0e6d103e656d1b0280" +dependencies = [ + "libc", +] + +[[package]] +name = "cpufeatures" +version = "0.3.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "8b2a41393f66f16b0823bb79094d54ac5fbd34ab292ddafb9a0456ac9f87d201" +dependencies = [ + "libc", +] + +[[package]] +name = "crypto-common" +version = "0.1.7" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "78c8292055d1c1df0cce5d180393dc8cce0abec0a7102adb6c7b1eef6016d60a" +dependencies = [ + "generic-array", + "typenum", +] + +[[package]] +name = "crypto-common" +version = "0.2.2" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "ce6e4c961d6cd6c9a86db418387425e8bdeaf05b3c8bc1411e6dca4c252f1453" +dependencies = [ + "hybrid-array", +] + +[[package]] +name = "digest" +version = "0.10.7" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "9ed9a281f7bc9b7576e61468ba615a66a5c8cfdff42420a70aa82701a3b1e292" +dependencies = [ + "block-buffer 0.10.4", + "crypto-common 0.1.7", +] + +[[package]] +name = "digest" +version = "0.11.3" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "f1dd6dbb5841937940781866fa1281a1ff7bd3bf827091440879f9994983d5c2" +dependencies = [ + "block-buffer 0.12.0", + "const-oid", + "crypto-common 0.2.2", +] + [[package]] name = "displaydoc" -version = "0.2.5" +version = "0.2.6" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "97369cbbc041bc366949bc74d34658d6cda5621039731c6310521892a3a20ae0" +checksum = "1ac70aa55017e108007fbaf5aa0f54b021c98f92ff8af59d42eda9da96e3dd4f" dependencies = [ "proc-macro2", "quote", @@ -219,6 +347,21 @@ dependencies = [ "percent-encoding", ] +[[package]] +name = "futures" +version = "0.3.32" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "8b147ee9d1f6d097cef9ce628cd2ee62288d963e16fb287bd9286455b241382d" +dependencies = [ + "futures-channel", + "futures-core", + "futures-executor", + "futures-io", + "futures-sink", + "futures-task", + "futures-util", +] + [[package]] name = "futures-channel" version = "0.3.32" @@ -226,6 +369,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "07bbe89c50d7a535e539b8c17bc0b49bdb77747034daa8087407d655f3f7cc1d" dependencies = [ "futures-core", + "futures-sink", ] [[package]] @@ -234,6 +378,40 @@ version = "0.3.32" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "7e3450815272ef58cec6d564423f6e755e25379b217b0bc688e295ba24df6b1d" +[[package]] +name = "futures-executor" +version = "0.3.32" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "baf29c38818342a3b26b5b923639e7b1f4a61fc5e76102d4b1981c6dc7a7579d" +dependencies = [ + "futures-core", + "futures-task", + "futures-util", +] + +[[package]] +name = "futures-io" +version = "0.3.32" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "cecba35d7ad927e23624b22ad55235f2239cfa44fd10428eecbeba6d6a717718" + +[[package]] +name = "futures-macro" +version = "0.3.32" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "e835b70203e41293343137df5c0664546da5745f82ec9b84d40be8336958447b" +dependencies = [ + "proc-macro2", + "quote", + "syn", +] + +[[package]] +name = "futures-sink" +version = "0.3.32" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "c39754e157331b013978ec91992bde1ac089843443c49cbc7f46150b0fad0893" + [[package]] name = "futures-task" version = "0.3.32" @@ -246,12 +424,27 @@ version = "0.3.32" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "389ca41296e6190b48053de0321d02a77f32f8a5d2461dd38762c0593805c6d6" dependencies = [ + "futures-channel", "futures-core", + "futures-io", + "futures-macro", + "futures-sink", "futures-task", + "memchr", "pin-project-lite", "slab", ] +[[package]] +name = "generic-array" +version = "0.14.7" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "85649ca51fd72272d7821adaf274ad91c288277713d9c18820d8499a7ff69e9a" +dependencies = [ + "typenum", + "version_check", +] + [[package]] name = "getrandom" version = "0.2.17" @@ -352,11 +545,20 @@ version = "1.10.1" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "6dbf3de79e51f3d586ab4cb9d5c3e2c14aa28ed23d180cf89b4df0454a69cc87" +[[package]] +name = "hybrid-array" +version = "0.4.12" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "9155a582abd142abc056962c29e3ce5ff2ad5469f4246b537ed42c5deba857da" +dependencies = [ + "typenum", +] + [[package]] name = "hyper" -version = "1.9.0" +version = "1.10.1" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "6299f016b246a94207e63da54dbe807655bf9e00044f73ded42c3ac5305fbcca" +checksum = "55281c53a1894c864990125767da440a4e630446785086f52523b20033b74498" dependencies = [ "atomic-waker", "bytes", @@ -411,6 +613,30 @@ dependencies = [ "tracing", ] +[[package]] +name = "iana-time-zone" +version = "0.1.65" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "e31bc9ad994ba00e440a8aa5c9ef0ec67d5cb5e5cb0cc7f8b744a35b389cc470" +dependencies = [ + "android_system_properties", + "core-foundation-sys", + "iana-time-zone-haiku", + "js-sys", + "log", + "wasm-bindgen", + "windows-core", +] + +[[package]] +name = "iana-time-zone-haiku" +version = "0.1.2" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "f31827a206f56af32e590ba56d5d2d085f558508192593743f16b2306495269f" +dependencies = [ + "cc", +] + [[package]] name = "icu_collections" version = "2.2.0" @@ -538,6 +764,25 @@ version = "2.12.0" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "d98f6fed1fde3f8c21bc40a1abb88dd75e67924f9cffc3ef95607bad8017f8e2" +[[package]] +name = "is-docker" +version = "0.2.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "928bae27f42bc99b60d9ac7334e3a21d10ad8f1835a4e12ec3ec0464765ed1b3" +dependencies = [ + "once_cell", +] + +[[package]] +name = "is-wsl" +version = "0.4.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "173609498df190136aa7dea1a91db051746d339e18476eed5ca40521f02d7aa5" +dependencies = [ + "is-docker", + "once_cell", +] + [[package]] name = "is_terminal_polyfill" version = "1.70.2" @@ -624,21 +869,33 @@ dependencies = [ [[package]] name = "memchr" -version = "2.8.0" +version = "2.8.1" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "f8ca58f447f06ed17d5fc4043ce1b10dd205e060fb3ce5b979b8ed8e59ff3f79" +checksum = "6b947ae49db0d222b1dbc6b113ce7248a3fc3a6ca21b696717bfc000ba4484d8" [[package]] name = "mio" -version = "1.2.0" +version = "1.2.1" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "50b7e5b27aa02a74bac8c3f23f448f8d87ff11f92d3aac1a6ed369ee08cc56c1" +checksum = "02bd0af71c67b473010cbbc60715ee815645a4dc942899111f494b4b737d6fda" dependencies = [ "libc", "wasi", "windows-sys 0.61.2", ] +[[package]] +name = "nix" +version = "0.31.3" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "cf20d2fde8ff38632c426f1165ed7436270b44f199fc55284c38276f9db47c3d" +dependencies = [ + "bitflags", + "cfg-if", + "cfg_aliases", + "libc", +] + [[package]] name = "nu-ansi-term" version = "0.50.3" @@ -648,6 +905,34 @@ dependencies = [ "windows-sys 0.61.2", ] +[[package]] +name = "num-traits" +version = "0.2.19" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "071dfc062690e90b734c0b2273ce72ad0ffa95f0c74596bc250dcfd960262841" +dependencies = [ + "autocfg", +] + +[[package]] +name = "oauth2" +version = "5.0.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "51e219e79014df21a225b1860a479e2dcd7cbd9130f4defd4bd0e191ea31d67d" +dependencies = [ + "base64", + "chrono", + "getrandom 0.2.17", + "http", + "rand 0.8.6", + "serde", + "serde_json", + "serde_path_to_error", + "sha2 0.10.9", + "thiserror 1.0.69", + "url", +] + [[package]] name = "once_cell" version = "1.21.4" @@ -660,20 +945,39 @@ version = "1.70.2" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "384b8ab6d37215f3c5301a95a4accb5d64aa607f1fcb26a11b5303878451b4fe" +[[package]] +name = "open" +version = "5.3.5" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "2fbaa89d2ddc8473c78a3adf69eea8cffa28c483b8e02a971ef31527cd0fc92c" +dependencies = [ + "is-wsl", + "libc", + "pathdiff", +] + [[package]] name = "openab-agent" version = "0.1.0" dependencies = [ "anyhow", + "base64", "clap", + "getrandom 0.4.2", "libc", - "reqwest", + "open", + "reqwest 0.12.28", + "rmcp", "serde", "serde_json", + "sha2 0.11.0", + "temp-env", "tempfile", "tokio", "tracing", "tracing-subscriber", + "url", + "urlencoding", "uuid", ] @@ -700,6 +1004,12 @@ dependencies = [ "windows-link", ] +[[package]] +name = "pathdiff" +version = "0.2.3" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "df94ce210e5bc13cb6651479fa48d14f601d9858cfe0467f43ae157023b938d3" + [[package]] name = "percent-encoding" version = "2.3.2" @@ -749,6 +1059,20 @@ dependencies = [ "unicode-ident", ] +[[package]] +name = "process-wrap" +version = "9.1.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "2e842efad9119158434d193c6682e2ebee4b44d6ad801d7b349623b3f57cdf55" +dependencies = [ + "futures", + "indexmap", + "nix", + "tokio", + "tracing", + "windows", +] + [[package]] name = "quinn" version = "0.11.9" @@ -763,7 +1087,7 @@ dependencies = [ "rustc-hash", "rustls", "socket2", - "thiserror", + "thiserror 2.0.18", "tokio", "tracing", "web-time", @@ -778,13 +1102,13 @@ dependencies = [ "bytes", "getrandom 0.3.4", "lru-slab", - "rand", + "rand 0.9.4", "ring", "rustc-hash", "rustls", "rustls-pki-types", "slab", - "thiserror", + "thiserror 2.0.18", "tinyvec", "tracing", "web-time", @@ -825,14 +1149,35 @@ version = "6.0.0" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "f8dcc9c7d52a811697d2151c701e0d08956f92b0e24136cf4cf27b57a6a0d9bf" +[[package]] +name = "rand" +version = "0.8.6" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "5ca0ecfa931c29007047d1bc58e623ab12e5590e8c7cc53200d5202b69266d8a" +dependencies = [ + "libc", + "rand_chacha 0.3.1", + "rand_core 0.6.4", +] + [[package]] name = "rand" version = "0.9.4" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "44c5af06bb1b7d3216d91932aed5265164bf384dc89cd6ba05cf59a35f5f76ea" dependencies = [ - "rand_chacha", - "rand_core", + "rand_chacha 0.9.0", + "rand_core 0.9.5", +] + +[[package]] +name = "rand_chacha" +version = "0.3.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "e6c10a63a0fa32252be49d21e7709d4d4baf8d231c2dbce1eaa8141b9b127d88" +dependencies = [ + "ppv-lite86", + "rand_core 0.6.4", ] [[package]] @@ -842,7 +1187,16 @@ source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "d3022b5f1df60f26e1ffddd6c66e8aa15de382ae63b3a0c1bfc0e4d3e3f325cb" dependencies = [ "ppv-lite86", - "rand_core", + "rand_core 0.9.5", +] + +[[package]] +name = "rand_core" +version = "0.6.4" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "ec0be4795e2f6a28069bec0b5ff3e2ac9bafc99e6a9a7dc3547996c5c816922c" +dependencies = [ + "getrandom 0.2.17", ] [[package]] @@ -918,6 +1272,40 @@ dependencies = [ "webpki-roots", ] +[[package]] +name = "reqwest" +version = "0.13.4" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "219c5811de6525e5416c7d5d53bb656d3afdbc6c5af816e0802bcfa42dbdc1c3" +dependencies = [ + "base64", + "bytes", + "futures-core", + "futures-util", + "http", + "http-body", + "http-body-util", + "hyper", + "hyper-util", + "js-sys", + "log", + "percent-encoding", + "pin-project-lite", + "serde", + "serde_json", + "sync_wrapper", + "tokio", + "tokio-util", + "tower", + "tower-http", + "tower-service", + "url", + "wasm-bindgen", + "wasm-bindgen-futures", + "wasm-streams", + "web-sys", +] + [[package]] name = "ring" version = "0.17.14" @@ -932,6 +1320,31 @@ dependencies = [ "windows-sys 0.52.0", ] +[[package]] +name = "rmcp" +version = "1.7.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "0810a9f717d9828f475fe1f629f4c305c8464b7f496c3a854b58d29e65f4058e" +dependencies = [ + "async-trait", + "chrono", + "futures", + "http", + "oauth2", + "pin-project-lite", + "process-wrap", + "reqwest 0.13.4", + "serde", + "serde_json", + "sse-stream", + "thiserror 2.0.18", + "tokio", + "tokio-stream", + "tokio-util", + "tracing", + "url", +] + [[package]] name = "rustc-hash" version = "2.1.2" @@ -1053,6 +1466,17 @@ dependencies = [ "zmij", ] +[[package]] +name = "serde_path_to_error" +version = "0.1.20" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "10a9ff822e371bb5403e391ecd83e182e0e77ba7f6fe0160b795797109d1b457" +dependencies = [ + "itoa", + "serde", + "serde_core", +] + [[package]] name = "serde_urlencoded" version = "0.7.1" @@ -1065,6 +1489,28 @@ dependencies = [ "serde", ] +[[package]] +name = "sha2" +version = "0.10.9" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "a7507d819769d01a365ab707794a4084392c824f54a7a6a7862f8c3d0892b283" +dependencies = [ + "cfg-if", + "cpufeatures 0.2.17", + "digest 0.10.7", +] + +[[package]] +name = "sha2" +version = "0.11.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "446ba717509524cb3f22f17ecc096f10f4822d76ab5c0b9822c5f9c284e825f4" +dependencies = [ + "cfg-if", + "cpufeatures 0.3.0", + "digest 0.11.3", +] + [[package]] name = "sharded-slab" version = "0.1.7" @@ -1076,9 +1522,9 @@ dependencies = [ [[package]] name = "shlex" -version = "1.3.0" +version = "2.0.1" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "0fda2ff0d084019ba4d7c6f371c95d8fd75ce3524c3cb8fb653a3023f6323e64" +checksum = "f8fadd59c855ef2080decdef8ff161eb6661b86933c9d82e5ba29dc602a55aba" [[package]] name = "signal-hook-registry" @@ -1104,14 +1550,27 @@ checksum = "67b1b7a3b5fe4f1376887184045fcf45c69e92af734b7aaddc05fb777b6fbd03" [[package]] name = "socket2" -version = "0.6.3" +version = "0.6.4" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "3a766e1110788c36f4fa1c2b71b387a7815aa65f88ce0229841826633d93723e" +checksum = "52d1cfed4120b4d927bf7c0f86d2087a4a7d6027c906d9f9d525a80573b9be51" dependencies = [ "libc", "windows-sys 0.61.2", ] +[[package]] +name = "sse-stream" +version = "0.2.3" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "f3962b63f038885f15bce2c6e02c0e7925c072f1ac86bb60fd44c5c6b762fb72" +dependencies = [ + "bytes", + "futures-util", + "http-body", + "http-body-util", + "pin-project-lite", +] + [[package]] name = "stable_deref_trait" version = "1.2.1" @@ -1161,6 +1620,15 @@ dependencies = [ "syn", ] +[[package]] +name = "temp-env" +version = "0.3.6" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "96374855068f47402c3121c6eed88d29cb1de8f3ab27090e273e420bdabcf050" +dependencies = [ + "parking_lot", +] + [[package]] name = "tempfile" version = "3.27.0" @@ -1174,13 +1642,33 @@ dependencies = [ "windows-sys 0.61.2", ] +[[package]] +name = "thiserror" +version = "1.0.69" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "b6aaf5339b578ea85b50e080feb250a3e8ae8cfcdff9a461c9ec2904bc923f52" +dependencies = [ + "thiserror-impl 1.0.69", +] + [[package]] name = "thiserror" version = "2.0.18" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "4288b5bcbc7920c07a1149a35cf9590a2aa808e0bc1eafaade0b80947865fbc4" dependencies = [ - "thiserror-impl", + "thiserror-impl 2.0.18", +] + +[[package]] +name = "thiserror-impl" +version = "1.0.69" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "4fee6c4efc90059e10f81e6d42c60a18f76588c3d74cb83a0b242a2b6c7504c1" +dependencies = [ + "proc-macro2", + "quote", + "syn", ] [[package]] @@ -1266,6 +1754,30 @@ dependencies = [ "tokio", ] +[[package]] +name = "tokio-stream" +version = "0.1.18" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "32da49809aab5c3bc678af03902d4ccddea2a87d028d86392a4b1560c6906c70" +dependencies = [ + "futures-core", + "pin-project-lite", + "tokio", +] + +[[package]] +name = "tokio-util" +version = "0.7.18" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "9ae9cec805b01e8fc3fd2fe289f89149a9b66dd16786abd8b19cfa7b48cb0098" +dependencies = [ + "bytes", + "futures-core", + "futures-sink", + "pin-project-lite", + "tokio", +] + [[package]] name = "tower" version = "0.5.3" @@ -1378,6 +1890,12 @@ version = "0.2.5" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "e421abadd41a4225275504ea4d6566923418b7f05506fbc9c0fe86ba7396114b" +[[package]] +name = "typenum" +version = "1.20.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "b6f5e870be6c3b371b77fe0ee0bafb859fa4964b4404c27de1d380043c4dda20" + [[package]] name = "unicode-ident" version = "1.0.24" @@ -1406,8 +1924,15 @@ dependencies = [ "idna", "percent-encoding", "serde", + "serde_derive", ] +[[package]] +name = "urlencoding" +version = "2.1.3" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "daf8dba3b7eb870caf1ddeed7bc9d2a049f3cfdfae7cb521b087cc33ae4c49da" + [[package]] name = "utf8_iter" version = "1.0.4" @@ -1422,9 +1947,9 @@ checksum = "06abde3611657adf66d383f00b093d7faecc7fa57071cce2578660c9f1010821" [[package]] name = "uuid" -version = "1.23.1" +version = "1.23.2" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "ddd74a9687298c6858e9b88ec8935ec45d22e8fd5e6394fa1bd4e99a87789c76" +checksum = "d258b83ceec21034727ecee8c382cfa6c3e133699b0742c64571814fb420c9f7" dependencies = [ "getrandom 0.4.2", "js-sys", @@ -1437,6 +1962,12 @@ version = "0.1.1" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "ba73ea9cf16a25df0c8caa16c51acb937d5712a8429db78a3ee29d5dcacd3a65" +[[package]] +name = "version_check" +version = "0.9.5" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "0b928f33d975fc6ad9f86c8f283853ad26bdd5b10b7f1542aa2fa15e2289105a" + [[package]] name = "want" version = "0.3.1" @@ -1547,6 +2078,19 @@ dependencies = [ "wasmparser", ] +[[package]] +name = "wasm-streams" +version = "0.5.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "9d1ec4f6517c9e11ae630e200b2b65d193279042e28edd4a2cda233e46670bbb" +dependencies = [ + "futures-util", + "js-sys", + "wasm-bindgen", + "wasm-bindgen-futures", + "web-sys", +] + [[package]] name = "wasmparser" version = "0.244.0" @@ -1588,12 +2132,107 @@ dependencies = [ "rustls-pki-types", ] +[[package]] +name = "windows" +version = "0.62.2" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "527fadee13e0c05939a6a05d5bd6eec6cd2e3dbd648b9f8e447c6518133d8580" +dependencies = [ + "windows-collections", + "windows-core", + "windows-future", + "windows-numerics", +] + +[[package]] +name = "windows-collections" +version = "0.3.2" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "23b2d95af1a8a14a3c7367e1ed4fc9c20e0a26e79551b1454d72583c97cc6610" +dependencies = [ + "windows-core", +] + +[[package]] +name = "windows-core" +version = "0.62.2" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "b8e83a14d34d0623b51dce9581199302a221863196a1dde71a7663a4c2be9deb" +dependencies = [ + "windows-implement", + "windows-interface", + "windows-link", + "windows-result", + "windows-strings", +] + +[[package]] +name = "windows-future" +version = "0.3.2" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "e1d6f90251fe18a279739e78025bd6ddc52a7e22f921070ccdc67dde84c605cb" +dependencies = [ + "windows-core", + "windows-link", + "windows-threading", +] + +[[package]] +name = "windows-implement" +version = "0.60.2" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "053e2e040ab57b9dc951b72c264860db7eb3b0200ba345b4e4c3b14f67855ddf" +dependencies = [ + "proc-macro2", + "quote", + "syn", +] + +[[package]] +name = "windows-interface" +version = "0.59.3" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "3f316c4a2570ba26bbec722032c4099d8c8bc095efccdc15688708623367e358" +dependencies = [ + "proc-macro2", + "quote", + "syn", +] + [[package]] name = "windows-link" version = "0.2.1" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "f0805222e57f7521d6a62e36fa9163bc891acd422f971defe97d64e70d0a4fe5" +[[package]] +name = "windows-numerics" +version = "0.3.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "6e2e40844ac143cdb44aead537bbf727de9b044e107a0f1220392177d15b0f26" +dependencies = [ + "windows-core", + "windows-link", +] + +[[package]] +name = "windows-result" +version = "0.4.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "7781fa89eaf60850ac3d2da7af8e5242a5ea78d1a11c49bf2910bb5a73853eb5" +dependencies = [ + "windows-link", +] + +[[package]] +name = "windows-strings" +version = "0.5.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "7837d08f69c77cf6b07689544538e017c1bfcf57e34b4c0ff58e6c2cd3b37091" +dependencies = [ + "windows-link", +] + [[package]] name = "windows-sys" version = "0.52.0" @@ -1654,6 +2293,15 @@ dependencies = [ "windows_x86_64_msvc 0.53.1", ] +[[package]] +name = "windows-threading" +version = "0.2.1" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "3949bd5b99cafdf1c7ca86b43ca564028dfe27d66958f2470940f73d86d75b37" +dependencies = [ + "windows-link", +] + [[package]] name = "windows_aarch64_gnullvm" version = "0.52.6" @@ -1875,18 +2523,18 @@ dependencies = [ [[package]] name = "zerocopy" -version = "0.8.48" +version = "0.8.50" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "eed437bf9d6692032087e337407a86f04cd8d6a16a37199ed57949d415bd68e9" +checksum = "3b065d4f0e55f82fae73202e189638116a87c55ab6b8e6c2721e13dd9d854ad1" dependencies = [ "zerocopy-derive", ] [[package]] name = "zerocopy-derive" -version = "0.8.48" +version = "0.8.50" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "70e3cd084b1788766f53af483dd21f93881ff30d7320490ec3ef7526d203bad4" +checksum = "0b631b19d36a892ab55420c92dbc83ccd79274f25be714855d3074aa71cab639" dependencies = [ "proc-macro2", "quote", diff --git a/openab-agent/Cargo.toml b/openab-agent/Cargo.toml index f059cfc6a..5b091a47e 100644 --- a/openab-agent/Cargo.toml +++ b/openab-agent/Cargo.toml @@ -21,9 +21,20 @@ getrandom = "0.4.2" urlencoding = "2.1.3" open = "5.3.5" url = "2.5.8" +rmcp = { version = "1.7", default-features = false, optional = true, features = [ + "client", + "transport-child-process", + "transport-streamable-http-client-reqwest", + "auth", +] } [target.'cfg(unix)'.dependencies] libc = "0.2" +[features] +default = [] +mcp = ["dep:rmcp"] + [dev-dependencies] tempfile = "3" +temp-env = "0.3.6" diff --git a/openab-agent/src/acp.rs b/openab-agent/src/acp.rs index 38054f25d..9585612b2 100644 --- a/openab-agent/src/acp.rs +++ b/openab-agent/src/acp.rs @@ -1,5 +1,7 @@ use crate::agent::Agent; use crate::llm::AnthropicProvider; +#[cfg(feature = "mcp")] +use crate::mcp::{self, McpRuntimeManager}; use serde::{Deserialize, Serialize}; use serde_json::{json, Value}; use std::collections::HashMap; @@ -35,6 +37,8 @@ pub struct AcpServer { // TODO(v0.2): add session TTL and periodic cleanup to prevent OOM sessions: HashMap, working_dir: String, + #[cfg(feature = "mcp")] + mcp_manager: Option, } impl AcpServer { @@ -44,6 +48,8 @@ impl AcpServer { working_dir: std::env::current_dir() .map(|p| p.to_string_lossy().to_string()) .unwrap_or_else(|_| "/tmp".to_string()), + #[cfg(feature = "mcp")] + mcp_manager: mcp::load_runtime_or_warn(), } } @@ -154,7 +160,12 @@ impl AcpServer { } }; - let agent = Agent::new_boxed(provider, self.working_dir.clone()); + let agent = Agent::new_boxed( + provider, + self.working_dir.clone(), + #[cfg(feature = "mcp")] + self.mcp_manager.clone(), + ); self.sessions.insert(session_id.clone(), agent); let resp = JsonRpcResponse { jsonrpc: "2.0", @@ -259,10 +270,16 @@ mod tests { #[test] fn test_session_new() { - // Set a fake key so from_env() succeeds in CI - unsafe { std::env::set_var("ANTHROPIC_API_KEY", "test-key") }; - let mut server = AcpServer::new(); - let resp_str = server.handle_session_new(2); + let resp_str = temp_env::with_vars( + [ + ("ANTHROPIC_API_KEY", Some("test-key")), + ("OPENAB_AGENT_PROVIDER", None), + ], + || { + let mut server = AcpServer::new(); + server.handle_session_new(2) + }, + ); let resp: Value = serde_json::from_str(&resp_str).unwrap(); assert_eq!(resp["jsonrpc"], "2.0"); assert_eq!(resp["id"], 2); @@ -271,14 +288,19 @@ mod tests { #[test] fn test_session_new_missing_key() { - // Ensure no OAuth token exists either - let auth_path = - std::path::PathBuf::from(std::env::var("HOME").unwrap_or_else(|_| "/tmp".to_string())) - .join(".openab/agent/auth.json"); - let _ = std::fs::remove_file(&auth_path); - unsafe { std::env::remove_var("ANTHROPIC_API_KEY") }; - let mut server = AcpServer::new(); - let resp_str = server.handle_session_new(3); + let tmp = tempfile::TempDir::new().unwrap(); + let home = tmp.path().to_string_lossy().to_string(); + let resp_str = temp_env::with_vars( + [ + ("ANTHROPIC_API_KEY", None), + ("OPENAB_AGENT_PROVIDER", None), + ("HOME", Some(home.as_str())), + ], + || { + let mut server = AcpServer::new(); + server.handle_session_new(3) + }, + ); let resp: Value = serde_json::from_str(&resp_str).unwrap(); assert!(resp["error"].is_object()); assert!(resp["error"]["message"] diff --git a/openab-agent/src/agent.rs b/openab-agent/src/agent.rs index b4a32d722..7a2b69c91 100644 --- a/openab-agent/src/agent.rs +++ b/openab-agent/src/agent.rs @@ -1,14 +1,18 @@ use anyhow::Result; +#[cfg(feature = "mcp")] +use serde::Deserialize; use std::path::PathBuf; use tracing::{debug, info}; use crate::llm::{ContentBlock, LlmEvent, LlmProvider, Message, ToolDef}; +#[cfg(feature = "mcp")] +use crate::mcp::{self, McpRuntimeManager}; use crate::skills; use crate::tools; const SYSTEM_PROMPT: &str = r#"You are openab-agent, a coding assistant. You help users by reading, writing, and editing files, and running shell commands. -You have 4 tools available: +You have these tools available: - read: Read file contents or list a directory - write: Create or overwrite a file - edit: Replace a string in a file (first occurrence) @@ -16,6 +20,13 @@ You have 4 tools available: Be direct and concise. Execute tasks immediately rather than explaining what you would do. When you need to understand code, read the relevant files first."#; +// The MCP system-prompt appendix is generated dynamically by +// `mcp::format_system_prompt_appendix(manager)` so the LLM sees both the +// `mcp` tool intro AND a server catalogue (PR #959 F1 discovery slice). +// Previously a static const here, but that hid the configured server names +// from the LLM and produced the "fs is disconnected, I give up" failure +// mode observed in the F1 PoC. + const MAX_TOOL_LOOPS: usize = 50; /// Maximum number of messages to keep in context. When exceeded, oldest /// messages (excluding the first user message) are dropped. @@ -27,35 +38,72 @@ pub struct Agent { working_dir: PathBuf, system_prompt: String, tools: Vec, + #[cfg(feature = "mcp")] + mcp_manager: Option, } impl Agent { #[cfg(test)] pub fn new(provider: impl LlmProvider + 'static, working_dir: String) -> Self { - let system_prompt = Self::build_system_prompt(&working_dir); + let system_prompt = Self::build_system_prompt( + &working_dir, + #[cfg(feature = "mcp")] + None, + ); Self { provider: Box::new(provider), messages: Vec::new(), working_dir: PathBuf::from(working_dir), system_prompt, tools: tools::tool_definitions(), + #[cfg(feature = "mcp")] + mcp_manager: None, } } - pub fn new_boxed(provider: Box, working_dir: String) -> Self { - let system_prompt = Self::build_system_prompt(&working_dir); + pub fn new_boxed( + provider: Box, + working_dir: String, + #[cfg(feature = "mcp")] mcp_manager: Option, + ) -> Self { + let system_prompt = Self::build_system_prompt( + &working_dir, + #[cfg(feature = "mcp")] + mcp_manager.as_ref(), + ); + #[cfg(feature = "mcp")] + let tools = { + let mut t = tools::tool_definitions(); + if mcp_manager.is_some() { + t.push(mcp::mcp_tool_def()); + } + t + }; + #[cfg(not(feature = "mcp"))] + let tools = tools::tool_definitions(); Self { provider, messages: Vec::new(), working_dir: PathBuf::from(working_dir), system_prompt, - tools: tools::tool_definitions(), + tools, + #[cfg(feature = "mcp")] + mcp_manager, } } - /// Run the agent with a user prompt, executing tool calls until completion. - /// Returns the final text response. - fn build_system_prompt(working_dir: &str) -> String { + /// Build the system prompt sent on every LLM call. Composition order: + /// 1. base prompt (`SYSTEM_PROMPT`, optionally prefixed by project-local + /// `AGENTS.md`), + /// 2. MCP appendix — tool intro + server catalogue (PR #959 F1 + /// discovery slice); only when `mcp_manager` is `Some`, + /// 3. skills catalogue. + /// + /// Built once at `Agent::new*` time and reused on every `call_llm`. + fn build_system_prompt( + working_dir: &str, + #[cfg(feature = "mcp")] mcp_manager: Option<&McpRuntimeManager>, + ) -> String { let wd = std::path::Path::new(working_dir); let agents_md = wd.join("AGENTS.md"); let custom = std::fs::read_to_string(&agents_md).unwrap_or_default(); @@ -66,6 +114,13 @@ impl Agent { format!("{}\n\n---\n\n{}", custom.trim(), SYSTEM_PROMPT) }; + #[cfg(feature = "mcp")] + let base = if let Some(mgr) = mcp_manager { + format!("{base}{}", mcp::format_system_prompt_appendix(mgr)) + } else { + base + }; + let discovered = skills::discover_skills(wd); if discovered.is_empty() { base @@ -140,7 +195,7 @@ impl Agent { let mut tool_results: Vec = Vec::new(); for (id, name, input) in &tool_calls { info!("executing tool: {name}"); - let result = tools::execute_tool(name, input, &self.working_dir).await; + let result = self.execute_tool_call(name, input).await; match result { Ok(output) => { tool_results.push(ContentBlock::ToolResult { @@ -178,12 +233,50 @@ impl Agent { /// first user message and maintaining strict user/assistant alternation. fn truncate_context(&mut self) { while self.messages.len() > MAX_CONTEXT_MESSAGES { - // Drain in pairs (assistant + user) from index 1 to maintain alternation - let end = (1 + 2).min(self.messages.len()); + // Drain a (assistant, user) pair from indices 1..3, preserving + // the original first user message at index 0 so user/assistant + // alternation stays intact. The `min()` guard is defensive — if + // the loop is ever entered with fewer than 3 messages, we drain + // whatever single tail message exists rather than panic. + let end = 3.min(self.messages.len()); self.messages.drain(1..end); } } + /// Route the `mcp` meta-tool to the MCP runtime when configured; + /// everything else goes to the stateless `tools::execute_tool`. Keeping + /// the routing here (rather than inside `tools.rs`) lets `tools.rs` stay + /// stateless and free of MCP/feature plumbing. + async fn execute_tool_call(&self, name: &str, input: &serde_json::Value) -> Result { + // Defensive guard (PR #959 chaodu F5): even though the `mcp` tool is + // only registered when a manager is loaded — and the system-prompt + // appendix is gated on `mcp_manager.is_some()` — a sufficiently + // creative LLM could still emit a `mcp(...)` tool call. Surface an + // actionable, non-fatal error so the loop continues instead of + // panicking or leaking an impl-detail message. + #[cfg(feature = "mcp")] + if name == mcp::MCP_TOOL_NAME { + let Some(manager) = self.mcp_manager.as_ref() else { + return Err(anyhow::anyhow!( + "tool `mcp` is not available in this session — \ + MCP runtime was not opted in (set `OPENAB_AGENT_MCP=true` \ + and configure `mcp.json`). Do not call `mcp` again." + )); + }; + let action = mcp::meta_tool::Action::deserialize(input) + .map_err(|e| anyhow::anyhow!("invalid mcp action payload: {e}"))?; + let value = mcp::meta_tool::dispatch(manager, action).await?; + return Ok(serde_json::to_string(&value)?); + } + #[cfg(not(feature = "mcp"))] + if name == "mcp" { + return Err(anyhow::anyhow!( + "tool `mcp` is not compiled into this build. Do not call `mcp` again." + )); + } + tools::execute_tool(name, input, &self.working_dir).await + } + async fn call_llm(&self) -> Result> { self.provider .chat(&self.system_prompt, &self.messages, &self.tools) @@ -298,6 +391,57 @@ mod tests { } } + #[cfg(feature = "mcp")] + #[test] + fn build_system_prompt_includes_mcp_catalogue_when_manager_provided() { + // PR #959 F1 discovery slice: when an MCP manager is wired in, the + // system prompt must surface the configured server catalogue so the + // LLM knows `list_tools` is worth calling (the "fs disconnected, I + // give up" failure mode the static const previously caused). + use crate::mcp::config::McpConfig; + let cfg: McpConfig = serde_json::from_str( + r#"{ + "mcpServers": { + "fs": { "type": "stdio", "command": "mcp-server-filesystem" }, + "linear": { + "type": "http", + "url": "https://mcp.linear.app/mcp", + "oauth": { "provider": "linear" } + } + } + }"#, + ) + .unwrap(); + let mgr = McpRuntimeManager::from_config(cfg); + + let tmp = tempfile::TempDir::new().unwrap(); + let prompt = Agent::build_system_prompt(&tmp.path().to_string_lossy(), Some(&mgr)); + + assert!( + prompt.contains("## MCP tool"), + "missing MCP section:\n{prompt}" + ); + assert!( + prompt.contains("**fs** (stdio)"), + "missing fs catalogue entry:\n{prompt}" + ); + assert!( + prompt.contains("requires `mcp login linear`"), + "missing OAuth login hint:\n{prompt}" + ); + } + + #[cfg(feature = "mcp")] + #[test] + fn build_system_prompt_omits_mcp_section_when_no_manager() { + let tmp = tempfile::TempDir::new().unwrap(); + let prompt = Agent::build_system_prompt(&tmp.path().to_string_lossy(), None); + assert!( + !prompt.contains("## MCP tool"), + "MCP section leaked into prompt without manager:\n{prompt}" + ); + } + #[tokio::test] #[ignore] // Integration test: executes real file tools async fn test_agent_multiple_tool_calls() { diff --git a/openab-agent/src/auth.rs b/openab-agent/src/auth.rs index 385ccede9..f34e681aa 100644 --- a/openab-agent/src/auth.rs +++ b/openab-agent/src/auth.rs @@ -517,15 +517,16 @@ mod tests { #[test] fn test_codex_client_id_default() { - unsafe { std::env::remove_var("OPENAB_AGENT_OAUTH_CLIENT_ID") }; - assert_eq!(codex_client_id(), "app_EMoamEEZ73f0CkXaXp7hrann"); + temp_env::with_var("OPENAB_AGENT_OAUTH_CLIENT_ID", None::<&str>, || { + assert_eq!(codex_client_id(), "app_EMoamEEZ73f0CkXaXp7hrann"); + }); } #[test] fn test_codex_client_id_override() { - unsafe { std::env::set_var("OPENAB_AGENT_OAUTH_CLIENT_ID", "custom_id") }; - assert_eq!(codex_client_id(), "custom_id"); - unsafe { std::env::remove_var("OPENAB_AGENT_OAUTH_CLIENT_ID") }; + temp_env::with_var("OPENAB_AGENT_OAUTH_CLIENT_ID", Some("custom_id"), || { + assert_eq!(codex_client_id(), "custom_id"); + }); } #[test] diff --git a/openab-agent/src/main.rs b/openab-agent/src/main.rs index a37693079..7acf6769b 100644 --- a/openab-agent/src/main.rs +++ b/openab-agent/src/main.rs @@ -2,6 +2,8 @@ mod acp; mod agent; mod auth; mod llm; +#[cfg(feature = "mcp")] +mod mcp; mod skills; mod tools; @@ -22,6 +24,32 @@ enum Commands { #[command(subcommand)] provider: AuthProvider, }, + /// Inspect / manage configured MCP servers + #[cfg(feature = "mcp")] + Mcp { + #[command(subcommand)] + action: McpAction, + }, +} + +#[cfg(feature = "mcp")] +#[derive(Subcommand)] +enum McpAction { + /// List configured MCP servers (loads global + project mcp.json) + List { + /// Substitute ${env:VAR} placeholders with real values. + /// WARNING: output will contain secrets if your config references + /// tokens via env vars — do not paste publicly. + #[arg(long)] + resolve: bool, + }, + /// Show per-server runtime status + Status, + /// Spawn the configured server and run the MCP handshake (smoke-test). + Connect { + /// Server name as configured in mcp.json + name: String, + }, } #[derive(Subcommand)] @@ -70,5 +98,11 @@ async fn main() { auth::show_status(); } }, + #[cfg(feature = "mcp")] + Some(Commands::Mcp { action }) => match action { + McpAction::List { resolve } => mcp::cli_list_servers(resolve), + McpAction::Status => mcp::cli_show_status().await, + McpAction::Connect { name } => mcp::cli_connect(name).await, + }, } } diff --git a/openab-agent/src/mcp/config.rs b/openab-agent/src/mcp/config.rs new file mode 100644 index 000000000..fbb25087b --- /dev/null +++ b/openab-agent/src/mcp/config.rs @@ -0,0 +1,324 @@ +//! `mcpServers` config schema + loader. See ADR §5.6. +//! +//! Loaded from `.openab/agent/mcp.json` (project) and `~/.openab/agent/mcp.json` +//! (global), project entries take precedence on name collision. + +use std::collections::HashMap; +use std::path::{Path, PathBuf}; + +use anyhow::{anyhow, Context, Result}; +use serde::{Deserialize, Serialize}; + +#[derive(Debug, Default, Clone, Serialize, Deserialize)] +pub struct McpConfig { + #[serde(rename = "mcpServers", default)] + pub servers: HashMap, +} + +#[derive(Debug, Clone, Serialize, Deserialize)] +#[serde(tag = "type", rename_all = "snake_case")] +pub enum ServerConfig { + Stdio { + command: String, + #[serde(default)] + args: Vec, + #[serde(default)] + env: HashMap, + #[serde(default, rename = "tool_filter")] + tool_filter: Option, + }, + Http { + url: String, + #[serde(default)] + oauth: Option, + #[serde(default, rename = "tool_filter")] + tool_filter: Option, + }, +} + +impl ServerConfig { + /// Static label used by the `mcp` meta-tool's `list_servers` action. + /// Returning `&'static str` lets `snapshot()` avoid cloning the + /// (potentially large) `Stdio { args, env, ... }` payload just to + /// read the transport variant. + pub fn transport_label(&self) -> &'static str { + match self { + ServerConfig::Stdio { .. } => "stdio", + ServerConfig::Http { .. } => "http", + } + } + + /// `true` when the server is HTTP with an `oauth` block — used by the + /// system-prompt catalogue (PR #959 F1 discovery slice) to hint that + /// the LLM should ask the user to run `mcp login ` before calling. + pub fn requires_oauth(&self) -> bool { + matches!(self, ServerConfig::Http { oauth: Some(_), .. }) + } +} + +#[derive(Debug, Default, Clone, Serialize, Deserialize)] +pub struct ToolFilter { + #[serde(default)] + pub include: Vec, + #[serde(default)] + pub exclude: Vec, +} + +/// OAuth block. Phase 1 only parses `provider` + `scopes`; custom-provider +/// fields (§6.3: `authorize_url`, `token_url`, `device_authorization_endpoint`, +/// `discovery`, `discovery_allowlist`) land with the Phase 2 auth slice. +#[derive(Debug, Clone, Serialize, Deserialize)] +pub struct OAuthConfig { + #[serde(default)] + pub provider: Option, + #[serde(default)] + pub scopes: Vec, +} + +impl McpConfig { + /// Load + merge global and project configs from the standard locations. + /// Missing files are treated as empty. + pub fn load() -> Result { + let global = home_dir().map(|h| h.join(".openab/agent/mcp.json")); + let project = std::env::current_dir() + .ok() + .map(|c| c.join(".openab/agent/mcp.json")); + Self::load_layered(global.as_deref(), project.as_deref()) + } + + /// Load + merge two layers; project wins on name collision. + pub fn load_layered(global: Option<&Path>, project: Option<&Path>) -> Result { + let mut merged = Self::default(); + for path in [global, project].into_iter().flatten() { + if !path.exists() { + continue; + } + let layer = Self::load_file(path)?; + merged.servers.extend(layer.servers); + } + Ok(merged) + } + + fn load_file(path: &Path) -> Result { + let raw = std::fs::read_to_string(path) + .with_context(|| format!("read mcp config {}", path.display()))?; + serde_json::from_str(&raw).with_context(|| format!("parse mcp config {}", path.display())) + } +} + +impl ServerConfig { + /// Return a copy with `${env:VAR}` placeholders resolved against the + /// process environment. Missing env vars are an error for that server; + /// callers should skip the server and continue (ADR §5.6 "per-server + /// failure isolated"). `name` is the server name used in error context. + pub fn resolved(&self, name: &str) -> Result { + self.resolved_with_env(name, &std::env::vars().collect()) + } + + fn resolved_with_env(&self, name: &str, env: &HashMap) -> Result { + let json = serde_json::to_value(self)?; + let resolved = interpolate_value(json, env) + .with_context(|| format!("resolve env for mcp server {name:?}"))?; + Ok(serde_json::from_value(resolved)?) + } +} + +fn interpolate_value( + value: serde_json::Value, + env: &HashMap, +) -> Result { + use serde_json::Value; + match value { + Value::String(s) => Ok(Value::String(interpolate_env(&s, env)?)), + Value::Array(items) => items + .into_iter() + .map(|v| interpolate_value(v, env)) + .collect::>>() + .map(Value::Array), + Value::Object(map) => map + .into_iter() + .map(|(k, v)| interpolate_value(v, env).map(|v| (k, v))) + .collect::>>() + .map(Value::Object), + other => Ok(other), + } +} + +/// Replace `${env:VAR}` tokens in `input` with the matching env value. +/// Missing variables produce an error naming the offender. +pub fn interpolate_env(input: &str, env: &HashMap) -> Result { + let mut out = String::with_capacity(input.len()); + let mut rest = input; + while let Some(start) = rest.find("${env:") { + out.push_str(&rest[..start]); + let after = &rest[start + "${env:".len()..]; + let end = after + .find('}') + .ok_or_else(|| anyhow!("unterminated ${{env:..}} in {input:?}"))?; + let var = &after[..end]; + let val = env + .get(var) + .ok_or_else(|| anyhow!("env var ${var} not set (referenced by mcp config)"))?; + out.push_str(val); + rest = &after[end + 1..]; + } + out.push_str(rest); + Ok(out) +} + +fn home_dir() -> Option { + std::env::var_os("HOME").map(PathBuf::from) +} + +#[cfg(test)] +mod tests { + use super::*; + + fn env(pairs: &[(&str, &str)]) -> HashMap { + pairs + .iter() + .map(|(k, v)| (k.to_string(), v.to_string())) + .collect() + } + + #[test] + fn interpolate_replaces_tokens() { + let e = env(&[("FOO", "bar"), ("X", "y")]); + assert_eq!( + interpolate_env("a${env:FOO}b${env:X}", &e).unwrap(), + "abarby" + ); + } + + #[test] + fn interpolate_passes_through_plain_strings() { + let e = env(&[]); + assert_eq!(interpolate_env("plain", &e).unwrap(), "plain"); + } + + #[test] + fn interpolate_errors_on_missing_var() { + let e = env(&[]); + let err = interpolate_env("${env:MISSING}", &e) + .unwrap_err() + .to_string(); + assert!(err.contains("MISSING"), "expected MISSING in error: {err}"); + } + + #[test] + fn interpolate_errors_on_unterminated() { + let e = env(&[("FOO", "bar")]); + assert!(interpolate_env("${env:FOO", &e).is_err()); + } + + #[test] + fn resolved_errors_on_missing_env_var_with_var_name() { + // chaodu F9 (#959 review): contract is that a missing env var + // referenced via `${env:VAR}` in any config field surfaces an error + // naming the offender, so users can fix `mcp.json` instead of + // chasing a generic parse failure. + let cfg = ServerConfig::Stdio { + command: "github-mcp-server".into(), + args: vec!["--token".into(), "${env:CHAODU_F9_MISSING}".into()], + env: HashMap::new(), + tool_filter: None, + }; + let err = format!( + "{:#}", + cfg.resolved_with_env("github", &env(&[])).unwrap_err() + ); + assert!( + err.contains("CHAODU_F9_MISSING"), + "expected missing var name in error: {err}" + ); + assert!( + err.contains("github"), + "expected server name in error context: {err}" + ); + } + + #[test] + fn parses_stdio_and_http_servers() { + let json = r#"{ + "mcpServers": { + "fs": { + "type": "stdio", + "command": "mcp-server-filesystem", + "args": ["/workspace"], + "tool_filter": { "include": ["read_*"] } + }, + "linear": { + "type": "http", + "url": "https://mcp.linear.app/mcp", + "oauth": { "provider": "linear" } + } + } + }"#; + let cfg: McpConfig = serde_json::from_str(json).unwrap(); + assert_eq!(cfg.servers.len(), 2); + match cfg.servers.get("fs").unwrap() { + ServerConfig::Stdio { + command, + args, + tool_filter, + .. + } => { + assert_eq!(command, "mcp-server-filesystem"); + assert_eq!(args, &vec!["/workspace".to_string()]); + assert_eq!(tool_filter.as_ref().unwrap().include, vec!["read_*"]); + } + _ => panic!("expected stdio"), + } + match cfg.servers.get("linear").unwrap() { + ServerConfig::Http { url, oauth, .. } => { + assert_eq!(url, "https://mcp.linear.app/mcp"); + assert_eq!(oauth.as_ref().unwrap().provider.as_deref(), Some("linear")); + } + _ => panic!("expected http"), + } + } + + #[test] + fn resolved_substitutes_env_in_args() { + let env = env(&[("MCP_TEST_TOKEN", "secret123")]); + let cfg = ServerConfig::Stdio { + command: "github-mcp-server".into(), + args: vec!["--token".into(), "${env:MCP_TEST_TOKEN}".into()], + env: HashMap::new(), + tool_filter: None, + }; + match cfg.resolved_with_env("github", &env).unwrap() { + ServerConfig::Stdio { args, .. } => { + assert_eq!(args[1], "secret123"); + } + _ => unreachable!(), + } + } + + #[test] + fn merge_project_wins() { + let dir = tempfile::tempdir().unwrap(); + let global = dir.path().join("global.json"); + let project = dir.path().join("project.json"); + std::fs::write( + &global, + r#"{"mcpServers":{"fs":{"type":"stdio","command":"global-fs"},"x":{"type":"stdio","command":"global-x"}}}"#, + ) + .unwrap(); + std::fs::write( + &project, + r#"{"mcpServers":{"fs":{"type":"stdio","command":"project-fs"}}}"#, + ) + .unwrap(); + let cfg = McpConfig::load_layered(Some(&global), Some(&project)).unwrap(); + assert_eq!(cfg.servers.len(), 2); + match cfg.servers.get("fs").unwrap() { + ServerConfig::Stdio { command, .. } => assert_eq!(command, "project-fs"), + _ => unreachable!(), + } + match cfg.servers.get("x").unwrap() { + ServerConfig::Stdio { command, .. } => assert_eq!(command, "global-x"), + _ => unreachable!(), + } + } +} diff --git a/openab-agent/src/mcp/meta_tool.rs b/openab-agent/src/mcp/meta_tool.rs new file mode 100644 index 000000000..eefe5b275 --- /dev/null +++ b/openab-agent/src/mcp/meta_tool.rs @@ -0,0 +1,513 @@ +//! Single `mcp` meta-tool the LLM sees. See ADR §5.2 + §5.3. +//! +//! Phase 1 scope: action enum + dispatch wiring + all six Phase 1 actions +//! (`help`, `list_servers`, `list_tools`, `describe_tool`, `call`, `status`). +//! The Phase 2 `login` / `complete_login` actions land with the OAuth slice. + +use anyhow::{anyhow, Context, Result}; +use serde::Deserialize; +use serde_json::{json, Value}; + +use super::runtime::{McpRuntimeManager, ServerStatus}; + +/// Deserialized form of the meta-tool's input JSON (ADR §5.2). The LLM +/// sends `{ "action": "...", ... }`; `tag = "action"` routes by that field. +#[derive(Debug, Deserialize)] +#[serde(tag = "action", rename_all = "snake_case")] +pub enum Action { + Help, + ListServers, + ListTools { + server: String, + }, + DescribeTool { + server: String, + tool: String, + }, + Call { + server: String, + tool: String, + #[serde(default)] + arguments: Value, + }, + Status { + #[serde(default)] + server: Option, + }, +} + +/// Entry point — the LLM tool dispatcher hands us a deserialized `Action` +/// and we return the JSON payload that becomes the tool result. +pub async fn dispatch(manager: &McpRuntimeManager, action: Action) -> Result { + match action { + Action::Help => Ok(json!(HELP)), + Action::ListServers => Ok(list_servers(manager).await), + Action::ListTools { server } => list_tools(manager, &server).await, + Action::DescribeTool { server, tool } => describe_tool(manager, &server, &tool).await, + Action::Call { + server, + tool, + arguments, + } => call_tool(manager, &server, &tool, arguments).await, + Action::Status { server } => Ok(status(manager, server.as_deref()).await), + } +} + +const HELP: &str = "\ +The `mcp` tool lets you talk to configured MCP servers. + +Actions: + help show this message + list_servers list configured servers and status + list_tools(server) list tools exposed by a server + describe_tool(server, tool) show input_schema for one tool + call(server, tool, args) invoke a tool + status(server?) per-server health + last error + +Connections are lazy: the first action that needs a server spawns its \ +child process and runs the handshake. Idle servers are evicted after \ +the configured TTL."; + +async fn call_tool( + manager: &McpRuntimeManager, + server: &str, + tool: &str, + arguments: Value, +) -> Result { + // Lenient arg coercion per Mira's Tick 18 review: LLMs often send + // `null` or omit `arguments` for no-arg tools; rejecting those would + // make zero-arg calls fragile. Only real type errors (string, number, + // array, bool) are refused. + let args_map = match arguments { + Value::Object(map) => map, + Value::Null => serde_json::Map::new(), + other => { + return Err(anyhow!( + "mcp call arguments must be a JSON object (or null/omitted for no-arg tools), got {other}" + )); + } + }; + manager + .connect(server) + .await + .with_context(|| format!("connect mcp server {server:?}"))?; + let peer = manager.arc_peer(server).await?; + let params = rmcp::model::CallToolRequestParams::new(tool.to_string()).with_arguments(args_map); + let result = peer + .call_tool(params) + .await + .with_context(|| format!("call_tool {tool:?} on {server:?}"))?; + serde_json::to_value(&result).context("serialize CallToolResult") +} + +/// Lazy-connect + list all tools on `server`. Shared by `list_tools` / +/// `describe_tool` (and the planned `tools_cache` on ServerHandle will plug +/// in here). The `Arc` clone lets the I/O `.await` run with +/// no runtime lock held. +async fn fetch_tools(manager: &McpRuntimeManager, server: &str) -> Result> { + manager + .connect(server) + .await + .with_context(|| format!("connect mcp server {server:?}"))?; + let peer = manager.arc_peer(server).await?; + peer.list_all_tools() + .await + .with_context(|| format!("list_all_tools on {server:?}")) +} + +async fn list_tools(manager: &McpRuntimeManager, server: &str) -> Result { + let entries: Vec = fetch_tools(manager, server) + .await? + .into_iter() + .map(|t| { + json!({ + "name": t.name, + "description": t.description, + }) + }) + .collect(); + Ok(Value::Array(entries)) +} + +async fn describe_tool(manager: &McpRuntimeManager, server: &str, tool: &str) -> Result { + // Progressive disclosure (ADR §5.2): `list_tools` returns compact + // `{name, description}`; this action returns the full `input_schema` + // for one tool. MCP has no single-tool query, so we list + filter. + let tool_def = fetch_tools(manager, server) + .await? + .into_iter() + .find(|t| t.name.as_ref() == tool) + .ok_or_else(|| anyhow!("no tool {tool:?} on mcp server {server:?}"))?; + Ok(json!({ + "name": tool_def.name, + "description": tool_def.description, + "input_schema": tool_def.input_schema, + })) +} + +async fn status(manager: &McpRuntimeManager, filter: Option<&str>) -> Value { + let snapshot = manager.snapshot().await; + let entries: Vec = snapshot + .into_iter() + .filter(|(name, _, _)| match filter { + Some(f) => f == name.as_str(), + None => true, + }) + .map(|(name, status, transport)| { + let last_error = match &status { + ServerStatus::Failed(msg) => Some(msg.clone()), + _ => None, + }; + json!({ + "name": name, + "status": status_label(&status), + "transport": transport, + "last_error": last_error, + }) + }) + .collect(); + Value::Array(entries) +} + +async fn list_servers(manager: &McpRuntimeManager) -> Value { + let snapshot = manager.snapshot().await; + let entries: Vec = snapshot + .into_iter() + .map(|(name, status, transport)| { + json!({ + "name": name, + "status": status_label(&status), + "transport": transport, + }) + }) + .collect(); + Value::Array(entries) +} + +fn status_label(status: &ServerStatus) -> &'static str { + match status { + // `Disconnected` is the cold/idle state — config loaded but the + // child process hasn't been spawned yet. Lazy connect happens on + // the first `call` / `list_tools`, so this is NOT a failure mode. + // Earlier label `"disconnected"` confused LLMs into reporting the + // server as broken on a plain `list_servers` (PR #959 F1 PoC + // observation). `"failed"` already covers the error case below. + ServerStatus::Disconnected => "idle", + ServerStatus::Connecting => "connecting", + ServerStatus::Connected => "connected", + ServerStatus::Failed(_) => "failed", + } +} + +#[cfg(test)] +mod tests { + use super::*; + use crate::mcp::config::McpConfig; + + fn mgr_from(json: &str) -> McpRuntimeManager { + let cfg: McpConfig = serde_json::from_str(json).unwrap(); + McpRuntimeManager::from_config(cfg) + } + + #[tokio::test] + async fn help_returns_doc_string() { + let mgr = mgr_from(r#"{"mcpServers":{}}"#); + let result = dispatch(&mgr, Action::Help).await.unwrap(); + let s = result.as_str().unwrap(); + assert!(s.contains("list_servers")); + assert!(s.contains("call(server, tool")); + } + + #[tokio::test] + async fn list_servers_reports_name_status_transport() { + let mgr = mgr_from( + r#"{ + "mcpServers": { + "fs": { "type": "stdio", "command": "mcp-server-filesystem" }, + "linear": { "type": "http", "url": "https://mcp.linear.app/mcp" } + } + }"#, + ); + let result = dispatch(&mgr, Action::ListServers).await.unwrap(); + let entries = result.as_array().unwrap(); + assert_eq!(entries.len(), 2); + let by_name: std::collections::HashMap<_, _> = entries + .iter() + .map(|e| (e["name"].as_str().unwrap(), e)) + .collect(); + assert_eq!(by_name["fs"]["transport"], "stdio"); + assert_eq!(by_name["fs"]["status"], "idle"); + assert_eq!(by_name["linear"]["transport"], "http"); + } + + #[tokio::test] + async fn list_servers_empty_yields_empty_array() { + let mgr = mgr_from(r#"{"mcpServers":{}}"#); + let result = dispatch(&mgr, Action::ListServers).await.unwrap(); + assert!(result.as_array().unwrap().is_empty()); + } + + #[tokio::test] + async fn call_rejects_non_object_arguments() { + let mgr = mgr_from( + r#"{ + "mcpServers": { + "fs": { "type": "stdio", "command": "true" } + } + }"#, + ); + let err = dispatch( + &mgr, + Action::Call { + server: "fs".into(), + tool: "read".into(), + arguments: json!("oops, a string"), + }, + ) + .await + .unwrap_err() + .to_string(); + assert!(err.contains("must be a JSON object"), "got: {err}"); + } + + #[tokio::test] + async fn call_null_arguments_passes_validation_and_reaches_connect() { + // Null args should be coerced to {} and fail at the *connect* step + // (binary doesn't exist), not at the validation step. + let mgr = mgr_from( + r#"{ + "mcpServers": { + "broken": { + "type": "stdio", + "command": "/nonexistent/openab-mcp-test-stub-zzz" + } + } + }"#, + ); + let err = dispatch( + &mgr, + Action::Call { + server: "broken".into(), + tool: "read".into(), + arguments: Value::Null, + }, + ) + .await + .unwrap_err() + .to_string(); + assert!(err.contains("connect mcp server"), "got: {err}"); + assert!(!err.contains("must be a JSON object"), "got: {err}"); + } + + #[tokio::test] + async fn list_tools_propagates_connect_failure() { + let mgr = mgr_from( + r#"{ + "mcpServers": { + "broken": { + "type": "stdio", + "command": "/nonexistent/path/openab-mcp-test-stub-zzz" + } + } + }"#, + ); + let err = dispatch( + &mgr, + Action::ListTools { + server: "broken".into(), + }, + ) + .await + .unwrap_err() + .to_string(); + assert!(err.contains("connect mcp server"), "got: {err}"); + } + + #[tokio::test] + async fn describe_tool_propagates_connect_failure() { + let mgr = mgr_from( + r#"{ + "mcpServers": { + "broken": { + "type": "stdio", + "command": "/nonexistent/path/openab-mcp-test-stub-zzz" + } + } + }"#, + ); + let err = dispatch( + &mgr, + Action::DescribeTool { + server: "broken".into(), + tool: "read".into(), + }, + ) + .await + .unwrap_err() + .to_string(); + assert!(err.contains("connect mcp server"), "got: {err}"); + } + + #[tokio::test] + async fn status_lists_each_server_with_null_last_error_by_default() { + let mgr = mgr_from( + r#"{ + "mcpServers": { + "fs": { "type": "stdio", "command": "mcp-server-filesystem" }, + "linear": { "type": "http", "url": "https://mcp.linear.app/mcp" } + } + }"#, + ); + let result = dispatch(&mgr, Action::Status { server: None }) + .await + .unwrap(); + let entries = result.as_array().unwrap(); + assert_eq!(entries.len(), 2); + for e in entries { + assert_eq!(e["status"], "idle"); + assert!(e["last_error"].is_null()); + } + } + + #[tokio::test] + async fn status_labels_failed_servers_with_last_error() { + // Status uses a `Failed` state distinct from `idle`; the LLM should + // see the failure surfaced explicitly via `status: "failed"` + + // `last_error: ` rather than collapsing into `idle`. + let mgr = mgr_from( + r#"{ + "mcpServers": { + "broken": { + "type": "stdio", + "command": "/nonexistent/openab-mcp-test-stub-zzz" + } + } + }"#, + ); + // Trip the Failed state via a connect attempt that will fail at spawn. + let _ = dispatch( + &mgr, + Action::Call { + server: "broken".into(), + tool: "anything".into(), + arguments: serde_json::json!({}), + }, + ) + .await; + let result = dispatch(&mgr, Action::Status { server: None }) + .await + .unwrap(); + let entries = result.as_array().unwrap(); + assert_eq!(entries.len(), 1); + assert_eq!(entries[0]["status"], "failed"); + assert!( + !entries[0]["last_error"].is_null(), + "Failed status should carry last_error" + ); + } + + #[tokio::test] + async fn status_filter_by_server_returns_single_entry() { + let mgr = mgr_from( + r#"{ + "mcpServers": { + "fs": { "type": "stdio", "command": "mcp-server-filesystem" }, + "linear": { "type": "http", "url": "https://mcp.linear.app/mcp" } + } + }"#, + ); + let result = dispatch( + &mgr, + Action::Status { + server: Some("fs".into()), + }, + ) + .await + .unwrap(); + let entries = result.as_array().unwrap(); + assert_eq!(entries.len(), 1); + assert_eq!(entries[0]["name"], "fs"); + assert_eq!(entries[0]["transport"], "stdio"); + } + + #[tokio::test] + async fn status_unknown_filter_returns_empty_array() { + let mgr = mgr_from( + r#"{ + "mcpServers": { + "fs": { "type": "stdio", "command": "mcp-server-filesystem" } + } + }"#, + ); + let result = dispatch( + &mgr, + Action::Status { + server: Some("nope".into()), + }, + ) + .await + .unwrap(); + assert!(result.as_array().unwrap().is_empty()); + } + + #[tokio::test] + async fn status_surfaces_last_error_after_failed_connect() { + let mgr = mgr_from( + r#"{ + "mcpServers": { + "broken": { + "type": "stdio", + "command": "/nonexistent/path/openab-mcp-test-stub-zzz" + } + } + }"#, + ); + let _ = dispatch( + &mgr, + Action::ListTools { + server: "broken".into(), + }, + ) + .await; + let result = dispatch(&mgr, Action::Status { server: None }) + .await + .unwrap(); + let entries = result.as_array().unwrap(); + assert_eq!(entries.len(), 1); + assert_eq!(entries[0]["status"], "failed"); + let last_error = entries[0]["last_error"].as_str().unwrap(); + assert!(last_error.contains("spawn"), "got: {last_error}"); + } + + #[test] + fn action_deserializes_from_meta_tool_payload() { + let payload = json!({ + "action": "call", + "server": "github", + "tool": "create_issue", + "arguments": { "title": "x" } + }); + let action: Action = serde_json::from_value(payload).unwrap(); + match action { + Action::Call { + server, + tool, + arguments, + } => { + assert_eq!(server, "github"); + assert_eq!(tool, "create_issue"); + assert_eq!(arguments["title"], "x"); + } + other => panic!("expected Call, got {other:?}"), + } + } + + #[test] + fn action_status_server_is_optional() { + let action: Action = serde_json::from_value(json!({ "action": "status" })).unwrap(); + assert!(matches!(action, Action::Status { server: None })); + let action: Action = + serde_json::from_value(json!({ "action": "status", "server": "fs" })).unwrap(); + assert!(matches!(action, Action::Status { server: Some(_) })); + } +} diff --git a/openab-agent/src/mcp/mod.rs b/openab-agent/src/mcp/mod.rs new file mode 100644 index 000000000..6f1e8c514 --- /dev/null +++ b/openab-agent/src/mcp/mod.rs @@ -0,0 +1,287 @@ +//! Native MCP client. See `docs/adr/openab-agent-mcp.md`. + +pub mod config; +pub mod meta_tool; +pub mod runtime; + +use serde_json::json; + +use crate::llm::ToolDef; +use config::{McpConfig, ServerConfig}; + +pub use runtime::McpRuntimeManager; + +/// Shared tool name used by `mcp_tool_def()` and the agent dispatch arm — +/// keeps the implicit contract between the two call sites explicit. +pub const MCP_TOOL_NAME: &str = "mcp"; + +/// The single `mcp` tool definition the LLM sees (ADR §5.2). The schema is +/// intentionally permissive on the per-action fields — the LLM should call +/// `mcp(action="help")` first to learn the action-specific contract. +pub fn mcp_tool_def() -> ToolDef { + ToolDef { + name: MCP_TOOL_NAME.to_string(), + description: "Talk to configured MCP servers. Call with \ + {action: 'help'} first to see the available actions \ + (help, list_servers, list_tools, describe_tool, call, status)." + .to_string(), + input_schema: json!({ + "type": "object", + "properties": { + "action": { + "type": "string", + "enum": ["help", "list_servers", "list_tools", + "describe_tool", "call", "status"], + "description": "Which meta-tool action to invoke" + }, + "server": { + "type": "string", + "description": "Server name (required by list_tools / describe_tool / call; optional filter for status)" + }, + "tool": { + "type": "string", + "description": "Tool name on the server (required by describe_tool / call)" + }, + "arguments": { + "description": "Tool arguments for call — JSON object, or null/omitted for no-arg tools" + } + }, + "required": ["action"] + }), + } +} + +fn load_config_or_exit() -> McpConfig { + McpConfig::load().unwrap_or_else(|e| { + eprintln!("failed to load mcp config: {e:#}"); + std::process::exit(1); + }) +} + +/// Runtime opt-in env var (PR #959 review, chaodu F6). MCP stays dormant +/// unless this is explicitly set to a truthy value, even when `mcp.json` +/// exists at one of the search paths. Prevents accidental activation in +/// environments where the config file might be present incidentally +/// (e.g. project tree copied into a container image, baseline VM rollouts). +pub const OPT_IN_ENV: &str = "OPENAB_AGENT_MCP"; + +/// Returns `true` when the user has explicitly opted into the MCP runtime +/// via `OPENAB_AGENT_MCP={1,true,yes,on}` (case-insensitive). Any other +/// value — including unset, empty, or `false` — keeps MCP dormant. +fn opted_in() -> bool { + matches!( + std::env::var(OPT_IN_ENV) + .as_deref() + .map(str::to_ascii_lowercase) + .as_deref(), + Ok("1" | "true" | "yes" | "on") + ) +} + +/// Construct an `McpRuntimeManager` from on-disk config — returns `None` +/// when MCP is not opted in (see [`OPT_IN_ENV`]) or no servers are +/// configured, so callers can skip the entire MCP path (saves system-prompt +/// tokens + keeps the LLM from hallucinating an empty tool surface). Parse +/// failure falls back to `None` with a `tracing::warn!`. Long-running +/// servers (ACP, future HTTP) call this; CLI subcommands use +/// `load_config_or_exit` instead so they work without the opt-in env var. +pub fn load_runtime_or_warn() -> Option { + if !opted_in() { + return None; + } + let cfg = McpConfig::load().unwrap_or_else(|e| { + tracing::warn!("mcp config failed to load, starting with no servers: {e:#}"); + McpConfig::default() + }); + if cfg.servers.is_empty() { + tracing::warn!( + "{OPT_IN_ENV} is set but no mcp servers configured at \ + ~/.openab/agent/mcp.json or ./.openab/agent/mcp.json" + ); + None + } else { + Some(McpRuntimeManager::from_config(cfg)) + } +} + +/// Build the MCP section appended to the system prompt at session start +/// (PR #959 chaodu F1, discovery slice). Mirrors the skills-catalogue +/// pattern: advertise *server names + transports* — not individual tools — +/// so the LLM knows the surface exists and can call +/// `mcp(action="list_tools", server=...)` to discover capabilities on demand. +/// +/// Token-budget invariance: the section grows O(server count), not +/// O(server count × tool count). PR #959 F1 PoC measured ≤100 tokens per +/// server-side meta entry under this pattern; flattening per-tool would +/// blow that invariance up. +/// +/// Status semantics worth surfacing to the LLM (matches `status_label` in +/// `meta_tool`): `idle` = ready (lazy-connect on first call), not broken. +pub fn format_system_prompt_appendix(manager: &McpRuntimeManager) -> String { + let catalog = manager.catalog(); + let mut out = String::from( + "\n\n## MCP tool\n\n\ + Use the `mcp` tool to talk to configured MCP servers. Key actions: \ + `list_tools(server)` discovers a server's tools, \ + `call(server, tool, arguments)` invokes one. Servers auto-connect \ + on first use — `status: \"idle\"` means ready (not broken); \ + `status: \"failed\"` carries the error reason in `last_error`. \ + Call `mcp(action=\"help\")` only if action shapes are unclear.\n\n", + ); + if catalog.is_empty() { + out.push_str( + "No MCP servers are configured. The `mcp` tool will report an \ + empty `list_servers` until one is added.\n", + ); + return out; + } + out.push_str("Configured servers:\n"); + for entry in catalog { + if entry.requires_oauth { + out.push_str(&format!( + "- **{}** ({}, requires `mcp login {}` before first call)\n", + entry.name, entry.transport, entry.name, + )); + } else { + out.push_str(&format!("- **{}** ({})\n", entry.name, entry.transport)); + } + } + out +} + +/// `openab-agent mcp list [--resolve]`. +/// +/// Default: print configs verbatim (`${env:VAR}` placeholders kept as-is) so +/// `mcp list` is safe to paste into bug reports. `--resolve` opts into +/// substituting env vars and prints a leading warning — useful for debugging +/// missing-env startup failures locally. +pub fn cli_list_servers(resolve: bool) { + let cfg = load_config_or_exit(); + if cfg.servers.is_empty() { + println!("No MCP servers configured."); + println!(" global: ~/.openab/agent/mcp.json"); + println!(" project: ./.openab/agent/mcp.json"); + return; + } + if resolve { + eprintln!("⚠ --resolve: env vars substituted into output below."); + eprintln!("⚠ Output may contain secrets — do not paste publicly."); + eprintln!(); + } + let mut servers: Vec<_> = cfg.servers.iter().collect(); + servers.sort_by_key(|(name, _)| *name); + for (name, server) in servers { + print_server(name, server, resolve); + } +} + +fn print_server(name: &str, server: &ServerConfig, resolve: bool) { + if resolve { + match server.resolved(name) { + Ok(r) => print_json("✓", name, &r), + Err(e) => println!("✗ {name}: {e:#}"), + } + } else { + print_json("•", name, server); + } +} + +fn print_json(status: &str, name: &str, value: &T) { + println!("{status} {name}"); + if let Ok(json) = serde_json::to_string_pretty(value) { + for line in json.lines() { + println!(" {line}"); + } + } +} + +/// `openab-agent mcp status`. +/// +/// Prints per-server runtime status. Servers start `Disconnected` and only +/// advance after `mcp connect ` (or, later, lazy dial from the agent +/// path). +pub async fn cli_show_status() { + let manager = McpRuntimeManager::from_config(load_config_or_exit()); + if manager.is_empty().await { + println!("No MCP servers configured."); + return; + } + for (name, status) in manager.statuses().await { + println!("{} {name}", status.icon()); + } +} + +/// `openab-agent mcp connect `. Spawns the configured stdio server, +/// runs the rmcp handshake, and reports success or the failure reason. +/// The connection is dropped on process exit — this CLI is a smoke-test +/// for `mcp.json` entries, not a long-running session. +pub async fn cli_connect(name: String) { + let manager = McpRuntimeManager::from_config(load_config_or_exit()); + match manager.connect(&name).await { + Ok(()) => println!("● connected: {name}"), + Err(e) => { + eprintln!("✗ {name}: {e:#}"); + std::process::exit(1); + } + } +} + +#[cfg(test)] +mod tests { + use super::*; + use config::McpConfig; + + fn mgr_from(json: &str) -> McpRuntimeManager { + let cfg: McpConfig = serde_json::from_str(json).unwrap(); + McpRuntimeManager::from_config(cfg) + } + + #[test] + fn format_system_prompt_appendix_lists_each_server() { + let mgr = mgr_from( + r#"{ + "mcpServers": { + "fs": { "type": "stdio", "command": "mcp-server-filesystem" }, + "weather": { "type": "http", "url": "https://example/mcp" } + } + }"#, + ); + let s = format_system_prompt_appendix(&mgr); + assert!(s.contains("## MCP tool")); + assert!(s.contains("Configured servers:")); + assert!(s.contains("**fs** (stdio)")); + assert!(s.contains("**weather** (http)")); + // Status semantics must be advertised so LLM doesn't misread `idle` + // as a failure (PR #959 F1 PoC observation). + assert!(s.contains("idle")); + } + + #[test] + fn format_system_prompt_appendix_marks_oauth_servers() { + let mgr = mgr_from( + r#"{ + "mcpServers": { + "linear": { + "type": "http", + "url": "https://mcp.linear.app/mcp", + "oauth": { "provider": "linear", "scopes": ["read"] } + } + } + }"#, + ); + let s = format_system_prompt_appendix(&mgr); + assert!( + s.contains("requires `mcp login linear`"), + "OAuth servers must surface the login hint; got:\n{s}" + ); + } + + #[test] + fn format_system_prompt_appendix_handles_empty_catalog() { + let mgr = mgr_from(r#"{"mcpServers":{}}"#); + let s = format_system_prompt_appendix(&mgr); + assert!(s.contains("## MCP tool")); + assert!(s.contains("No MCP servers are configured")); + assert!(!s.contains("Configured servers:")); + } +} diff --git a/openab-agent/src/mcp/runtime.rs b/openab-agent/src/mcp/runtime.rs new file mode 100644 index 000000000..5af3aa4a8 --- /dev/null +++ b/openab-agent/src/mcp/runtime.rs @@ -0,0 +1,517 @@ +//! Per-server lifecycle manager. See ADR §5.4 + §5.7. +//! +//! Handles live behind `Arc>` so `connect()` (async, +//! spawns child processes) is `Send` across `.await` and a background idle- +//! eviction task can share the map with foreground `mcp call` invocations +//! (ADR §5.7). Read-heavy / write-light fits `RwLock`. +//! +//! `connect()` uses a double-lock pattern: a short write lock to mark +//! `Connecting`, release the lock, run the rmcp handshake without holding +//! any lock, then re-acquire briefly to install the client or record the +//! failure. Holding the write lock across the `serve(...).await` would +//! starve every reader (including `mcp status` and the eviction scan) for +//! the duration of a child-process spawn + handshake. + +use std::collections::HashMap; +use std::sync::Arc; + +use anyhow::{anyhow, Context, Result}; +use rmcp::service::{RoleClient, RunningService}; +use rmcp::transport::{ConfigureCommandExt, StreamableHttpClientTransport, TokioChildProcess}; +use rmcp::ServiceExt; +use tokio::process::Command; +use tokio::sync::RwLock; + +use super::config::{McpConfig, ServerConfig}; + +#[allow(dead_code)] // NeedsAuth lands with the Phase 2 OAuth slice (ADR §5.7) +#[derive(Debug, Clone, PartialEq, Eq)] +pub enum ServerStatus { + Disconnected, + Connecting, + Connected, + Failed(String), +} + +impl ServerStatus { + pub fn icon(&self) -> &'static str { + match self { + ServerStatus::Disconnected => "○", + ServerStatus::Connecting => "◐", + ServerStatus::Connected => "●", + ServerStatus::Failed(_) => "✗", + } + } +} + +pub struct ServerHandle { + pub name: String, + pub config: ServerConfig, + pub status: ServerStatus, + /// `Arc` so foreground callers can clone a peer handle out under a + /// short read lock, drop the guard, and then run `peer.list_all_tools()` + /// / `peer.call_tool()` without holding any runtime lock across the + /// I/O `.await` (avoids writer starvation + `Future is not Send` traps). + pub client: Option>>, +} + +impl std::fmt::Debug for ServerHandle { + fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { + f.debug_struct("ServerHandle") + .field("name", &self.name) + .field("config", &self.config) + .field("status", &self.status) + .field("client", &self.client.is_some()) + .finish() + } +} + +/// Immutable, lock-free view of a configured server for catalogue +/// advertising in the system prompt (PR #959 chaodu F1, discovery slice). +/// Lives outside the `RwLock` so `format_system_prompt_appendix` +/// can build the prompt synchronously at `Agent::new_with_provider` time +/// without coordinating with the async runtime. +#[derive(Debug, Clone)] +pub struct CatalogEntry { + pub name: String, + pub transport: &'static str, + pub requires_oauth: bool, +} + +/// Owns one `ServerHandle` per configured server, behind an async `RwLock` +/// so the foreground LLM path and the background eviction task can share it. +#[derive(Debug, Default, Clone)] +pub struct McpRuntimeManager { + handles: Arc>>, + /// Sorted-by-name snapshot of static server identity (name + transport + + /// oauth-required flag). Frozen at `from_config` — never mutated, so it + /// is safe to read without locking. Used by the system-prompt catalogue + /// (PR #959 F1 discovery slice). + catalog: Arc<[CatalogEntry]>, +} + +impl McpRuntimeManager { + pub fn from_config(cfg: McpConfig) -> Self { + let mut catalog: Vec = cfg + .servers + .iter() + .map(|(name, config)| CatalogEntry { + name: name.clone(), + transport: config.transport_label(), + requires_oauth: config.requires_oauth(), + }) + .collect(); + catalog.sort_by(|a, b| a.name.cmp(&b.name)); + let handles: HashMap<_, _> = cfg + .servers + .into_iter() + .map(|(name, config)| { + let handle = ServerHandle { + name: name.clone(), + config, + status: ServerStatus::Disconnected, + client: None, + }; + (name, handle) + }) + .collect(); + Self { + handles: Arc::new(RwLock::new(handles)), + catalog: catalog.into(), + } + } + + /// Lock-free, synchronous access to the configured-server catalogue. + /// See `CatalogEntry` for the rationale. + pub fn catalog(&self) -> &[CatalogEntry] { + &self.catalog + } + + /// Snapshot of `(name, status)` sorted by name. Clones out so the read + /// guard is dropped before returning — callers don't hold a lock. + pub async fn statuses(&self) -> Vec<(String, ServerStatus)> { + let mut out: Vec<_> = { + let guard = self.handles.read().await; + guard + .iter() + .map(|(name, h)| (name.clone(), h.status.clone())) + .collect() + }; + out.sort_by(|(a, _), (b, _)| a.cmp(b)); + out + } + + pub async fn is_empty(&self) -> bool { + self.handles.read().await.is_empty() + } + + /// Clone the live MCP client handle for `name` out from under a short + /// read lock. The caller `.await`s on the returned `Arc` with no + /// runtime lock held, so background writers (idle eviction, new + /// `connect`s) are not starved by long-running tool calls. + /// + /// Errors if the server isn't configured or isn't currently + /// `Connected`. Callers that want lazy-connect should run + /// `connect(name)` first. + pub async fn arc_peer(&self, name: &str) -> Result>> { + let guard = self.handles.read().await; + let handle = guard + .get(name) + .ok_or_else(|| anyhow!("no mcp server named {name:?}"))?; + handle + .client + .as_ref() + .cloned() + .ok_or_else(|| anyhow!("mcp server {name:?} is not connected")) + } + + /// Snapshot of `(name, status, transport_label)` sorted by name. Used + /// by the `list_servers` meta-tool action; the static transport label + /// avoids cloning the `Stdio { args, env, .. }` payload. + pub async fn snapshot(&self) -> Vec<(String, ServerStatus, &'static str)> { + let mut out: Vec<_> = { + let guard = self.handles.read().await; + guard + .iter() + .map(|(name, h)| (name.clone(), h.status.clone(), h.config.transport_label())) + .collect() + }; + out.sort_by(|(a, ..), (b, ..)| a.cmp(b)); + out + } + + /// Lazy-connect the named server (ADR §5.7). Idempotent if already + /// `Connected` with a live client. HTTP servers requiring OAuth are + /// rejected until the Phase 2 auth slice lands (ADR §6). + pub async fn connect(&self, name: &str) -> Result<()> { + let dial = { + let mut guard = self.handles.write().await; + let handle = guard + .get_mut(name) + .ok_or_else(|| anyhow!("no mcp server named {name:?}"))?; + if matches!(handle.status, ServerStatus::Connected) && handle.client.is_some() { + return Ok(()); + } + let resolved = handle.config.resolved(name)?; + let dial = match resolved { + ServerConfig::Stdio { + command, args, env, .. + } => Dial::Stdio { command, args, env }, + // Reject oauth-protected servers BEFORE the `Connecting` + // transition: we never attempted a handshake, so leaving + // status at `Disconnected` is the honest state. Status + // becomes `Failed` only when a dial was actually tried. + ServerConfig::Http { + oauth: Some(_), + url, + .. + } => { + return Err(anyhow!( + "oauth-protected http server {url:?} requires the auth slice (Phase 2 §6)" + )); + } + ServerConfig::Http { url, .. } => Dial::Http { url }, + }; + handle.status = ServerStatus::Connecting; + dial + }; + + let dial_result = dial.run().await; + + let mut guard = self.handles.write().await; + let handle = guard + .get_mut(name) + .ok_or_else(|| anyhow!("server {name:?} vanished during connect"))?; + // Race guard: a concurrent connect() may have installed a client while + // we were dialing. Yield to the winner — `dial_result` drops here, + // killing the duplicate child via RunningService's Drop impl. + if matches!(handle.status, ServerStatus::Connected) && handle.client.is_some() { + return Ok(()); + } + match dial_result { + Ok(client) => { + handle.status = ServerStatus::Connected; + handle.client = Some(Arc::new(client)); + Ok(()) + } + Err(e) => { + let msg = format!("{e:#}"); + handle.status = ServerStatus::Failed(msg.clone()); + Err(anyhow!(msg)) + } + } + } +} + +/// Per-transport dial parameters, extracted under the manager's write lock +/// then dialed without holding the lock. Flat (no nested `*Dial` structs) +/// because two variants don't warrant a dispatch enum. +enum Dial { + Stdio { + command: String, + args: Vec, + env: HashMap, + }, + Http { + url: String, + }, +} + +impl Dial { + async fn run(self) -> Result> { + match self { + Dial::Stdio { command, args, env } => { + let cmd = Command::new(&command).configure(|c| { + c.env_clear(); + c.envs(stdio_child_env(&env)); + c.args(&args); + }); + let transport = TokioChildProcess::new(cmd) + .with_context(|| format!("spawn mcp child process {command:?}"))?; + ().serve(transport) + .await + .with_context(|| format!("mcp handshake with {command:?}")) + } + Dial::Http { url } => { + let transport = StreamableHttpClientTransport::from_uri(url.as_str()); + ().serve(transport) + .await + .with_context(|| format!("mcp handshake with {url:?}")) + } + } + } +} + +fn stdio_child_env(explicit: &HashMap) -> HashMap { + let mut env = baseline_child_env(); + env.extend(explicit.clone()); + env +} + +fn baseline_child_env() -> HashMap { + let mut env = HashMap::new(); + for key in baseline_env_keys() { + if let Ok(val) = std::env::var(key) { + env.insert((*key).to_string(), val); + } + } + env +} + +#[cfg(unix)] +fn baseline_env_keys() -> &'static [&'static str] { + &["HOME", "PATH", "TERM", "USER"] +} + +#[cfg(windows)] +fn baseline_env_keys() -> &'static [&'static str] { + &[ + "HOME", + "PATH", + "TERM", + "USERPROFILE", + "USERNAME", + "SystemRoot", + "SystemDrive", + ] +} + +#[cfg(not(any(unix, windows)))] +fn baseline_env_keys() -> &'static [&'static str] { + &["HOME", "PATH", "TERM"] +} + +#[cfg(test)] +mod tests { + use super::*; + + #[tokio::test] + async fn from_config_initializes_each_server_disconnected() { + let json = r#"{ + "mcpServers": { + "fs": { "type": "stdio", "command": "mcp-server-filesystem" }, + "linear": { "type": "http", "url": "https://mcp.linear.app/mcp" } + } + }"#; + let cfg: McpConfig = serde_json::from_str(json).unwrap(); + let mgr = McpRuntimeManager::from_config(cfg); + let statuses = mgr.statuses().await; + assert_eq!(statuses.len(), 2); + for (_, status) in statuses { + assert_eq!(status, ServerStatus::Disconnected); + } + } + + #[tokio::test] + async fn empty_config_yields_empty_manager() { + let mgr = McpRuntimeManager::from_config(McpConfig::default()); + assert!(mgr.is_empty().await); + assert!(mgr.statuses().await.is_empty()); + assert!(mgr.catalog().is_empty()); + } + + #[test] + fn catalog_is_sorted_and_flags_oauth() { + let json = r#"{ + "mcpServers": { + "linear": { + "type": "http", + "url": "https://mcp.linear.app/mcp", + "oauth": { "provider": "linear", "scopes": ["read"] } + }, + "fs": { "type": "stdio", "command": "mcp-server-filesystem" }, + "weather": { "type": "http", "url": "https://example/mcp" } + } + }"#; + let cfg: McpConfig = serde_json::from_str(json).unwrap(); + let mgr = McpRuntimeManager::from_config(cfg); + let cat = mgr.catalog(); + let names: Vec<&str> = cat.iter().map(|e| e.name.as_str()).collect(); + assert_eq!(names, vec!["fs", "linear", "weather"]); + let by_name: std::collections::HashMap<&str, &CatalogEntry> = + cat.iter().map(|e| (e.name.as_str(), e)).collect(); + assert_eq!(by_name["fs"].transport, "stdio"); + assert!(!by_name["fs"].requires_oauth); + assert_eq!(by_name["linear"].transport, "http"); + assert!(by_name["linear"].requires_oauth); + assert_eq!(by_name["weather"].transport, "http"); + assert!(!by_name["weather"].requires_oauth); + } + + #[tokio::test] + async fn statuses_sorted_by_name() { + let json = r#"{ + "mcpServers": { + "zed": { "type": "stdio", "command": "z" }, + "alpha": { "type": "stdio", "command": "a" }, + "mid": { "type": "stdio", "command": "m" } + } + }"#; + let cfg: McpConfig = serde_json::from_str(json).unwrap(); + let mgr = McpRuntimeManager::from_config(cfg); + let names: Vec = mgr.statuses().await.into_iter().map(|(n, _)| n).collect(); + assert_eq!(names, vec!["alpha", "mid", "zed"]); + } + + #[tokio::test] + async fn connect_unknown_server_errors() { + let mgr = McpRuntimeManager::from_config(McpConfig::default()); + let err = mgr.connect("missing").await.unwrap_err().to_string(); + assert!(err.contains("missing"), "expected 'missing' in {err}"); + } + + #[tokio::test] + async fn connect_http_with_oauth_defers_to_auth_slice() { + let json = r#"{ + "mcpServers": { + "linear": { + "type": "http", + "url": "https://mcp.linear.app/mcp", + "oauth": { "provider": "linear" } + } + } + }"#; + let cfg: McpConfig = serde_json::from_str(json).unwrap(); + let mgr = McpRuntimeManager::from_config(cfg); + let err = mgr.connect("linear").await.unwrap_err().to_string(); + assert!(err.contains("oauth"), "expected 'oauth' in {err}"); + // OAuth rejection happens BEFORE the Connecting transition, so the + // server remains Disconnected — no dial was attempted. + assert_eq!(mgr.statuses().await[0].1, ServerStatus::Disconnected); + } + + #[tokio::test] + async fn connect_http_anonymous_to_dead_address_records_failed() { + // 127.0.0.1:1 is a TCP port that no MCP server will ever bind. The + // handshake `.serve()` future fails fast at the connect() syscall, + // so this test stays hermetic — no network reachability assumed. + let json = r#"{ + "mcpServers": { + "dead": { "type": "http", "url": "http://127.0.0.1:1/mcp" } + } + }"#; + let cfg: McpConfig = serde_json::from_str(json).unwrap(); + let mgr = McpRuntimeManager::from_config(cfg); + let err = mgr.connect("dead").await.unwrap_err().to_string(); + assert!(err.contains("handshake"), "expected 'handshake' in {err}"); + match &mgr.statuses().await[0].1 { + ServerStatus::Failed(_) => {} + other => panic!("expected Failed, got {other:?}"), + } + } + + #[tokio::test] + async fn connect_to_missing_binary_records_failed() { + let json = r#"{ + "mcpServers": { + "broken": { + "type": "stdio", + "command": "/nonexistent/path/openab-mcp-test-stub-zzz" + } + } + }"#; + let cfg: McpConfig = serde_json::from_str(json).unwrap(); + let mgr = McpRuntimeManager::from_config(cfg); + let err = mgr.connect("broken").await.unwrap_err().to_string(); + assert!(err.contains("spawn"), "expected 'spawn' in {err}"); + match &mgr.statuses().await[0].1 { + ServerStatus::Failed(msg) => assert!(msg.contains("spawn")), + other => panic!("expected Failed, got {other:?}"), + } + } + + #[tokio::test] + async fn race_guard_no_stuck_connecting_on_concurrent_failures() { + // Two concurrent connect() tasks race against a guaranteed-failure + // server (non-existent binary). Per chaodu F4 (#959 review), the + // race guard must never leave status stuck at `Connecting` even when + // both dial attempts fail. Final status must be Failed (terminal), + // and a third connect() after the race must still be allowed to + // retry from Failed. + let json = r#"{ + "mcpServers": { + "broken": { + "type": "stdio", + "command": "/nonexistent/path/openab-mcp-race-test-zzz" + } + } + }"#; + let cfg: McpConfig = serde_json::from_str(json).unwrap(); + let mgr = std::sync::Arc::new(McpRuntimeManager::from_config(cfg)); + let a = { + let mgr = mgr.clone(); + tokio::spawn(async move { mgr.connect("broken").await }) + }; + let b = { + let mgr = mgr.clone(); + tokio::spawn(async move { mgr.connect("broken").await }) + }; + let _ = a.await.unwrap(); + let _ = b.await.unwrap(); + match &mgr.statuses().await[0].1 { + ServerStatus::Failed(_) => {} + other => panic!("expected Failed after race, got {other:?}"), + } + // From Failed, a follow-up connect() must still attempt a fresh + // dial — proves the Failed → Connecting transition isn't gated out. + assert!(mgr.connect("broken").await.is_err()); + assert!(matches!(mgr.statuses().await[0].1, ServerStatus::Failed(_))); + } + + #[test] + fn stdio_child_env_keeps_only_baseline_plus_explicit() { + let mut explicit = HashMap::new(); + explicit.insert("MCP_TOKEN".to_string(), "server-token".to_string()); + explicit.insert("PATH".to_string(), "/custom/bin".to_string()); + + let env = stdio_child_env(&explicit); + + assert_eq!( + env.get("MCP_TOKEN").map(String::as_str), + Some("server-token") + ); + assert_eq!(env.get("PATH").map(String::as_str), Some("/custom/bin")); + assert!(!env.contains_key("DISCORD_BOT_TOKEN")); + assert!(!env.contains_key("ANTHROPIC_API_KEY")); + } +}