feat(openab-agent): native MCP client (ADR + implementation)#959
feat(openab-agent): native MCP client (ADR + implementation)#959brettchien wants to merge 53 commits into
Conversation
Adds ADR for in-core rmcp + progressive-disclosure meta-tool, deferred to symmetry with the Skills extension pattern from PR openabdev#955. Memory analysis rules out the sidecar alternative; per-session config refresh replaces file-watcher hot reload to drop ~150 LOC of race-condition hotspot. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ate guardrails - §6.4 flow selection: device-code preferred (matches existing CLI convention), paste-back universal fallback, browser laptop opt-in - §5.2 add login/complete_login meta-tool actions; login returns flow-tagged union - §6.1 spell out TokenStore persistence assumption + cold-start refresh - §6.3 device_authorization_endpoint extension point + RFC 8414 discovery - §3.8 stdio container-image caveat (interpreter required for Node/Python) - §5.7 max_concurrent_servers knob (default 10, see §7 for constrained tuning) - §7 Fargate 512MB/1GB OOM analysis + mitigations Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ntract - §6.4 RFC 8414 discovery disabled by default; opt-in requires oauth.discovery=true + oauth.discovery_allowlist (boot rejects otherwise). Rationale: awsvpc egress + SSRF surface in multi-tenant deployments. - §6.1 add RTR race warning: async persistence layers (Fargate S3 sync, eventually-consistent volumes) must flush new tokens to durable storage before Spot interruption, else cascade-revoke locks the user out. Contract: fsync(2) agent-side + mtime-event-driven sync deployment-side. - §6.3 expose oauth.discovery / oauth.discovery_allowlist on custom providers. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Phase 1 foundation slice per ADR §5.4 + §5.6:
- Add optional `rmcp 1.7` dep + `mcp` feature flag (default off)
- Wire `#[cfg(feature = "mcp")] mod mcp;` into main
- New `mcp/config.rs`: `McpConfig` / `ServerConfig` (Stdio | Http) /
`ToolFilter` / `OAuthConfig`, global+project layered load, project
precedence on name collision, `${env:VAR}` interpolation with
per-server error context
OAuth fields limited to `provider` + `scopes`; custom-provider
endpoints (§6.3) deferred to Phase 2 auth slice to avoid dead schema.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
CI fmt-check found 8 formatting deltas in the Phase 1 scaffold. No logic change. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
CI previously only built the default feature set, leaving the Phase 1 MCP scaffold (mcp/config.rs, gated by --features mcp) without compile, clippy, or test coverage. Adds explicit --features mcp invocations and watches workflow file changes so this gap is closed for the rest of the rollout (ADR §9). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Wires up the Phase 1 config loader so clippy --features mcp can see it. McpConfig::load() + ServerConfig::resolved() + serde pretty-print are all reachable from main, clearing the dead-code denial of compile under -D warnings. Output groups successful servers (✓) and failures (✗ with reason), sorted by name for deterministic display. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
OpenAB PR ScreeningThis is auto-generated by the OpenAB project-screening flow for context collection and reviewer handoff.
Screening reportscreened PR #959 and moved the project item from `Incoming` to `PR-Screening`.comment: #959 (comment) IntentAdd native MCP client support inside FeatFeature work. The PR introduces an ADR and initial implementation surface for Who It ServesPrimary beneficiaries are agent runtime operators and deployers who need MCP tools available inside OpenAB agent sessions without broker mediation. Maintainers and reviewers also benefit from a concrete ADR that defines auth, lifecycle, persistence, and rollout constraints before the full implementation lands. Rewritten PromptImplement native MCP client support for Merge PitchThis should move forward because native MCP belongs at the agent-runtime layer if OpenAB wants tools to be session-local, model-visible, and independent of gateway/broker transport. The risk profile is high enough that the draft status is appropriate: MCP touches auth, process execution, network egress, token storage, prompt budget, and runtime reliability. The likely reviewer concern is scope control. Best-Practice ComparisonOpenClaw is relevant because this feature needs durable job/tool execution boundaries, explicit routing, retry/backoff, and run logs. PR #959 aligns with OpenClaw on isolated execution and resilience intent, but it still needs concrete run-log/diagnostic behavior and explicit delivery/routing semantics for tool calls once implementation expands beyond config. Hermes Agent is directly relevant. The PR already borrows the right instincts: daemon/runtime ownership, fresh session thinking, persisted state, and circuit breaker patterns. The Hermes comparison should stay concrete in implementation review: token writes need atomic persistence, concurrent refresh needs a race contract, scheduled/background work should not share stale sessions accidentally, and each MCP server invocation path needs clear lifecycle ownership. Implementation OptionsConservative: merge only the ADR and feature-gated config/module scaffold first. Balanced: keep this PR as the foundation PR. Land ADR, Ambitious: land the full staged plan in this PR: stdio, Streamable HTTP, OAuth flows, token persistence, circuit breaker, doctor CLI, and feature flag removal. Comparison Table
RecommendationUse the balanced path. Treat this PR as the MCP foundation PR: ADR, gated dependency, config, stdio transport, meta-tool, and focused tests. Keep Streamable HTTP/OAuth as the next PR, and resilience/doctor/default-on behavior as a third PR. |
sort_by_key over (name, _) is the cleaner form; no behavior change. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Per Mira's security review on the Phase 1 thread: the previous
'mcp list' eagerly called ServerConfig::resolved(), substituting
\${env:GITHUB_TOKEN} etc. into the printed output. Three leak
paths matter — pasting CLI output into bug reports / chat, screen
sharing, and stdout log collection.
New behavior:
- Default: print raw config; \${env:VAR} placeholders kept verbatim.
Safe to paste publicly; reader still sees which env var feeds
each field.
- --resolve opts into substitution and prints a two-line warning
banner. Useful for diagnosing missing-env startup failures.
No CLI-shape break: 'mcp list' still works; --resolve is additive.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
println! puts the warning banner into stdout, which (a) gets swallowed by 'mcp list --resolve > dump.json' redirection so the user never sees the security notice, and (b) corrupts the JSON payload for downstream pipes like 'mcp list --resolve | jq'. Routing the banner through eprintln! keeps it visible regardless of redirection and keeps stdout pure JSON for piping. Standard Unix convention: data on stdout, diagnostics on stderr. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Introduces McpRuntimeManager owning one ServerHandle per configured server, each starting in ServerStatus::Disconnected per ADR §5.7 (lazy connect). Wires the manager via `mcp status` CLI so the types are exercised by clippy --features mcp; actual rmcp TokioChildProcess dial + Connected / Failed transitions land in the next slice to keep that risky bit isolated for bisecting.
`clippy --features mcp -- -D warnings` doesn't compile the test target, so `len()`'s only callers (the unit tests) didn't keep it alive. Tests already used `statuses().len()` in one place — switch the other two to match and drop the now-dead method. `is_empty()` stays because `cli_show_status` calls it.
…ct() ADR §5.7 lifecycle needs a `connect()` method that spawns a child process and awaits an rmcp handshake — both `Send` across `.await`. Plain `HashMap<String, ServerHandle>` is not `Sync`, and the background idle- eviction task will share the map with the foreground `mcp call` path, so the read-heavy / write-light access pattern wants `tokio::sync::RwLock`. This slice lands the lock migration only: - `handles: Arc<RwLock<HashMap<String, ServerHandle>>>` - `statuses()` and `is_empty()` become async; `cli_show_status` follows - `McpRuntimeManager` is now `Clone` (Arc bump) so the eviction task can hold its own handle - `connect(name)` transitions to `Connecting` and returns; the actual `rmcp::TokioChildProcess` dial + `Connected` / `Failed` transitions land in the next slice — keeping that bit isolated for bisecting 6 unit tests cover snapshot ordering, unknown-server error, transition, and clone-shares-state.
The chained `.statuses().await.into_iter().map(|(n, _)| n).collect()` fits on one line under rustfmt's default 100-char width — broke it preemptively last tick on the wrong side of the threshold.
`clippy --features mcp -D warnings` doesn't compile the test target, so `connect()`'s only callers (the unit tests) don't keep it alive. The real rmcp dial lands in the next slice and will be reachable from the meta-tool dispatch / `mcp call` CLI path; until then a targeted allow keeps clippy quiet without papering over the type-level dead_code on ServerStatus variants (which is already explicitly allowed).
connect() now spawns the configured stdio child via rmcp's TokioChildProcess, runs the JSON-RPC handshake (().serve(transport)), and parks the resulting RunningService on ServerHandle.client. Lock discipline uses a double-acquire: a brief write lock to mark Connecting, drop, run the dial without any lock held, then a second brief write lock to install the client or record Failed(msg). Holding the write lock across the spawn+handshake would block mcp status and the future idle-eviction scan for the entire dial latency. HTTP transport is rejected with a "phase 2" error so misconfigured entries surface cleanly without advancing state past Disconnected. Adds `openab-agent mcp connect <name>` as a smoke-test CLI for mcp.json entries; the RunningService is dropped on process exit. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…rver Per Mira's Tick 13 review: when two callers race on the same name, both pass the first-lock idempotency check (Disconnected), both spawn a child, both come back to acquire the install-lock. The second arrival must yield to the winner, otherwise it overwrites the installed client and silently drops the working RunningService. Adds a 3-line double-check at the second lock acquisition: if status is already Connected with a live client, return Ok(()) — `dial_result` falls out of scope here and RunningService::Drop kills the duplicate child. Cheaper than a Connecting-aware notify/condvar and adequate until the eviction-aware slice lands. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Three rustfmt collapses that I missed under default 100-char width:
* import order: bare `rmcp::ServiceExt` sorts AFTER `rmcp::transport::*`
(rustfmt puts sub-module paths before bare items within a crate).
* `return Err(anyhow!(...))` for the HTTP phase-2 message fits on one
line (~88 chars).
* `let Self { command, args, env } = self;` fits on one line.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…patch
Adds the single `mcp` tool the LLM sees (ADR §5.2). This slice lands:
* `Action` enum tagged on `action`, matching the eight ADR actions,
deserialized straight from the LLM's tool-call payload.
* `dispatch(manager, action) -> Result<Value>` as the single entry
point — both `agent.rs::execute_tool` (next slice) and tests go
through this.
* `help` returns the static usage doc.
* `list_servers` returns `[{name, status, transport}]` via a new
`McpRuntimeManager::snapshot()` that clones out under a read lock.
* The four IO-bearing actions (`list_tools`, `describe_tool`, `call`,
`status`) return a `not yet implemented` error so the surface area
is visible to callers without pre-implementing the peer-borrow
path. `login` / `complete_login` land with the Phase 2 OAuth slice.
`#[allow(dead_code)]` on `Action` and `dispatch` because clippy
--features mcp doesn't compile the test target and agent.rs
wire-up lands next slice.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…LLM fallback hint Two of Mira's three Tick 15 findings (Arc-wrap for peer borrow lands with the actual list_tools/call slice — Arc without a consumer earns no keep). (1) `not_implemented_msg(action)` names the requested action, lists the actions that DO work (`help`, `list_servers`), and tells the LLM to fall back to native `read` / `write` / `edit` / `bash` rather than retrying. The old generic "not yet implemented" gave the model nothing to act on. (2) `ServerConfig::transport_label() -> &'static str` replaces the config clone inside `snapshot()`. The `Stdio` variant carries an `args: Vec<String>` and `env: HashMap<String, String>`; cloning that just to read the variant tag was wasted heap traffic on every `list_servers` call. `snapshot()` now yields `(String, ServerStatus, &'static str)`. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
rustfmt's fn_call_width default (60) splits assert!() args when the inline args exceed it; the new "read, write, edit, bash" fallback assert tipped the third call from ~55 to ~62 chars. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Apply Mira's Tick 15 review (3) and ship the first IO-bearing
meta-tool action.
* `ServerHandle.client` is now `Option<Arc<RunningService<...>>>`.
`connect()` wraps the dial result in `Arc::new`.
* `McpRuntimeManager::arc_peer(name)` clones the Arc out under a
short read lock, drops the guard, returns the handle. Callers
`.await` on `peer.list_all_tools()` / `peer.call_tool()` with no
runtime lock held — kills the writer-starvation risk Mira flagged
and sidesteps `Future is not Send` from holding a guard across
`.await`.
* `Action::ListTools { server }` now wired: lazy `connect()` →
`arc_peer()` → `peer.list_all_tools()` → `[{name, description}]`.
`describe_tool` / `call` / `status` still stub.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Apply Mira's Tick 18 review (3) — lenient args parsing for the
`call` action so LLMs that send `arguments: null` (or omit the field
entirely) for no-arg tools don't bounce off a strict type check.
* `Action::Call { server, tool, arguments }` is wired through
`call_tool()`: validate arguments → lazy `connect()` → `arc_peer()`
→ `peer.call_tool(CallToolRequestParams::new(...).with_arguments(map))`
→ serialize `CallToolResult` to JSON.
* Argument coercion: `Value::Object(map) → map`, `Value::Null → {}`,
everything else (string, number, array, bool) is rejected with a
message that names the actual type so the LLM can correct itself.
* Tests cover both branches: non-object string args fail early at
validation; null args pass validation and fail later at connect.
* `describe_tool` and `status` still stub; supported-list message
updated to include `call`.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The CallToolRequestParams chain fits in exactly 100 chars; rustfmt keeps it on one line. Burned a tick on a defensive break. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… 1 surface
describe_tool returns full input_schema for one tool (list+filter via
list_all_tools — MCP spec has no single-tool query). status reports
per-server {name, status, transport, last_error} with optional name
filter; surfaces Failed's embedded error as last_error.
Extract fetch_tools helper shared by list_tools + describe_tool —
this is the natural insertion point for the planned tools_cache.
New mcp::flow module exposes init_paste_authorize(provider, client_id, redirect_uri) -> PasteAuthorize, which generates the PKCE pair + state nonce internally and returns the authorize URL plus the secrets the caller must persist for complete_login to validate the callback. Internalizing pair generation removes a footgun (caller can't mismatch verifier/state) and shrinks the API to the two parameters that actually vary per call. auth::generate_pkce promoted from private to pub so the MCP flow path can share it with Codex — security primitive, single source of truth, no drift on future hardening. Module-scope #![allow(dead_code)] consistent with mcp::oauth — first prod caller (the §6.4 login orchestration) lands in the next slice. Tests cover URL structure, percent-encoding of redirect_uri, scope form-encoding, unparseable authorize_url error path, and custom-provider URL composition. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
rustfmt's reorder_imports does NOT sort by pure alphabetical order across use statements — local path roots have a fixed precedence: self < super < crate < external crates. My alphabetical assumption (c < s, so crate first) was wrong in both flow.rs's module-scope imports and its mod tests block. Burned Tick 36 (twice, same file). Runbook updated alongside the Tick 14 sub-module-vs-bare ordering note since the two rules apply at different scopes. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
ServerStatus gains a NeedsAuth variant; McpRuntimeManager::connect() now transitions oauth-protected http servers into that state with an error pointing the user at `mcp login <name>` instead of staying silently Disconnected. Icon "◌" (U+25CC DOTTED CIRCLE) matches the existing geometric family (○ ◐ ●) rather than "⚠" — the latter is emoji-prone (Discord and many terminals upgrade it via VS16), which would break aligned status output. Status label "needs_auth" wired through meta_tool's snake_case status_label() table. Tests cover both the transition + error format and an idempotency guarantee: a second connect() on a NeedsAuth server must keep the state sticky (only successful `mcp login` clears it). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…l_width
Tick 37's assert!(err.contains("mcp login"), "...") was 84 chars inline
but rustfmt's default fn_call_width=60 measures the arg list and split
it. Match the formatter.
McpRuntimeManager::start_paste_login(server) wires flow::init_paste_authorize for built-in OAuth providers (ADR §6.4). The PKCE verifier + state are stashed in an in-memory pending_logins map (HashMap<String, PendingPasteLogin>) for the next slice's complete_login to consume. Server status flips to NeedsAuth. Scope this slice: - Built-in providers only (anthropic-mcp). Custom-provider paste-back needs runtime callback port allocation; deferred to a follow-up slice. - Custom providers declaring device_authorization_endpoint short-circuit with an explicit "use device flow" error (ADR §6.4 selection logic). - ADR §6.4 says transient state lives "in TokenStore"; this slice keeps it in-process. auth.json needs a heterogeneous-entry schema change to hold non-token shapes — separate slice. oauth::builtin_client_id is the per-provider client_id resolver — env-var- required (no hard-coded default) so paste-back fails loud rather than emitting an authorize URL with a placeholder client_id. flow.rs sheds its module-level #![allow(dead_code)] now that init_paste_authorize has a prod caller transitively from start_paste_login (itself allow-dead-code until the next slice wires the mcp::login action).
Three distinct fmt rule misses on the start_paste_login slice: - oauth.rs: chain at receiver+chain = exactly 60 chars stays inline (over-broken on the unwrap_err+to_string chain) - runtime.rs: struct-pattern binders force per-line when arm body is also long enough that the inline form would overflow — different threshold than the existing connect()'s Stdio arm - runtime.rs: field-access chain ELEMENT counts toward chain_width; `start.authorize_url.starts_with(...)` is 2 chain elements, not 1, so receiver+chain over 60 → break
Untagged Serde enum `AuthEntry` keeps refresh-task state machine separate
from in-flight paste-login state. Per Mira's Tick 39 review: repurposing
TokenStore fields for pending entries would have made refresh loop on
them. Adds `{load,save,remove}_pending_login` helpers (mcp-gated, wired
in next slice via runtime::start_paste_login).
`start_paste_login` now writes the `mcp-pending:<server>` entry via
`auth::save_pending_login`, dropping the in-memory `pending_logins`
map. Aligns runtime state with the ADR §6.4 contract ("kept in
TokenStore") so `complete_login` survives an agent restart.
To keep tests off the real `$HOME/.openab/agent/auth.json` (the
cross-module HOME-env race that bit Tick 24), the auth-path becomes
an injected `PathBuf` field: `from_config()` defaults to
`auth::auth_path()`, `from_config_with_auth_path()` lets tests point
at a tempdir. The two tests that exercise the disk path adopt a
`mgr_with_tempdir` helper; rejection tests untouched (they error
before persist). `auth::{load,save,remove}_pending_login` likewise
take `&Path` so they're driven by the injected path, not a global.
`(McpRuntimeManager::from_config_with_auth_path(cfg, path), dir)` at 60 chars between parens trips rustfmt's tuple width heuristic and gets exploded into a 4-line literal. Bind the manager first so the tuple stays a 2-token one-liner. No behaviour change.
Pure URL → authorization-code helper for the upcoming `runtime::complete_login` (ADR §6.4). Validates the `state` echo before returning the `code` so CSRF / cross-flow contamination fails closed before any token-endpoint round-trip. Surfaces an `error=` query param verbatim and tolerates extra parameters (`iss`, vendor tracking) without rejecting valid callbacks. Token exchange + runtime wiring follow in the next slice; helper carries `#[allow(dead_code)]` until that lands so the no-feature build stays warning-clean.
Reverts the 18 OAuth-related commits (2a69bf9..0953985) so this PR contains only ADR + Phase 1 foundation: rmcp dep, mcpServers config loader, stdio transport, meta-tool dispatch, anonymous Streamable HTTP transport body. Phase 2 OAuth work (TokenStore namespacing, provider catalog, paste-back flow primitives, AuthEntry split, parse_paste_callback) is preserved on feat/openab-agent-mcp-oauth-stash and will be proposed as a separate PR once openabdev#959 lands. Per shaun-agent auto-screen Balanced recommendation: smaller PRs get reviewed; one mega-PR doesn't. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
shaun-agent
left a comment
There was a problem hiding this comment.
Staff Review — feat(openab-agent): native MCP client (ADR + Phase 1)
Reviewer: @shaun-agent (staff, on behalf of @mrshroom69)
Verdict: Approve with minor fixes required before merge
This is a high-quality PR. The ADR is thorough (657-line design doc with prior-art survey, memory analysis, rollout plan), the Phase 1 implementation is well-scoped (~1,270 LOC across 4 new files + surgical edits to 4 existing files), CI passes both feature combos, and the screening recommendation (Balanced path) is correctly followed. The progressive-disclosure meta-tool pattern is the right call for openab-agent's ~500-token system prompt budget.
Issues to fix (blocking merge)
1. unsafe { std::env::set_var(...) } in tests — unsound on Rust ≥1.83
acp.rs tests (lines 277, 290) and config.rs test (line 253) use unsafe { std::env::set_var(...) }. Since Rust 1.83, set_var is unsafe precisely because it's UB in multi-threaded contexts. The ENV_LOCK mutex in acp.rs is a partial mitigation but doesn't protect against other threads the tokio runtime may spawn.
Fix: Use temp_env crate (dev-dependency) or restructure tests to pass env via explicit HashMap parameter (the config.rs interpolate_env tests already do this correctly — the resolved() test is the outlier).
2. is_none_or — stabilized in Rust 1.82, verify MSRV
meta_tool.rs line uses .filter(|(name, _, _)| filter.is_none_or(|f| f == name.as_str())). Option::is_none_or was stabilized in 1.82. Confirm the workspace MSRV is ≥1.82 or replace with .map_or(true, |f| f == name.as_str()).
3. System prompt wording says "4 tools" → now "these tools"
The diff correctly changes "You have 4 tools available:" to "You have these tools available:" — good. But the bullet list still only enumerates read/write/edit/bash. When mcp feature is on, the MCP_SYSTEM_PROMPT_APPENDIX appends the 5th tool description separately. This works but creates a slightly awkward prompt structure (list of 4, then a separate "Additional tool:" paragraph). Consider making the appendix inject into the list directly, or at minimum change the intro to not imply the list is exhaustive. Low priority — cosmetic.
Observations (non-blocking, for follow-up PRs)
-
No idle eviction yet. ADR §5.7 describes a background eviction task with configurable TTL. Phase 1 doesn't implement it — connected servers stay alive until process exit. Fine for Phase 1 (short-lived ACP sessions), but should be tracked for Phase 2/3.
-
#[allow(dead_code)]onNeedsAuthvariant — acknowledged in comment as Phase 2. Clean. -
tool_filterparsed but not enforced. Config acceptsinclude/excludeglobs butlist_tools/calldon't filter. Should be documented as Phase 2 scope or removed from the config schema to avoid user confusion. -
Race in double-lock connect pattern. The "race guard" comment (runtime.rs ~line 180) correctly identifies that a concurrent
connect()may win. The losing dial'sRunningServiceis dropped (killing the child). This is correct but worth a brief integration test withtokio::spawnto prove no panic under contention. -
ADR references
OpenClawbut notes it's "not applicable" — this satisfies thepr-contribution-guidelines.mdrequirement to evaluate both mandatory references and state inapplicability with evidence. Good.
Summary
The architecture is sound, the code is clean, tests cover the happy path and failure modes well, and the scope is correctly bounded to Phase 1. Fix items 1-2 above (both are ~5-line changes), and this is ready to merge.
cc @brettchien
Proposed Fix (2 files, 7 lines changed)@brettchien — here are the two fixes from the review above. Both are minimal: 1.
|
|
Staff follow-up on #959: I found and prepared a local fix set, but I am not pushing it yet pending review/approval per Shaun's instruction. Proposed fix:
Verification:
Requesting review from <@1490707025028448306> and <@1507445470614524087>. If both approve, I will push the patch. If there are gaps, I'll route back to Shaun instead of pushing. |
Staff decision on proposed fixesReviewing the two fix sets proposed (mine + codex-MBP2018's): ✅ Approved fixes (merge-blocking)
❌ Rejected fix
Action needed@brettchien — please apply these three fixes to your branch:
Once those land and CI is green, this PR is approved from staff side. The |
5ea08b5 to
e90ef01
Compare
Status after fix pushHead is now Applied only the fixes approved in staff decision:
CI is green:
PR is still draft and <@1490707025028448306> <@1507445470614524087> please give final consensus:
|
shaun-agent
left a comment
There was a problem hiding this comment.
Staff Approval ✅
Verified commit e90ef01 addresses all three blocking items:
- ✅
env_clear()security fix —runtime.rsnow callsenv_clear()+stdio_child_env()which only passesHOME/PATH/TERM/USER+ explicit config env. Test confirmsDISCORD_BOT_TOKEN/ANTHROPIC_API_KEYare excluded. - ✅
is_none_or→match—meta_tool.rsuses explicit match (avoids both the MSRV concern and the clippyunnecessary_map_orlint). - ✅
unsafe set_var—acp.rs/auth.rs/config.rsall migrated totemp-env.
CI: both checks green.
@brettchien — from staff side this is approved. When you're ready, mark as ready-for-review (undraft) and it can merge. No further changes needed from our side.
|
CHANGES REQUESTED What This PR DoesAdds native MCP client support to How It Works
Findings
Finding Details🔴 F1: No recovery from Failed stateCurrently: Err(e) => {
handle.status = ServerStatus::Failed(msg.clone());
Err(anyhow!(msg))
}Suggested fix: allow 🟡 F6: Feature gating → runtime opt-inRecommended approach:
Reviewer Attribution
What's Good (🟢)
Blocker: F1 must be addressed. F2-F10 are suggestions/confirmations — author can accept, defer with tracking issue, or explain why current behavior is correct. Reviewed by: 超渡法師 (coordinator), 口渡法師, 擺渡法師, 覺渡法師 |
|
Maintainer clarification on F6 (feature gating strategy): The desired behavior is a two-layer gate: Compile-time (Cargo feature)
Runtime (activation gate)
Why
Please adjust the implementation accordingly: flip |
…F6/F9/F10 + F2/F7 doc)
Code changes:
- F6: gate `load_runtime_or_warn()` on `OPENAB_AGENT_MCP={1,true,yes,on}`
env var. Compile-time feature flag is being removed in Phase 3 PR openabdev#969;
this adds the runtime half so MCP stays dormant unless explicitly opted
in, preventing accidental activation when mcp.json is present.
- F4: integration test in `mcp::runtime::tests` exercising two concurrent
`connect()` tasks against a guaranteed-failure server, asserts no
stuck-Connecting state and that follow-up dials still attempt fresh.
- F5: defensive guard in `Agent::execute_tool_call` for `mcp` tool calls
when manager is unavailable — returns actionable error instead of
silently falling through to fs-tool dispatch.
- F9: regression test for missing-env-var error path; asserts the offender
variable name + server name surface in the chained error message.
- F10: `truncate_context` drain bounds clarified — `3.min(len)` instead of
`(1 + 2).min(len)`, with comment explaining the preserved first message.
Doc changes (ADR §5.4.1 added):
- F2: rationale for `env_clear()` baseline + explicit `env:` allowlist —
blocks token leakage through ambient env, matches MCP SDK behavior.
- F7: rationale for single shared `McpRuntimeManager` per process —
connection pooling, deduplicated handshakes, lifecycle owned by Agent.
|
@chaodu-agent thanks for the deep review — here's the resolution of the F1-F10 findings. Fixed in this PR (commit 5af7d86)
Documented in ADR §5.4.1 (this PR)
Addressed by stacked PRs (no separate follow-up needed)
Token-budget PoC — concrete numbers for F11 (🟢)Your F11 noted "system prompt growth ≤ ~100 tokens regardless of server count" as a 🟢 design strength. We measured the actual cost with
Δ = 233 tokens, fixed. Higher than your "≤100" eyeball (the Pre-gate on
|
Discord Discussion URL: https://discord.com/channels/1490643539682529300/1510657283552706663/1510690101120995369
📚 PR stack — merge in this order:
mcp doctor, stacks on feat(openab-agent): MCP Phase 2 — OAuth 2.1 paste-back (stacks on #959) #962)Each downstream PR rebases + cleans up its diff once the upstream PR merges. Reviewers can take all three in parallel; merges must serialize.
Summary
Adds an ADR (
docs/adr/openab-agent-mcp.md) proposing native MCP client support insideopenab-agent(the Rust coding agent in the Cargo workspace) and lands Phase 1 of the rollout (foundation: rmcp dep + stdio transport + meta-tool + anonymous Streamable HTTP transport body).Per
shaun-agentauto-screen Balanced recommendation, OAuth (Phase 2) and resilience (Phase 3) are split into separate PRs to keep each diff reviewable.ADR commits
cb8b710— ADR scaffold: prior-art survey (Hermes / opencode / pi-mcp-adapter / Goose / OpenHands), in-corermcp+ progressive-disclosure meta-tool architecture, OAuth + lifecycle + memory analysis, rollout planfdd8738— OAuth flow unification: device-code preferred (matches existingclaude/codex/grokCLI convention) + paste-back universal fallback + browser laptop opt-in; TokenStore persistence assumption + cold-start refresh47363a2— Hardening: RFC 8414 dynamic discovery disabled by default (awsvpc egress + SSRF surface), opt-in via per-server allowlist; refresh-token rotation race contract (fsync agent-side + mtime-event-driven sync deployment-side, motivated by Fargate Spot + S3 sidecar sync)Phase 1 implementation
mcpServersconfig loader (openab-agent/src/mcp/config.rs) +mcp listCLI subcommand with secret redactionmcp/runtime.rs) —Disconnected | Connecting | Connected | Failed,Arc<RwLock<_>>per-server, double-lock connect with race guardrmcp::transport::child_process::TokioChildProcessNeedsAuthonce Phase 2 lands)mcp/meta_tool.rs) — singlemcpaction dispatch (help/list_servers/list_tools/describe_tool/call/status) wired into the agent dispatch loopmcp(default on); CI exercises both feature combos (note: Phase 3 feat(openab-agent): MCP Phase 3 — circuit breaker + flag removal (stacks on #959 + #962) #969 removes this flag)Scope
openab-agent— not broker forward[agent].inherit_cloud_mcp_servers(issue feat(acp): allow opting out of inherited cloud MCP connectors via [agent].inherit_cloud_mcp_servers #753) remains out of scope and unaffectedTest plan
mcp-server-filesystem) connects + tool list + tool call via meta-toolmcp list;--resolvewarnings on stderr🤖 Generated with Claude Code