feat(openab-agent): MCP Phase 2 — OAuth 2.1 paste-back (stacks on #959) by brettchien · Pull Request #962 · openabdev/openab

brettchien · 2026-06-01T06:10:00Z

Discord Discussion URL: https://discord.com/channels/1490643539682529300/1510657283552706663/1510690101120995369

📚 PR stack — merge in this order:

feat(openab-agent): native MCP client (ADR + implementation) #959 — Phase 1: Foundation (ADR + rmcp + stdio + meta-tool)
feat(openab-agent): MCP Phase 2 — OAuth 2.1 paste-back (stacks on #959) #962 (this PR) — Phase 2: OAuth 2.1 paste-back ← stacks on feat(openab-agent): native MCP client (ADR + implementation) #959, merge second
feat(openab-agent): MCP Phase 3 — circuit breaker + flag removal (stacks on #959 + #962) #969 — Phase 3: Resilience (circuit breaker + mcp doctor, stacks on feat(openab-agent): MCP Phase 2 — OAuth 2.1 paste-back (stacks on #959) #962)

Once #959 merges, this branch rebases and the diff collapses to only the 18 OAuth-specific commits. Reviewers can take all three in parallel; merges must serialize.

Summary

Phase 2 of the openab-agent native MCP client (ADR docs/adr/openab-agent-mcp.md §6). Adds OAuth 2.1 paste-back authentication for Streamable HTTP servers.

OAuth commits

commit	content	ADR ref
`2a69bf9`	Namespaced `TokenStore` + fsync	§6.1
`4c77595`	Provider catalog (anthropic / openai / xai)	§6.2
`da86d5b`	`OAuthConfig` fields + discovery boot check	§6.3
`65f1c4d`	Custom provider resolution	§6.3
`04b84f9`	`ResolvedProvider` as enum `{Builtin, Custom}`	§6.3
`2400144`	Paste-back flow primitives (PKCE S256, state nonce)	§6.4
`95f33bc`	`NeedsAuth` runtime state for protected HTTP servers	§6.4
`af3a25d`	`start_paste_login` + builtin client_id resolver	§6.4
`7dd718c`	Split `auth.json` into `TokenStore \| PendingPasteLogin`	§6.4
`79b1e2e`	Persist pending paste-login to `auth.json`	§6.4
`bc006e2`	`parse_paste_callback` URL helper (state validation + CSRF)	§6.4

Plus fmt fixups: 06acde6, 7ea3e59, f61b498, 7f4ef17, 32232c0, ca91f1d, 111f43a.

Scope

Paste-back universal fallback only — device-code preferred path tracked separately
Builtin client_id catalog covers anthropic / openai / xai; arbitrary providers via custom OAuthConfig
mcp login CLI subcommand: next slice (will wire start_paste_login + parse_paste_callback + complete_login)
Circuit breaker + mcp doctor deferred to Phase 3 (feat(openab-agent): MCP Phase 3 — circuit breaker + flag removal (stacks on #959 + #962) #969)

Test plan

mcp login <server> happy path against vendor-hosted MCP (Linear, etc.) in headless container
State mismatch rejected (RFC 6749 §10.12)
PKCE verifier round-trip
Refresh-token rotation race: agent fsync + auth.json mtime stable
mcp list redacts client_secret (regression on Phase 1 secret-redaction)
Both cargo feature combos green

🤖 Generated with Claude Code

Adds ADR for in-core rmcp + progressive-disclosure meta-tool, deferred to symmetry with the Skills extension pattern from PR openabdev#955. Memory analysis rules out the sidecar alternative; per-session config refresh replaces file-watcher hot reload to drop ~150 LOC of race-condition hotspot. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…ate guardrails - §6.4 flow selection: device-code preferred (matches existing CLI convention), paste-back universal fallback, browser laptop opt-in - §5.2 add login/complete_login meta-tool actions; login returns flow-tagged union - §6.1 spell out TokenStore persistence assumption + cold-start refresh - §6.3 device_authorization_endpoint extension point + RFC 8414 discovery - §3.8 stdio container-image caveat (interpreter required for Node/Python) - §5.7 max_concurrent_servers knob (default 10, see §7 for constrained tuning) - §7 Fargate 512MB/1GB OOM analysis + mitigations Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…ntract - §6.4 RFC 8414 discovery disabled by default; opt-in requires oauth.discovery=true + oauth.discovery_allowlist (boot rejects otherwise). Rationale: awsvpc egress + SSRF surface in multi-tenant deployments. - §6.1 add RTR race warning: async persistence layers (Fargate S3 sync, eventually-consistent volumes) must flush new tokens to durable storage before Spot interruption, else cascade-revoke locks the user out. Contract: fsync(2) agent-side + mtime-event-driven sync deployment-side. - §6.3 expose oauth.discovery / oauth.discovery_allowlist on custom providers. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Phase 1 foundation slice per ADR §5.4 + §5.6: - Add optional `rmcp 1.7` dep + `mcp` feature flag (default off) - Wire `#[cfg(feature = "mcp")] mod mcp;` into main - New `mcp/config.rs`: `McpConfig` / `ServerConfig` (Stdio | Http) / `ToolFilter` / `OAuthConfig`, global+project layered load, project precedence on name collision, `${env:VAR}` interpolation with per-server error context OAuth fields limited to `provider` + `scopes`; custom-provider endpoints (§6.3) deferred to Phase 2 auth slice to avoid dead schema. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

CI fmt-check found 8 formatting deltas in the Phase 1 scaffold. No logic change. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

CI previously only built the default feature set, leaving the Phase 1 MCP scaffold (mcp/config.rs, gated by --features mcp) without compile, clippy, or test coverage. Adds explicit --features mcp invocations and watches workflow file changes so this gap is closed for the rest of the rollout (ADR §9). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Wires up the Phase 1 config loader so clippy --features mcp can see it. McpConfig::load() + ServerConfig::resolved() + serde pretty-print are all reachable from main, clearing the dead-code denial of compile under -D warnings. Output groups successful servers (✓) and failures (✗ with reason), sorted by name for deterministic display. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

sort_by_key over (name, _) is the cleaner form; no behavior change. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Per Mira's security review on the Phase 1 thread: the previous 'mcp list' eagerly called ServerConfig::resolved(), substituting \${env:GITHUB_TOKEN} etc. into the printed output. Three leak paths matter — pasting CLI output into bug reports / chat, screen sharing, and stdout log collection. New behavior: - Default: print raw config; \${env:VAR} placeholders kept verbatim. Safe to paste publicly; reader still sees which env var feeds each field. - --resolve opts into substitution and prints a two-line warning banner. Useful for diagnosing missing-env startup failures. No CLI-shape break: 'mcp list' still works; --resolve is additive. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

println! puts the warning banner into stdout, which (a) gets swallowed by 'mcp list --resolve > dump.json' redirection so the user never sees the security notice, and (b) corrupts the JSON payload for downstream pipes like 'mcp list --resolve | jq'. Routing the banner through eprintln! keeps it visible regardless of redirection and keeps stdout pure JSON for piping. Standard Unix convention: data on stdout, diagnostics on stderr. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Introduces McpRuntimeManager owning one ServerHandle per configured server, each starting in ServerStatus::Disconnected per ADR §5.7 (lazy connect). Wires the manager via `mcp status` CLI so the types are exercised by clippy --features mcp; actual rmcp TokioChildProcess dial + Connected / Failed transitions land in the next slice to keep that risky bit isolated for bisecting.

`clippy --features mcp -- -D warnings` doesn't compile the test target, so `len()`'s only callers (the unit tests) didn't keep it alive. Tests already used `statuses().len()` in one place — switch the other two to match and drop the now-dead method. `is_empty()` stays because `cli_show_status` calls it.

…ct() ADR §5.7 lifecycle needs a `connect()` method that spawns a child process and awaits an rmcp handshake — both `Send` across `.await`. Plain `HashMap<String, ServerHandle>` is not `Sync`, and the background idle- eviction task will share the map with the foreground `mcp call` path, so the read-heavy / write-light access pattern wants `tokio::sync::RwLock`. This slice lands the lock migration only: - `handles: Arc<RwLock<HashMap<String, ServerHandle>>>` - `statuses()` and `is_empty()` become async; `cli_show_status` follows - `McpRuntimeManager` is now `Clone` (Arc bump) so the eviction task can hold its own handle - `connect(name)` transitions to `Connecting` and returns; the actual `rmcp::TokioChildProcess` dial + `Connected` / `Failed` transitions land in the next slice — keeping that bit isolated for bisecting 6 unit tests cover snapshot ordering, unknown-server error, transition, and clone-shares-state.

The chained `.statuses().await.into_iter().map(|(n, _)| n).collect()` fits on one line under rustfmt's default 100-char width — broke it preemptively last tick on the wrong side of the threshold.

`clippy --features mcp -D warnings` doesn't compile the test target, so `connect()`'s only callers (the unit tests) don't keep it alive. The real rmcp dial lands in the next slice and will be reachable from the meta-tool dispatch / `mcp call` CLI path; until then a targeted allow keeps clippy quiet without papering over the type-level dead_code on ServerStatus variants (which is already explicitly allowed).

connect() now spawns the configured stdio child via rmcp's TokioChildProcess, runs the JSON-RPC handshake (().serve(transport)), and parks the resulting RunningService on ServerHandle.client. Lock discipline uses a double-acquire: a brief write lock to mark Connecting, drop, run the dial without any lock held, then a second brief write lock to install the client or record Failed(msg). Holding the write lock across the spawn+handshake would block mcp status and the future idle-eviction scan for the entire dial latency. HTTP transport is rejected with a "phase 2" error so misconfigured entries surface cleanly without advancing state past Disconnected. Adds `openab-agent mcp connect <name>` as a smoke-test CLI for mcp.json entries; the RunningService is dropped on process exit. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…rver Per Mira's Tick 13 review: when two callers race on the same name, both pass the first-lock idempotency check (Disconnected), both spawn a child, both come back to acquire the install-lock. The second arrival must yield to the winner, otherwise it overwrites the installed client and silently drops the working RunningService. Adds a 3-line double-check at the second lock acquisition: if status is already Connected with a live client, return Ok(()) — `dial_result` falls out of scope here and RunningService::Drop kills the duplicate child. Cheaper than a Connecting-aware notify/condvar and adequate until the eviction-aware slice lands. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Three rustfmt collapses that I missed under default 100-char width: * import order: bare `rmcp::ServiceExt` sorts AFTER `rmcp::transport::*` (rustfmt puts sub-module paths before bare items within a crate). * `return Err(anyhow!(...))` for the HTTP phase-2 message fits on one line (~88 chars). * `let Self { command, args, env } = self;` fits on one line. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…patch Adds the single `mcp` tool the LLM sees (ADR §5.2). This slice lands: * `Action` enum tagged on `action`, matching the eight ADR actions, deserialized straight from the LLM's tool-call payload. * `dispatch(manager, action) -> Result<Value>` as the single entry point — both `agent.rs::execute_tool` (next slice) and tests go through this. * `help` returns the static usage doc. * `list_servers` returns `[{name, status, transport}]` via a new `McpRuntimeManager::snapshot()` that clones out under a read lock. * The four IO-bearing actions (`list_tools`, `describe_tool`, `call`, `status`) return a `not yet implemented` error so the surface area is visible to callers without pre-implementing the peer-borrow path. `login` / `complete_login` land with the Phase 2 OAuth slice. `#[allow(dead_code)]` on `Action` and `dispatch` because clippy --features mcp doesn't compile the test target and agent.rs wire-up lands next slice. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…LLM fallback hint Two of Mira's three Tick 15 findings (Arc-wrap for peer borrow lands with the actual list_tools/call slice — Arc without a consumer earns no keep). (1) `not_implemented_msg(action)` names the requested action, lists the actions that DO work (`help`, `list_servers`), and tells the LLM to fall back to native `read` / `write` / `edit` / `bash` rather than retrying. The old generic "not yet implemented" gave the model nothing to act on. (2) `ServerConfig::transport_label() -> &'static str` replaces the config clone inside `snapshot()`. The `Stdio` variant carries an `args: Vec<String>` and `env: HashMap<String, String>`; cloning that just to read the variant tag was wasted heap traffic on every `list_servers` call. `snapshot()` now yields `(String, ServerStatus, &'static str)`. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

rustfmt's fn_call_width default (60) splits assert!() args when the inline args exceed it; the new "read, write, edit, bash" fallback assert tipped the third call from ~55 to ~62 chars. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Apply Mira's Tick 15 review (3) and ship the first IO-bearing meta-tool action. * `ServerHandle.client` is now `Option<Arc<RunningService<...>>>`. `connect()` wraps the dial result in `Arc::new`. * `McpRuntimeManager::arc_peer(name)` clones the Arc out under a short read lock, drops the guard, returns the handle. Callers `.await` on `peer.list_all_tools()` / `peer.call_tool()` with no runtime lock held — kills the writer-starvation risk Mira flagged and sidesteps `Future is not Send` from holding a guard across `.await`. * `Action::ListTools { server }` now wired: lazy `connect()` → `arc_peer()` → `peer.list_all_tools()` → `[{name, description}]`. `describe_tool` / `call` / `status` still stub. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Apply Mira's Tick 18 review (3) — lenient args parsing for the `call` action so LLMs that send `arguments: null` (or omit the field entirely) for no-arg tools don't bounce off a strict type check. * `Action::Call { server, tool, arguments }` is wired through `call_tool()`: validate arguments → lazy `connect()` → `arc_peer()` → `peer.call_tool(CallToolRequestParams::new(...).with_arguments(map))` → serialize `CallToolResult` to JSON. * Argument coercion: `Value::Object(map) → map`, `Value::Null → {}`, everything else (string, number, array, bool) is rejected with a message that names the actual type so the LLM can correct itself. * Tests cover both branches: non-object string args fail early at validation; null args pass validation and fail later at connect. * `describe_tool` and `status` still stub; supported-list message updated to include `call`. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The CallToolRequestParams chain fits in exactly 100 chars; rustfmt keeps it on one line. Burned a tick on a defensive break. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

… 1 surface describe_tool returns full input_schema for one tool (list+filter via list_all_tools — MCP spec has no single-tool query). status reports per-server {name, status, transport, last_error} with optional name filter; surfaces Failed's embedded error as last_error. Extract fetch_tools helper shared by list_tools + describe_tool — this is the natural insertion point for the planned tools_cache.

…in_width

Surfaces the `mcp` action enum to the LLM via a new ToolDef and routes calls through McpRuntimeManager when configured. AcpServer constructs the manager once at startup (warn-and-default on config parse failure) and clones the cheap Arc-backed handle into each session's Agent. Tools.rs stays stateless and feature-flag-free — the mcp routing arm lives in Agent::execute_tool_call where it has access to the manager. A shared MCP_TOOL_NAME const keeps mcp_tool_def() and the dispatch arm in sync.

The `mut` on `tools` and `base` is consumed only inside `#[cfg(feature = "mcp")]` blocks, so the no-feature build saw them as unused. Gate `allow(unused_mut)` behind `cfg_attr(not(feature = "mcp"), ...)` so both builds stay clean.

`test_session_new` and `test_session_new_missing_key` both mutate `ANTHROPIC_API_KEY`. cargo runs tests in parallel, so a `set` from one thread can be observed by the other before the first reads it. The pre-existing race was timing-tight enough to pass historically; adding `mcp::load_runtime_or_warn()` to `AcpServer::new()` widened the gap between set and read enough to flip the race deterministic. Guarding both tests with a shared `Mutex<()>` serializes them without adding a `serial_test` dep. Poisoned guard is recovered via `into_inner()` — we don't care if a prior assertion panicked.

Mira Tick 46 catch: when a provider omits both refresh_token and expires_in (Figma, Sentry, xAI), the prior `now + 0` fallback set the token "already expired", causing the next connect() to enter the refresh path, fail closed on the empty refresh_token, and bounce the user back to NeedsAuth seconds after a successful login. `expires_in: None` now uses the u64::MAX sentinel — `is_expired` returns false until the provider eventually 401s on use (at which point `mcp login` re-runs is the correct UX for non-refreshable tokens). Test now asserts the sentinel directly so the regression can't sneak back in. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

`openab-agent mcp login <name> [--paste URL]` drives the whole paste-back loop: start_paste_login pins PKCE state, the CLI shows the authorize URL, blocks on stdin (or --paste for scripted use) for the redirect URL, then complete_login validates state + exchanges the code + persists the TokenStore. --paste exists for CI smoke tests and scripted setups; without it the CLI is interactive (read_redirect_from_stdin). State-mismatch or network failure leaves the pending entry intact so the user can retry with a fresh paste without re-issuing `mcp login`. Drops the #[allow(dead_code)] markers from start_paste_login, complete_login, save_namespaced_token_at, remove_pending_login and PasteLoginStart — all reachable from main via cli_login now, so dead_code propagates through the call graph (skill rule). Phase 2 user-facing surface is now complete: list / status / connect / login. mcp status surfacing of NeedsAuth + the device-code flow remain for the next slices. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

… --paste Mira's PR openabdev#962 review (Q2) noted that --paste echoes the redirect URL into shell history and `ps`. PKCE makes either route theoretically safe, but `echo "<url>" | openab-agent mcp login <name>` leaves no trace — preferred for CI / scripts.

`auth::list_pending_logins_at` enumerates `mcp-pending:<name>` entries; `McpRuntimeManager::pending_logins` delegates via injected auth_path. `cli_show_status` cross-references against configured servers to flag both "login pending — finish via mcp login <name>" and orphaned entries with no matching mcp.json server. Introduces `PENDING_PREFIX` const so the read/write sides can't drift on the literal.

…_status one-pass Simplify pass on 9bca3de: - auth.rs: PENDING_PREFIX demoted to private; pending_key() promoted to pub fn as the single public constructor. runtime.rs uses it instead of formatting against the const, so the namespace literal lives in exactly one place. - mcp/mod.rs: cli_show_status folds the orphan calculation into the print loop via pending.remove(name) — the leftover set IS the orphans, dropping the second configured: HashSet<&str> pass. - mcp/mod.rs: drop the cli_login "Security note" docstring — the same rationale lives on the Login clap subcommand doc in main.rs, which is the user-facing surface. - runtime.rs: trim the spawn_blocking rationale on pending_logins — moved next to list_pending_logins_at where the sync read actually lives. Net -10 LOC across 3 files; ~/ops/openab-agent-ci.sh --quick green.

Phase 2 / ADR §6.5: HTTP+oauth servers now dial with their cached access_token when one is present and not within REFRESH_SKEW_SECONDS of expiry. Refresh-token rotation for the expired-but-recoverable case is the next slice — for now, expired tokens fall through to NeedsAuth alongside missing tokens, so `mcp login` stays the single user-actionable path. - auth::is_expired now pub(crate) — single source of truth for the expiry+skew check across codex (get_valid_token) and mcp (connect). - Dial::Http carries Option<String> auth; Dial::run uses StreamableHttpClientTransportConfig::with_uri(url).auth_header(token) when present, falls back to from_uri for anonymous HTTP. - connect() oauth branch tries load_namespaced_token_at first; the Ok+!expired arm builds an authenticated Dial, all other arms bounce to NeedsAuth. Tests: - connect_oauth_with_valid_cached_token_attempts_dial_not_needs_auth proves the bearer-injection path reaches handshake (fails at the hermetic 127.0.0.1:1 dead address, surfaces Failed not NeedsAuth). - connect_oauth_with_expired_token_bounces_to_needs_auth pins the expiry guard against future refresh-flow regressions.

Phase 2 / ADR §6.6: HTTP+oauth servers with an expired cached token now attempt RFC 6749 §6 refresh-grant before bouncing to NeedsAuth. On success the rotated TokenStore is fsynced and the dial proceeds with the new bearer; on failure the bounce is preserved so `mcp login` stays the user-actionable path. - post_token_refresh: sibling of post_token_exchange for grant_type= refresh_token (public client, no client_secret). - try_refresh_oauth_token: resolves client_id from current config (so a rotated catalog entry is picked up), POSTs the refresh, builds a new TokenStore preserving the previous refresh_token if the provider omits one (Google-style rotation), persists via save_namespaced_token_at. - DialPlan two-phase enum lets the inside-lock branch tag "needs refresh" and defer the async refresh to outside the write lock — slow refresh must not block concurrent `mcp status` reads. Tests: - connect_oauth_expired_no_refresh_token_bounces_to_needs_auth pins the empty-refresh short-circuit (no POST attempted). - connect_oauth_expired_with_refresh_token_failed_refresh_bounces_to_needs_auth proves the refresh path runs and that any failure surfaces as user-actionable NeedsAuth (current fixture's custom provider plus dead token endpoint both fail the refresh). Custom-provider refresh shares the same gap as custom-provider login — resolve_paste_client errors before the refresh POST runs. Will be unblocked by the same future slice that wires custom-provider login.

…ected winner Mira + Kirin caught a race: two concurrent connect() calls on the same expired-token server can interleave so one wins the refresh + dial while the other's refresh fails — and the loser was unconditionally overwriting `Connected` with `NeedsAuth`, giving the user a false "please log in" signal on a healthy session. Guard the status mutation with `!matches!(h.status, Connected)` so a concurrent winner's terminal state is preserved.

Mirrors finish_login (oauth.rs:271) so a hostile/buggy provider returning a huge expires_in can't trigger a debug-build overflow panic in the refresh-token rotation path.

…d refresh post_token_exchange and post_token_refresh diverged only in the form payload — same Client::builder() boilerplate, same status/body folding, same response shape. Extract post_token_form so each grant call site is just the RFC 6749 form tuple + a grant_label for the error context. Also drops a narrating comment in connect() that recapped the three match arms below it without adding any non-obvious context.

Adds the §3.1 device authorization request + §3.4 polling tick HTTP helpers — pure data plumbing, no runtime wiring yet. The polling-tick response classifier is split out as a pure fn (classify_device_poll) so the §3.5 error-code → flow-state mapping (authorization_pending, slow_down, access_denied, expired_token) is unit-testable without a mock HTTP server. DeviceAuthResponse defaults interval to 5s per RFC 8628 §3.5 when the provider omits it. verification_uri_complete (§3.3.1 extension) is Option since verification_uri + user_code is the always-present fallback. Both POST fns + the two types are #[allow(dead_code)] — next slice wires start_device_login() and the polling loop into McpRuntimeManager.

…loop) - `start_device_login(name)`: POST §3.1 + spawn detached §3.4 polling task - `DeviceLoginStart`: public user-facing bundle (user_code / verification_uri / verification_uri_complete / expires_in) - `resolve_device_client`: Custom-provider-only gate with explicit error for missing `device_authorization_endpoint` / `client_id` - `run_device_poll_loop`: deadline check + SlowDown back-off (+5s) + success persistence + NeedsAuth on terminal failure - `finalize_device_login` / `mark_device_login_failed`: status transitions (Disconnected on success keeps MCP handshake out of detached task; next `connect()` picks up the cached token via the oauth-aware DialPlan) - 4 gating tests: unknown server / stdio / missing device endpoint / HTTP reach #[allow(dead_code)] on the new surface until the next slice wires `mcp login --device` CLI. Quick gate green.

- main.rs: add `--device` flag to `McpAction::Login` (conflicts_with `paste`) - mcp/mod.rs: `cli_login_device(name)` — prints verification_uri_complete (§3.3.1) when present, falls back to `verification_uri` + `user_code`; watches `statuses()` until Disconnected (success) / NeedsAuth (terminal failure) or the provider-declared `expires_in` deadline elapses - runtime.rs: drop `#[allow(dead_code)]` on the Tick-55 device-flow chain (`start_device_login` is now called from the CLI surface)

- config.rs: add `oauth.redirect_uri: Option<String>` to `OAuthConfig` for the paste-back branch on custom providers (built-ins pin their callback in `ProviderSpec`; ignored by the device-code branch) - oauth.rs: propagate `redirect_uri` through `ResolvedProvider::Custom` - runtime.rs: in `resolve_paste_client`, accept Custom + client_id + redirect_uri; emit precise per-missing-field error otherwise: - device endpoint present → "use device flow" (preserved) - missing client_id → "requires `oauth.client_id`" - missing redirect_uri → "requires `oauth.redirect_uri` (must match the redirect URL pre-registered with the provider)" - Tests: rename `_rejects_custom_provider_for_now` → `_rejects_custom_without_redirect_uri`; add `_rejects_custom_without_client_id` + success-path `_custom_with_client_id_and_redirect_uri_succeeds` Closes the last Phase 2 paste-back functional gap. ADR §6.3 enumerates optional custom fields but doesn't currently list `redirect_uri`; follow-up ADR amendment lands separately.

Adds the field to the §6.3 config-shape list and a follow-up paragraph explaining why custom paste-back requires it (no local listener; must match provider OAuth app registration) and why the device-code / built-in branches ignore it. Follow-up to 84837ee (custom-provider paste-back support); Mira + Kirin dual sign-off requested this as a non-blocking docs amendment.

- auth.rs: convert free function `is_expired(&TokenStore) -> bool` into `impl TokenStore { pub fn is_expired(&self) -> bool }`. Switch the skew addition to `saturating_add` for parity with the rest of the OAuth arithmetic (defends `u64::MAX` "never expires" sentinel + any near-overflow clock value). - auth.rs: fold `show_status`'s inline duplicate of the expiry formula into the new method. - mcp/runtime.rs: drop the free-function import + update the cached-token DialPlan branch to `store.is_expired()`. - Tests: rewrite the three existing `is_expired` cases to method form and add `_sentinel_u64_max` covering the saturating_add behavior. Mira Tick 50 non-blocking cleanup.

Drop stale `#[allow(dead_code)]` markers now that runtime/flow wire all items through prod paths. Reuse `reqwest::Client` across the device-poll loop so TLS / TCP handshakes are amortized across the dozens-to-hundreds of polls a 30-minute device-flow window can produce. Trim session-narrative parentheticals from doc comments. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Reviewer-pass findings: - Drop dead auth.rs wrappers (`load/save/remove_namespaced_token`) that were never wired; runtime/tests use the `_at` siblings directly. - Extract `build_token_store` helper to dedupe the `TokenStore` construction + `u64::MAX` sentinel + refresh-rotation logic across `finish_login`, `finalize_device_login`, and `try_refresh_oauth_token`. - Preserve underlying error context in `complete_login` via `.with_context()` instead of `.map_err(|_| anyhow!(...))`. - Trim session-narrative parentheticals ("Tick N", "Mira's review") from doc comments; keep the technical reason that future readers need. - Refresh stale doc strings that still said "next slice" for behaviour Phase 2 itself ships. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

shaun-agent · 2026-06-01T12:01:52Z

OpenAB PR Screening

This is auto-generated by the OpenAB project-screening flow for context collection and reviewer handoff.
Click 👍 if you find this useful. Human review will be done within 24 hours. We appreciate your support and contribution 🙏

Title: feat(openab-agent): MCP Phase 2 — OAuth 2.1 paste-back (stacks on feat(openab-agent): native MCP client (ADR + implementation) #959)
Source: feat(openab-agent): MCP Phase 2 — OAuth 2.1 paste-back (stacks on #959) #962
Status: moved to PR-Screening
Generated at: 2026-06-01T12:02:32.192Z
Discord thread: not available

Screening report

posted screening and moved the project item to `PR-Screening`.

GitHub comment: #962 (comment)
Project action: https://github.com/orgs/openabdev/projects/1, item PVTI_lADOEFbZWM4BUUALzguVxqM, status now PR-Screening

Intent

PR #962 is trying to add Phase 2 of the openab-agent native MCP client: OAuth 2.1 paste-back authentication for Streamable HTTP MCP servers. The operator-visible problem is that protected HTTP MCP servers can be discovered but cannot yet complete a durable auth flow in headless or Discord-driven environments.

Feat

Feature work. It introduces provider resolution, OAuth config/discovery checks, PKCE/state primitives, pending paste-login persistence, token-store namespacing/fsync behavior, and runtime NeedsAuth state for protected HTTP servers. The PR is draft and stacked on #959, so reviewers should treat the current diff as noisy until Phase 1 lands and this branch rebases.

Who It Serves

Primary beneficiaries: agent runtime operators and deployers who need OpenAB agents to connect to protected MCP servers. Secondary beneficiaries: maintainers who need the MCP auth model split into reviewable phases before exposing mcp login UX.

Rewritten Prompt

Implement the OAuth 2.1 paste-back foundation for openab-agent MCP Streamable HTTP servers, stacked after #959. Add typed OAuth configuration, builtin provider metadata for Anthropic/OpenAI/xAI, custom provider resolution, PKCE S256 verifier/challenge generation, CSRF-resistant state handling, pending-login persistence in auth.json, and runtime state that marks protected servers as needing auth. Keep this slice below CLI UX: do not add the final mcp login command wiring unless it can be tested end to end. Include focused tests for provider resolution, state mismatch rejection, PKCE round-trip, secret redaction, auth persistence/fsync behavior, and feature combinations.

Merge Pitch

This should move forward as the next MCP capability slice because it builds the security-critical auth substrate before user-facing command wiring. Risk is medium-high: OAuth state, token persistence, and secret redaction are easy places to create durable security bugs. The likely reviewer concern is not whether OAuth is needed, but whether this draft can be reviewed while stacked on #959 and whether the implementation has enough tests around CSRF/state, PKCE, persistence, and redaction before it becomes the base for CLI login.

Best-Practice Comparison

OpenClaw is relevant mainly as an operational reference: durable job/state persistence, explicit routing, retries, and run logs map to the need for durable auth state and clear auth/runtime transitions. This PR appears aligned on durable persisted state, but should make failure and recovery behavior explicit before CLI UX lands.

Hermes Agent is also relevant: its fresh-session scheduled runs and atomic persisted state are a useful bar for auth.json writes and pending-login recovery. The paste-back design should be checked for atomic writes, lock behavior if multiple agent processes touch auth state, and deterministic recovery after interruption.

Implementation Options

Conservative: wait for #959 to merge, rebase, then review only the OAuth commits with no new scope. Require focused unit tests for PKCE, state mismatch, provider resolution, pending-login serialization, and secret redaction before marking ready.

Balanced: after rebase, merge the OAuth foundation behind the existing MCP feature gates, but split any large runtime behavior into a follow-up if it is not directly needed for paste-back state. Add a small integration-style test around protected HTTP server auth state without requiring a live vendor server.

Ambitious: rebase and complete the full paste-back user path in one PR, including mcp login, callback parsing, token exchange completion, redacted listing, and a mocked end-to-end OAuth server test. This gives reviewers a complete UX but increases review size and security blast radius.

Comparison Table

Option	Speed	Complexity	Reliability	Maintainability	User Impact	Fit for OpenAB now
Conservative	Fast after #959	Low	Good if tests are added	Good, smallest review	No immediate login UX	Strong fit for stacked draft cleanup
Balanced	Moderate	Medium	Better coverage of runtime behavior	Good if runtime scope stays tight	Prepares reliable next-slice CLI	Best fit
Ambitious	Slow	High	Potentially strong, but larger risk	Harder to review and bisect	Delivers visible login path	Premature while #959 is still pending

Recommendation

Use the balanced path after #959 lands: rebase, collapse the diff to OAuth-only commits, keep mcp login as the next slice, and require focused tests for PKCE/state, provider resolution, persistence/recovery, feature gates, and secret redaction before advancing from draft. If runtime changes remain large after rebase, split nonessential runtime expansion into Phase 3 so Masami or Pahud can review the security-sensitive OAuth core cleanly.

Concurrent connect() callers may each hit `try_refresh_oauth_token` with the same stale `refresh_token` (no per-server single-flight gate yet). The connect() race guard prevents state corruption; the worst case is N duplicate POSTs to the token endpoint. A Phase 3 follow-up will add per-server single-flight for providers that cascade-revoke on replayed refresh tokens. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Align ADR with implementation: the device-code polling loop persists tokens then sets status to Disconnected, so the next connect() reads the cached token via the oauth-aware dial path. Keeps the rmcp handshake out of the detached polling task. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Track the most-recent device-poll JoinHandle per server in `McpRuntimeManager.device_login_tasks`. On a fresh `start_device_login` the prior task is aborted before the new one is registered, so a retry after a transient failure can't leave two loops racing to finalize the same server (would risk double-persisting tokens / contradictory status transitions). Phase 2 follow-up openabdev#4 of 4. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Add `refresh_locks: HashMap<server, Arc<tokio::Mutex>>` and acquire the per-server lock before the refresh POST. After taking the lock, re-read the on-disk token; if a prior waiter already refreshed, return the cached store without a second POST. Closes the cascade-revoke window on providers (Google, ...) that invalidate every issued refresh_token when the same one is replayed concurrently. Phase 2 follow-up openabdev#1 of 4. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Replace the truncate+write+sync_all path with: write same-dir tmp (PID + atomic counter for uniqueness, mode 0o600 via create_new), fsync the tmp, rename(2) onto auth.json, fsync the parent directory. On crash, the file is either the prior version or the new version — never half-written. The old implementation truncated `auth.json` before writing, so a mid-write crash left a corrupt file that the next startup would fail to parse. On the ECS deployment that then re-restored a now-revoked refresh_token from S3 (ADR §6.1). Phase 2 follow-up openabdev#3 of 4. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

brettchien and others added 30 commits May 31, 2026 15:30

style(openab-agent/mcp): apply cargo fmt to config.rs

3d58865

CI fmt-check found 8 formatting deltas in the Phase 1 scaffold. No logic change. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

fix(openab-agent/mcp): satisfy clippy::unnecessary_sort_by

82c43e5

sort_by_key over (name, _) is the cleaner form; no behavior change. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

fix(openab-agent/mcp): satisfy cargo fmt --check on runtime test

dc1b0a6

The chained `.statuses().await.into_iter().map(|(n, _)| n).collect()` fits on one line under rustfmt's default 100-char width — broke it preemptively last tick on the wrong side of the threshold.

fix(openab-agent/mcp): collapse let-params chain to satisfy cargo fmt

2a859e6

The CallToolRequestParams chain fits in exactly 100 chars; rustfmt keeps it on one line. Burned a tick on a defensive break. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

fix(openab-agent/mcp): break dispatch().await.unwrap() chain past cha…

f36fb77

…in_width

fix(openab-agent/mcp): use is_none_or per clippy::unnecessary_map_or

6a3e9e0

brettchien and others added 16 commits June 1, 2026 10:34

fix(openab-agent/mcp): saturating_add for refresh expires_at

1e4863f

Mirrors finish_login (oauth.rs:271) so a hostile/buggy provider returning a huge expires_in can't trigger a debug-build overflow panic in the refresh-token rotation path.

brettchien force-pushed the feat/openab-agent-mcp-oauth branch from 93da300 to e5ea59f Compare June 1, 2026 10:36

brettchien and others added 2 commits June 1, 2026 11:21

brettchien and others added 5 commits June 1, 2026 12:08

brettchien mentioned this pull request Jun 1, 2026

tracking: openab-agent native MCP — Phase 3 (Resilience) #966

Open

12 tasks

brettchien marked this pull request as ready for review June 1, 2026 15:59

brettchien requested a review from thepagent as a code owner June 1, 2026 15:59

brettchien mentioned this pull request Jun 1, 2026

feat(openab-agent): MCP Phase 3 — circuit breaker + flag removal (stacks on #959 + #962) #969

Open

7 tasks

github-actions Bot added the pending-maintainer label Jun 1, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(openab-agent): MCP Phase 2 — OAuth 2.1 paste-back (stacks on #959)#962

feat(openab-agent): MCP Phase 2 — OAuth 2.1 paste-back (stacks on #959)#962
brettchien wants to merge 94 commits into
openabdev:mainfrom
brettchien:feat/openab-agent-mcp-oauth

brettchien commented Jun 1, 2026 •

edited

Loading

Uh oh!

shaun-agent commented Jun 1, 2026 •

edited

Loading

Intent

Feat

Who It Serves

Rewritten Prompt

Merge Pitch

Best-Practice Comparison

Implementation Options

Comparison Table

Recommendation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

brettchien commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

OAuth commits

Scope

Test plan

Uh oh!

shaun-agent commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

OpenAB PR Screening

Intent

Feat

Who It Serves

Rewritten Prompt

Merge Pitch

Best-Practice Comparison

Implementation Options

Comparison Table

Recommendation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

brettchien commented Jun 1, 2026 •

edited

Loading

shaun-agent commented Jun 1, 2026 •

edited

Loading