Skip to content

Add browser pool parity fields to MCP#112

Open
IlyaasK wants to merge 26 commits into
mainfrom
browser-pools-parity-mcp-tool
Open

Add browser pool parity fields to MCP#112
IlyaasK wants to merge 26 commits into
mainfrom
browser-pools-parity-mcp-tool

Conversation

@IlyaasK

@IlyaasK IlyaasK commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

Summary

  • add update support to the existing manage_browser_pools MCP tool
  • add browser pool create/update parity fields from the SDK: start_url, chrome_policy, kiosk_mode, extension selection, profile ID/save settings, and viewport settings
  • add update-only discard_all_idle support without requiring agents to resend unchanged pool size/config
  • reuse shared browser config helpers for profile, extension, and viewport request shaping across manage_browsers and manage_browser_pools
  • derive browser config response payload types from SDK method params instead of locally re-declaring SDK-shaped objects
  • improve MCP agent ergonomics with compact browser-pool/browser list responses, structured next_actions, stable pool-ID follow-ups, and structured SSH forwarding guidance
  • centralize MCP response helpers, pagination response formatting, and pagination schemas so tools do not hand-roll repeated { items, has_more, next_offset } payloads
  • tighten MCP schema validation for integer-valued fields such as pagination, timeouts, viewport dimensions, click counts, and durations
  • address review-bot findings by preserving full default profile inventory listing, keeping explicit empty paginated pages as structured JSON, ignoring empty chrome_policy objects, and guarding nullish pool list responses
  • restore the cleaner computer_action schema/execution structure from main while keeping clipboard support and integer validation from this stack

Why

  • browser pools can now be configured with the same core browser-session options used when warming pool browsers
  • MCP clients should be able to create and update pools without dropping to the SDK/CLI for start URLs, Chrome policy, extensions, profiles, or viewport settings
  • update-only operations such as discard_all_idle should not force agents to resend unchanged pool size/configuration
  • agents need compact list responses for selection, full get responses for inspection, and explicit next-step guidance after create/update/acquire operations
  • default profile listing should remain a complete inventory for agents, while explicit limit/offset still provides paginated control when requested
  • explicit empty pages should not read like global inventory absence, because that can make agents run setup or recovery flows unnecessarily
  • keeping browser config mapping, response formatting, and pagination behavior in shared helpers avoids drift across stacked project/API-key/browser/app/profile changes

Agent Experience / Flow

This PR gives agents a fast-session workflow for repeated browser work. Instead of creating a fresh browser for every task, an agent can configure a pool once, acquire pre-warmed sessions, use normal browser tools, and release sessions back into the pool.

Typical flow:

  1. Agent decides a pool is useful when a workflow needs repeated browser sessions with the same profile, viewport, policy, start URL, extension, proxy, or timeout configuration.
  2. Agent calls manage_browser_pools create size=<n> name=<name> with the same browser-shaping options it would otherwise pass to manage_browsers create, such as profile_id, start_url, chrome_policy, viewport_width, and extension_id.
  3. Agent uses manage_browser_pools list for a compact pool inventory and manage_browser_pools get id_or_name=<pool_id> for full details.
  4. Agent calls manage_browser_pools acquire id_or_name=<pool_id> to get a browser session.
  5. Agent uses the returned session_id with computer_action, execute_playwright_code, browser_curl, or exec_command.
  6. Agent calls manage_browser_pools release id_or_name=<pool_id> session_id=<id> reuse=true when the session can return to the pool, or reuse=false when it should be discarded and replaced.
  7. Agent calls manage_browser_pools update id_or_name=<pool_id> with only the fields it wants to change. size is required for create, but update accepts partial bodies such as discard_all_idle=true.
  8. Agent uses flush to remove idle browsers without deleting the pool, and delete only when the pool is no longer needed.

Agent-facing response behavior:

  • manage_browser_pools create/update returns a compact browser_pool summary plus next_actions.
  • manage_browser_pools acquire returns a compact browser summary plus next_actions for control/release/get-detail follow-up.
  • manage_browser_pools list returns compact summaries and points agents to get for full details.
  • Pool next-actions use the stable pool id, not the display name.
  • manage_browsers create/update returns structured JSON with browser and next_actions.
  • SSH forwarding guidance is now structured under ssh_port_forwarding instead of appended as mixed JSON + markdown.
  • Paginated tools share the same response shape through paginatedJsonResponse(page, { mapItem, note }); paginated empty pages always stay structured JSON with items, has_more, and next_offset.
  • manage_profiles list keeps the old agent-friendly default of returning the full matching inventory unless the caller explicitly asks for limit or offset.
  • Empty full-inventory guidance remains available through itemsJsonResponse, not through paginated responses.

How

  • typed pool create/update request bodies from KernelClient["browserPools"]
  • SDK-derived browser config output types from KernelClient["browsers"] and KernelClient["browserPools"]
  • shared profile, extension, viewport, and start URL request shaping in src/lib/mcp/browser-config.ts
  • shared response helpers in src/lib/mcp/responses.ts, including separate item and paginated response helpers
  • shared pagination input schema in src/lib/mcp/schemas.ts
  • create requires size; update allows partial configuration/update-only bodies and rejects only completely empty update requests
  • empty chrome_policy: {} is ignored, so it does not count as a real update field or get sent to the API
  • compact pool/browser summaries intentionally omit routine connection details while keeping full details available through get
  • computer_action keeps a single action schema and prefix executor so clipboard support does not split or duplicate the computer-use surface

Behavior changes

Intentional changes worth calling out for reviewers (none are regressions):

  • Pagination now rejects out-of-range limit/offset at the schema level (.min/.max) instead of relying on the old silent API-side clamp.
  • Not-found and validation responses in manage_apps and manage_profiles now set isError: true instead of returning plain content, so MCP clients surface them as errors.
  • An empty chrome_policy: {} is now dropped rather than sent as an update field.

Validation

Static and build checks:

  • bunx prettier --check src/lib/mcp/responses.ts src/lib/mcp/tools/apps.ts src/lib/mcp/tools/browsers.ts src/lib/mcp/tools/profiles.ts
  • git diff --check
  • git diff --cached --check
  • bunx tsc --noEmit --incremental false
  • production bun run build with dummy deployment env and network access for Google Fonts

Focused checks from the latest commit:

  • Direct response-helper smoke verified unpaginated empty inventories can still show setup guidance through itemsJsonResponse.
  • Direct response-helper smoke verified paginated empty pages preserve structured JSON with items: [], has_more, and next_offset.
  • Direct response-helper smoke verified paginated responses stay structured even if emptyText is forced through at runtime.
  • TypeScript verified emptyText is no longer part of paginatedJsonResponse options.

Earlier PR validation retained:

  • In-process toolset normalization test verified manage_computer_action, manage_search_docs, and legacy browser_utilities aliases disable the expected toolsets, and unknown disabled toolsets throw a clear error.
  • Focused MCP handler smoke verified default profile listing auto-iterates all matching profiles, explicit profile pagination preserves page metadata, empty pool chrome_policy returns a clean MCP error instead of sending an empty update, non-empty chrome_policy is still sent and summarized, and nullish pool list responses return No browser pools found.
  • Local MCP HTTP smoke against http://localhost:3002/mcp verified initialize, OPTIONS /mcp, tools/list, resources/list, prompts/list, schema assertions, and search_docs dummy-env behavior.
  • Live MCP validation checks verified fractional/invalid inputs return isError: true for pagination, pool size, browser_curl, computer_action, exec_command, and manage_browsers update validation paths.

Platform/API smoke context:

  • Full localhost browser-pool CRUD against the supplied org/key remains blocked by the platform plan gate: 403 Browser pools require a Start-Up or Enterprise plan.
  • The latest pass used a dummy opaque Bearer token for MCP transport/schema/validation tests, so it did not perform real Kernel API CRUD.

Notes

  • bun run format:check still fails on pre-existing AGENTS.md markdown table formatting. This PR uses touched-file Prettier checks to avoid unrelated project-instruction churn.

Note

Medium Risk
Broad MCP surface changes (pool updates, pagination validation, error isError behavior) could affect agent workflows; no auth changes but SDK upgrade ties behavior to 0.60.0 APIs.

Overview
Extends manage_browser_pools with update, SDK parity fields on create/update (start_url, chrome_policy, kiosk_mode, extensions, profile ID/save, viewport, discard_all_idle), and bumps @onkernel/sdk to 0.60.0.

browser-config.ts centralizes profile/extension/viewport/start URL shaping for manage_browsers and pools; list/get responses use compact summaries, next_actions, structured ssh_port_forwarding, and shared paginatedJsonResponse / itemsJsonResponse helpers. MCP resources move to browser-pools:// with registerJsonResourceTemplate for per-entity URIs; pagination and numeric tool inputs get stricter Zod bounds. Docs add browser_curl and clipboard on computer_action.

Reviewed by Cursor Bugbot for commit 6f3e4a7. Bugbot is set up for automated code reviews on this repo. Configure here.

@vercel

vercel Bot commented Jun 1, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
mcp Ready Ready Preview, Comment Jun 22, 2026 3:44pm

Comment thread src/lib/mcp/tools/browser-pools.ts Outdated
width,
height,
...(params.viewport_refresh_rate !== undefined && {
refresh_rate: params.viewport_refresh_rate,

@vercel vercel Bot Jun 1, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Browser pool update action fails with "size is required" error even when user only wants to update other fields

Fix on Vercel

Comment thread src/lib/mcp/tools/browser-pools.ts Outdated
Comment thread src/lib/mcp/tools/browser-pools.ts
@firetiger-agent

Copy link
Copy Markdown

Monitoring Plan: Add update action to manage_browser_pools MCP tool

What this PR does: Lets AI agents update existing browser pool configurations (size, profile, extensions, viewport, chrome policy, kiosk mode, etc.) via the MCP tool — previously only create was supported.

Intended effect:

  • PATCH /browser_pools/{id_or_name} volume: baseline 119–169 req/hr; confirmed if requests flow successfully with ≤1% 4xx rate post-deploy
  • PATCH 5xx rate: baseline 0/hr; confirmed if stays at 0

Risks:

  • Field-scope validation regressionactionFieldError() may block valid fields; alert if PATCH 4xx rate exceeds 5/hr for 2 consecutive hours (baseline: 0–1/hr)
  • size required on updatebuildPoolConfigParams() throws if size is omitted; callers intending to patch only chrome_policy or kiosk_mode will get a silent MCP error; alert if any "size is required" log appears in API logs post-deploy
  • discard_all_idle pool drain — updating a live pool with this flag destroys idle browsers; alert if kernel_browser_pool_building_count sustains above 30 for >15 min
  • Acquire 5xx regression — misconfigured chrome_policy on a pool could cause new provisions to fail; alert if acquire 5xx exceed 20/hr for 2 consecutive hours (baseline: 0–20/hr isolated)

Status updates will be posted automatically on this PR as monitoring progresses.

View monitor

@firetiger-agent

Copy link
Copy Markdown

Monitoring plan created for browser pool update action and field-scope enforcement in MCP server.

Affected services: kernel-mcp-server (Vercel Preview), api

Key signals to watch (baseline: 24h pre-deploy):

  • PATCH /browser_pools/{id_or_name} error rate (4xx/5xx) — expect stable or slight increase from new update calls
  • kernel_browser_pool_building_count — watch for sustained spike if discard_all_idle is used
  • POST /browser_pools/{id_or_name}/acquire 5xx rate — baseline has sparse spikes (0–20/hr); sustained elevation would indicate pool configuration regression

Alert thresholds:

  • PATCH 4xx rate exceeds 5 errors/hr (baseline: 0–1/hr)
  • PATCH 5xx any spike (baseline: 0/hr)
  • Acquire 5xx sustained above 20/hr for 2+ hours (post-deploy)

[View monitor]

View monitor

Comment thread src/lib/mcp/tools/browser-pools.ts

@masnwilliams masnwilliams left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reviewed — large but well-structured: extracts shared helpers (responses.ts, browser-config.ts, resource-templates.ts) and brings browser pools to create/update parity. typechecks clean across the branch. this is the cleanup the earlier PRs were pointing at — resolves the duplicate textResponse, and the ResourceTemplate migration likely fixes "get one by URI" resources that the old static-URI + startsWith approach couldn't actually serve. also removes the dead extension-check from #110.

worth confirming before merge

  • pool update size cast (browser-pools.ts, ~updateParams as BrowserPoolUpdateParams): verified BrowserPoolUpdateParams.size is required in @onkernel/sdk@0.58.0, so the cast silences a real type error. it rests entirely on the comment's claim that the backend PATCH accepts partial bodies. if true → the SDK type is wrong, please file an SDK/OpenAPI fix and reference it here. if not → an update that omits size (e.g. just toggling headless) will 4xx at runtime despite typechecking. this is the one real risk in the PR.

notes

  • resource URI changed browser_pools://browser-pools:// (hyphen). documented in the README diff so it's intentional — just confirm nothing external referenced the old underscore form. note the other three schemes stayed single-word (apps:///browsers:///profiles://), so this is the only one that changed.

agent-fit

no new concern — pool lifecycle (incl. flush/delete) is core. cross-cutting: still no MCP annotations on any tool; adding readOnlyHint/destructiveHint here would let clients auto-confirm reads and prompt on flush/delete.

Comment thread src/lib/mcp/tools/profiles.ts Outdated
Comment thread src/lib/mcp/tools/browser-pools.ts Outdated
Comment thread src/lib/mcp/tools/browser-pools.ts
Comment thread src/lib/mcp/tools/browsers.ts
@socket-security

socket-security Bot commented Jun 3, 2026

Copy link
Copy Markdown

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Updated@​onkernel/​sdk@​0.58.0 ⏵ 0.60.077 +510010098 +1100

View full report

Comment thread src/lib/mcp/tools/profiles.ts Outdated
Resolve the active Cursor profile pagination bug by preserving structured JSON for explicit empty pages, restore the cleaner computer_action structure from main with clipboard support, derive browser config payload types from the SDK, and remove stale dead types.
Remove emptyText from paginated response options so explicit empty pages always return items, has_more, and next_offset. Keep empty guidance only on non-paginated item responses and update app/browser list callers accordingly.
@masnwilliams

Copy link
Copy Markdown
Collaborator

reviewed the full diff — solid refactor + additive feature. no lost functionality (every tool/action/param preserved; deletions moved into the shared helpers), SDK types are derived everywhere they should be, and tsc --noEmit is clean against sdk 0.60.0. a few small things:

nits

  • src/lib/mcp/tools/apps.ts:9textResponse is imported but never used. dead import (tsc misses it since noUnusedLocals isn't set, but lint may flag it).
  • src/lib/mcp/tools/browser-pools.ts:258fill_rate_per_minute is the only numeric pool field left as a bare z.number(); every sibling got .int().min(). if it's an integer percentage, .int().min(0).max(100) would match the others; if fractional is allowed, at least .min(0).

question

  • src/lib/mcp/responses.ts:1PaginatedPage<T> is a hand-written structural type rather than derived from the SDK page type. works fine via structural typing, but it's the one spot that wouldn't fail at compile time if the SDK renamed getPaginatedItems/next_offset. intentional to decouple from SDK internals, or worth deriving?

behavior changes worth a line in the PR body (not bugs — all defensible)

  • pagination now rejects out-of-range limit/offset (schema .min/.max) instead of relying on the old silent API clamp.
  • not-found / validation responses in manage_apps and manage_profiles now set isError: true (previously plain content) — arguably a correctness improvement.
  • empty chrome_policy: {} is now dropped rather than sent as an update field.

nice work — the net-negative line counts in browsers/profiles/apps are the good kind of churn.

- Remove unused textResponse import in apps.ts (dead import)
- Add .min(0) to browser pool fill_rate_per_minute so the schema
  rejects negatives; left unbounded above and non-integer since the
  SDK/API type it as a plain number (fractional/>100% rates are valid)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@IlyaasK

IlyaasK commented Jun 22, 2026

Copy link
Copy Markdown
Contributor Author

thanks @masnwilliams — addressed in b3b3c74:

nits

  • apps.ts:9 — dropped the unused textResponse import.
  • fill_rate_per_minute — added .min(0). Left it unbounded above and non-integer on purpose: the SDK/OpenAPI type it as a plain number ("Percentage of the pool to fill per minute. Defaults to 10%") with no int/max, so .int().max(100) would reject inputs the API actually accepts (fractional rates, or >100%/min fast-fill). .min(0) just stops negatives, which is the only bound that's unambiguously safe.

question — PaginatedPage<T>

Keeping it hand-written; intentional. It's a generic helper consumed by every paginated endpoint (profiles/apps/browsers), each a distinct SDK page class, so a structural "anything with getPaginatedItems() / has_more / next_offset" is the right shape — deriving from a concrete page type would either pin it to one endpoint or need a hand-maintained union.

Worth noting the exposure is narrower than it looks: getPaginatedItems() is required, so a real SDK page stops being assignable the moment the SDK renames it → the call site breaks at compile time. Only the two optional fields (has_more/next_offset) can silently drift to undefined on a field rename — and deriving wouldn't fix that either, since the SDK types those optional too, so any supertype inherits the optionality. So the structural type isn't actually giving up compile-time safety a derived one would buy.

(No test for that optional-field drift since there's no test runner in the repo yet — didn't want to pull a framework into this PR. Happy to add one separately if we want the insurance.)

behavior changes

Documented in the PR body now under a new "Behavior changes" section (out-of-range pagination rejection, isError: true on not-found/validation, empty chrome_policy dropped).

Comment thread src/lib/mcp/tools/browser-pools.ts
Match the .int() validation used by sibling pool fields (size,
timeout_seconds) per the MCP integer-param convention; fill rate
percentages are whole numbers in practice.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Comment thread src/lib/mcp/tools/browser-pools.ts
Revert the .int() constraint: fill_rate_per_minute is a percentage,
not a count, and the SDK/API type it as a plain number with no integer
restriction, so .int() rejected valid fractional rates. Added a comment
noting the intentional deviation from the sibling count fields.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Comment thread src/lib/mcp/tools/browser-pools.ts
The acquire next_actions echoed the caller's id_or_name (possibly a
display name) in the release hint, while create/update use the stable
pool id. Prefer browser.pool.id, falling back to id_or_name only when
the SDK omits the pool ref, so follow-ups survive a pool rename.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes using high effort and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 87a580c. Configure here.

Comment thread src/lib/mcp/tools/browser-pools.ts Outdated
buildPoolConfigParams included proxy_id whenever it was defined, so an
empty string was forwarded to the API on pool create/update. Match the
truthiness check manage_browsers create uses (and the sibling name
field) so empty proxy_id is dropped; pools have no clear-proxy semantics
for an empty string.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants