Skip to content

feat: auto-pause agent during Browserbase captcha solving (WIP)#1752

Open
derekmeegan wants to merge 9 commits intomainfrom
derek/autowait_captchas_on_browserbase
Open

feat: auto-pause agent during Browserbase captcha solving (WIP)#1752
derekmeegan wants to merge 9 commits intomainfrom
derek/autowait_captchas_on_browserbase

Conversation

@derekmeegan
Copy link
Contributor

@derekmeegan derekmeegan commented Feb 25, 2026

why

Automatically pause agent execution when Browserbase's captcha solver is active. Listens for browserbase-solving-started/finished/errored console messages and blocks the agent's prepareStep (DOM/hybrid) or action handler (CUA) until solving completes, errors, or hits a 90s timeout.

Also updates the system prompt to tell the agent not to interact with captchas since they are handled transparently.

what changed

added callbacks

test plan

will review and test code tm


Summary by cubic

Automatically pauses agent actions while Browserbase’s captcha solver runs to prevent accidental interactions and keep execution stable. We listen for solver console events and resume on finish, error, or a 90s timeout.

  • New Features
    • Added CaptchaSolver to track "browserbase-solving-started/finished/errored" via console.
    • Blocked prepareStep (DOM/hybrid) and action handler (CUA) while solving, with a 90s timeout.
    • Logged solver errors and reset an error flag after handling; clean attach/dispose lifecycle to avoid listener leaks.
    • Enabled by default for Browserbase sessions unless browserSettings.solveCaptchas is false.
    • Updated system prompt to tell the agent not to interact with captchas and that execution will auto-pause during solving.

Written for commit 6d8d7f3. Summary will update on new commits. Review in cubic

Automatically pause agent execution when Browserbase's captcha solver is
active. Listens for browserbase-solving-started/finished/errored console
messages and blocks the agent's prepareStep (DOM/hybrid) or action handler
(CUA) until solving completes, errors, or hits a 90s timeout.

Also updates the system prompt to tell the agent not to interact with
captchas since they are handled transparently.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@changeset-bot
Copy link

changeset-bot bot commented Feb 25, 2026

⚠️ No Changeset found

Latest commit: 4d9bd2a

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 25, 2026

Greptile Summary

Integrated automatic captcha solving pause mechanism for Browserbase environments. Agent execution (both DOM/hybrid and CUA modes) now pauses when Browserbase's captcha solver is active, listening for console messages (browserbase-solving-started/finished/errored) and blocking agent steps/actions until solving completes or times out after 90 seconds.

Key changes:

  • Created CaptchaSolver class to track captcha solving state via console message listeners
  • Updated agent system prompt to tell agents not to interact with captchas
  • Integrated captcha solver into both V3AgentHandler (DOM/hybrid modes) and V3CuaAgentHandler (CUA mode)
  • Added isCaptchaSolverEnabled getter that checks Browserbase settings
  • Proper cleanup of listeners in all execution paths (success, error, abort)

Issues found:

  • Race condition in waitIfSolving() when called concurrently - second call overwrites resolveWait
  • Potential missed captcha start if console message arrives before listener attachment
  • Timeout handling slightly inconsistent with success/error paths

Confidence Score: 3/5

  • This PR has good architectural design but contains a race condition that could cause indefinite waiting in concurrent scenarios.
  • The implementation is well-structured with proper cleanup paths and integrates cleanly into both agent handler types. However, the CaptchaSolver.waitIfSolving() race condition could cause real issues if called concurrently (e.g., if the agent somehow triggers multiple steps simultaneously or if there's internal parallelism). Since this is marked WIP and the author plans to test tomorrow, these issues should be addressed before merge. The timeout handling inconsistency and potential listener attachment race are minor concerns.
  • Pay close attention to packages/core/lib/v3/agent/utils/captchaSolver.ts - the race condition in waitIfSolving() needs resolution before production use.

Important Files Changed

Filename Overview
packages/core/lib/v3/agent/utils/captchaSolver.ts New captcha solver class to pause agent execution during Browserbase captcha solving via console messages. Has potential race condition in waitIfSolving().
packages/core/lib/v3/handlers/v3AgentHandler.ts Integrated captcha solver into DOM/hybrid agent execution, blocking prepareStep during captcha solving. Proper cleanup in all execution paths.
packages/core/lib/v3/handlers/v3CuaAgentHandler.ts Integrated captcha solver into CUA agent execution, blocking action handler during captcha solving. Proper cleanup in finally block.

Sequence Diagram

sequenceDiagram
    participant Agent as Agent (DOM/CUA)
    participant Handler as Agent Handler
    participant CS as CaptchaSolver
    participant Page as Browser Page
    participant BB as Browserbase

    Agent->>Handler: execute() / stream()
    Handler->>CS: new CaptchaSolver()
    Handler->>CS: attach(page)
    CS->>Page: on("console", listener)
    
    loop Each Step/Action
        Agent->>Handler: prepareStep() / actionHandler()
        Handler->>CS: waitIfSolving()
        
        alt No captcha solving
            CS-->>Handler: resolve immediately
            Handler->>Agent: continue
        else Captcha solving
            BB->>Page: console: "browserbase-solving-started"
            Page->>CS: listener triggered
            CS->>CS: solving = true
            Note over CS,Handler: Agent blocked
            BB->>Page: console: "browserbase-solving-finished"
            Page->>CS: listener triggered
            CS->>CS: solving = false, resolveWait()
            CS-->>Handler: resolve
            Handler->>Agent: continue
        else Timeout (90s)
            CS->>CS: timeout triggers
            CS->>CS: solving = false, lastSolveErrored = true
            CS-->>Handler: resolve
            Handler->>Handler: log error
            Handler->>Agent: continue
        end
    end
    
    Agent->>Handler: complete/error/abort
    Handler->>CS: dispose()
    CS->>Page: off("console", listener)
    CS->>CS: resolve pending wait
Loading

Last reviewed commit: 6d8d7f3

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 files reviewed, 3 comments

Edit Code Review Agent Settings | Greptile

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 5 files

Confidence score: 3/5

  • Concurrent waitIfSolving() calls in packages/core/lib/v3/agent/utils/captchaSolver.ts can orphan earlier waiters due to a single resolveWait slot, causing hangs until timeout for some callers.
  • packages/core/lib/v3/handlers/v3AgentHandler.ts may leak captchaSolver if streamText() throws before callbacks run, since disposal only happens inside stream handlers in stream().
  • These are concrete runtime risks (waiting hangs and resource leaks), so there is some user-impacting risk despite the rest of the change looking contained.
  • Pay close attention to packages/core/lib/v3/agent/utils/captchaSolver.ts and packages/core/lib/v3/handlers/v3AgentHandler.ts - concurrency waiting and disposal paths.
Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/core/lib/v3/agent/utils/captchaSolver.ts">

<violation number="1" location="packages/core/lib/v3/agent/utils/captchaSolver.ts:72">
P1: Bug: `resolveWait` is a single slot, so concurrent `waitIfSolving()` calls orphan earlier waiters. The second call overwrites the resolver, leaving the first caller's promise unresolved until its 90s timeout. Consider sharing a single deferred promise across all waiters, e.g.:

```ts
private waitPromise: Promise<void> | null = null;

waitIfSolving(): Promise<void> {
  if (!this.solving) return Promise.resolve();
  if (!this.waitPromise) {
    this.waitPromise = new Promise<void>((resolve) => {
      const timer = setTimeout(() => { /* ... */ }, SOLVE_TIMEOUT_MS);
      this.resolveWait = () => { clearTimeout(timer); resolve(); this.waitPromise = null; };
    });
  }
  return this.waitPromise;
}
```</violation>
</file>

<file name="packages/core/lib/v3/handlers/v3AgentHandler.ts">

<violation number="1" location="packages/core/lib/v3/handlers/v3AgentHandler.ts:293">
P2: Resource leak: if `streamText()` throws synchronously, `captchaSolver` is never disposed. The `execute()` method properly wraps this in a `try/finally`, but `stream()` only disposes inside stream callbacks (`onError`/`onFinish`/`onAbort`), which won't fire if `streamText` itself throws. Wrap the `streamText` call and subsequent setup in a try/catch that calls `captchaSolver?.dispose()` on failure.</violation>
</file>
Architecture diagram
sequenceDiagram
    participant User as Client/V3
    participant Handler as Agent Handler (DOM/CUA)
    participant Solver as NEW: CaptchaSolver
    participant Page as Browser Page (Browserbase)
    participant LLM as LLM Provider

    User->>Handler: execute(instruction)
    
    opt NEW: isBrowserbase && captchaSolverEnabled
        Handler->>Solver: attach(page)
        Solver->>Page: on("console", listener)
    end

    Note over Handler, LLM: CHANGED: System prompt tells LLM to ignore captchas

    loop Agent Step Loop
        Page-->>Solver: NEW: console("browserbase-solving-started")
        Note right of Solver: solving = true

        Handler->>Handler: prepareStep() / actionHandler()
        
        Handler->>Solver: NEW: waitIfSolving()
        
        alt Solver is active
            Note over Solver: NEW: Block execution (max 90s timeout)
            
            alt Solve Success
                Page-->>Solver: NEW: console("browserbase-solving-finished")
                Solver-->>Handler: Resolve (Resume)
            else Solve Error
                Page-->>Solver: NEW: console("browserbase-solving-errored")
                Solver-->>Handler: Resolve (Resume with error flag)
                Handler->>Handler: Log solver error & resetError()
            else Timeout (90s)
                Solver->>Solver: Internal timeout
                Solver-->>Handler: Resolve (Resume)
            end
        else Solver is idle
            Solver-->>Handler: Resolve immediately
        end

        Handler->>LLM: Request next action
        LLM-->>Handler: Action response
        Handler->>Page: Execute action
    end

    Handler->>Solver: NEW: dispose()
    Solver->>Page: off("console", listener)
    Handler-->>User: Result
Loading

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

- Fix race condition: concurrent waitIfSolving() callers now share a
  single deferred promise so no waiter is orphaned.
- Extract settle() helper for consistent timeout/resolve/cleanup paths.
- Fix stream() disposal: wrap streamText() in try/catch so captchaSolver
  is disposed if streamText throws synchronously.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@derekmeegan
Copy link
Contributor Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: dc13b5a25a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 315 to 316
const page = await this.v3.context.awaitActivePage();
captchaSolver.attach(page);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Reattach captcha listener when active page changes

In execute() the solver is attached once to the initial page, but agent actions/tools later resolve the current page dynamically via awaitActivePage(), which can switch to a popup/new tab during the run. In that case captcha console events from the new active page are never observed, so waitIfSolving() will not block and the agent can keep interacting while Browserbase is solving. Please track active-page changes (or attach listeners per active page) instead of binding only once.

Useful? React with 👍 / 👎.

Comment on lines 197 to 200
const roadblocksSection = isBrowserbase
? `<roadblocks>
<note>captchas, popups, etc.</note>
<captcha>If you see a captcha, use the wait tool. It will automatically be solved by our internal solver.</captcha>
<captcha>Captchas are automatically detected and solved by the Browserbase captcha solver. Do NOT attempt to solve or interact with captchas yourself — they will be handled transparently without any action from you. Your execution will be paused automatically while a captcha is being solved and will resume once it is complete.</captcha>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Gate captcha auto-pause prompt on solveCaptchas setting

The prompt now always claims execution is automatically paused for captchas whenever isBrowserbase is true, but runtime only enables this behavior when isCaptchaSolverEnabled (browserSettings.solveCaptchas !== false). For Browserbase sessions with solveCaptchas: false, the model is told not to act because pause is automatic even though no solver/pause is active, which can break captcha handling. This roadblocks text should be conditioned on the same enabled flag as the handlers.

Useful? React with 👍 / 👎.

…aptchas

- CaptchaSolver now accepts a page-provider callback and re-attaches
  the console listener whenever the active page changes (popups, new
  tabs). This ensures captcha events are observed on whichever page is
  currently active.
- System prompt captcha messaging is now gated on captchaSolverEnabled
  (solveCaptchas !== false) rather than just isBrowserbase, so sessions
  with solveCaptchas: false don't incorrectly tell the agent that
  captchas are auto-solved.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 4 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/core/lib/v3/agent/utils/captchaSolver.ts">

<violation number="1" location="packages/core/lib/v3/agent/utils/captchaSolver.ts:54">
P1: Stale `solving` state after page change: when `ensureAttached()` detects a page switch and detaches the old listener, `this.solving` is not reset. If a solve was in progress on the old page, the agent will block for up to 90 seconds waiting for a finish/error event that can never arrive (since the old page's listener was removed). Reset `solving` and settle pending waiters when detaching from a changed page.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

if (page === this.attachedPage) return;

// Detach from the old page
this.detachListener();
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Stale solving state after page change: when ensureAttached() detects a page switch and detaches the old listener, this.solving is not reset. If a solve was in progress on the old page, the agent will block for up to 90 seconds waiting for a finish/error event that can never arrive (since the old page's listener was removed). Reset solving and settle pending waiters when detaching from a changed page.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At packages/core/lib/v3/agent/utils/captchaSolver.ts, line 54:

<comment>Stale `solving` state after page change: when `ensureAttached()` detects a page switch and detaches the old listener, `this.solving` is not reset. If a solve was in progress on the old page, the agent will block for up to 90 seconds waiting for a finish/error event that can never arrive (since the old page's listener was removed). Reset `solving` and settle pending waiters when detaching from a changed page.</comment>

<file context>
@@ -29,13 +33,27 @@ export class CaptchaSolver {
+    if (page === this.attachedPage) return;
 
+    // Detach from the old page
+    this.detachListener();
+
+    this.attachedPage = page;
</file context>
Suggested change
this.detachListener();
this.detachListener();
// Reset solving state — we can no longer receive events from the old page,
// so any in-progress solve must be treated as resolved to avoid a 90s hang.
if (this.solving) {
this.solving = false;
this._lastSolveErrored = false;
this.settle();
}
Fix with Cubic

derekmeegan and others added 6 commits February 25, 2026 12:18
The test was passing isBrowserbase: true but not captchaSolverEnabled,
so the auto-solve captcha messaging wasn't included in the prompt.
Updated test to pass captchaSolverEnabled: true and match new wording.
Added a new test case for isBrowserbase: true with captchaSolverEnabled:
false to verify the solver-off path.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant