Skip to content

Fix observe context for file inputs; add upload eval/example#1718

Draft
shrey150 wants to merge 2 commits intomainfrom
codex/file-input-context-fix
Draft

Fix observe context for file inputs; add upload eval/example#1718
shrey150 wants to merge 2 commits intomainfrom
codex/file-input-context-fix

Conversation

@shrey150
Copy link
Contributor

@shrey150 shrey150 commented Feb 20, 2026

Summary

  • capture input[type] metadata in the DOM snapshot pipeline (domMapsForSession + buildSessionDomIndex)
  • thread input-type metadata into hybrid capture and a11y shaping
  • keep structural file inputs during a11y pruning and synthesize missing file-input nodes when AX omits them
  • add/extend snapshot tests to cover file-input metadata propagation, retention, and synthetic injection
  • add observe_file_input_upload eval task showing observe -> unpack selector -> page.locator(xpath).setInputFiles(...)
  • add matching core example: packages/core/examples/observe_file_input_upload.ts
  • register the new eval in packages/evals/evals.config.json

Testing

  • targeted unit tests (source-level vitest):
    • tests/snapshot-dom-session-builders.test.ts
    • tests/snapshot-a11y-tree-utils.test.ts
    • tests/snapshot-a11y-resolvers.test.ts
    • tests/snapshot-capture-orchestration.test.ts
    • result: all passing (48 tests)

Related


Summary by cubic

Fixes observe() context for file upload inputs so the correct input field is returned as a reliable XPath and works with setInputFiles. Adds an example and eval to validate file uploads end-to-end.

  • Bug Fixes

    • Capture input[type] metadata from the DOM and pass it through snapshot capture and a11y shaping.
    • Keep structural file inputs during pruning and synthesize missing nodes when AX omits them.
    • Expand snapshot tests to cover propagation, retention, and synthetic injection of file inputs.
  • New Features

    • Add observe_file_input_upload eval and register it in evals.config.json.
    • Add core example demonstrating observe() + setInputFiles for a resume upload.

Written for commit 3db73a0. Summary will update on new commits. Review in cubic

@changeset-bot
Copy link

changeset-bot bot commented Feb 20, 2026

🦋 Changeset detected

Latest commit: 3db73a0

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 3 packages
Name Type
@browserbasehq/stagehand Patch
@browserbasehq/stagehand-evals Patch
@browserbasehq/stagehand-server Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 20, 2026

Greptile Summary

Fixed file input observation by capturing input[type] metadata throughout the snapshot pipeline and ensuring file inputs remain visible in accessibility trees.

Key changes:

  • Added extractInputType helper in domTree.ts to parse input type attributes from CDP DOM nodes
  • Threaded inputTypeMap through SessionDomIndex, FrameDomMaps, and A11yOptions type definitions
  • Modified buildHierarchicalTree to preserve structural file inputs that would normally be pruned
  • Implemented appendMissingFileInputNodes to synthesize file-input nodes when the AX tree omits them
  • Extended all 48 unit tests to verify metadata propagation, retention, and synthetic injection
  • Added working eval and example showing observe() → unpack selector → page.locator().setInputFiles() pattern

Confidence Score: 5/5

  • Safe to merge with high confidence
  • Clean implementation with comprehensive test coverage (48 passing tests), proper type threading, defensive coding (optional chaining, fallbacks), and working eval/example demonstrating the fix
  • No files require special attention

Important Files Changed

Filename Overview
packages/core/lib/v3/types/private/snapshot.ts Added inputTypeMap field to type definitions for tracking input element types
packages/core/lib/v3/understudy/a11y/snapshot/domTree.ts Implemented extractInputType helper and threaded input-type metadata through DOM indexing pipeline
packages/core/lib/v3/understudy/a11y/snapshot/a11yTree.ts Preserved file inputs during a11y pruning and synthesized missing file-input nodes when AX tree omits them
packages/core/lib/v3/understudy/a11y/snapshot/capture.ts Propagated inputTypeMap through hybrid capture orchestration to a11y shaping layer
packages/core/examples/observe_file_input_upload.ts Added example demonstrating observe → unpack selector → setInputFiles workflow for file uploads
packages/evals/tasks/observe_file_input_upload.ts Added eval task validating file input observation and upload functionality

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[CDP DOM.getDocument] --> B[extractInputType helper]
    B --> C[buildSessionDomIndex]
    C --> D[inputTypeByBe Map]
    D --> E[collectPerFrameMaps]
    E --> F[inputTypeMap per frame]
    F --> G[a11yForFrame]
    G --> H{isFileInputNode check}
    H -->|Yes| I[Keep structural node]
    H -->|No| J[Normal pruning]
    I --> K[buildHierarchicalTree]
    J --> K
    K --> L{File input in tree?}
    L -->|No| M[appendMissingFileInputNodes]
    L -->|Yes| N[Use existing]
    M --> O[Synthetic file input node]
    N --> P[Final outline with file inputs]
    O --> P
    P --> Q[observe returns selector]
    Q --> R[page.locator xpath .setInputFiles]
Loading

Last reviewed commit: ac7939f

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

11 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 11 files

Confidence score: 3/5

  • Runtime risk: packages/evals/tasks/observe_file_input_upload.ts uses a relative ESM import without the .js extension, which can fail under native ESM resolution and break this task at runtime.
  • Score reflects a concrete user-impacting runtime failure in an eval task, though the issue is localized and straightforward to fix.
  • Pay close attention to packages/evals/tasks/observe_file_input_upload.ts - missing .js extension on a relative ESM import.
Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/evals/tasks/observe_file_input_upload.ts">

<violation number="1" location="packages/evals/tasks/observe_file_input_upload.ts:4">
P2: Missing `.js` extension on relative ESM import. All other eval task files in this directory use `"../types/evals.js"`. Omitting the extension will fail at runtime under native ESM resolution.

(Based on your team's feedback about requiring explicit .js extensions on all relative ESM import specifiers.) [FEEDBACK_USED]</violation>
</file>
Architecture diagram
sequenceDiagram
    participant U as User Script
    participant SH as Stagehand (v3)
    participant CAP as Capture Orchestrator
    participant DOM as DOM Tree Service
    participant AXS as A11y Tree Service
    participant CDP as Browser (CDP)

    Note over U,CDP: Observe & File Upload Flow

    U->>SH: observe(instruction)
    SH->>CAP: tryScopedSnapshot()
    
    CAP->>DOM: domMapsForSession()
    DOM->>CDP: DOM.getDocument (depth: -1)
    CDP-->>DOM: Full DOM Tree
    DOM->>DOM: NEW: extractInputType()
    Note right of DOM: Captures 'type' attribute<br/>for <input> elements
    DOM-->>CAP: { tagNameMap, inputTypeMap, xpathMap }

    CAP->>AXS: a11yForFrame(domMaps)
    AXS->>CDP: Accessibility.getFullAXTree()
    CDP-->>AXS: AXNodes
    
    AXS->>AXS: buildHierarchicalTree()
    Note right of AXS: CHANGED: isFileInputNode() check<br/>prevents pruning file inputs
    
    AXS->>AXS: NEW: appendMissingFileInputNodes()
    Note right of AXS: Synthesizes AX nodes if DOM has<br/>file inputs missing from AX tree

    AXS-->>CAP: Simplified A11y Tree (with file inputs)
    CAP-->>SH: Snapshot Data
    SH-->>U: Array of Observations (selectors)

    Note over U,CDP: Interaction Phase

    U->>U: Unpack observed XPath
    U->>CDP: page.locator(xpath).setInputFiles(path)
    CDP-->>U: File uploaded successfully
Loading

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

import { promises as fs } from "fs";
import path from "path";
import crypto from "crypto";
import { EvalFunction } from "../types/evals";
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Missing .js extension on relative ESM import. All other eval task files in this directory use "../types/evals.js". Omitting the extension will fail at runtime under native ESM resolution.

(Based on your team's feedback about requiring explicit .js extensions on all relative ESM import specifiers.)

View Feedback

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At packages/evals/tasks/observe_file_input_upload.ts, line 4:

<comment>Missing `.js` extension on relative ESM import. All other eval task files in this directory use `"../types/evals.js"`. Omitting the extension will fail at runtime under native ESM resolution.

(Based on your team's feedback about requiring explicit .js extensions on all relative ESM import specifiers.) </comment>

<file context>
@@ -0,0 +1,103 @@
+import { promises as fs } from "fs";
+import path from "path";
+import crypto from "crypto";
+import { EvalFunction } from "../types/evals";
+
+const FILE_UPLOAD_V2_URL =
</file context>
Fix with Cubic

@shrey150 shrey150 marked this pull request as draft February 20, 2026 16:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant