Skip to content

Fix/agent improvements 2#65

Merged
aidenybai merged 103 commits intomainfrom
fix/agent-improvements-2
Apr 1, 2026
Merged

Fix/agent improvements 2#65
aidenybai merged 103 commits intomainfrom
fix/agent-improvements-2

Conversation

@NisargIO
Copy link
Copy Markdown
Member

@NisargIO NisargIO commented Mar 31, 2026


Summary by cubic

Expands cross‑browser testing and hardens the agent/CLI: installs all Playwright engines in CI/init/Action, adds health checks and screenshot evidence, simplifies the TUI to keyboard‑only, and improves Action setup with optional Claude token automation and version checks. Drops the iOS simulator and the deprecated security audit tool.

  • New Features

    • Browser MCP: open accepts browser (chromium | webkit | firefox); CDP limited to Chromium; Playwright exec gains snapshotAfter and viewport‑aware ARIA snapshots; screenshots use CSS scaling; CI reporter prints screenshot paths.
    • CI/init/GitHub Action: install Chromium/WebKit/Firefox with streamed output and a 5‑minute timeout; Action installs deps and @anthropic-ai/claude-code, checks out full history (fetch-depth: 0), skips the dev server when EXPECT_BASE_URL is set; fixed 8‑step website test plan; Action helper can generate a Claude token and set ANTHROPIC_API_KEY via gh.
    • CLI: detect nearby projects and show them in the port picker; pass base URLs via --url/-u and feed dev‑server hints into execution; website adds /api/version, and the CLI pings VERSION_API_URL on startup.
    • Tools: network_requests flags failed requests, near‑duplicate requests within 500ms, and mixed content; console_logs summarizes error/warning counts.
  • Refactors

    • Removed iOS Simulator/Appium and the security_audit MCP tool.
    • Split browser runtime into overlay, performance, and rrweb; added tracing with named services in expect-cli and expect-mcp; MCP tools resolve in dev/test; CI stability: fail fast on auth/fatal adapter errors, emit run start immediately, treat any session update as activity, and print recent debug logs on failures.
    • TUI: removed mouse/click support in favor of keyboard‑only; verbose logging is reactive.
    • Skill install: download the expect skill tarball into .agents/skills/expect and symlink into selected agents.

Written for commit ef736a9. Summary will update on new commits.


Note

Medium Risk
Moderate risk because it changes installation/setup flows, CI workflow behavior, and CLI navigation/port selection logic, which can break onboarding or automated runs across environments.

Overview
Streamlines and hardens Expect setup and CI runs. CI workflows now install all Playwright browsers (Chromium/WebKit/Firefox), and the generated expect.yml workflow is updated to fetch full git history, support overriding EXPECT_BASE_URL via repo variables (skipping local dev-server startup), and always wait on the resolved base URL.

Refactors the CLI onboarding and TUI behavior. init installs Playwright browsers with a timeout, add-github-action can optionally generate a Claude token and set ANTHROPIC_API_KEY via the GitHub CLI, and skill installation is reworked to download the skill tarball into .agents/skills/expect and symlink it into selected agent directories.

Improves test targeting and port picking. The watch command can accept a base URL, the main menu can pass CLI-provided base URLs into testing, and the port picker now surfaces detected (not-running) projects and passes devServerHints into the testing screen; the interactive UI is simplified by removing mouse/clickable support in favor of keyboard-driven selection.

Written by Cursor Bugbot for commit ef736a9. This will update automatically on new commits. Configure here.

ben-million and others added 8 commits March 30, 2026 20:10
.agents/skills/expect/SKILL.md is the source of truth. The two copies
had drifted apart: packages/ had Session Replay + Telemetry docs that
.agents/ lacked, while .agents/ had been rewritten with accessibility
(WCAG), performance (Web Vitals/LoAF), --cookies, and run_in_background
that packages/ never received.

This commit:
- Adds the missing Session Replay and Telemetry sections to .agents/
- Overwrites packages/ from .agents/ (preserving its extra frontmatter)
- Adds scripts/sync-skill.sh to copy .agents/ → packages/ or --check
  that they haven't drifted
- Wires the --check into CI so future divergence fails the build

Closes #63
The .agents/skills/ directory contains domain expertise (accessibility
rules, SEO checklists, design guidelines) that was only used for
developing expect itself. This wires them into the test agent so it
applies those criteria when evaluating UIs during browser tests.

Implementation:
- scripts/generate-evaluation-skills.sh reads fixing-accessibility,
  fixing-seo, and web-design-guidelines skills and emits a TypeScript
  module with their content inlined as string constants
- packages/shared/src/evaluation-skills.generated.ts is the output,
  committed so the CLI works without running the script first
- buildExecutionSystemPrompt() appends the skill content to the system
  prompt as additional evaluation criteria
- prebuild hook runs the codegen so the generated file stays current
- 3 new tests verify the skills appear in the system prompt
- Added snapshotAfter parameter to the Playwright execution context, allowing automatic ARIA snapshot capture after code execution.
- Updated input schema and documentation to reflect the new feature.
- Enhanced error handling to return structured results, including success status and error messages.
- Added tests to verify functionality of snapshotAfter, ensuring fresh snapshots are returned alongside execution results.
@vercel
Copy link
Copy Markdown
Contributor

vercel bot commented Mar 31, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
expect Ready Ready Preview, Comment Apr 1, 2026 11:35pm

@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new bot commented Mar 31, 2026

Open in StackBlitz

npm i https://pkg.pr.new/expect-cli@65

commit: 1cb67e8

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

16 issues found across 110 files

Note: This PR contains a large number of files. cubic only reviews up to 75 files per PR, so some files may not have been reviewed.

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/expect-skill/fixing-animation/SKILL.md">

<violation number="1" location="packages/expect-skill/fixing-animation/SKILL.md:45">
P2: Avoid documenting `will-change` as a permanent style; this conflicts with your own guidance and can cause unnecessary layer promotion. Scope it to active animation only.</violation>
</file>

<file name="packages/expect-skill/fixing-animation/rules/gpu-performance.md">

<violation number="1" location="packages/expect-skill/fixing-animation/rules/gpu-performance.md:78">
P2: The example applies `will-change: transform` permanently, which contradicts the file’s own guidance to scope it only to active animations.</violation>
</file>

<file name="apps/cli/src/commands/add-skill.ts">

<violation number="1" location="apps/cli/src/commands/add-skill.ts:63">
P2: Add an empty-agent guard for non-interactive runs so `-y` does not proceed as if installation were valid when no supported agents are detected.</violation>

<violation number="2" location="apps/cli/src/commands/add-skill.ts:101">
P2: Do not mark installation as successful when no selected agent symlink was created.</violation>
</file>

<file name="packages/expect-skill/fixing-animation/rules/consistency.md">

<violation number="1" location="packages/expect-skill/fixing-animation/rules/consistency.md:81">
P2: `duration-fast` is defined with conflicting values (100ms vs 125ms) in the same rule, which can cause inconsistent implementation across components.</violation>
</file>

<file name="packages/expect-skill/react-best-practices/SKILL.md">

<violation number="1" location="packages/expect-skill/react-best-practices/SKILL.md:12">
P3: The summary count is inconsistent with the rules listed below (8 categories/58 rules vs a 9th category with an additional rule). Update the summary text to avoid contradictory guidance.</violation>
</file>

<file name="packages/expect-skill/react-best-practices/rules/bundle-conditional.md">

<violation number="1" location="packages/expect-skill/react-best-practices/rules/bundle-conditional.md:22">
P2: The example calls `setEnabled(false)` but no `setEnabled` is defined, so the snippet does not compile as written.</violation>
</file>

<file name="apps/cli/src/commands/skill-content.ts">

<violation number="1" location="apps/cli/src/commands/skill-content.ts:38">
P2: The skill content references `--base-url`, but this CLI flag is not supported. Users following this instruction will hit an unknown option error; keep the guidance to `EXPECT_BASE_URL` only.</violation>
</file>

<file name="packages/expect-skill/react-best-practices/rules/rendering-animate-svg-wrapper.md">

<violation number="1" location="packages/expect-skill/react-best-practices/rules/rendering-animate-svg-wrapper.md:38">
P3: The guidance line mixes CSS properties/functions/transition terminology, which makes the rule technically inaccurate and potentially confusing.</violation>
</file>

<file name="packages/expect-skill/fixing-animation/rules/accessibility.md">

<violation number="1" location="packages/expect-skill/fixing-animation/rules/accessibility.md:96">
P2: The example uses `motion.div` without importing `motion`, so the snippet is not runnable as written.</violation>
</file>

<file name="packages/expect-skill/react-best-practices/rules/js-set-map-lookups.md">

<violation number="1" location="packages/expect-skill/react-best-practices/rules/js-set-map-lookups.md:15">
P3: The example snippets are syntactically invalid because they use a bare `...` inside array literals.</violation>
</file>

<file name="packages/expect-skill/react-best-practices/rules/async-parallel.md">

<violation number="1" location="packages/expect-skill/react-best-practices/rules/async-parallel.md:20">
P2: The phrase "1 round trip" is inaccurate for `Promise.all`; this is still multiple requests executed concurrently, not a single round trip.</violation>
</file>

<file name="packages/expect-skill/react-best-practices/rules/js-cache-storage.md">

<violation number="1" location="packages/expect-skill/react-best-practices/rules/js-cache-storage.md:48">
P2: Cookie parsing splits on every `=`, which can truncate valid cookie values that contain `=`.</violation>

<violation number="2" location="packages/expect-skill/react-best-practices/rules/js-cache-storage.md:65">
P2: The invalidation example clears only `storageCache` and leaves `cookieCache` stale, despite mentioning external cookie changes.</violation>
</file>

<file name="packages/expect-skill/react-best-practices/rules/js-cache-function-results.md">

<violation number="1" location="packages/expect-skill/react-best-practices/rules/js-cache-function-results.md:33">
P2: The recommended module-level `Map` cache is unbounded and never invalidated, which can cause unbounded memory growth.</violation>
</file>

<file name="packages/expect-skill/react-best-practices/rules/client-swr-dedup.md">

<violation number="1" location="packages/expect-skill/react-best-practices/rules/client-swr-dedup.md:48">
P2: Use a default import for `useSWRMutation`; the current named import is incorrect for `swr/mutation`.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment thread packages/expect-skill/fixing-animation/SKILL.md
Comment thread packages/expect-skill/fixing-animation/rules/gpu-performance.md
Comment thread apps/cli/src/commands/add-skill.ts Outdated
Comment thread apps/cli/src/commands/add-skill.ts
Comment thread packages/expect-skill/fixing-animation/rules/consistency.md
Comment thread packages/expect-skill/react-best-practices/rules/client-swr-dedup.md Outdated
Comment thread packages/expect-skill/react-best-practices/SKILL.md Outdated
Comment thread packages/expect-skill/react-best-practices/rules/rendering-animate-svg-wrapper.md Outdated
Comment thread packages/expect-skill/react-best-practices/rules/js-set-map-lookups.md Outdated
- Enhanced the CI reporter to include screenshot paths in the artifacts output.
- Updated the artifact extraction logic to capture screenshot paths from execution events.
- Modified the run-test utility to pass screenshot paths to the CI reporter.
- Updated the MCP session to save and return screenshot paths as part of the close result.
- Adjusted the server logic to handle and display screenshot information in the results.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 31, 2026

Test Results

❌ Website Test: failed

10 passed, 1 failed out of 12 steps — 535s

Step Status Duration
Homepage loads with hero section and install commands ✅ passed 26s
View demo navigates to /replay?demo=true and loads player ✅ passed 60s
Replay controls — play/pause, speed selector, step list ✅ passed 181s
Copy button — install command copied to clipboard ✅ passed 37s
Theme toggle — dark mode switch and back to light ✅ passed 35s
Footer links — GitHub and X have correct URLs and target="_blank" ✅ passed 11s
Legal pages — /terms, /privacy, /security each load with text content ✅ passed 31s
Mobile viewport — 375x812 renders without horizontal scrollbar, key content visible ✅ passed 61s
Accessibility audit (WCAG 2.1 AA) ⬜ not-run
Performance metrics ✅ passed 14s
Project healthcheck — pnpm check ❌ failed 23s
Cross-browser (WebKit/Safari) — primary flow recheck ✅ passed 50s

Session Recording

https://github.com/millionco/expect/releases/download/ci-pr-65/e3a44d78-742d-40b3-a444-a7bb5bee5902.webm


Workflow run #118 | 📎 Download all recordings

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 6 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/browser/src/mcp/mcp-session.ts">

<violation number="1" location="packages/browser/src/mcp/mcp-session.ts:148">
P2: Screenshot paths accumulate across session lifetimes, so `close()` may return stale paths from earlier runs.</violation>

<violation number="2" location="packages/browser/src/mcp/mcp-session.ts:168">
P2: `saveScreenshot` records a screenshot path even when writing the file fails, which can return non-existent screenshot artifacts.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment thread packages/browser/src/mcp/mcp-session.ts Outdated
Comment thread packages/browser/src/mcp/mcp-session.ts Outdated
- Revised the description to clarify usage for fixing accessibility violations and compliance with WCAG 2.1 AA.
- Enhanced rules section with detailed accessibility requirements for interactive elements, keyboard access, focus management, and semantics.
- Added examples for common accessibility issues and their corrections.
- Expanded the guidelines for forms, announcements, contrast, and content handling to improve overall clarity and usability.
Comment thread packages/expect-skill/SKILL.md
Comment thread packages/browser/src/mcp/mcp-session.ts Outdated
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 7 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/expect-skill/fixing-accessibility/SKILL.md">

<violation number="1" location="packages/expect-skill/fixing-accessibility/SKILL.md:27">
P2: This line turns 44x44 touch targets into a strict WCAG 2.1 AA requirement, which conflicts with the repo’s own guidance and can produce false-positive accessibility findings. Reword it as a recommendation instead of a mandatory rule.</violation>
</file>

<file name="packages/expect-skill/web-design-guidelines/SKILL.md">

<violation number="1" location="packages/expect-skill/web-design-guidelines/SKILL.md:20">
P2: This rule is too broad: native interactive elements should not require manual keyboard handlers. Limit this requirement to custom non-native interactive elements.</violation>

<violation number="2" location="packages/expect-skill/web-design-guidelines/SKILL.md:62">
P3: The quote examples are incorrect: the line says to use curly quotes but only shows straight quotes, making the guidance ambiguous.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment thread packages/expect-skill/fixing-accessibility/SKILL.md Outdated
Comment thread packages/expect-skill/web-design-guidelines/SKILL.md Outdated
### Typography

- `…` not `...`
- Curly quotes `"` `"` not straight `"`
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3: The quote examples are incorrect: the line says to use curly quotes but only shows straight quotes, making the guidance ambiguous.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At packages/expect-skill/web-design-guidelines/SKILL.md, line 62:

<comment>The quote examples are incorrect: the line says to use curly quotes but only shows straight quotes, making the guidance ambiguous.</comment>

<file context>
@@ -9,31 +9,173 @@ metadata:
+### Typography
+
+- `…` not `...`
+- Curly quotes `"` `"` not straight `"`
+- Non-breaking spaces: `10&nbsp;MB`, `⌘&nbsp;K`, brand names
+- Loading states end with `…`: `"Loading…"`, `"Saving…"`
</file context>
Suggested change
- Curly quotes `"` `"` not straight `"`
- Curly quotes `` `` not straight `"`
Fix with Cubic

- performance_metrics: add page health section with DOM node count,
  viewport overflow detection, broken images, images without
  dimensions, and duplicate ID detection
- console_logs: add error/warning count summary to response
- network_requests: detect failed requests (4xx/5xx), duplicate
  requests within 500ms window, and mixed content (HTTP on HTTPS)
Comment thread .github/workflows/code-quality.yml Outdated
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 5 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/browser/src/mcp/server.ts">

<violation number="1" location="packages/browser/src/mcp/server.ts:368">
P2: Mixed-content detection uses any historical HTTPS document in the filtered entries, which can produce false positives across multi-navigation sessions.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment thread packages/browser/src/mcp/server.ts
Comment thread .agents/skills/expect/SKILL.md Outdated
Comment thread packages/browser/src/mcp/mcp-session.ts Outdated
Comment thread apps/cli/src/commands/skill-content.ts Outdated
Comment thread apps/cli/src/commands/add-skill.ts
Comment thread packages/browser/src/mcp/mcp-session.ts Outdated
Comment thread apps/cli/src/commands/add-skill.ts
Comment on lines +368 to +375
const isHttps = entries.some(
(entry) => entry.resourceType === "document" && entry.url.startsWith("https://"),
);
const mixedContent = isHttps
? entries.filter(
(entry) => entry.resourceType !== "document" && entry.url.startsWith("http://"),
)
: [];
Copy link
Copy Markdown
Contributor

@vercel vercel bot Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mixed-content detection incorrectly flags HTTP resources as mixed content when ANY historical document was HTTPS, causing false positives in multi-navigation sessions

Fix on Vercel

- Added @expect/shared as a workspace dependency in pnpm-lock.yaml and package.json.
- Updated MCP runtime to include Tracing layer for improved observability.
- Enhanced various MCP server methods with tracing spans for better performance monitoring.
Added logic to the ACP client to monitor stderr for fatal errors reported by adapters, ensuring immediate failure of active session queues upon detection. This prevents consumers from hanging indefinitely and improves overall error handling in the system.
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 17 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="apps/cli/src/commands/init-utils.ts">

<violation number="1" location="apps/cli/src/commands/init-utils.ts:119">
P2: `generateClaudeToken` now buffers all stdout and prints it only after process completion, which can hide interactive prompts and stall user-driven token setup.</violation>
</file>

<file name="packages/shared/src/observability/Tracing.ts">

<violation number="1" location="packages/shared/src/observability/Tracing.ts:12">
P1: Do not hard-code the Axiom API token in source. Read it from configuration/environment so credentials aren’t committed to the repo.</violation>
</file>

<file name="packages/agent/src/acp-client.ts">

<violation number="1" location="packages/agent/src/acp-client.ts:573">
P1: `Stream.take(1)` stops draining adapter stderr after the first fatal match, which can block the child process if stderr continues to be written.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

const dataset = "expect-cli";
const token = "xaat-a6ce2fdb-d378-444e-9d72-bb458867187a";
const AXIOM_DATASET = "expect-cli";
const AXIOM_TOKEN = "xaat-a6ce2fdb-d378-444e-9d72-bb458867187a";
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Do not hard-code the Axiom API token in source. Read it from configuration/environment so credentials aren’t committed to the repo.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At packages/shared/src/observability/Tracing.ts, line 12:

<comment>Do not hard-code the Axiom API token in source. Read it from configuration/environment so credentials aren’t committed to the repo.</comment>

<file context>
@@ -1,25 +1,22 @@
-    const dataset = "expect-cli";
-    const token = "xaat-a6ce2fdb-d378-444e-9d72-bb458867187a";
+const AXIOM_DATASET = "expect-cli";
+const AXIOM_TOKEN = "xaat-a6ce2fdb-d378-444e-9d72-bb458867187a";
 
-    return Otlp.layerJson({
</file context>
Suggested change
const AXIOM_TOKEN = "xaat-a6ce2fdb-d378-444e-9d72-bb458867187a";
const AXIOM_TOKEN = process.env.AXIOM_TOKEN ?? "";
Fix with Cubic

Stream.tap((line) => Effect.logDebug("ACP adapter stderr", { line })),
Stream.map(getAdapterSessionError),
Stream.filter((error): error is SessionQueueError => error !== undefined),
Stream.take(1),
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Stream.take(1) stops draining adapter stderr after the first fatal match, which can block the child process if stderr continues to be written.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At packages/agent/src/acp-client.ts, line 573:

<comment>`Stream.take(1)` stops draining adapter stderr after the first fatal match, which can block the child process if stderr continues to be written.</comment>

<file context>
@@ -560,18 +559,20 @@ export class AcpClient extends ServiceMap.Service<AcpClient>()("@expect/AcpClien
-      Stream.filterEffect(() =>
-        Ref.get(adapterSessionErrorRef).pipe(Effect.map((existing) => !existing)),
-      ),
+      Stream.take(1),
       Stream.tap((adapterSessionError) =>
         Effect.gen(function* () {
</file context>
Fix with Cubic

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you buggin

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Can you share what’s incorrect or the intended behavior here so I can refine future reviews?

Comment on lines +119 to +124
const [stdoutText, exitCode] = yield* Effect.all(
[Stream.mkString(Stream.decodeText(handle.stdout)), handle.exitCode],
{ concurrency: 2 },
);

process.stdout.write(stdoutText);
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: generateClaudeToken now buffers all stdout and prints it only after process completion, which can hide interactive prompts and stall user-driven token setup.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At apps/cli/src/commands/init-utils.ts, line 119:

<comment>`generateClaudeToken` now buffers all stdout and prints it only after process completion, which can hide interactive prompts and stall user-driven token setup.</comment>

<file context>
@@ -115,33 +116,36 @@ export const generateClaudeToken = Effect.gen(function* () {
-        process.stdout.write(text);
-      }),
-    ),
+  const [stdoutText, exitCode] = yield* Effect.all(
+    [Stream.mkString(Stream.decodeText(handle.stdout)), handle.exitCode],
+    { concurrency: 2 },
</file context>
Suggested change
const [stdoutText, exitCode] = yield* Effect.all(
[Stream.mkString(Stream.decodeText(handle.stdout)), handle.exitCode],
{ concurrency: 2 },
);
process.stdout.write(stdoutText);
let stdoutText = "";
yield* handle.stdout.pipe(
Stream.decodeText(),
Stream.runForEach((text) =>
Effect.sync(() => {
stdoutText += text;
process.stdout.write(text);
}),
),
);
const exitCode = yield* handle.exitCode;
Fix with Cubic

…sion errors

Refactored the error handling logic in the ACP client to return an Option type for session errors, enhancing type safety and clarity. This change allows for better handling of potential errors reported by adapters, ensuring that only valid errors are processed and logged.
NisargIO added 2 commits April 1, 2026 02:11
Reverted all test body changes back to main. Only changed TEST_LAYERS
to add availability detection via isCommandAvailable so tests skip
when the agent binary isn't installed. Un-skipped session resume test
per review feedback.

Made-with: Cursor
Cleaned up the init-utils.ts and init-utils.test.ts files by removing trailing whitespace, improving code readability without altering functionality.
Copy link
Copy Markdown
Contributor

@skoshx skoshx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

round2

Comment thread .gitignore Outdated
.testie-agent-traces/
packages/video/build
.cursor/plans
.repos No newline at end of file
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this shouldn't be gitignored because then the agent cant get results from effect so it doesn't improve perf and become Effect god

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed!

}),
Effect.forEach(
Array.from(sessionUpdatesMap.values()),
(updatesQueue) => Queue.fail(updatesQueue, adapterSessionError),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a Stream is only failed once, so this sessionUpdatesMap isn't needed..

Just listen to stderr, parse the error, then Stream.tap(() => Queue.fail(updatesQueue, error),

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are adding multiple sessions per test to improve testing perf in the future

Stream.tap((line) => Effect.logDebug("ACP adapter stderr", { line })),
Stream.map(getAdapterSessionError),
Stream.filter((error): error is SessionQueueError => error !== undefined),
Stream.take(1),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you buggin

});
}),
).pipe(Layer.provide(NodeHttpClient.layerUndici), Layer.orDie);
export const layerAxiom = (serviceName = AXIOM_DATASET) =>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks sus?? serviceName isn't the same as dataset name AFAIK ? the dataset should always be "expect-cli", but service name could be either "expect-mcp" or like "expect-clI"

Comment thread packages/shared/src/analytics/analytics.ts
}).pipe(Effect.orDie);
}).pipe(
Effect.catchTag("UnknownError", (cause) =>
Effect.logWarning("Failed to get machine ID, using fallback", { cause }).pipe(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is bad because if machineId fails, now well have noisy ahh analytics because same user will get a different UUID for each run

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

machineId is unlikely to fail only in extreme edge cases

Effect.logWarning("Git failed to get diff preview", {
error: error.message,
scope: "WorkingTree",
}).pipe(Effect.map(() => "")),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't swallow errors just let them bubble up.

this Git error should not happen, so swallowing it just means masking a potential bug

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These can throw an error i.e for example if the user doesn't add fetch-depth: 0 to GHA. I think it's fine instead of bubbling up. We don't want this to crash the session

Comment thread packages/supervisor/src/git/git.ts
Comment thread packages/supervisor/src/git/git.ts
Comment thread apps/cli/src/components/screens/main-menu-screen.tsx Outdated
aidenybai and others added 9 commits April 1, 2026 14:36
stepStatuses was creating Map entries for step events whose stepId
didn't match any step in the steps array, causing status to return
"failed" even when 0 visible steps failed.
The AI agent was inventing inconsistent test plans each run, leading to
flaky steps like cross-browser WebKit, accessibility audits, and font
loading checks. A fixed checklist gives deterministic step coverage.
Restored the .cursor/plans entry in the .gitignore file to ensure that the plans directory is ignored during version control, maintaining a clean repository.
Modified the layerAxiom function to use a default service name constant, AXIOM_DEFAULT_SERVICE_NAME, improving code readability and ensuring consistency in service naming.
Eliminated unnecessary imports from Tracing.ts, including NodeSocket and DevTools, to streamline the code and enhance readability. This change focuses on maintaining a cleaner codebase without altering existing functionality.
…ll container support

Updated the snapshot process to include viewport-aware options, allowing for more accurate representation of elements within scroll containers. Introduced new parameters in SnapshotOptions and adjusted the computeSnapshotStats function to account for hidden nodes. Added tests to validate the new behavior and ensure correct statistics are computed when scroll containers are present.
Refactored the scroll detection logic to improve readability and maintainability by introducing helper functions for hiding child elements and inserting markers. Updated the snapshot process to utilize a constant for no scroll containers, enhancing the handling of viewport-aware options. This change simplifies the code while preserving existing functionality.
…oration

Improved error handling in the viewport snapshot process by adding debug logging for failures during both preparation and restoration. This change provides better insights into issues that may arise, ensuring more robust handling of scroll container scenarios.
…etching

Introduced a new VERSION_API_URL constant to facilitate fetching the current version from the API. Updated the CLI entry point to make a request to this API, enhancing the update check functionality. Refactored the version fetching logic to utilize the new API endpoint, improving the overall update mechanism.
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 5 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="apps/cli/src/index.tsx">

<violation number="1" location="apps/cli/src/index.tsx:25">
P2: Guard this startup fetch behind the telemetry opt-out flags; currently it still sends a network request even when `NO_TELEMETRY=1`.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment thread apps/cli/src/index.tsx
…CP tool

The agent could only launch Chromium — cross-browser testing in the
ui_quality_rules always failed. Now the `open` tool accepts a `browser`
parameter (chromium | webkit | firefox), threading through McpSession
and Browser.createPage to Playwright's engine-specific launchers.

- Install all three Playwright browsers in CI, init, and generated GH Action
- CDP connections correctly restricted to chromium only
- Chromium-specific launch args skipped for non-chromium engines
- Agent prompt updated to document browser switching workflow
…l in init

The previous approach used tryRun (60s timeout, stdio ignored) which
would silently hang on slow connections downloading ~170MB of browsers.
Now streams download progress to the terminal and allows 5 minutes.
- Move PLAYWRIGHT_INSTALL_TIMEOUT_MS to constants.ts per magic number rule
- Use BrowserEngine type in McpSession OpenOptions instead of inline union
- Replace switch with record lookup in resolveBrowserType
Comment thread apps/cli/src/constants.ts Outdated
export const UPDATE_CHECK_TIMEOUT_MS = 5_000;
export const PLAYWRIGHT_INSTALL_TIMEOUT_MS = 300_000;

export type PackageManager = "npm" | "pnpm" | "yarn" | "bun" | "deno";
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate PackageManager types cause silent type divergence

Medium Severity

Two different PackageManager types with the same name are now exported from different modules. The one in constants.ts is "npm" | "pnpm" | "yarn" | "bun" | "deno", while the one in init-utils.ts is "npm" | "pnpm" | "yarn" | "bun" | "vp". The detect-projects.ts utility re-exports the constants version and uses it in DetectedProject, which can return "deno". Any consumer expecting the init-utils variant won't handle "deno", and vice versa for "vp". This creates a subtle type-safety gap that TypeScript won't flag when both types are imported under the same name.

Additional Locations (1)
Fix in Cursor Fix in Web

Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 12 total unresolved issues (including 11 from previous reviews).

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

pnpm: "pnpm dlx",
yarn: "npx",
bun: "bunx",
deno: "deno run -A npm:",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deno DLX command produces invalid package specifier

Low Severity

The DLX_COMMANDS entry for deno is "deno run -A npm:". When interpolated into the workflow template as ${dlx} expect-cli@latest --ci, this produces deno run -A npm: expect-cli@latest --ci with a space between npm: and the package name. Deno requires npm: to be directly attached to the package name (e.g., npm:expect-cli@latest). This would produce an invalid command if deno is ever detected as the package manager.

Fix in Cursor Fix in Web

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 issues found across 21 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/browser/src/mcp/server.ts">

<violation number="1" location="packages/browser/src/mcp/server.ts:102">
P2: Validate or normalize `cdp` when a non-Chromium browser is requested; current behavior can falsely report CDP connection even though CDP is ignored.</violation>
</file>

<file name="packages/browser/tests/browser-e2e.test.ts">

<violation number="1" location="packages/browser/tests/browser-e2e.test.ts:378">
P2: Close `chromiumSession` in a `finally` block so failures in snapshot/assertion do not leak a browser instance.</violation>
</file>

<file name="apps/cli/src/commands/add-github-action.ts">

<violation number="1" location="apps/cli/src/commands/add-github-action.ts:39">
P3: The Deno dlx command prefix leaves a space between `npm:` and the package name when interpolated, which makes the generated `deno run` command invalid. Generate the command as `npm:expect-cli@latest` without a separating space.</violation>

<violation number="2" location="apps/cli/src/commands/add-github-action.ts:48">
P3: The new Deno workflow command mapping is dead code because package-manager detection never returns `deno` here.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment on lines +102 to 110
const result = yield* session.open(url, {
headed,
cookies,
waitUntil,
cdpUrl,
browserType,
});
const engineSuffix = browserType && browserType !== "chromium" ? ` [${browserType}]` : "";
const cdpSuffix = cdpUrl ? ` (connected via CDP: ${cdpUrl})` : "";
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Validate or normalize cdp when a non-Chromium browser is requested; current behavior can falsely report CDP connection even though CDP is ignored.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At packages/browser/src/mcp/server.ts, line 102:

<comment>Validate or normalize `cdp` when a non-Chromium browser is requested; current behavior can falsely report CDP connection even though CDP is ignored.</comment>

<file context>
@@ -93,10 +99,17 @@ export const createBrowserMcpServer = <E>(
           }
 
-          const result = yield* session.open(url, { headed, cookies, waitUntil, cdpUrl });
+          const result = yield* session.open(url, {
+            headed,
+            cookies,
</file context>
Suggested change
const result = yield* session.open(url, {
headed,
cookies,
waitUntil,
cdpUrl,
browserType,
});
const engineSuffix = browserType && browserType !== "chromium" ? ` [${browserType}]` : "";
const cdpSuffix = cdpUrl ? ` (connected via CDP: ${cdpUrl})` : "";
const effectiveCdpUrl =
!browserType || browserType === "chromium" ? cdpUrl : undefined;
const result = yield* session.open(url, {
headed,
cookies,
waitUntil,
cdpUrl: effectiveCdpUrl,
browserType,
});
const engineSuffix = browserType && browserType !== "chromium" ? ` [${browserType}]` : "";
const cdpSuffix = effectiveCdpUrl ? ` (connected via CDP: ${effectiveCdpUrl})` : "";
Fix with Cubic

});

it("can switch between chromium and webkit sessions", async () => {
const chromiumSession = await runBrowser((browser) =>
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Close chromiumSession in a finally block so failures in snapshot/assertion do not leak a browser instance.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At packages/browser/tests/browser-e2e.test.ts, line 378:

<comment>Close `chromiumSession` in a `finally` block so failures in snapshot/assertion do not leak a browser instance.</comment>

<file context>
@@ -288,3 +289,107 @@ describe("browser e2e", () => {
+  });
+
+  it("can switch between chromium and webkit sessions", async () => {
+    const chromiumSession = await runBrowser((browser) =>
+      browser.createPage(origin, { waitUntil: "domcontentloaded" }),
+    );
</file context>
Fix with Cubic

pnpm: "pnpm install",
yarn: "yarn install --frozen-lockfile",
bun: "bun install",
deno: "deno install",
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3: The new Deno workflow command mapping is dead code because package-manager detection never returns deno here.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At apps/cli/src/commands/add-github-action.ts, line 48:

<comment>The new Deno workflow command mapping is dead code because package-manager detection never returns `deno` here.</comment>

<file context>
@@ -43,6 +45,7 @@ const INSTALL_COMMANDS: Record<PackageManager, string> = {
   pnpm: "pnpm install",
   yarn: "yarn install --frozen-lockfile",
   bun: "bun install",
+  deno: "deno install",
   vp: "npm ci",
 };
</file context>
Fix with Cubic

pnpm: "pnpm dlx",
yarn: "npx",
bun: "bunx",
deno: "deno run -A npm:",
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3: The Deno dlx command prefix leaves a space between npm: and the package name when interpolated, which makes the generated deno run command invalid. Generate the command as npm:expect-cli@latest without a separating space.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At apps/cli/src/commands/add-github-action.ts, line 39:

<comment>The Deno dlx command prefix leaves a space between `npm:` and the package name when interpolated, which makes the generated `deno run` command invalid. Generate the command as `npm:expect-cli@latest` without a separating space.</comment>

<file context>
@@ -35,6 +36,7 @@ const DLX_COMMANDS: Record<PackageManager, string> = {
   pnpm: "pnpm dlx",
   yarn: "npx",
   bun: "bunx",
+  deno: "deno run -A npm:",
   vp: "npx",
 };
</file context>
Fix with Cubic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants