Conversation
Research findings for the `agent` binary (Cursor CLI): - AGENT.md one-pager with hook payloads, transcript format, CLI flags - Probe script for capturing hook payloads from the CLI - Key finding: beforeSubmitPrompt and stop hooks don't fire in -p mode Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Entire-Checkpoint: 44abbd4f67c4
Registers the `agent` binary (Cursor CLI) with the E2E framework as "cursor-cli". Maps to Entire agent "cursor" so existing hooks/strategy code is reused. - RunPrompt: `agent -p <prompt> --force --trust --workspace <dir>` - StartSession: tmux-based interactive mode with `agent --force` - Bootstrap: validates CURSOR_API_KEY on CI - TimeoutMultiplier: 1.5x (slightly conservative for API variability) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Entire-Checkpoint: 44f3de857440
Handle workspace trust dialog (--trust only works in headless mode) and update prompt pattern to match Cursor's actual TUI input indicator. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Entire-Checkpoint: 9093105ebe04
RunPrompt now uses tmux interactive sessions instead of headless -p mode, which ensures beforeSubmitPrompt and stop hooks fire correctly. Extracts shared helpers for session startup and workspace trust dialog handling. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Entire-Checkpoint: 9a7ed0c8e36e
PR SummaryMedium Risk Overview Also adds a Cursor CLI integration one-pager ( Written by Cursor Bugbot for commit cadb34a. Configure here. |
There was a problem hiding this comment.
Pull request overview
Adds initial end-to-end (E2E) support and exploratory tooling for running Entire against the Cursor CLI (agent) and documenting observed hook/transcript behavior.
Changes:
- Introduces an E2E agent implementation for Cursor CLI using tmux-based interactive mode.
- Adds a probe script to install Cursor hooks and capture hook payloads to disk for manual/automated investigation.
- Adds a Cursor CLI integration one-pager documenting hook semantics, transcript location/layout, and limitations.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| scripts/test-cursor-cli-agent-integration.sh | New probe script that wires .cursor/hooks.json, captures stdin payloads per hook event, and prints collected JSON. |
| e2e/agents/cursor_cli.go | New E2E agent adapter that runs Cursor CLI interactively via tmux and drives prompts based on a TUI prompt regex. |
| cmd/entire/cli/agent/cursor/AGENT.md | New documentation summarizing Cursor CLI hook behavior and transcript layout expectations. |
| timeout := 90 * time.Second | ||
| if cfg.PromptTimeout > 0 { | ||
| timeout = cfg.PromptTimeout | ||
| } | ||
|
|
||
| displayCmd := a.Binary() + " --force --workspace " + dir + " (interactive prompt: " + prompt + ")" | ||
|
|
||
| // Start an interactive tmux session so all hooks fire | ||
| // (beforeSubmitPrompt and stop don't fire in headless -p mode). | ||
| s, err := a.startInteractiveSession(dir) | ||
| if err != nil { | ||
| return Output{Command: displayCmd, ExitCode: -1}, | ||
| fmt.Errorf("start interactive session: %w", err) | ||
| } | ||
| defer s.Close() | ||
|
|
||
| // Wait for trust dialog and accept it. | ||
| if err := a.acceptTrustDialogIfNeeded(s); err != nil { | ||
| return Output{Command: displayCmd, Stdout: s.Capture(), ExitCode: -1}, err | ||
| } | ||
|
|
||
| // Wait for the TUI to be ready. | ||
| if _, err := s.WaitFor(a.PromptPattern(), 30*time.Second); err != nil { | ||
| return Output{Command: displayCmd, Stdout: s.Capture(), ExitCode: -1}, | ||
| fmt.Errorf("waiting for startup prompt: %w", err) | ||
| } | ||
|
|
||
| // Send the prompt. | ||
| if err := s.Send(prompt); err != nil { | ||
| return Output{Command: displayCmd, Stdout: s.Capture(), ExitCode: -1}, | ||
| fmt.Errorf("sending prompt: %w", err) | ||
| } | ||
|
|
||
| // Wait for the prompt pattern to reappear (agent finished processing). | ||
| content, waitErr := s.WaitFor(a.PromptPattern(), timeout) | ||
| if waitErr != nil { | ||
| // Check for deadline exceeded to allow transient error detection. | ||
| if ctx.Err() == context.DeadlineExceeded { | ||
| waitErr = fmt.Errorf("%w: %w", waitErr, context.DeadlineExceeded) | ||
| } | ||
| return Output{Command: displayCmd, Stdout: content, ExitCode: -1}, waitErr | ||
| } |
There was a problem hiding this comment.
RunPrompt doesn’t respect ctx cancellation/deadline: the tmux WaitFor calls only use local timeouts, so a test context expiring won’t stop the wait and can run past the scenario budget. Bound the per-prompt timeout by ctx’s deadline (when set) and/or actively abort/close the session when ctx.Done() is triggered.
| // in interactive mode (no -p flag) so all hooks fire. | ||
| func (a *CursorCLI) startInteractiveSession(dir string) (*TmuxSession, error) { | ||
| name := fmt.Sprintf("cursor-cli-test-%d", time.Now().UnixNano()) | ||
| return NewTmuxSession(name, dir, nil, a.Binary(), "--force", "--workspace", dir) |
There was a problem hiding this comment.
startInteractiveSession launches the agent without the E2E env vars used elsewhere (notably ENTIRE_TEST_TTY=0, and typically ACCESSIBLE=1). Since Cursor runs hook commands as child processes, missing these env vars can make hook/test behavior diverge from the rest of the E2E suite (see e.g. e2e/agents/gemini.go:128, e2e/agents/claude.go:178-193). Consider wrapping the command with env ... like the other tmux-based agents.
| return NewTmuxSession(name, dir, nil, a.Binary(), "--force", "--workspace", dir) | |
| env := map[string]string{ | |
| "ENTIRE_TEST_TTY": "0", | |
| } | |
| if accessible := os.Getenv("ACCESSIBLE"); accessible != "" { | |
| env["ACCESSIBLE"] = accessible | |
| } | |
| return NewTmuxSession(name, dir, env, a.Binary(), "--force", "--workspace", dir) |
| - `--model <model>`: Model override (e.g., `sonnet-4`, `gpt-5`) | ||
| - `--output-format <fmt>`: `text` (default), `json`, `stream-json` | ||
| - Interactive mode: `agent --force` (launches TUI) | ||
| - Prompt pattern for TUI ready: TBD (needs interactive probe) |
There was a problem hiding this comment.
This doc says the interactive-mode prompt pattern is “TBD”, but the new E2E agent implementation depends on a concrete PromptPattern() for readiness detection. Please update this section to reflect the actual prompt pattern being used (or document how it’s derived/validated) so the one-pager stays consistent with the code.
| - Prompt pattern for TUI ready: TBD (needs interactive probe) | |
| - Prompt pattern for TUI ready: Defined by `PromptPattern()` in the Cursor agent integration; derived from the `agent` startup banner and validated via interactive E2E tests. |
24b8609 to
ef0818d
Compare
No description provided.