Skip to content

wait_for_text matches stale scrollback immediately — needs baseline anchor #45

@tony

Description

@tony

Summary

wait_for_text does not capture a baseline before polling, so any text matching pattern that was already on the pane when the tool is invoked causes the very first poll (~50 ms later) to return found=True — regardless of whether the agent's command has produced any new output. The sibling tool wait_for_content_change already snapshots a baseline; wait_for_text doesn't, and so races against its own scrollback.

Reproduction

1. Create a fresh pane.
2. Send: echo READY
3. Wait until the prompt returns.
4. Call wait_for_text(pane_id=…, pattern="READY", timeout=2).
   Expected: timeout, found=False (nothing new has been written).
   Actual:   found=True within ~50 ms, matching the line from step 2.

Relevant code paths:

  • src/libtmux_mcp/tools/pane_tools/wait.py lines 100–254 — wait_for_text. The poll loop at lines 202–226 calls pane.capture_pane(start=content_start, end=content_end) on each tick and runs the compiled regex over every returned line. No state is recorded before the loop starts.
  • src/libtmux_mcp/tools/pane_tools/wait.py lines 257–373 — wait_for_content_change. Line 327 (initial_content = await asyncio.to_thread(pane.capture_pane)) captures a baseline before the loop and compares each poll against it. This is the pattern wait_for_text should mirror.

Why this matters for MCP DX

Agents call wait_for_text to synchronise on command output. A tool whose name reads as "wait" but whose behaviour reads as "match anywhere in the current pane" violates the principle of least astonishment, and there is no signal in the result that the match came from stale scrollback rather than fresh output. The synchronous "is the pattern currently in the pane?" intent is already served by search_panes (src/libtmux_mcp/tools/pane_tools/search.py lines 76–142), so wait_for_text should mean wait for new appearance.

Options considered

(A) Baseline-anchored, no flag — recommended

At entry, read (history_size, cursor_y) via display-message (the existing pattern used by snapshot_pane / get_pane_info in src/libtmux_mcp/tools/pane_tools/meta.py lines 128–162). The pair (hs0, cy0) defines an absolute grid index baseline_abs = hs0 + cy0 that is stable across subsequent scrolling: tmux's capture-pane -S <n> interprets n against the live hsize (cmd-capture-pane.c lines 145–160, top = gd->hsize + n).

On each poll, read the current history_size (hs1) and capture only lines after the baseline: pane.capture_pane(start=baseline_abs - hs1 + 1, end=None). The + 1 skips the baseline line itself so we don't re-match the line the cursor sat on at entry.

Why this works in practice: on the main screen, new content is monotonically appended at or near the cursor — screen-write.c:screen_write_linefeed is the only mover, and when it would advance past sy-1 it scrolls the top line into history (grid.c grid_scroll_history increments hsize). Full-screen TUIs run on the alternate screen, which has a fresh grid with hlimit=0 (screen.c:screen_alternate_on), so they don't interact with the baseline at all.

Pros: pure docstring + small implementation change, no new state to manage, no fd lifecycle, mirrors the in-tree pattern from wait_for_content_change. Breaking change in name only (returning False where today it returns True was the user's intent anyway), and pre-alpha (0.1.0a0) makes the cost negligible.

Cons: edge cases the docstring must call out — see the bottom of this issue.

(B) Baseline-anchored, opt-in flag skip_existing_matches=False

Same mechanism as (A), but gated behind a boolean parameter that defaults to today's unsafe behaviour. Preserves callers that have somehow come to rely on stale-match semantics.

Pros: no breakage. Cons: every caller has to know to set a flag to get the only sensible default. MCP guidance (and FastMCP's own example tools, e.g. examples/memory.py splitting remember() and read_profile() instead of one memory(mode=…) tool) favours narrow tools over mode flags. The flag is effectively a permanent tombstone for a bug.

(C) Sentinel-marker pattern, docs-only fix

Document that callers should send a unique marker (cmd; echo __WAIT_$RANDOM__) and wait for that marker. Works adversarially and addresses cases that even (A) can't fix (e.g. patterns the caller didn't originate, such as prompts or log lines).

Pros: no code change. Cons: pushes the cost onto every caller, doesn't help when the agent is waiting for output it didn't author (logs, third-party processes, prompts), and is easy to forget. Useful as a complement to (A), not a replacement.

(D) pipe-pane side channel

Open a pipe-pane to a side channel at baseline time (cmd-pipe-pane.c lines 56–120), then read the pipe for new bytes. Strongest guarantee — sees only future bytes, immune to scrollback rollover, no polling.

Pros: most correct. Cons: heaviest — fd lifecycle, cleanup on pane death, a second reader loop, and changes the shape of the tool from "poll-and-return" to something stateful. Belongs in a future streaming tool (e.g. a stream_pane), tracked separately from this fix.

Recommendation

Adopt (A). The implementation is a small extension of wait_for_text that mirrors wait_for_content_change. The current content_start / content_end parameters can either be dropped (the baseline anchor supersedes them as a semantics knob) or repurposed as a performance cap only; (A) does not need them.

search_panes should be cross-referenced from the new docstring as the companion for "match anywhere right now." (C) should be mentioned in the docstring as an adversarial-safety pattern for callers who need stronger guarantees than tmux's grid model provides. (D) is deferred to a future tracking issue.

Edge cases the docstring must call out

Regardless of which option is chosen, the new docstring should be explicit about:

  • Scrollback truncation. If history-limit is small and the baseline line rolls out of history during the wait, start becomes more negative than -hsize_now. tmux clips -S to the oldest available line (cmd-capture-pane.c line 152), so the worst case degrades to today's behaviour on the surviving portion of history — not an infinite false-match loop.
  • Reverse-index sequences (\eM). Programs that scroll content downward at the top of the scrolling region can rewrite history below the baseline. Rare on the main screen — pagers (less, more) and other heavy users run on the alternate screen and aren't observed via this tool.
  • clear / reset. With the default scroll-on-clear option, cleared content scrolls into history (screen-write.c:screen_write_clearscreen), so baseline anchoring is unaffected.

Out of scope

  • wait_for_content_change — already anchored, no change needed.
  • A streaming pipe-pane-based tool — track separately if added to the roadmap.
  • search_panes semantics — already covers the synchronous case correctly.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingenhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions