feat(desktop): client-side stall detection for chat tool calls by eulicesl · Pull Request #7555 · BasedHardware/omi

eulicesl · 2026-05-30T20:51:16Z

Summary

Problem: A running tool call had only two states — running / completed. When a tool is slow or hangs, the user sees an indefinite spinner with no signal that anything is wrong and no way to bail out. This is a common source of "the chat is stuck / broken" perception.
What changed: Adds client-side stall detection. A StallDetector actor watches two clocks per turn (the inter-event gap and each in-flight tool's own duration) and promotes UI state through .running → .slow → .stalled at configurable thresholds. The UI surfaces a "still working…" annotation on slow tools and a message-level Cancel banner on stalled ones, wired to the existing AgentBridge.interrupt().
What did NOT change (scope boundary): No bridge/protocol or backend changes — the detector reads the timestamped events the bridge already emits. No tool-cancellation timeouts (it only surfaces UI + offers manual Cancel). No telemetry changes.

Linked Issue / Bounty

N/A — independent contribution.

Root Cause (bug fixes only)

N/A — new feature (UX reliability affordance).

How It Works

StallDetector is pure logic: the current time is passed in as a parameter on every operation (step(kind:atMs:), tick(atMs:)), so it has no wall-clock dependency and is fully deterministic under test. Production wraps it in a periodic tick task while a turn is active so promotions still fire during silent gaps when no events arrive.
StallThresholds holds the promotion cutoffs (slowGapMs / stalledGapMs, default 8s / 20s) as a Sendable/Equatable struct so they can be tuned later with no behavioral churn.
ToolCallStatus is extended from { running, completed } to { running, slow, stalled, completed, failed } with an isInFlight helper. The exhaustive switches that consume it are updated across the chat, onboarding, and task-chat surfaces (this is the bulk of the changed-line count — mechanical enum-extension ripple).
ChatProvider feeds every bridge callback to the detector and maps detector transitions onto the tool-call rows; the stalled state drives the Cancel banner.

Change Type

Feature

Scope

Desktop app (Swift/macOS)

Security Impact

New permissions or capabilities introduced? No
Auth or token handling changed? No
Data access scope changed (screen data, OCR, user data)? No
New or changed network calls? No
Plugin or tool execution surface changed? No — Cancel routes to the existing AgentBridge.interrupt(). The detector observes only event timing, never content.

Testing

Built locally — xcrun swift build -c debug --package-path Desktop → Build complete!
Manual verification: compile + unit tests on macOS. See "Human Verification" for the runtime gap.
Existing tests pass
New tests added — StallDetectorTests (13 tests: inter-event promotion, per-tool timers tracked independently, reset-on-event, threshold guard) and ToolCallStatusTests (3 tests: the extended-enum + isInFlight contract).

Human Verification

Verified scenarios: clean debug build against current main; 16 new tests pass; full test target compiles.
Edge cases checked (via tests): a tool's own timer promoting independently of the inter-event gap; multiple parallel tools tracked separately; repeated tick at the same timestamp emitting no duplicate transition; thresholds precondition.
What I did not verify (and why): I did not drive the running app to screenshot each .slow / .stalled state. Deterministically forcing a real stall end-to-end needs a fake-bridge harness (out of scope for this PR), so runtime UI evidence is a known gap.

Evidence

Test output:

Test Suite 'StallDetectorTests' passed  — Executed 13 tests, 0 failures
Test Suite 'ToolCallStatusTests' passed — Executed 3 tests, 0 failures
Build complete!

No runtime screenshot — honest gap (see Human Verification). Happy to add one if a reviewer wants it.

Docs

N/A — no docs impact

Performance, Privacy, and Reliability

Performance: one actor + a 500ms tick task active only during an in-flight turn; negligible. Privacy: the detector sees event kind and timing only — never prompt, tool input, or output text. Reliability: strictly additive — when nothing stalls, behavior is identical to today.

Migration / Backward Compatibility

Backward compatible? Yes
Migration needed? No — the ToolCallStatus extension is additive; persistence treats the new in-flight states as transient and never relies on them being present in stored history.

Risks and Mitigations

Risk: Extending ToolCallStatus could miss an exhaustive switch. Mitigation: Swift's exhaustiveness checking forces every consumer to handle the new cases; the compile is green, and ToolCallStatusTests locks the isInFlight contract.
Risk: A periodic tick task leaking past turn end. Mitigation: the tick task is cancelled via defer on turn-scope exit (success or throw).
Known limitation: thresholds (8s / 20s) are first-pass defaults, intended to be tuned against real timing data later.

AI Disclosure

Tools used: Claude Code (Claude Opus).
AI-assisted portions: implementation and this PR description.
How I verified correctness: ran the local debug build and the unit-test suite (all green); manually reviewed every diff and confirmed the detector never touches message content.

chatgpt-codex-connector · 2026-05-30T20:51:21Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

greptile-apps · 2026-05-30T20:57:18Z

Greptile Summary

Adds client-side stall detection for chat tool calls on macOS. A new StallDetector actor watches two independent timers — an inter-event gap and a per-tool duration — and promotes ToolCallStatus through .running → .slow → .stalled at configurable thresholds (8 s / 20 s). The UI gains an amber spinner for slow tools and a Cancel banner for stalled ones, wired to the existing AgentBridge.interrupt().

StallDetector / StallThresholds: Pure-logic actor with deterministic time injection; well-covered by 13 new unit tests.
ToolCallStatus extension: Adds .slow, .stalled, .failed + isInFlight helper; exhaustive switch ripple updated across chat, onboarding, task-chat, and persistence surfaces.
ChatProvider integration: Feeds every bridge callback and a 500 ms tick task into the detector; one bug — stall promotions are silently dropped for tools whose bridge-provided toolUseId is nil because the synthetic \"untracked-\\(name)\" key never matches the nil stored in the content block.

Confidence Score: 3/5

The core stall detection logic and UI wiring are sound, but the integration between the detector's synthetic tool IDs and the content-block matching has a functional gap that prevents the banner from appearing for tools without a bridge-supplied toolUseId.

The applyStallTransitions method registers tools under a synthetic key untracked-name when toolUseId is nil, but then matches content blocks by tuid == id where tuid is also nil — so the comparison is always false. Any tool the bridge delivers without a toolUseId will never show the amber spinner or Cancel banner, which is the primary deliverable of this feature. The fix is a one-line change, but until it lands the feature is partially inoperative for that code path.

desktop/Desktop/Sources/Providers/ChatProvider.swift — specifically applyStallTransitions (the synthetic-ID / nil-tuid mismatch). All other changed files look correct.

Important Files Changed

Filename	Overview
desktop/Desktop/Sources/Chat/StallDetector.swift	New actor implementing pure-logic stall detection with deterministic time injection; two independent timers (inter-event gap + per-tool duration) producing typed Transition values; well-tested and idempotent.
desktop/Desktop/Sources/Providers/ChatProvider.swift	Integrates StallDetector into the chat turn; contains a bug in applyStallTransitions where tools registered under a synthetic "untracked-{name}" key never match their content blocks (toolUseId is nil), silently dropping all stall promotions for those tools.
desktop/Desktop/Sources/MainWindow/Pages/ChatPage.swift	Extends ToolCallsGroup with aggregateStatus, statusIcon helper, and ToolCallStalledBanner; Cancel button wired via onCancel closure chain to AgentBridge.interrupt(); exhaustive switches updated for new enum cases.
desktop/Desktop/Sources/Rewind/Core/TaskChatMessageStorage.swift	Persistence encoding updated to map .slow/.stalled to "running" and adds .failed round-trip; forward-compat default for unknown status codes.
desktop/Desktop/Tests/StallDetectorTests.swift	13 deterministic tests covering inter-event promotion, per-tool timer independence, reset-on-event, idempotent ticks, and parallel tools; no test for the untracked-id integration path.

Sequence Diagram

sequenceDiagram
    participant Bridge as AgentBridge
    participant CP as ChatProvider (MainActor)
    participant SD as StallDetector (actor)
    participant UI as ToolCallsGroup

    Bridge->>CP: toolActivityHandler("started", toolUseId)
    CP->>CP: addToolActivity(.running)
    CP->>SD: await step(.toolStarted(id), atMs)
    SD-->>CP: [] no transitions yet

    loop every 500ms stallTickTask
        CP->>SD: await tick(atMs now)
        alt gap under 8s
            SD-->>CP: []
        else gap reaches 8s
            SD-->>CP: [.tool(id, .running to .slow)]
            CP->>CP: "applyStallTransitions block status = .slow"
            CP-->>UI: amber spinner
        else gap reaches 20s
            SD-->>CP: [.tool(id, .slow to .stalled)]
            CP->>CP: "applyStallTransitions block status = .stalled"
            CP-->>UI: Cancel banner shown
        end
    end

    UI->>CP: onCancelTurn tapped
    CP->>Bridge: stopAgent interrupt

    Bridge->>CP: toolActivityHandler("completed", toolUseId)
    CP->>CP: addToolActivity(.completed)
    CP->>SD: await step(.toolCompleted(id), atMs)
    SD-->>CP: [.tool(id, .stalled to .running)]

_{Reviews (1): Last reviewed commit: "feat(desktop): client-side stall detecti..." | Re-trigger Greptile}

Greptile P1 on BasedHardware#7555: tools that arrive without a `toolUseId` were registered in the StallDetector under a synthetic `untracked-<name>` key, but their content block stored `toolUseId: nil`. The transition match used `tuid == id`, so `nil == "untracked-<name>"` was always false and every stall transition for those tools was silently dropped — the spinner never went amber and the Cancel banner never appeared, the exact case the feature exists to fix. Route both the registration site and the match site through a single `ChatProvider.stallTrackingId(toolUseId:name:)` helper so the key is derived identically and cannot diverge again. Add tests locking the derivation (real id passes through; nil falls back to the name key). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

kodjima33

Desktop feature: client-side stall detection, well-tested actor pattern. Approve only — feature >500 lines and mergeStateStatus BLOCKED, author needs to rebase.

A running tool call previously had only two states (running / completed), so a slow or hung tool showed an indefinite spinner with no signal and no way out. Adds a `StallDetector` actor that watches two clocks per turn — the inter-event gap and each in-flight tool's own duration — and promotes state through `.running → .slow → .stalled` at configurable thresholds (`StallThresholds`, default 8s / 20s). The detector is pure logic: time is passed in on every call, so it is fully deterministic and testable without wall-clock waits. `ToolCallStatus` is extended to `{ running, slow, stalled, completed, failed }` with an `isInFlight` helper, and the exhaustive switches that consume it are updated across the chat, onboarding, and task-chat surfaces. `ChatProvider` feeds every bridge event to the detector and runs a periodic tick task so promotions fire during silent gaps. The UI surfaces a "still working…" annotation on slow tools and a message-level Cancel banner on stalled ones, wired to `AgentBridge.interrupt()`. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Greptile P1 on BasedHardware#7555: tools that arrive without a `toolUseId` were registered in the StallDetector under a synthetic `untracked-<name>` key, but their content block stored `toolUseId: nil`. The transition match used `tuid == id`, so `nil == "untracked-<name>"` was always false and every stall transition for those tools was silently dropped — the spinner never went amber and the Cancel banner never appeared, the exact case the feature exists to fix. Route both the registration site and the match site through a single `ChatProvider.stallTrackingId(toolUseId:name:)` helper so the key is derived identically and cannot diverge again. Add tests locking the derivation (real id passes through; nil falls back to the name key). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

eulicesl · 2026-05-31T16:34:53Z

Rebased onto the latest main, checks green, and all review threads resolved — this one's ready to merge whenever convenient. Thanks for the review!

greptile-apps Bot reviewed May 30, 2026

View reviewed changes

Comment thread desktop/Desktop/Sources/Providers/ChatProvider.swift Outdated

kodjima33 approved these changes May 31, 2026

View reviewed changes

eulicesl and others added 2 commits May 31, 2026 11:08

eulicesl force-pushed the feat/desktop-chat-stall-detection branch from 147b9b7 to a10264a Compare May 31, 2026 15:10

eulicesl mentioned this pull request May 31, 2026

test(desktop): lock chat prompt ↔ tool-registry & schema consistency #7573

Closed

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(desktop): client-side stall detection for chat tool calls#7555

feat(desktop): client-side stall detection for chat tool calls#7555
eulicesl wants to merge 2 commits into
BasedHardware:mainfrom
eulicesl:feat/desktop-chat-stall-detection

eulicesl commented May 30, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 30, 2026

Uh oh!

greptile-apps Bot commented May 30, 2026

Uh oh!

Uh oh!

kodjima33 left a comment

Uh oh!

eulicesl commented May 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

eulicesl commented May 30, 2026

Summary

Linked Issue / Bounty

Root Cause (bug fixes only)

How It Works

Change Type

Scope

Security Impact

Testing

Human Verification

Evidence

Docs

Performance, Privacy, and Reliability

Migration / Backward Compatibility

Risks and Mitigations

AI Disclosure

Uh oh!

chatgpt-codex-connector Bot commented May 30, 2026

Uh oh!

greptile-apps Bot commented May 30, 2026

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

kodjima33 left a comment

Choose a reason for hiding this comment

Uh oh!

eulicesl commented May 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants