fix: thread abort signals through host waits by ThomasK33 · Pull Request #94 · coder/agent-tty

ThomasK33 · 2026-05-12T08:08:23Z

Summary

Thread AbortSignal through host-side wait, render polling, run-completion, lifecycle polling, and RPC request paths.
Register timer/socket/listener cleanup with ResourceScope so aborted or completed waits release resources deterministically.
Centralize scoped settlement/abort helpers in src/util/abort.ts, including preserving original operation errors when cleanup also fails.
Add targeted coverage for abort utilities, run-completion aborts, sendRpc aborts, and RPC request-context abort on client disconnect.
Address review follow-ups by making wait --exit cleanup cancellable, normalizing lifecycle delay abort errors, documenting shared contracts, and resetting unrelated mise.lock drift.

Fixes #84.

Notes

The stale RPC socket liveness probe now has a bounded 1s probe. If the connect attempt neither succeeds nor fails promptly, the socket is treated as stale so host startup does not hang indefinitely.

Validation

npm run typecheck -- --pretty false
npm run test -- test/unit/host/runCompletionCoordinator.test.ts test/unit/host/hostMain.test.ts
npm run lint
npm run format:check
npm run test -- test/integration/wait-render.test.ts
npm run test -- test/integration/lifecycle.test.ts
npm run test -- test/unit/util/abort.test.ts test/unit/host/rpcClient.test.ts test/unit/host/rpcServer.test.ts test/unit/host/runCompletionCoordinator.test.ts test/unit/host/hostMain.test.ts
npm run verify

Generated with mux • Model: openai:gpt-5.5 • Thinking: high

ThomasK33 · 2026-05-12T08:08:37Z

/coder-agents-review

coder-agents-review

The abort signal threading is a solid structural improvement: deterministic cleanup via ResourceScope, consistent abort propagation through host waits, and the run-completion test verifies sentinel bytes stay hidden after abort. Clean decomposition of the generic waitForScopedOperation helper.

This is a first-pass (Netero) review only. These are mechanical findings; the full review panel has not yet reviewed this PR and will review after these findings are addressed.

1 P1, 1 P2, 2 P3, 1 Note.

The P1 is coverage: src/util/abort.ts exports three functions used by five modules, including an invariant that crashes on misuse, with zero unit tests. Every other utility in src/util/ has a corresponding test file. The P2 is the broader test ratio: 416 new source lines across 6 files, 37 test lines covering one path.

ThomasK33 · 2026-05-12T08:53:22Z

/coder-agents-review

coder-agents-review

Strong response to R1: all five findings addressed in a single commit, the settlement pattern centralized into shared utilities, and test coverage went from 37 lines to 327 lines. The createResourceScopedSettlers + waitForScopedOperation composition is clean and well-tested. Multiple reviewers independently verified the settlement ordering, abort propagation, and scope cleanup semantics.

1 P2, 9 P3, 3 Nit. 19-reviewer panel.

The P2 is a resource accumulation issue: each aborted wait --exit request appends a .then() handler to the shared ptyExitPromise that is never cleaned up by the ResourceScope. Under automation with short client timeouts, this grows without bound. The idle-wait path correctly registers its poll interval on the scope; the exit path does not have an equivalent.

The P3s cluster around three themes: (1) test coverage for the pre-aborted signal fast path, which three reviewers independently flagged; (2) documentation gaps on the new shared utilities and MethodContext; (3) type-level enforcement and error identity consistency.

src/host/hostMain.ts:734

P2 [ATT94-6] The exit-wait path creates waitCondition = ptyExitPromise.then(...) without registering anything on waitScope. When the client disconnects and waitForScopedOperation rejects with AbortError, the scope closes empty, but the .then() handler on ptyExitPromise still holds a closure over state and fires when the PTY eventually exits.

For a single aborted wait this is trivial. But ptyExitPromise is a singleton for the host lifetime. Each aborted wait --exit appends another .then() handler. An agent polling with short client timeouts accumulates one closure per attempt. At thousands of abandoned waits, all closures fire synchronously on PTY exit, each calling state.snapshot() and allocating a result object nobody reads.

The idle-wait path (line 752) correctly adds checkInterval to waitScope, so scope close stops the poll. The exit path has no equivalent cleanup. Consider wrapping ptyExitPromise.then() in a cancellable pattern (e.g., racing against a scope-owned deferred that rejects on scope close). (Hisoka)

ThomasK33 · 2026-05-12T10:08:35Z

/coder-agents-review

coder-agents-review

All 13 R2 findings verified fixed across 13 reviewers. The R3 fixes are thorough: ptyExitWaiters Set eliminates the .then() leak, pollDelay normalizes abort error identity, runAbortCallback guards onAbort throws, requestScope.close() is wrapped in try/catch, and the TimeoutConfig discriminated union enforces the timeout pairing at compile time. Multiple reviewers (Hisoka, Takumi, Knov, Pariston) independently traced the settlement races and confirmed correctness.

2 P3, 2 Nit. 13-reviewer panel (9 clean).

The two P3s are minor: a test gap where the most common production path through waitForScopedOperation (operation resolves normally) has zero coverage, and the new makeAbortReason/makeAbortError naming distinction needs a one-liner JSDoc each so callers pick the right one.

PS. The commit subject "fix: clean up abort review findings" describes the process, not the change. Something like fix: guard scope cleanup, normalize abort reasons, add abort JSDoc names what shipped.

ThomasK33 · 2026-05-12T10:44:52Z

/coder-agents-review

coder-agents-review

All 4 R3 findings verified fixed. CI green (11/11). 9-reviewer panel, 7 clean. No correctness issues found across 4 rounds.

Four rounds, 32 findings total: 22 fixed, 7 dropped, 3 unposted Notes. Every posted finding was addressed within one round. The abort threading is structurally sound: multiple reviewers independently traced the settlement races, scope cleanup ordering, and resource lifecycle across all paths. Test coverage is healthy at 1.3:1 source-to-test ratio with 29 targeted tests.

One Nit: ptyExitWaiters could use a comment explaining why it exists alongside ptyExitPromise.

ThomasK33 · 2026-05-12T11:11:11Z

/coder-agents-review

coder-agents-review

ATT94-32 verified. All prior findings resolved. 9-reviewer panel, 8 clean. No correctness issues across 5 rounds.

2 Nit: both are JSDoc completeness on abort.ts exports. throwIfAborted is the only exported function without a comment, and the createResourceScopedSettlers JSDoc documents the reject-path AggregateError but omits the resolve-path close failure mode.

Five rounds, 34 findings total: 23 fixed, 7 dropped, 4 open (2 Nit posted, 3 unposted Notes). The abort signal threading is structurally sound and well-tested.

ThomasK33 · 2026-05-12T11:33:55Z

/coder-agents-review

ThomasK33 · 2026-05-12T11:44:39Z

/coder-agents-review

coder-agents-review

R5 Nits verified fixed. 9-reviewer panel, 8 clean. No correctness issues.

2 P3, both test coverage for edge paths in the abort utilities. The resolve-close-failure asymmetry (ATT94-35) is a re-raise of ATT94-17 (dropped R2) with new evidence: the existing reject-close-failure test at line 88 proves cleanup CAN throw, invalidating the original drop reasoning. The timeout-cleanup gap (ATT94-36) is the most common production path (command succeeds before timeout) with no assertion that the timer is cleaned up.

ThomasK33 · 2026-05-12T11:53:43Z

/coder-agents-review

coder-agents-review

Review blocked: the two open findings from R6 have no response and no code push.

ATT94-35 (P3): createResourceScopedSettlers resolve-close-failure path is untested. The reject-close-failure path has a test; the resolve path (which silently converts a success into a rejection) does not.
ATT94-36 (P3): No waitForScopedOperation test where the operation wins the race against a configured timeout. This is the normal production path for every successful run, wait, and waitForRender.

Further review is blocked until the author responds or pushes fixes for these findings.

ThomasK33 · 2026-05-12T11:58:47Z

/coder-agents-review

coder-agents-review

ATT94-35 and ATT94-36 verified fixed. 9-reviewer panel, all clean. Netero clean. No new findings.

Eight rounds, 36 findings: 27 fixed, 7 dropped, 3 unposted Notes. Every posted finding addressed. No correctness issues found by any reviewer at any point in the review. The abort signal threading is structurally sound, well-documented, and thoroughly tested.

fix: thread abort signals through host waits

1b3b990

coder-agents-review Bot reviewed May 12, 2026

View reviewed changes

Comment thread src/util/abort.ts

Comment thread src/host/hostMain.ts Outdated

Comment thread src/host/rpcServer.ts

Comment thread src/host/rpcServer.ts

Comment thread src/host/hostMain.ts Outdated

test: cover abort signal cleanup paths

b7c0143

coder-agents-review Bot reviewed May 12, 2026

View reviewed changes

Comment thread test/unit/util/abort.test.ts

Comment thread src/util/abort.ts

Comment thread src/host/rpcClient.ts Outdated

Comment thread src/util/abort.ts

ThomasK33 force-pushed the agent-tty-fqde branch from 8644f48 to 05b1461 Compare May 12, 2026 10:39

coder-agents-review Bot approved these changes May 12, 2026

View reviewed changes

Comment thread src/host/hostMain.ts

ThomasK33 force-pushed the agent-tty-fqde branch from 05b1461 to 8a22991 Compare May 12, 2026 11:10

coder-agents-review Bot approved these changes May 12, 2026

View reviewed changes

Comment thread src/util/abort.ts

Comment thread src/util/abort.ts

ThomasK33 force-pushed the agent-tty-fqde branch from 8a22991 to 9ae0b58 Compare May 12, 2026 11:33

coder-agents-review Bot reviewed May 12, 2026

View reviewed changes

Comment thread test/unit/util/abort.test.ts

Comment thread test/unit/util/abort.test.ts

fix: guard scope cleanup and abort reasons

9165f30

ThomasK33 force-pushed the agent-tty-fqde branch from 9ae0b58 to 9165f30 Compare May 12, 2026 11:53

coder-agents-review Bot reviewed May 12, 2026

View reviewed changes

coder-agents-review Bot approved these changes May 12, 2026

View reviewed changes

ThomasK33 merged commit 1da0a0e into main May 12, 2026
11 checks passed

ThomasK33 deleted the agent-tty-fqde branch May 12, 2026 12:16

Conversation

ThomasK33 commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Notes

Validation

Uh oh!

ThomasK33 commented May 12, 2026

Uh oh!

coder-agents-review Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ThomasK33 commented May 12, 2026

Uh oh!

coder-agents-review Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ThomasK33 commented May 12, 2026

Uh oh!

coder-agents-review Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ThomasK33 commented May 12, 2026

Uh oh!

coder-agents-review Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ThomasK33 commented May 12, 2026

Uh oh!

coder-agents-review Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ThomasK33 commented May 12, 2026

Uh oh!

ThomasK33 commented May 12, 2026

Uh oh!

coder-agents-review Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ThomasK33 commented May 12, 2026

Uh oh!

coder-agents-review Bot left a comment

Choose a reason for hiding this comment

Uh oh!

ThomasK33 commented May 12, 2026

Uh oh!

coder-agents-review Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

ThomasK33 commented May 12, 2026 •

edited

Loading