fix(desktop): close saveMessage vs poll race with a pending-save counter by eulicesl · Pull Request #7559 · BasedHardware/omi

eulicesl · 2026-05-31T03:13:31Z

Summary

Problem: A duplicate / "ghost" chat message can appear after a turn completes. isSending is released before the AI message's save finishes (deliberately, to unblock the next query), so pollForNewMessages can run during that save's await, observe the just-saved AI message before its local UUID has been reconciled to the server id, and append it again as if it arrived from another platform.
What changed: Adds PendingSaveCounter — a @MainActor multi-holder counter. Every saveMessage call site brackets its save with begin()/end(), and the poll guard gains !pendingSaves.isActive, suppressing the poll for the full lifetime of any in-flight save. This closes the race window directly instead of relying on text heuristics.
What did NOT change (scope boundary): The pre-existing 200-char text-prefix merge in pollForNewMessages stays as a secondary safety net. The five save sites are not consolidated (see Risks). No backend, telemetry, or protocol changes.

Linked Issue / Bounty

N/A — independent contribution.

Root Cause (bug fixes only)

Root cause: isSending is cleared before the AI-response save resolves, so the !isSending poll guard is already satisfied while the save is still in flight. The poll then reads the saved-but-not-yet-reconciled message as new-from-another-platform and duplicates it.
Why it wasn't caught before: An existing 200-char text-prefix merge (keyed on sender + unsynced flag) masks most occurrences, so the duplicate only surfaces on short, common replies ("Yes", "Got it", "Sure") that can repeat verbatim across turns — rare and hard to reproduce by hand.

How It Works

PendingSaveCounter mirrors the existing ReentrancyGate pattern but supports multiple concurrent holders (several saves can be in flight at once). end() is bounded at 0 so a stray call can't drive the count negative and wedge the poll off.
Each of the five saveMessage sites calls begin() before launching the save and end() on both the success and catch paths (via MainActor.run where the surrounding Task closure isn't @MainActor).
The poll's existing guard !isSending, !isLoading, !isLoadingSessions gains !pendingSaves.isActive.

Change Type

Bug fix

Scope

Desktop app (Swift/macOS)

Security Impact

New permissions or capabilities introduced? No
Auth or token handling changed? No
Data access scope changed (screen data, OCR, user data)? No
New or changed network calls? No — same saveMessage / poll calls, only their interleaving is gated.
Plugin or tool execution surface changed? No

Testing

Built locally — xcrun swift build -c debug --package-path Desktop → Build complete!
Manual verification: compile + unit tests on macOS. See "Human Verification".
Existing tests pass
New tests added — PendingSaveCounterTests (6 tests): fresh-is-inactive, begin activates, end decrements, multiple holders stack (the concurrent-save case), end() bounded at zero, and interleaved/out-of-order begin/end.

Human Verification

Verified scenarios: clean debug build against current main; all 6 counter tests pass.
Edge cases checked (via tests): concurrent saves keeping the poll suppressed until the last one lands; stray end() not driving the count negative; out-of-order begin/end pairs.
What I did not verify (and why): there is no end-to-end reproducer of the duplicate-on-race in this PR. Reproducing it deterministically needs a mockable APIClient.saveMessage delay driven through the full ChatProvider lifecycle, which is larger test-infrastructure work. The synchronization primitive's contract is verified in isolation; the cross-flow behavior is asserted by construction (the poll cannot run while any save holds the counter).

Evidence

Test output:

Test Suite 'PendingSaveCounterTests' passed — Executed 6 tests, 0 failures
Build complete!

N/A — change is not directly user-visible (it removes an intermittent duplicate); no screenshot.

Docs

N/A — no docs impact

Performance, Privacy, and Reliability

Performance: a single integer counter check added to the poll guard; negligible. Privacy: no content is inspected — the counter only tracks save in-flight state. Reliability: strictly tightens an existing race; when no save is in flight, behavior is identical to today.

Migration / Backward Compatibility

Backward compatible? Yes
Migration needed? No — purely in-memory synchronization; no schema or stored-data change.

Risks and Mitigations

Risk: An unbalanced begin() without a matching end() would suppress the poll indefinitely. Mitigation: every site calls end() on both success and catch paths; end() is bounded at 0; the counter is unit-tested for the interleaved/stray-end cases.
Risk: Five separate save sites are easy to under-bracket. Mitigation: each site is documented inline and brackets symmetrically; consolidating them is deliberately deferred because they persist different message types at different lifecycle points and collapsing them could mask other races.

AI Disclosure

Tools used: Claude Code (Claude Opus).
AI-assisted portions: implementation and this PR description.
How I verified correctness: ran the local debug build and the unit-test suite (all green); manually traced each of the five save sites to confirm balanced begin()/end() on success and error paths.

greptile-apps · 2026-05-31T03:19:11Z

Greptile Summary

This PR introduces PendingSaveCounter, a @MainActor integer counter that brackets all five saveMessage call sites in ChatProvider and adds !pendingSaves.isActive to the pollForNewMessages guard, closing the race window where isSending is cleared before the AI-message save resolves and the poll could duplicate the just-saved message.

PendingSaveCounter.swift: New synchronization primitive with begin()/end() (bounded at zero to prevent negative counts) and isActive for the poll guard; well-documented contract and six unit tests cover fresh/single/stacked/bounded/interleaved scenarios.
ChatProvider.swift: All five fire-and-forget and inline save sites are symmetrically bracketed; for the four async Task sites, end() is correctly dispatched through await MainActor.run {} to satisfy the @MainActor isolation requirement.
Site 4 (critical inline do/catch): end() is duplicated in the success and catch branches rather than using defer, which leaves a small maintenance surface if a future change adds a guard or early-exit that forgets to call end().

Confidence Score: 4/5

The change is purely additive in-memory synchronization and cannot regress existing behaviour when no saves are in flight.

All five saveMessage sites are correctly bracketed with matching begin()/end() on both success and error paths. The only gap is site 4 duplicating end() in the do/catch instead of using defer — a future edit adding an early exit could forget end() and suppress the poll indefinitely, but this is a maintenance concern rather than a current defect.

The inline do/catch at site 4 in ChatProvider.swift is the only spot worth a second look to ensure future edits maintain the end()/defer discipline.

Important Files Changed

Filename	Overview
desktop/Desktop/Sources/PendingSaveCounter.swift	New @mainactor counter primitive — clean, well-documented, bounded at zero; contract matches all five call sites.
desktop/Desktop/Sources/Providers/ChatProvider.swift	Adds pendingSaves to pollForNewMessages guard and brackets all five saveMessage sites; site 4 duplicates end() instead of using defer.
desktop/Desktop/Tests/PendingSaveCounterTests.swift	Six synchronous contract tests on @mainactor cover all counter states correctly.

_{Reviews (1): Last reviewed commit: "fix(desktop): close saveMessage vs poll ..." | Re-trigger Greptile}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9fbcdd2656

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Addresses three review findings on BasedHardware#7559: - Use `defer { pendingSaves.end() }` at the critical inline AI-save site instead of calling end() in both the success and catch branches, so a future early-exit can't skip the release and permanently suppress the poll (Greptile P2). - Re-check `!pendingSaves.isActive` after getMessages returns. A proactive message appended via appendAssistantMessage while the fetch was in flight could otherwise be processed before its UUID reconciles — the top-of-poll guard alone doesn't cover saves that begin during the fetch (Codex P2). - Add a debug `assert(count > 0)` in PendingSaveCounter.end() so an unbalanced end() surfaces immediately in debug; the release guard still bounds it at zero. Updated the bound test to verify the counter returns to zero and stays usable across rounds without relying on a stray end() (Greptile P2). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ae0aec65a5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

kodjima33

Desktop bug fix: closes saveMessage/poll race with a counter-gated guard, unit tested. Approve only — mergeStateStatus BLOCKED, author needs to rebase before merge.

`isSending` is released *before* the AI message save completes (so the next query isn't blocked), which opens a window where `pollForNewMessages` can observe the just-saved AI message before its local UUID is reconciled to the server id — and append it again as if it arrived from another platform. An existing 200-char text-prefix merge catches most cases but is fragile on short, common replies ("Yes", "Got it") that repeat verbatim across turns. Adds `PendingSaveCounter`, a `@MainActor` multi-holder counter (mirrors the existing `ReentrancyGate` pattern, bounded at 0 so stray `end()` calls can't go negative). Every saveMessage call site brackets its save with `begin()` / `end()`, and the poll guard gains `!pendingSaves.isActive` — suppressing the poll for the full lifetime of any in-flight save rather than relying on text heuristics. The pre-existing merge stays as a secondary safety net. The five save sites (user follow-up, AI-from-notification, user turn start, AI success — the critical one — and partial-AI-on-error) are each documented inline. They are intentionally not consolidated: they persist different message types at different lifecycle points, and collapsing them risks masking other races. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Addresses three review findings on BasedHardware#7559: - Use `defer { pendingSaves.end() }` at the critical inline AI-save site instead of calling end() in both the success and catch branches, so a future early-exit can't skip the release and permanently suppress the poll (Greptile P2). - Re-check `!pendingSaves.isActive` after getMessages returns. A proactive message appended via appendAssistantMessage while the fetch was in flight could otherwise be processed before its UUID reconciles — the top-of-poll guard alone doesn't cover saves that begin during the fetch (Codex P2). - Add a debug `assert(count > 0)` in PendingSaveCounter.end() so an unbalanced end() surfaces immediately in debug; the release guard still bounds it at zero. Updated the bound test to verify the counter returns to zero and stays usable across rounds without relying on a stray end() (Greptile P2). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Follow-up to the poll-suppression hardening: suppressing the poll while a save is in flight could permanently drop that cycle, since pollForNewMessages only fires on activation / Cmd+R (no periodic poll). Messages from other platforms sent while the app was inactive would then stay unfetched until the next activation (Codex P2). PendingSaveCounter gains an `onDrained` callback fired when the count returns to 0. ChatProvider sets `pollDeferredDuringSave` whenever a poll bails due to an in-flight save (top guard or post-fetch recheck) and, on drain, re-runs the poll exactly once — only when a cycle was actually deferred, so the common no-save path adds no extra polling. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f2ad021073

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Follow-up to the drain-retry: onDrained cleared pollDeferredDuringSave before the retry had passed pollForNewMessages's other guards. If the retry then bailed on isSending/streaming (a new turn started mid-flight), the deferral was lost and the original poll's other-platform messages weren't fetched until the next activation (Codex P2). Clear the flag only inside pollForNewMessages, once it gets past all the deferral-relevant guards and commits to a fetch — not in onDrained. A retry that bails early now leaves the flag set, so the next drain (e.g. the AI-response save, which completes after isSending is released) retries and succeeds. The post-fetch recheck still re-sets the flag if a save sneaks in during getMessages. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

eulicesl · 2026-05-31T15:46:36Z

Desktop bug fix: closes saveMessage/poll race with a counter-gated guard, unit tested. Approve only — mergeStateStatus BLOCKED, author needs to rebase before merge.

Rebase Completed

eulicesl · 2026-05-31T16:34:54Z

Rebased onto the latest main, checks green, and all review threads resolved — this one's ready to merge whenever convenient. Thanks for the review!

kodjima33

Desktop bug fix (pending-save counter closes saveMessage↔poll race). CLEAN, 308 lines, scoped tests. Approve only — author below the 3-prior-merged-PRs bar in the default gate.

greptile-apps Bot reviewed May 31, 2026

View reviewed changes

Comment thread desktop/Desktop/Sources/Providers/ChatProvider.swift

Comment thread desktop/Desktop/Sources/PendingSaveCounter.swift

chatgpt-codex-connector Bot reviewed May 31, 2026

View reviewed changes

Comment thread desktop/Desktop/Sources/Providers/ChatProvider.swift Outdated

chatgpt-codex-connector Bot reviewed May 31, 2026

View reviewed changes

Comment thread desktop/Desktop/Sources/Providers/ChatProvider.swift Outdated

kodjima33 approved these changes May 31, 2026

View reviewed changes

eulicesl and others added 3 commits May 31, 2026 11:05

eulicesl force-pushed the feat/desktop-chat-save-race-fix branch from 24ac362 to f2ad021 Compare May 31, 2026 15:07

chatgpt-codex-connector Bot reviewed May 31, 2026

View reviewed changes

Comment thread desktop/Desktop/Sources/Providers/ChatProvider.swift Outdated

eulicesl mentioned this pull request May 31, 2026

test(desktop): lock chat prompt ↔ tool-registry & schema consistency #7573

Closed

8 tasks

kodjima33 approved these changes Jun 1, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(desktop): close saveMessage vs poll race with a pending-save counter#7559

fix(desktop): close saveMessage vs poll race with a pending-save counter#7559
eulicesl wants to merge 4 commits into
BasedHardware:mainfrom
eulicesl:feat/desktop-chat-save-race-fix

eulicesl commented May 31, 2026

Uh oh!

greptile-apps Bot commented May 31, 2026

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

kodjima33 left a comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

eulicesl commented May 31, 2026

Uh oh!

eulicesl commented May 31, 2026

Uh oh!

kodjima33 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

eulicesl commented May 31, 2026

Summary

Linked Issue / Bounty

Root Cause (bug fixes only)

How It Works

Change Type

Scope

Security Impact

Testing

Human Verification

Evidence

Docs

Performance, Privacy, and Reliability

Migration / Backward Compatibility

Risks and Mitigations

AI Disclosure

Uh oh!

greptile-apps Bot commented May 31, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

kodjima33 left a comment

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

eulicesl commented May 31, 2026

Uh oh!

eulicesl commented May 31, 2026

Uh oh!

kodjima33 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants