Skip to content

vscode: Tower-divergence toast re-pops during Restart Tower; version mismatch should warn, not block commands #1017

@amrmelsayed

Description

@amrmelsayed

Two related concerns surfaced in v3.1.9 with the Tower-version divergence preflight from #983:

Bug 1: Restart Tower toast re-pops several times during the restart

Repro: Tower is running an older version than the installed CLI (e.g., Tower 3.1.7 with CLI 3.1.9 after an npm install -g without a Tower restart). The divergence toast appears: Codev Tower is running 3.1.7, but 3.1.9 is installed. Restart Tower to load it. Click Restart Tower. The toast closes (correct), but then re-pops a couple of times during the afx tower stop && afx tower start window before the process settles and the toast finally goes away for good.

Likely root cause:

probeRunningTower in packages/vscode/src/preflight/preflight.ts is invoked from multiple trigger sites (initial preflight, Tower reconnect events, and the post-restart re-probe at the end of restartTowerAndReprobe). The existing towerDivergenceShownThisSession guard (around preflight.ts:395) suppresses duplicate toasts within a session but is reset to false whenever the probe transitions to ok (preflight.ts:367).

During afx tower stop && afx tower start:

  1. Tower stops → reconnect-triggered probe returns unreachable (silent, no toast, no flag change).
  2. New Tower starts → probe returns ok (versions match) → flag resets to false at line 367.
  3. If any further probe in the restart window momentarily reports stale (race with version-endpoint readiness, the readiness barrier from tower: reconcile terminal sessions before serving requests (kill the restart successor-lookup race) #997 not fully wired into the version probe path, or transient reconnect chatter), the guard is now cleared and the toast re-fires.

The flag is per-session suppression, but a restart can legitimately transition through ok multiple times before the system fully settles, each transition clearing the guard.

Likely fix: introduce a restartInProgress gate set when restartTowerAndReprobe is called and cleared only after the final re-probe confirms success (or final failure). While the gate is true, suppress all toast surfacing AND skip the towerDivergenceShownThisSession = false reset on transient ok transitions, so a brief stale reading mid-restart doesn't re-arm the toast.

Bug 2: commands should warn on version mismatch, not block

When the Tower version doesn't match the installed CLI, certain commands appear to refuse to run at all. A version mismatch is a soft incompatibility signal, not a hard correctness problem. The current behavior (whatever the exact mechanism — investigate at plan time) is overly conservative.

Desired behavior:

  • Tower running an older version than the installed CLI: warn that some operations may behave unexpectedly or fail with cryptic errors, but let commands attempt. The user is in the best position to judge whether the operation they want is affected.
  • The warning shouldn't fire on every click. The existing divergence toast (Bug 1's surface) already names the situation. Augment the error path: when a Tower-side error response surfaces during a command, include a tail clause if cachedTowerStatus !== 'ok' — something like "Tower is running an older version; this may be the cause."
  • Investigate whether there's any explicit gating that hard-blocks commands on cachedTowerStatus. isCliReady() in preflight.ts:99 only consults the CLI status (cachedStatus), not the Tower status — so any command-blocking on Tower version is either elsewhere or is the user's experience of API-level errors being misread as refusals.

If no explicit blocking exists, this collapses to a UX polish: improve error messages to surface the divergence as the probable cause.

Acceptance criteria

  • Restart Tower action wraps in a restartInProgress gate that suppresses toast re-arming AND the ok-transition flag-reset until the final post-restart probe completes.
  • Toast appears once when divergence is first detected; clicking Restart Tower does not produce additional toasts during the restart window; if the restart succeeds and Tower is ok, the toast goes away and stays away.
  • If the restart fails (restartTower returns false), the existing fallback toast (Codev: Tower restart did not complete...) fires once, not in a loop.
  • Plan-phase investigation: identify whether any code path hard-blocks commands on cachedTowerStatus !== 'ok'. Document the finding in the plan.
  • If hard-blocking exists, replace with the warn-but-allow pattern (existing divergence toast + augmented error-path messaging).
  • If no hard-blocking exists, augment the error path so Tower-side errors surface the version-divergence context when applicable.
  • Unit tests for the toast suppression behavior during the restartInProgress window (including the failure path).

Out of scope

Suggested protocol

BUGFIX. Bug 1 is mechanical: a per-restart gate to wrap the existing per-session flag. Bug 2 is verification + small wording fixes. Both fit in loadBuilderPromptTemplate-adjacent territory (preflight.ts only), no design discussion needed. Plan-gate is unnecessary; the diagnosis and fix shape are both spelled out.

If the plan-phase investigation surfaces that the warn-but-allow change touches multiple call sites in a load-bearing way, the builder can escalate to PIR for the second part — but the default expectation is BUGFIX scope.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/vscodeArea: VS Code extension

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions