fix(devtools-proxy-support): reconnect SSH tunnel after unrecoverable client error COMPASS-8355#772
Merged
Conversation
… client error COMPASS-8355 - Add sshClient.on('error') handler in SSHAgent constructor to prevent unhandled 'error' events from crashing the process when the SSH session dies unexpectedly (e.g. after hibernate) - Add reinitializeClient flag to initialize() to recreate the ssh2 Client instance when it enters an unrecoverable "Instance unusable" state - Expand retryable error patterns to include "Instance unusable after fatal error", "read ECONNRESET", and "Socket closed" - Extract client setup into createSshClient() to ensure close/error handlers and forwardOut binding are consistent on both initial setup and recreation
addaleax
reviewed
Jun 2, 2026
| 'error', | ||
| new Error('some unexpected ssh error'), | ||
| ); | ||
| }).not.to.throw(); |
Collaborator
There was a problem hiding this comment.
We can keep this test, but I think a much more valuable test would be one in which the actual connection is interrupted – and luckily, the setup here does allow us to test this directly, because HTTPServerProxyTestSetup holds a reference to an SSH server, and that SSH server forwards getConnections to the underlying net.Server, and we can call .destroy() on those connections.
|
coverage: 78.676% (+0.05%) from 78.63% — COMPASS-8355 into main |
- Replace delete this.connectingPromise with = undefined assignment - Improve "Instance unusable" test: patch connect() to simulate broken client state, verify fresh client is created and auth happens twice - Track initializedSuccessfully in _connect() to detect when initialize() itself failed, requiring a new client on retry
… to clear keepalive timer COMPASS-8355 Without this, the old ssh2 Client's keepalive setInterval keeps firing after initialize(true) replaces it, preventing process exit. ssh2's end() guards with isWritable() so it is a no-op when the socket is already dead.
addaleax
approved these changes
Jun 3, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Fixes the SSH tunnel failing to recover after a laptop resumes from hibernate (COMPASS-8355).
When the OS hibernates, the underlying TCP connection is killed mid-stream. The ssh2 packet parser receives truncated data and puts the
Clientinstance into a permanently broken "unusable" state. Two problems followed:sshClient.on('error', ...)handler inSSHAgent, so the'error'event emitted by the broken parser went completely unhandled → uncaught exception crashing the process._connect()only matched"Not connected"and"Channel open failure". Even if"Instance unusable after fatal error"had been added to that list, the retry would have calledinitialize()on the same broken instance → same error again. The ssh2Clientmust be recreated before reconnecting.Changes
createSshClient()(new private method) — extracts ssh2Clientconstruction and listener setup into a reusable method. Adds a persistentclient.on('error', ...)handler that setsconnected = falseand prevents unhandled error events from crashing the process.initialize(reinitializeClient = false)— the new flag triggerssshClient.end()+createSshClient()before reconnecting, discarding the broken instance and bindingforwardOutto the fresh one._connect()— expands retryable error patterns to cover"Instance unusable after fatal error","read ECONNRESET", and"Socket closed". WhenrequiresNewClientis true, passes the flag through toinitialize().New tests
'does not crash on unexpected sshClient error events'— verifies the new'error'handler absorbs the event without throwing.'reconnects with a fresh SSH client after "Instance unusable after fatal error"'— simulates a broken ssh2 client by patchingconnect()to emit that error, then verifies a successful reconnect with a fresh client (auth handler called twice).Open Questions
This is step 1 of a broader cleanup (see COMPASS-8355). Step 2 will update
devtools-connectto expose the tunnel reference or surface tunnel errors onDevtoolsConnectionState, so that Compass no longer needs to create and manage the tunnel itself.Checklist