fix(controlplane): bind Flight sessions to connection SNI org, not a username map#652
Merged
Conversation
…not a username map Addresses the blocker raised in review of #651: Flight auth resolved the org from SNI correctly, but stored the result in a process-global `orgRoutedSessionProvider.userOrg[username] -> orgID` map that `CreateSession` read back by bare username. Usernames are only unique within an org, so two tenants sharing a username could race: connection A (org-acme) writes `userOrg["alice"]=org-acme`, connection B (org-beta) overwrites it, and A's session then gets created against org-beta's worker stack. This reintroduced the exact username-collision class the SNI-only identity change was meant to remove. Fix: derive the org for a session from the connection's managed hostname (SNI) — the same immutable per-connection identity auth uses — re-resolved at session-create time, and delete the username map entirely: - orgRoutedSessionProvider gains an injected `resolveOrg(ctx) -> (orgID, ok)` and drops the `userOrg` map. CreateSession resolves the org from the request context's SNI and fails closed if it doesn't resolve. - Production wires resolveOrg to ControlPlane.flightOrgFromContext (extract SNI from the gRPC peer → resolveFlightOrgFromSNI). The Postgres-side auth resolution is unchanged; the Flight validator now only authenticates and stores no routing state. - Tests: prove two same-username connections route to their own org by context, and fail closed when the SNI doesn't resolve. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ession Follow-up review hardening for the Flight org-routing fix: - Export flightsqlingress.SNIFromContext and use it from both the auth path and controlplane session routing, deleting the duplicated SNI-extraction helper so auth and routing can never silently diverge on a connection's hostname. - Fail closed with a clear error if an orgRoutedSessionProvider is ever constructed without an org resolver (defensive; the one production wiring always sets it). - Add TestFlightOrgFromContextResolvesViaSNI: drives the real peer→TLS ServerName→extractOrgFromSNI→ResolveSNIPrefix chain (managed resolves; unknown prefix, unmanaged hostname, and missing peer all fail closed). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Follow-up to #651, addressing the review blocker from @EDsCODE.
Problem
#651 made Flight auth resolve the org from SNI correctly, but the result was stashed in a process-global map keyed by bare username:
Usernames are only unique within an org, so two tenants can both have
alice:alice@acme.<suffix>authenticates →userOrg["alice"] = org-acmealice@beta.<suffix>authenticates → overwrites →userOrg["alice"] = org-betaThis reintroduced the exact username-collision class the SNI-only identity change was meant to eliminate — the SNI-scoped principal was collapsed back to a username at the auth→session handoff.
Fix
Derive the session's org from the connection's managed hostname (SNI) — the same immutable per-connection identity auth uses — re-resolved at session-create time, and delete the username map entirely:
orgRoutedSessionProviderdropsuserOrgand gains an injectedresolveOrg(ctx) (orgID, ok).CreateSessionresolves the org from the request context's SNI and fails closed if it doesn't resolve.resolveOrgtoControlPlane.flightOrgFromContext(extract SNI from the gRPC peer →resolveFlightOrgFromSNI). The Flight validator now only authenticates (ValidateOrgUseragainst the SNI-resolved org) and stores no routing state. The Postgres-side resolution is untouched.ctxreachingCreateSessionis a timeout-child of the gRPC request ctx, sopeer.FromContextstill yields the TLSServerName.There is now no shared mutable username→org state anywhere in the Flight path; routing identity is a pure function of the connection's hostname.
Tests
TestOrgRoutedSessionProviderRoutesByContextSNINotUsername: two connections sharing usernamealicefrom different org contexts each route to their own org (StackForOrg(org-a)then(org-b)), proving the collision is gone.TestOrgRoutedSessionProviderFailsClosedWhenSNIUnresolved: no session created when the SNI doesn't resolve.userOrgassertion); the durable-reconnect path already carriedOrgIDcorrectly and is unchanged.Builds clean on both default and
-tags kubernetes; controlplane suite green on both.🤖 Generated with Claude Code