fix: avoid deadlock by moving connector refresh outside txn and adding per-token mutex#4312
Open
tsardelli-dh wants to merge 1 commit intodexidp:masterfrom
Open
fix: avoid deadlock by moving connector refresh outside txn and adding per-token mutex#4312tsardelli-dh wants to merge 1 commit intodexidp:masterfrom
tsardelli-dh wants to merge 1 commit intodexidp:masterfrom
Conversation
c9b79a5 to
fc00334
Compare
62941c5 to
9320c9c
Compare
…and adding per-token mutex Previously, `updateRefreshToken` executed `refreshWithConnector` inside the `UpdateRefreshToken` transaction. With SQL backends that enforce strict connection limits (e.g. SQLite), this blocked the only available connection while the connector call could indirectly trigger further storage access (e.g. when using PasswordDB), causing the system to hang. This patch moves connector refresh calls outside of the storage transaction and introduces a per-refresh-ID mutex to ensure only one concurrent request per token hits the external IdP. Other concurrent requests wait for the mutex and reuse the updated identity. Signed-off-by: Tommaso Sardelli <t.sardelli@datahow.ch>
9320c9c to
cccbebc
Compare
Author
|
Hey folks. Whenever you get a chance, I’d love to hear any thoughts or suggestions on this. I'm totally open to changes if this approach doesn’t feel right. Thanks! |
nabokihms
requested changes
Nov 3, 2025
|
|
||
| refreshTokenPolicy *RefreshTokenPolicy | ||
| // mutex to refresh the same token only once for concurrent requests | ||
| refreshLocks sync.Map |
Member
There was a problem hiding this comment.
How does it work when Dex is running in the HA mode, e.g., two instances of Dex are serving clients at the same time? For the Kubernetes storage, we did locking on the Kubernetes side, but in in-memory lock doesn't help.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Overview
This patch moves connector refresh calls outside of the storage transaction and introduces a per-refresh-ID mutex to ensure only one concurrent request per token hits the external IdP. Other concurrent requests wait for the mutex and reuse the updated identity.
What this PR does / why we need it
This is an initial attempt to fix #4209. The gist of the problem is that commit 4b5f1d52 moved the connection to the external IdP to update the refresh token, in the same DB transaction used to update the token in dex's storage. This causes a deadlock when the "external IdP" is still dex (e.g. when using PasswordDB) and the underlying storage sets a maximum number of open connections (e.g. SQLite or PgBouncer)
For a detailed overview of what triggers the issue, please see #4209 (comment)
Closes #4209
Special notes for your reviewer
I'm not sure how to properly test this change or if it's already covered by the existing test suite. If you have any suggestions please let me know.