Skip to content

Add retry logic to role synchronization#3100

Open
jorsol wants to merge 1 commit into
zalando:masterfrom
jorsol:fix-sync-roles-recovery
Open

Add retry logic to role synchronization#3100
jorsol wants to merge 1 commit into
zalando:masterfrom
jorsol:fix-sync-roles-recovery

Conversation

@jorsol

@jorsol jorsol commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Prevents syncRoles() from failing during cluster startup, crash recovery, or primary failover windows.

When a PostgreSQL instance is starting up or processing WAL logs, it will accept connections but reject write operations with a 26006 read-only transaction error. Since role syncing occurs within a reconciliation loop, failing flat during this phase creates unnecessary error noise.

This adds a retry logic around c.userSyncStrategy.ExecuteSyncRequests to not fail immediately.

Closes #3099

@FxKu FxKu added this to the 2.0.0 milestone Jun 4, 2026
@FxKu FxKu added the minor label Jun 4, 2026
@FxKu FxKu modified the milestones: 2.0.0, wishlist Jun 11, 2026
@FxKu

FxKu commented Jun 11, 2026

Copy link
Copy Markdown
Member

syncing roles is not the only time write operations happen. How about implementing a retry logic around db.Exec(query) parts in users.go file like you've mentioned in your issue. Search for:

retryutil.Retry(
			constants.PostgresConnectTimeout,
			constants.PostgresConnectRetryTimeout,
			func() (bool, error) {
			...

to see how it was done in other places.

@FxKu FxKu modified the milestones: wishlist, 2.0.0 Jun 11, 2026
@FxKu FxKu moved this to Open Questions in Postgres Operator Jun 11, 2026
@FxKu FxKu moved this from Open Questions to WIP / currently reviewed in Postgres Operator Jun 11, 2026
@jorsol jorsol force-pushed the fix-sync-roles-recovery branch from 2cb0f47 to dc147e3 Compare June 11, 2026 14:47
@jorsol

jorsol commented Jun 11, 2026

Copy link
Copy Markdown
Contributor Author

Hi @FxKu, can you review again?

@jorsol jorsol changed the title fix: skip role sync when database is in recovery mode Add retry logic to role synchronization Jun 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Status: WIP / currently reviewed

Development

Successfully merging this pull request may close these issues.

Race-condition when doing syncRoles()

2 participants