DOC-6252 sections about failover behaviour when all endpoints are unhealthy by andy-stark-redis · Pull Request #2768 · redis/docs

andy-stark-redis · 2026-02-10T16:21:44Z

Added info about this based on customer feedback. The corresponding section for the Lettuce geo failover page will be added in a separate PR.

Note

Low Risk
Low risk documentation-only changes; the main risk is incorrect exception/option naming that could mislead users configuring failover.

Overview
Expands the client-side geographic failover docs to better describe health check strategies (ping, lag-aware via REST API, and custom) in the main overview.

Adds new guidance for Jedis and redis-py on what happens when all endpoints are unhealthy, including the exceptions thrown, how long the client keeps probing based on failover attempt/delay settings, and suggested retry/reconnect handling. Also clarifies redis-py troubleshooting advice around timeouts and LagAwareHealthCheck configuration.

^{Written by Cursor Bugbot for commit 817863f. This will update automatically on new commits. Configure here.}

github-actions · 2026-02-10T16:21:59Z

DOC-6252

github-actions · 2026-02-10T16:22:03Z

Staging links:
https://redis.io/docs/staging/DOC-6252-failover-no-dbs/develop/clients/failover/
https://redis.io/docs/staging/DOC-6252-failover-no-dbs/develop/clients/jedis/failover/
https://redis.io/docs/staging/DOC-6252-failover-no-dbs/develop/clients/redis-py/failover/

dwdougherty

LGTM.

andy-stark-redis · 2026-02-10T16:52:16Z

Thanks @dwdougherty !

ggivo

LGTM from Jedis perspective

ggivo · 2026-02-11T08:31:18Z

content/develop/clients/jedis/failover.md

+in the [Retry configuration]({{< relref "#retry-configuration" >}}) section). However, if the client exhausts
+all the available failover attempts before any endpoint becomes healthy again, commands will throw a `JedisPermanentlyNotAvailableException`. The client won't recover automatically from this situation, so you
+should handle it by reconnecting with the `MultiDBClient` builder after a suitable delay (see
+[Failover configuration](#failover-configuration) for a connection example).
+


On a second look, I don’t think this is technically correct.

Even after a JedisPermanentlyNotAvailableException, if an endpoint becomes healthy again, the client can recover.

JedisPermanentlyNotAvailableException just means that there were no healthy connections for a configured amount of time, so we treat it as a permanent error at that moment. It doesn’t necessarily mean the client is incapable of recovering later.

It also looks like we’re missing an integration test for this scenario — e.g. recovery after a JedisPermanentlyNotAvailableException has already been thrown.

@atakavci — any concerns we clarify this behavior in the docs around JedisPermanentlyNotAvailableException, as it can recover?

@ggivo , agreed.
JedisPermanentlyNotAvailableException is the way Jedis signaling to the application that "all unhealthy" state has been stable for some period of time, and configured number of attempts(in regard to configured delay) is already exhausted. So that upon receiving this type of exception, the application would decide how to react to a consistent/stable availability issue.

@atakavci @ggivo OK, so after the app gets a JedisPermanentlyNotAvailableException does Jedis still keep trying to find a healthy endpoint automatically in the background (so if you try a command again a bit later then it might succeed)? Or do you have to add some code to handle this explicitly from the app (eg, use isHealthy to check all the current endpoints and then use setActiveDatabase to start using a healthy endpoint if you can find one)?

@andy-stark-redis
thank you for raising the question, it looks it was a kind of gray area.

When a client instance hits the all-unhealthy case:

If failback is enabled, it will automatically recover and switch to a healthy database on the first run of the periodic failback execution, without user intervention.

If failback is disabled, the user will need to verify a healthy endpoint and explicitly call setActiveDatabase to switch to it.

you can check the test i am introducing with this PR.
@ggivo please let me know what you think of it.

Beyond the question, it also made me think it could be a good improvement to trigger a check if the all-unhealthy situation is resolved, either on a health state change or on any incoming command request, or both may be. I ll take a closer look when i find the time.

Fixed. Thanks for the clarification.

jit-ci · 2026-02-27T15:22:03Z

🛡️ Jit Security Scan Results

✅ No security findings were detected in this PR

^{Security scan by Jit}

andy-stark-redis added 2 commits February 10, 2026 14:27

DOC-6252 redis-py handling of no failover DBs

f76911b

DOC-6252 Jedis behaviour when no DBs are available

0f5521e

andy-stark-redis requested a review from a team February 10, 2026 16:21

andy-stark-redis self-assigned this Feb 10, 2026

andy-stark-redis added the clients Client library docs label Feb 10, 2026

dwdougherty approved these changes Feb 10, 2026

View reviewed changes

ggivo reviewed Feb 11, 2026

View reviewed changes

DOC-6252 updates based on feedback

817863f

andy-stark-redis merged commit eb8720c into main Feb 27, 2026
5 checks passed

andy-stark-redis deleted the DOC-6252-failover-no-dbs branch February 27, 2026 15:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DOC-6252 sections about failover behaviour when all endpoints are unhealthy#2768

DOC-6252 sections about failover behaviour when all endpoints are unhealthy#2768
andy-stark-redis merged 3 commits intomainfrom
DOC-6252-failover-no-dbs

andy-stark-redis commented Feb 10, 2026 •

edited by cursor bot

Loading

Uh oh!

github-actions bot commented Feb 10, 2026 •

edited by atlassian bot

Loading

Uh oh!

github-actions bot commented Feb 10, 2026 •

edited

Loading

Uh oh!

dwdougherty left a comment

Uh oh!

andy-stark-redis commented Feb 10, 2026

Uh oh!

ggivo left a comment

Uh oh!

ggivo Feb 11, 2026 •

edited

Loading

Uh oh!

atakavci Feb 11, 2026

Uh oh!

andy-stark-redis Feb 11, 2026

Uh oh!

atakavci Feb 26, 2026

Uh oh!

andy-stark-redis Feb 27, 2026

Uh oh!

jit-ci bot commented Feb 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

andy-stark-redis commented Feb 10, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Feb 10, 2026 • edited by atlassian bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dwdougherty left a comment

Choose a reason for hiding this comment

Uh oh!

andy-stark-redis commented Feb 10, 2026

Uh oh!

ggivo left a comment

Choose a reason for hiding this comment

Uh oh!

ggivo Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

atakavci Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

andy-stark-redis Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

atakavci Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

andy-stark-redis Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

jit-ci bot commented Feb 27, 2026

🛡️ Jit Security Scan Results

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

andy-stark-redis commented Feb 10, 2026 •

edited by cursor bot

Loading

github-actions bot commented Feb 10, 2026 •

edited by atlassian bot

Loading

github-actions bot commented Feb 10, 2026 •

edited

Loading

ggivo Feb 11, 2026 •

edited

Loading