Skip to content

Conversation

@ivandika3
Copy link
Contributor

What changes were proposed in this pull request?

Please refer to https://issues.apache.org/jira/browse/RATIS-2382

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/RATIS-2382

How was this patch tested?

CI (https://github.com/ivandika3/ratis/actions/runs/20984539273)

| **Default** | false |


| **Property** | `raft.server.read.read-index.heartbeat.skip.enabled` |
Copy link
Contributor

@OneSizeFitsQuorum OneSizeFitsQuorum Jan 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for the idea and code.

Just want to discuss whether heartbeat.skip.enabled is a good name, I think maybe raft.server.read.read-index.leadership-check.skip.enabled

Copy link
Contributor Author

@ivandika3 ivandika3 Jan 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@OneSizeFitsQuorum Thanks for the review. I was also grappling with whether "leadership check" is a better name. However, my reasoning of choosing "heartbeat" is because we already have a LeaderStateImpl#checkLeadership to decide whether to leader need to step down due to lost majority heartbeats. LeaderStateImpl#checkLeadership does not send heartbeat, it simply checks the last RPC response time of the majority. Therefore, I picked "heartbeat" to explicitly mention the RPC that is going to be skipped.

That said, I'm ok if the community decide "leadership-check" is a more fitting name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about raft.server.read.read-index.leadership-heartbeat-check.skip.enabled?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@OneSizeFitsQuorum Sounds good, updated. Thanks.

Copy link
Contributor

@szetszwo szetszwo Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about raft.server.read.read-index.leadership-heartbeat-check.skip.enabled?

Sorry for coming late:

  • Since this conf is closely related to raft.server.read.leader.lease.enabled, they should be next to each other.
  • The default "skip.enabled == false" is a double negative.

So, how about raft.server.read.leader.heartbeat-check.enabled (default is true).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point to make it closer to lease since disabling heartbeat check essentially removed some of the leader lease guarantee.

Copy link
Contributor

@szetszwo szetszwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ivandika3 , thanks for working on this!

Please see #1334 (comment) and the comment inlined.

Comment on lines 230 to 251
| **Property** | `raft.server.read.read-index.leadership-heartbeat-check.skip.enabled` |
|:----------------|:----------------------------------------------------------------------|
| **Description** | whether to skip the leadership check heartbeat for read index. |
| **Type** | boolean |
| **Default** | false |

Note that although enabling `leadership-heartbeat-check.skip.enabled` reduce the RTT due to the leadership check heartbeat,
it causes reads to not be linearizable in some cases.
This is because without the leadership check heartbeats, the leader might not be the latest leader
and might serve stale reads.
There might be a small period of time when there is a split brain and two Raft peers that believe that
they are leaders (old leader and new leader where old leader's term < new leader's term).
The old leader might not have detected that it has lost majority heartbeats and stepped down or the old leader
has not received any RPC with higher term which forces the old leader to steps down.
Without the leadership check heartbeat which should detect that the old leader is no longer the latest leader and
force the old leader to step down, the old leader ReadIndex might return an index that is lower than
the new leader's applied index.
This means that the old leader might return stale data which is not linearizable by definition.
However, this situation should happen only abnormal situations when the Raft group encounters a leader election
due to network partition between the old leader and the other majority quorum (which will elect a new leader).
Therefore, this might be an acceptable tradeoff for applications that seek to improve the linearizable read
performance.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the doc. Below is my suggestion:

| **Property**    | `raft.server.read.leader.heartbeat-check.enabled` |
|:----------------|:--------------------------------------------------|
| **Description** | whether to check heartbeat for read index.        |
| **Type**        | boolean                                           |
| **Default**     | true                                              |

Note that the original read index algorithm requires heartbeat check
in order to guarantee linearizable read.
By setting this property to false,
it reduces the RTT by eliminating the heartbeat check.
However, it might cause the reads not to be linearizable in a split-brain case.
Without the heartbeat check, a leader might not be the latest leader
and, as a result, it might serve stale reads.
When there is a split brain, there might be a small period of time
that the (old) leader has lost majority heartbeats but have not yet detected it.
As the same time, a new leader is elected by a majority of peers.
Then, the old leader might serve stale data
since it does not have the transactions committed by the new leaders.
Since such split-brain case is supposed to be rare,
it might be an acceptable tradeoff for applications that 
seek to improve the linearizable read performance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, updated.

Copy link
Contributor

@szetszwo szetszwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 the change looks good.

@szetszwo szetszwo merged commit bd06cf7 into apache:master Feb 9, 2026
16 checks passed
@szetszwo
Copy link
Contributor

szetszwo commented Feb 9, 2026

@ivandika3 , thanks for working on this!

@OneSizeFitsQuorum , thanks for reviewing this!

@ivandika3
Copy link
Contributor Author

Thanks @OneSizeFitsQuorum @szetszwo for the reviews.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants