Skip to content

Enable HATracker memberlist#7284

Open
SungJin1212 wants to merge 9 commits intocortexproject:masterfrom
SungJin1212:Enable-HATracker-memberlist
Open

Enable HATracker memberlist#7284
SungJin1212 wants to merge 9 commits intocortexproject:masterfrom
SungJin1212:Enable-HATracker-memberlist

Conversation

@SungJin1212
Copy link
Member

This PR enables memberlist on HATracker.
It includes the implementations of a mergeable interface of ReplicaDesc.

Which issue(s) this PR fixes:
Fixes #7220

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

EnableStartupSync bool `yaml:"enable_startup_sync"`

KVStore kv.Config `yaml:"kvstore" doc:"description=Backend storage to use for the ring. Please be aware that memberlist is not supported by the HA tracker since gossip propagation is too slow for HA purposes."`
KVStore kv.Config `yaml:"kvstore" doc:"description=Backend storage to use for the ring. Please be aware that memberlist support in the HA tracker is currently experimental since gossip propagation is too slow for HA purposes."`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should reword it again. since gossip propagation is too slow for HA purposes is more for why we didn't allow memberlist before


// Tracker kv store only supports consul and etcd.
storeAllowedList := []string{"consul", "etcd"}
if slices.Contains(storeAllowedList, cfg.KVStore.Store) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR should only just extend allow list to memberlist instead of allowing all stores

@SungJin1212
Copy link
Member Author

@yeya24
I updated the PR.
To allow for zero-downtime migrations when moving from consul/etcd to memberlist, I also included the multi.

Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
@SungJin1212 SungJin1212 force-pushed the Enable-HATracker-memberlist branch from 0a901c6 to b958c62 Compare February 25, 2026 05:03
Copy link
Member

@friedrichg friedrichg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this!

expectedErr: fmt.Errorf("invalid HATracker KV store type: %s", "memberlist"),
expectedErr: nil,
},
"should pass with multi kv store": {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we also have an integration test for multi kv store?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated TestHATrackerWithMultiKV to the latest commit

CHANGELOG.md Outdated
* Metrics: Renamed `cortex_parquet_queryable_cache_*` to `cortex_parquet_cache_*`.
* Flags: Renamed `-querier.parquet-queryable-shard-cache-size` to `-querier.parquet-shard-cache-size` and `-querier.parquet-queryable-shard-cache-ttl` to `-querier.parquet-shard-cache-ttl`.
* Config: Renamed `parquet_queryable_shard_cache_size` to `parquet_shard_cache_size` and `parquet_queryable_shard_cache_ttl` to `parquet_shard_cache_ttl`.
* [FEATURE] HATracker: Add experimental support for memberlist as a KV store backend. #7284
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and multi. right?

expectedRingMembers, expectedTombstones)
}

func TestHATrackerWithMemberlist(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This integration test is fine. do you think another test with 2 cortex instances would be valuable? I am thinking sending samples to an instance and ensuring the other instance has the same HA leader


// RemoveTombstones is a no-op for ReplicaDesc.
func (d *ReplicaDesc) RemoveTombstones(_ time.Time) (total, removed int) {
// No-op: HATracker manages tombstones via cleanupOldReplicas
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think cleanOldReplicas calls delete that is not implemented for memberlist.

err = c.client.Delete(ctx, key)

return errors.New("memberlist does not support Delete")

can you add a test for cleaning old replicas?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, we need to implement it. thanks

Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enable Memberlist as valid HA-tracker store

3 participants