Skip to content

feat(instances): cache server lookups to reduce API calls#1252

Draft
lukasmetzner wants to merge 10 commits into
mainfrom
fix-improve-api-usage
Draft

feat(instances): cache server lookups to reduce API calls#1252
lukasmetzner wants to merge 10 commits into
mainfrom
fix-improve-api-usage

Conversation

@lukasmetzner
Copy link
Copy Markdown
Contributor

Feature is marked as experimental.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 26, 2026

Codecov Report

❌ Patch coverage is 92.50000% with 15 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.11%. Comparing base (547798d) to head (a2c56c8).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
internal/config/config.go 23.07% 8 Missing and 2 partials ⚠️
hcloud/instances.go 63.63% 4 Missing ⚠️
internal/servercache/servercache.go 99.37% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1252      +/-   ##
==========================================
+ Coverage   71.01%   72.11%   +1.10%     
==========================================
  Files          24       26       +2     
  Lines        2677     2851     +174     
==========================================
+ Hits         1901     2056     +155     
- Misses        604      618      +14     
- Partials      172      177       +5     
Flag Coverage Δ
e2e 45.21% <53.00%> (+0.01%) ⬆️
unit 68.64% <91.00%> (+1.58%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread internal/servercache/allservercache.go Outdated
Copy link
Copy Markdown
Member

@jooola jooola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First review! Looks great, only need a bit of polishing.

Comment thread hcloud/cloud.go
nodeLister corelisters.NodeLister
client *hcloud.Client
robotClient hrobot.RobotClient
instanceCache *servercache.Cache[hcloud.Server]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
instanceCache *servercache.Cache[hcloud.Server]
serverCache *servercache.Cache[hcloud.Server]

Comment thread internal/config/config.go
Comment on lines +33 to +34
hcloudCacheMode = "HCLOUD_CACHE_MODE"
hcloudCacheTTL = "HCLOUD_CACHE_TTL"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we still want a namespace like "servers"? With "Instances" I understand the instance controller "instance_v2". While with "servers" I understand the API Servers.

We also probably have different caching options for the robot servers.

Comment thread internal/config/config.go

if mode, ok := os.LookupEnv(hcloudCacheMode); ok {
klog.Warningf("Experimental: %s is experimental, breaking changes may occur within minor releases.", hcloudCacheMode)
cfg.Cache.Mode = servercache.Mode(mode)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we validate the value provided by the user?

}, []string{"op"})

CacheRequests = prometheus.NewCounterVec(prometheus.CounterOpts{
Name: "cloud_controller_manager_cache_requests_total",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably want to say its a "server" cache.

Comment on lines +150 to +157
func (c *Cache[T]) getFromCache(
ctx context.Context,
subsystem string,
lookup func() *entry[T],
fetch func() (*T, error),
opts ...RefreshOption,
) (*T, error) {
refreshOpts := newCacheRefreshOpts(c, opts...)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am tempted to do the following for all functions that pass subsystem, just to unclutter the args:

Suggested change
func (c *Cache[T]) getFromCache(
ctx context.Context,
subsystem string,
lookup func() *entry[T],
fetch func() (*T, error),
opts ...RefreshOption,
) (*T, error) {
refreshOpts := newCacheRefreshOpts(c, opts...)
func (c *Cache[T]) getFromCache(
ctx context.Context,
lookup func() *entry[T],
fetch func() (*T, error),
opts ...RefreshOption,
) (*T, error) {
subsystem := GetSubsystem(ctx)
refreshOpts := newCacheRefreshOpts(c, opts...)

And if a ctx is missing, we can/should probably add it.

Comment on lines +208 to +212
func (c *Cache[T]) refreshOne(
subsystem string,
fetch func() (*T, error),
ttl time.Duration,
) error {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to pass a ctx here:

  1. Have access to the subsystem, without passing subsystem via the function args.
  2. Not a big fan of passing a context during in the function definition, instead of from the function caller. The context hierarchy is clearer when passed via the function caller.

@@ -0,0 +1,26 @@
# Instance Cache
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Instance Cache
# Server Cache

Comment thread internal/servercache/servercache_test.go Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants