Skip to content

feat(ingester): Add active series tracker for pattern-based monitoring#7476

Open
yeya24 wants to merge 1 commit intocortexproject:masterfrom
yeya24:feat/active-series-tracker
Open

feat(ingester): Add active series tracker for pattern-based monitoring#7476
yeya24 wants to merge 1 commit intocortexproject:masterfrom
yeya24:feat/active-series-tracker

Conversation

@yeya24
Copy link
Copy Markdown
Contributor

@yeya24 yeya24 commented May 4, 2026

Add a new active series tracker feature that counts active series by configurable label matchers (including regex) and exposes the counts as Prometheus metrics. This is designed for internal monitoring without enforcing any limits.

Key changes:

  • Add ActiveSeriesTrackersConfig type in validation package with PromQL matcher syntax support
  • Add ActiveSeriesTrackers field to Limits struct for per-tenant config with default fallback
  • Add ActiveForMatchers() method to ActiveSeries for counting matching series across all stripes
  • Add cortex_ingester_active_series_per_tracker gauge metric
  • Integrate into updateActiveSeries() periodic tick
  • Matchers are validated and compiled during config unmarshalling
  • Runtime hot-reloadable via existing runtime config overrides

Configuration example:

  overrides: 
    tenant-123:
      active_series_trackers:
      - name: api_metrics
        matchers: '{__name__=~"api_.*"}'

Metric emitted:

  cortex_ingester_active_series_per_tracker{user="tenant", name="api_metrics"} 42

What this PR does:

Which issue(s) this PR fixes:
Fixes #

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]
  • docs/configuration/v1-guarantees.md updated if this PR introduces experimental flags

@dosubot dosubot Bot added component/ingester type/feature type/observability To help know what is going on inside Cortex labels May 4, 2026
@yeya24 yeya24 force-pushed the feat/active-series-tracker branch 4 times, most recently from c3dae35 to 48fbe69 Compare May 5, 2026 06:58
"default": [],
"description": "List of active series tracker configurations. Each tracker counts active series matching its matchers and exposes the count as a metric.",
"items": {
"type": "string"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"type": "string"
"$ref": "#/definitions/ActiveSeriesTrackerConfig"

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not modify the schema by hand. This is what's generated from the build. We need to support it in the schema generator instead

Comment thread pkg/ingester/metrics.go
Comment thread pkg/util/validation/active_series_tracker.go
@yeya24 yeya24 force-pushed the feat/active-series-tracker branch 3 times, most recently from 7d8668b to a55cd3e Compare May 6, 2026 04:45
Comment thread pkg/ingester/tracker_counter.go Outdated
Comment on lines +32 to +62
func (tc *trackerCounter) updateConfig(ctx context.Context, db *tsdb.DB, trackers validation.ActiveSeriesTrackersConfig) {
tc.mu.Lock()
defer tc.mu.Unlock()

newMatchers := make(map[string][]*labels.Matcher, len(trackers))
for i := range trackers {
newMatchers[trackers[i].Name] = trackers[i].ParsedMatchers()
}

if trackerMatchersEqual(tc.matchers, newMatchers) {
return
}

// Config changed — backfill counts from TSDB head.
tc.matchers = newMatchers
tc.counts = make(map[string]int, len(newMatchers))

if db == nil || len(newMatchers) == 0 {
return
}

ir, err := db.Head().Index()
if err != nil {
return
}
defer ir.Close()

all, err := ir.Postings(ctx, "", "")
if err != nil {
return
}
Copy link
Copy Markdown
Member

@SungJin1212 SungJin1212 May 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, LGTM.
I have one question.
We need to scan the entire TSDB head index while holding the lock. Is it fine in high cardinality environments, considering we also need to acquire this lock every time a new series is created?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good catch and definitely needs to be fixed. We should never scan all postings. I will fix in next commit.

But updateConfig is a single thread async process. We don't need to scan head index for every new series

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Add a new active series tracker feature that counts active series by
configurable label matchers (including regex) and exposes the counts as
Prometheus metrics. This is designed for internal monitoring without
enforcing any limits.

Key changes:
- Add ActiveSeriesTrackersConfig type in validation package with PromQL
  matcher syntax support
- Add ActiveSeriesTrackers field to Limits struct for per-tenant config
  with default fallback
- Add ActiveForMatchers() method to ActiveSeries for counting matching
  series across all stripes
- Add cortex_ingester_active_series_per_tracker gauge metric
- Integrate into updateActiveSeries() periodic tick
- Matchers are validated and compiled during config unmarshalling
- Runtime hot-reloadable via existing runtime config overrides

Configuration example:
  overrides:
    tenant-123:
      active_series_trackers:
        - name: api_metrics
          matchers: '{__name__=~"api_.*"}'

Metric emitted:
  cortex_ingester_active_series_per_tracker{user="tenant", name="api_metrics"} 42

Signed-off-by: Ben Ye <benye@amazon.com>
@yeya24 yeya24 force-pushed the feat/active-series-tracker branch from a55cd3e to 9b57f52 Compare May 6, 2026 05:31
@dosubot dosubot Bot added the lgtm This PR has been approved by a maintainer label May 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component/ingester lgtm This PR has been approved by a maintainer size/XL type/feature type/observability To help know what is going on inside Cortex

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants