Skip to content

PHOENIX-7884 : Refactor tracking of IndexCdcConsumer lag#2506

Open
palashc wants to merge 2 commits into
apache:masterfrom
palashc:PHOENIX-7884
Open

PHOENIX-7884 : Refactor tracking of IndexCdcConsumer lag#2506
palashc wants to merge 2 commits into
apache:masterfrom
palashc:PHOENIX-7884

Conversation

@palashc

@palashc palashc commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

Decouples the cdcIndexUpdateLag metric from batch completion. A new package-private helper IndexCDCConsumerProgress holds the consumer's monotonic effective freshness watermark, advanced by:

  • a successful own-partition batch (sets lastProcessedTimestamp), or
  • an empty own-partition CDC poll (proves caught-up to pollEnd − timestampBufferMs).

A new sleepWithLagSampling(...) helper in IndexCDCConsumer replaces every consumer-thread Thread.sleep site (run loop, startup wait, parent-progress wait, failure backoff). It chunks every sleep into lagSampleIntervalMs slices (default 1000ms, configurable via new property phoenix.index.cdc.consumer.lag.sample.interval.ms) and emits one cdcIndexUpdateLag sample per slice. The previous per-batch updateCdcLag calls are removed so the sampler is the single emitter, and parent-replay timestamps no longer pollute the per-table histogram.

Files changed:

  • phoenix-core-server/.../IndexCDCConsumer.java — modified
  • phoenix-core-server/.../IndexCDCConsumerProgress.java — new
  • phoenix-core/src/test/.../IndexCDCConsumerProgressTest.java — new (9 unit tests)
  • phoenix-core/src/it/.../IndexCDCConsumerLagIT.java — new

Why are the changes needed?

cdcIndexUpdateLag is the primary freshness SLO signal for eventually consistent secondary indexes, but pre-fix it only emits inside if (!batchMutations.isEmpty()) blocks (IndexCDCConsumer.java:988 and :1109). This produces three distinct bugs:

  1. Silent during idle / sustained failure / startup. Histograms have no new samples; existing percentiles age out and dashboards look healthy while the EC index is silently falling behind.
  2. Silent during post-split parent replay. While replayAndCompleteParentRegions(...) runs (which can take hours), the child region's own writes accumulate unprocessed but the lag metric reports nothing.
  3. Mis-attribution. The per-batch emit also fires from processPartitionToCompletion during replay, so ancestor-partition timestamps pollute the per-data-table histogram with stale values.

Net effect: freshness alerts produce false negatives during the operational windows that matter most.

Does this PR introduce any user-facing change?

YescdcIndexUpdateLag behaviour changes (no name/shape change; histogram, same per-table fanout).

Scenario Before After
Active traffic Sample per batch Sample per ~1s (configurable)
Idle table Silent Continuous samples, value ≈ timestampBufferMs baseline
Sustained failures Silent Continuous, value growing — visibly stuck
Parent-region replay Silent + polluted with ancestor timestamps Continuous, value reflects child-region wall-clock lag
Cold start Silent until first batch Reports now − consumerStartTime

One operational note for release notes: the metric now legitimately grows during post-split parent replay (the EC index is stale during that window). Alerts on cdcIndexUpdateLag should use a long-enough window (e.g. > 30 min) until a follow-up subtask ships a cdcParentReplayActiveGauge for suppression.

New config: phoenix.index.cdc.consumer.lag.sample.interval.ms (default 1000, floor 50).

How was this patch tested?

Unit testsIndexCDCConsumerProgressTest, 9 deterministic tests, all passing. Covers cold-start floor, monotonicity of both signals, empty-poll watermark math, processed-vs-empty interaction, idle bounded growth, negative-lag clamp, and pre-buffer empty-poll edge case.

Integration testIndexCDCConsumerLagIT, 1 test, passing in ~11s. Verifies on a real MiniCluster that the sampler keeps emitting samples during a 5s idle window (binary flow check — pre-fix delta would be 0). Uses awaitMinCount(1, 120s) rather than fixed sleeps for startup so the test is robust to slow CI / GC jitter. Numerical value-correctness is left to the unit tests, which are deterministic.

Regression coverage on existing EC index ITs:

  • MultiTenantEventualIndexIT#testBasicMultiTenantEventualIndex — PASS
  • MultiTenantEventualIndexGenerateIT#testBasicMultiTenantEventualIndex — PASS (covers the processCDCBatchGenerated path)
  • ConcurrentMutationsCoveredEventualIT — 3/3 PASS (exercises sibling metrics cdcBatchProcessTime / cdcBatchCount / cdcMutationCount that are deliberately preserved in the same blocks)

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude (Anthropic)

@palashc palashc requested a review from virajjasani June 9, 2026 21:40
@palashc

palashc commented Jun 9, 2026

Copy link
Copy Markdown
Contributor Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant