Skip to content

feat: Refactor reindexing timeline generation and add a preview viewer api#19511

Open
capistrant wants to merge 1 commit into
apache:masterfrom
capistrant:cascadeReindex-timeline-api
Open

feat: Refactor reindexing timeline generation and add a preview viewer api#19511
capistrant wants to merge 1 commit into
apache:masterfrom
capistrant:cascadeReindex-timeline-api

Conversation

@capistrant
Copy link
Copy Markdown
Contributor

@capistrant capistrant commented May 24, 2026

Description

Replacement for #19051. This splits the server side code out from that and refactors it, to achieve the same goal in a hopefully more architecturally sound manner. @vogievetsky has volunteered to help with the UI side.

New Supervisor API (cascading reindexing supervisors only)

GET /druid/indexer/v1/supervisor/{supervisorId}/reindexingTimeline

This generates a timeline of search intervals for reindexing with the effective sets of rules that are used to create the underlying inline compaction configs for the different intervals. This is the business end of the console UI described below

Reindexing Timeline Refactor

ReindexingPlanner and ReindexingPlan are new classes that extract out the internals that take a datasource timeline, a rule provider (plus other reindex cascade config bits) and generate a set of sorted intervals that each apply reindexing rules to said interval.

ReindexingTimelineView is the public DTO for the new API and is what the UI in the console will be built to present the visualized reindexing timeline to users/operators.

Release note

Adds a new compaction supervisor api for supervisors who use reindexCompact template.

GET /druid/indexer/v1/supervisor/{supervisorId}/reindexingTimeline

This API allows viewing of a timeline of intervals, each with what set of reindeixing rules that they will apply to the underlying interval.


Key changed/added classes in this PR
  • CascadingReindexingTemplate
  • CompactionScheduler
  • OverlordCompactionScheduler
  • ReindexingPlan
  • ReindexingPlanner
  • ReindexingTimelineView
  • SupervisorResource

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.


private static SegmentTimeline createTimeline(DateTime start, DateTime end)
{
final DataSegment segment = DataSegment.builder()
Copy link
Copy Markdown
Member

@FrankChen021 FrankChen021 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Severity Findings
P0 0
P1 0
P2 2
P3 0
Total 2
Severity Findings
P0 0
P1 0
P2 2
P3 0
Total 2

Reviewed 13 of 13 changed files.


This is an automated review by Codex GPT-5.5

final IntervalPartitioningInfo originalInfo = searchIntervals.get(i);
final Interval originalInterval = originalInfo.getInterval();

if (!originalInterval.overlaps(dataRangeWithSkipOffset)) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P2] Skip-offset-only intervals are mislabeled as no-data

The no-overlap branch also catches intervals that are entirely after dataRangeWithSkipOffset because of skipOffsetFromNow or skipOffsetFromLatest. Those intervals still overlap the live timeline, but they are recorded as SKIPPED_NO_DATA; ReindexingTimelineView filters that disposition out, so the preview hides intervals skipped only by the skip-offset boundary. Please distinguish live intervals beyond the truncation boundary and mark them SKIPPED_BEYOND_BOUNDARY.

LOG.warn("No search intervals generated for dataSource[%s], no reindexing jobs will be created", dataSource);
return Collections.emptyList();
}
final ReindexingPlan plan = new ReindexingPlanner(this).plan(jobParams.getScheduleStartTime(), timeline);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P2] Job creation ignores planner validation errors

ReindexingPlanner.plan() now converts invalid granularity timelines into a PlanValidationError and an empty interval list for API preview, but the job creation path never checks plan.getValidationError(). Invalid cascading configs that previously failed job generation can now silently produce no jobs, masking a broken supervisor as an empty run. Please preserve the old failure or surface the validation error before returning jobs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants