Skip to content

Support columns_sorted in row_filters#20497

Merged
adriangb merged 2 commits intoapache:mainfrom
sdf-jkl:columns_sorted
Mar 15, 2026
Merged

Support columns_sorted in row_filters#20497
adriangb merged 2 commits intoapache:mainfrom
sdf-jkl:columns_sorted

Conversation

@sdf-jkl
Copy link
Contributor

@sdf-jkl sdf-jkl commented Feb 23, 2026

Which issue does this PR close?

Rationale for this change

Improving predicate ordering for predicate pushdown

What changes are included in this PR?

Building on changes from #3477 and #7528

  • Implement the columns_sorted function
  • Change should_enable_page_index to use index when choose to reorder predicates in config

Are these changes tested?

Yes, unit tests

Are there any user-facing changes?

No

@github-actions github-actions bot added the datasource Changes to the datasource crate label Feb 23, 2026
@sdf-jkl sdf-jkl marked this pull request as ready for review February 23, 2026 18:45
@sdf-jkl
Copy link
Contributor Author

sdf-jkl commented Feb 23, 2026

@alamb @Ted-Jiang please take a look when you are available.

@alamb
Copy link
Contributor

alamb commented Feb 27, 2026

I think @adriangb also looked at using the sorted mentadata recently

One big question I have is why are we proposing this change? Do we have any evidence this help performance (like benchmarks?)

@sdf-jkl
Copy link
Contributor Author

sdf-jkl commented Feb 27, 2026

Do you know if any of the benchmark queries evaluate filters on sorted cols? If not, I'll make a new one.

@sdf-jkl
Copy link
Contributor Author

sdf-jkl commented Feb 27, 2026

I think there are no optimizations for evaluating predicates on filtered arrays.

It would be cool to be able to support binary search for comparison predicates. Locate the relevant rows once, then operate only on that range:

=      → return only matching rows (skip everything else)
>, >=  → keep rows after the search value
<, <=  → keep rows before the search value
!=     → keep all rows except those equal to the value

@adriangb
Copy link
Contributor

It would be cool to be able to support binary search for comparison predicates. Locate the relevant rows once, then operate only on that range:

If the columns is sorted, won't row group / page pruning basically be doing this already?

@sdf-jkl
Copy link
Contributor Author

sdf-jkl commented Feb 28, 2026

It would be cool to be able to support binary search for comparison predicates. Locate the relevant rows once, then operate only on that range:

If the columns is sorted, won't row group / page pruning basically be doing this already?

You're right. Then, if a column is sorted it's a reason enough to use load page indices and prune. Due how data is distributed this could prune a lot right away.

@sdf-jkl
Copy link
Contributor Author

sdf-jkl commented Mar 12, 2026

I'll work on adding a query to the existing bench to capture the benefits and send the results later.

@adriangb
Copy link
Contributor

@sdf-jkl could you help me with some napkin math on how this optimization works? Is the idea that applying a row selection when a page index is present is more efficient? I'm not sure if that means we should filter columns that have a page index first or last, and how that would weigh against e.g. the size of the column or the selectivity of the filter 🤔

@sdf-jkl
Copy link
Contributor Author

sdf-jkl commented Mar 14, 2026

Sorry, I think I got things mixed up while working on this.

We consider a column sorted by checking page_index ordering (min/max) for that column across pages in each row group. If those pages are ordered, we treat that column as sorted.

Given that, this column is usually a strong candidate for row group/page pruning. So we prune.

After pruning, the remaining work goes to row_filter. For a range predicate on a sorted column, row_filter is then likely to trim mostly at kept-window boundaries (often a relatively small contiguous region, though it can still include full page(s) once we use the selection on heavier columns)

This should make the incremental benefit of using a predicate on this column early in Late Materialization likely marginal in many workloads, given most of the pruning value was already captured earlier.

@adriangb
Copy link
Contributor

This should make the incremental benefit of using a predicate on this column early in Late Materialization likely marginal in many workloads, given most of the pruning value was already captured earlier.

So the point is that these columns (sorted columns) were likely well pruned by row group / page min/max stats -> they're unlikely to be selective for row pruning -> they should be evaluated last?

@sdf-jkl
Copy link
Contributor Author

sdf-jkl commented Mar 14, 2026

They are unlikely to be highly selective for row pruning, but we can't reliably assume they are always less selective than other predicates.

My implementation here did the opposite and prioritized them in the evaluation order, which is a mistake.

At this point, I think the rule itself might be unnecessary, and we could consider closing the issue.

I can clean up the docs and the function placeholder in row_filter.rs in this PR.

This is the current doc saying we should prioritized sorted columns:

//! The basic algorithm for constructing the `RowFilter` is as follows
//!
//! 1. Break conjunctions into separate predicates. An expression
//! like `a = 1 AND (b = 2 AND c = 3)` would be
//! separated into the expressions `a = 1`, `b = 2`, and `c = 3`.
//! 2. Determine whether each predicate can be evaluated as an `ArrowPredicate`.
//! 3. Determine, for each predicate, the total compressed size of all
//! columns required to evaluate the predicate.
//! 4. Determine, for each predicate, whether all columns required to
//! evaluate the expression are sorted.
//! 5. Re-order the predicate by total size (from step 3).
//! 6. Partition the predicates according to whether they are sorted (from step 4)
//! 7. "Compile" each predicate `Expr` to a `DatafusionArrowPredicate`.
//! 8. Build the `RowFilter` with the sorted predicates followed by
//! the unsorted predicates. Within each partition, predicates are
//! still be sorted by size.

@adriangb
Copy link
Contributor

I agree updating the docs and removing this un-implemented heuristic makes sense.

@github-actions github-actions bot removed the datasource Changes to the datasource crate label Mar 15, 2026
@github-actions github-actions bot added the datasource Changes to the datasource crate label Mar 15, 2026
The sorted dataset is automatically generated from the ClickBench partitioned dataset. You can configure the memory used during the sorting process with the `DATAFUSION_MEMORY_GB` environment variable. The default memory limit is 12GB.
```bash
./bench.sh data data_sorted_clickbench
./bench.sh data clickbench_sorted
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

threw this in

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m assuming the new version is correct 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sdf-jkl
Copy link
Contributor Author

sdf-jkl commented Mar 15, 2026

@adriangb

@adriangb adriangb added this pull request to the merge queue Mar 15, 2026
Merged via the queue into apache:main with commit ab28234 Mar 15, 2026
30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

datasource Changes to the datasource crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support columns_sorted in row_filters

3 participants