Support `columns_sorted` in row_filters by sdf-jkl · Pull Request #20497 · apache/datafusion

sdf-jkl · 2026-02-23T16:34:18Z

Which issue does this PR close?

Closes Support columns_sorted in row_filters #3476.

Rationale for this change

Improving predicate ordering for predicate pushdown

What changes are included in this PR?

Building on changes from #3477 and #7528

Implement the columns_sorted function
Change should_enable_page_index to use index when choose to reorder predicates in config

Are these changes tested?

Yes, unit tests

Are there any user-facing changes?

No

sdf-jkl · 2026-02-23T18:47:34Z

@alamb @Ted-Jiang please take a look when you are available.

alamb · 2026-02-27T19:44:31Z

I think @adriangb also looked at using the sorted mentadata recently

One big question I have is why are we proposing this change? Do we have any evidence this help performance (like benchmarks?)

sdf-jkl · 2026-02-27T20:43:51Z

Do you know if any of the benchmark queries evaluate filters on sorted cols? If not, I'll make a new one.

sdf-jkl · 2026-02-27T21:54:28Z

I think there are no optimizations for evaluating predicates on filtered arrays.

It would be cool to be able to support binary search for comparison predicates. Locate the relevant rows once, then operate only on that range:

=      → return only matching rows (skip everything else)
>, >=  → keep rows after the search value
<, <=  → keep rows before the search value
!=     → keep all rows except those equal to the value

adriangb · 2026-02-27T22:50:56Z

It would be cool to be able to support binary search for comparison predicates. Locate the relevant rows once, then operate only on that range:

If the columns is sorted, won't row group / page pruning basically be doing this already?

sdf-jkl · 2026-02-28T00:34:44Z

It would be cool to be able to support binary search for comparison predicates. Locate the relevant rows once, then operate only on that range:

If the columns is sorted, won't row group / page pruning basically be doing this already?

You're right. Then, if a column is sorted it's a reason enough to use load page indices and prune. Due how data is distributed this could prune a lot right away.

sdf-jkl · 2026-03-12T16:04:22Z

I'll work on adding a query to the existing bench to capture the benefits and send the results later.

Add sorted data benchmark. #19042

adriangb · 2026-03-13T23:18:59Z

@sdf-jkl could you help me with some napkin math on how this optimization works? Is the idea that applying a row selection when a page index is present is more efficient? I'm not sure if that means we should filter columns that have a page index first or last, and how that would weigh against e.g. the size of the column or the selectivity of the filter 🤔

sdf-jkl · 2026-03-14T20:00:18Z

Sorry, I think I got things mixed up while working on this.

We consider a column sorted by checking page_index ordering (min/max) for that column across pages in each row group. If those pages are ordered, we treat that column as sorted.

Given that, this column is usually a strong candidate for row group/page pruning. So we prune.

After pruning, the remaining work goes to row_filter. For a range predicate on a sorted column, row_filter is then likely to trim mostly at kept-window boundaries (often a relatively small contiguous region, though it can still include full page(s) once we use the selection on heavier columns)

This should make the incremental benefit of using a predicate on this column early in Late Materialization likely marginal in many workloads, given most of the pruning value was already captured earlier.

adriangb · 2026-03-14T20:48:46Z

This should make the incremental benefit of using a predicate on this column early in Late Materialization likely marginal in many workloads, given most of the pruning value was already captured earlier.

So the point is that these columns (sorted columns) were likely well pruned by row group / page min/max stats -> they're unlikely to be selective for row pruning -> they should be evaluated last?

sdf-jkl · 2026-03-14T21:55:36Z

They are unlikely to be highly selective for row pruning, but we can't reliably assume they are always less selective than other predicates.

My implementation here did the opposite and prioritized them in the evaluation order, which is a mistake.

At this point, I think the rule itself might be unnecessary, and we could consider closing the issue.

I can clean up the docs and the function placeholder in row_filter.rs in this PR.

This is the current doc saying we should prioritized sorted columns:

datafusion/datafusion/datasource-parquet/src/row_filter.rs

Lines 45 to 60 in 9b7d092

    
           //! The basic algorithm for constructing the `RowFilter` is as follows 
        
           //! 
        
           //! 1. Break conjunctions into separate predicates. An expression 
        
           //!    like `a = 1 AND (b = 2 AND c = 3)` would be 
        
           //!    separated into the expressions `a = 1`, `b = 2`, and `c = 3`. 
        
           //! 2. Determine whether each predicate can be evaluated as an `ArrowPredicate`. 
        
           //! 3. Determine, for each predicate, the total compressed size of all 
        
           //!    columns required to evaluate the predicate. 
        
           //! 4. Determine, for each predicate, whether all columns required to 
        
           //!    evaluate the expression are sorted. 
        
           //! 5. Re-order the predicate by total size (from step 3). 
        
           //! 6. Partition the predicates according to whether they are sorted (from step 4) 
        
           //! 7. "Compile" each predicate `Expr` to a `DatafusionArrowPredicate`. 
        
           //! 8. Build the `RowFilter` with the sorted predicates followed by 
        
           //!    the unsorted predicates. Within each partition, predicates are 
        
           //!    still be sorted by size.

adriangb · 2026-03-15T14:34:00Z

I agree updating the docs and removing this un-implemented heuristic makes sense.

sdf-jkl · 2026-03-15T21:23:41Z

benchmarks/README.md

 The sorted dataset is automatically generated from the ClickBench partitioned dataset. You can configure the memory used during the sorting process with the `DATAFUSION_MEMORY_GB` environment variable. The default memory limit is 12GB.
 ```bash
-./bench.sh data data_sorted_clickbench
+./bench.sh data clickbench_sorted


threw this in

I’m assuming the new version is correct 👍

cde6dfa#diff-1769f5787dc11c8b1f1b48288cdf3c89d25a5b5cbc6be4740bfcc70a6313ba99R106

sdf-jkl · 2026-03-15T21:45:36Z

@adriangb

github-actions bot added the datasource Changes to the datasource crate label Feb 23, 2026

sdf-jkl marked this pull request as ready for review February 23, 2026 18:45

sdf-jkl force-pushed the columns_sorted branch from ff5e6a6 to 0a03e3c Compare March 13, 2026 22:52

fix stale doc

cab96a6

sdf-jkl force-pushed the columns_sorted branch from 0a03e3c to cab96a6 Compare March 15, 2026 21:19

github-actions bot removed the datasource Changes to the datasource crate label Mar 15, 2026

remove redundant heuristic

5995e8f

github-actions bot added the datasource Changes to the datasource crate label Mar 15, 2026

sdf-jkl commented Mar 15, 2026

View reviewed changes

adriangb approved these changes Mar 15, 2026

View reviewed changes

adriangb added this pull request to the merge queue Mar 15, 2026

Merged via the queue into apache:main with commit ab28234 Mar 15, 2026
30 checks passed

Conversation

sdf-jkl commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

sdf-jkl commented Feb 23, 2026

Uh oh!

alamb commented Feb 27, 2026

Uh oh!

sdf-jkl commented Feb 27, 2026

Uh oh!

sdf-jkl commented Feb 27, 2026

Uh oh!

adriangb commented Feb 27, 2026

Uh oh!

sdf-jkl commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sdf-jkl commented Mar 12, 2026

Uh oh!

adriangb commented Mar 13, 2026

Uh oh!

sdf-jkl commented Mar 14, 2026

Uh oh!

adriangb commented Mar 14, 2026

Uh oh!

sdf-jkl commented Mar 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adriangb commented Mar 15, 2026

Uh oh!

sdf-jkl Mar 15, 2026

Choose a reason for hiding this comment

Uh oh!

adriangb Mar 15, 2026

Choose a reason for hiding this comment

Uh oh!

sdf-jkl Mar 15, 2026

Choose a reason for hiding this comment

Uh oh!

sdf-jkl commented Mar 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sdf-jkl commented Feb 23, 2026 •

edited

Loading

sdf-jkl commented Feb 28, 2026 •

edited

Loading

sdf-jkl commented Mar 14, 2026 •

edited

Loading