refactor: unify SQL planning for ORDER BY, HAVING, DISTINCT, etc #19974

lichuang · 2026-01-24T14:32:15Z

Which issue does this PR close?

Closes Unify SQL planning for ORDER BY, HAVING, DISTINCT, etc #10326

This PR refactors ORDER BY planning to use the merged schema approach (similar to HAVING), unifying the SQL planning logic and simplifying the generated ex
ecution plans.

Currently, DataFusion has two different code paths for handling ORDER BY:

add_missing_columns (in LogicalPlanBuilder::sort_with_limit): Traverses the plan tree looking for Projection nodes and adds missing columns to them
Merged Schema approach (used by HAVING): Uses a merged schema (SELECT list + FROM clause) to resolve expressions and directly adds missing columns to the SELECT list

Having both paths coexist leads to:

• Complex and hard-to-maintain code
• Non-intuitive handling of simple queries like SELECT x FROM foo ORDER BY y
• Generated execution plans with unnecessary subquery wrapping

Solution

Implement the approach proposed in #10326: handle ORDER BY similarly to HAVING by using the merged schema and adding missing columns directly to the SELECT
list instead of traversing the plan tree.

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

lichuang · 2026-01-28T07:28:58Z

@alamb @jonahgao

alamb · 2026-01-28T21:06:30Z

This seems to add more code (rather than unify the code paths as the PR comment suggests) 🤔

alamb · 2026-01-28T21:06:56Z

In other words, this PR seems to add more code paths, when the idea was to reduce the code / duplication

github-actions bot added the sql SQL Planner label Jan 24, 2026

lichuang force-pushed the issue-10326 branch 3 times, most recently from 618a289 to 6a0bfa1 Compare January 28, 2026 02:36

github-actions bot added the logical-expr Logical plan and expressions label Jan 28, 2026

refactor: unify SQL planning for ORDER BY, HAVING, DISTINCT, etc

2f22ece

lichuang force-pushed the issue-10326 branch from 6a0bfa1 to 2f22ece Compare January 28, 2026 06:45

lichuang marked this pull request as ready for review January 28, 2026 07:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: unify SQL planning for ORDER BY, HAVING, DISTINCT, etc #19974

refactor: unify SQL planning for ORDER BY, HAVING, DISTINCT, etc #19974

lichuang commented Jan 24, 2026 •

edited by alamb

Loading

Uh oh!

lichuang commented Jan 28, 2026

Uh oh!

alamb commented Jan 28, 2026

Uh oh!

alamb commented Jan 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

refactor: unify SQL planning for ORDER BY, HAVING, DISTINCT, etc #19974

Are you sure you want to change the base?

refactor: unify SQL planning for ORDER BY, HAVING, DISTINCT, etc #19974

Conversation

lichuang commented Jan 24, 2026 • edited by alamb Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

lichuang commented Jan 28, 2026

Uh oh!

alamb commented Jan 28, 2026

Uh oh!

alamb commented Jan 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

lichuang commented Jan 24, 2026 •

edited by alamb

Loading