Skip to content

fix: preserve duplicate GROUPING SETS rows#21058

Open
xiedeyantu wants to merge 3 commits intoapache:mainfrom
xiedeyantu:groupset
Open

fix: preserve duplicate GROUPING SETS rows#21058
xiedeyantu wants to merge 3 commits intoapache:mainfrom
xiedeyantu:groupset

Conversation

@xiedeyantu
Copy link
Member

Which issue does this PR close?

  • Closes #.

Rationale for this change

GROUPING SETS with duplicate grouping lists were incorrectly collapsed during execution. The internal grouping id only encoded the null mask, so repeated grouping sets shared the same key and were merged, which caused rows to be lost compared with PostgreSQL behavior.

I verified the expected result in PostgreSQL, where the same query returns 30 rows for the full emp example instead of being collapsed.

What changes are included in this PR?

  • Preserve duplicate grouping sets by adding a per-occurrence ordinal into the internal grouping id used during execution.
  • Keep GROUPING() semantics unchanged.
  • Add regression coverage for the duplicate GROUPING SETS case in:
    • datafusion/core/tests/sql/aggregates/basic.rs
    • datafusion/sqllogictest/test_files/group_by.slt

Are these changes tested?

  • cargo fmt --all
  • cargo test -p datafusion duplicate_grouping_sets_are_preserved -- --nocapture
  • PostgreSQL validation against the same query/result shape

Are there any user-facing changes?

  • Yes. Queries that contain duplicate GROUPING SETS entries now return the correct duplicated result rows, matching PostgreSQL behavior.

@xiedeyantu xiedeyantu marked this pull request as draft March 19, 2026 16:29
@github-actions github-actions bot added core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) physical-plan Changes to the physical-plan crate labels Mar 19, 2026
@xiedeyantu xiedeyantu marked this pull request as ready for review March 19, 2026 16:40
@xiedeyantu
Copy link
Member Author

Hi @alamb , there's another one here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate physical-plan Changes to the physical-plan crate sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant