Preserve recursive CTE static schema with plan-time schema alignment#22037
Open
kosiew wants to merge 5 commits intoapache:mainfrom
Open
Preserve recursive CTE static schema with plan-time schema alignment#22037kosiew wants to merge 5 commits intoapache:mainfrom
kosiew wants to merge 5 commits intoapache:mainfrom
Conversation
- Added `align_plan_to_schema` and `SchemaAlignExec` for improved schema alignment in execution plans. - Maintained strict behavior in `project_plan_to_schema` for projection-only cases. - Updated adapter to handle nullability narrowing while preserving SQL behavior. - Modified `RecursiveQueryExec` to preserve static/declared schema and aligned recursive term at plan construction. - Removed nullability-widening schema synthesis for cleaner execution. - Restored `0 AS` level in SQL logic test file `cte.slt`.
…ent behavior - Added direct tests for align_plan_to_schema: - Verified exact schema returns the same plan. - Ensured rename-only uses ProjectionExec. - Confirmed nullability narrowing uses SchemaAlignExec. - Tested count/type/field metadata/schema metadata errors. - Documented conservative property behavior in the adapter path.
- Refactored `align_plan_to_schema` function to store input schema in a variable, reducing redundant calls. - Updated validation and comparison logic for better clarity and performance. - Simplified partitioning handling in `SchemaAlignExec` by consolidating pattern matching. - Enhanced `DisplayAs` implementation to correctly handle `TreeRender` format.
…odules - Reuse `input_schema` in common.rs - Simplify projected return using `debug_assert_eq!` - Utilize `partition_count()` in common.rs - Modify TreeRender to return `Ok(())` - Reuse `static_schema` in tests for recursive_query.rs
- Removed redundant upfront align validation in common.rs. - Added test helpers in common.rs: - single_field_schema - single_i32_exec - metadata mismatch builders - Shortened repeated test setup in common.rs. - Added recursive_exec test helper in recursive_query.rs. - Simplified RecursiveQueryExec::try_new(...) in recursive_query.rs.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Rationale for this change
RecursiveQueryExecwidened recursive CTE output nullability by reconciling the static and recursive term schemas. This caused the physical schema to diverge from the logical/static CTE schema and forced valid SQL such as0 AS levelto be rewritten as nullable expressions likeSUM(0) AS level.This change preserves the declared recursive CTE schema by treating the static/anchor term schema as authoritative and aligning the recursive term to that schema during plan construction.
What changes are included in this PR?
Added
align_plan_to_schema, a higher-level plan-time schema alignment helper that guarantees the resulting plan advertises the expected schema exactly.Kept
project_plan_to_schemaas the narrower projection-based helper and refactored shared validation intovalidate_schema_alignment.Added
SchemaAlignExec, an execution-plan adapter that:RecordBatchschemas inside the adapterUpdated
RecursiveQueryExec::try_newto:align_plan_to_schemaRestored the recursive CTE SLT coverage from
SUM(0) AS levelback to0 AS level.Are these changes tested?
Yes.
Added and updated tests covering:
align_plan_to_schema:ProjectionExecSchemaAlignExecproject_plan_to_schema:RecursiveQueryExec:Restored SQL logic test coverage in
cte.sltusing0 AS level.Validated with:
Are there any user-facing changes?
Yes.
Recursive CTEs now preserve the declared/static schema instead of widening nullability based on recursive expressions. Existing valid SQL such as:
continues to work without requiring nullable rewrites like:
LLM-generated code disclosure
This PR includes LLM-generated code and comments. All LLM-generated content has been manually reviewed and tested.