Skip to content

feat(substrait): implement consume_nested for List expressions#20821

Closed
dd-david-levin wants to merge 1 commit intoapache:mainfrom
dd-david-levin:david.levin/nested-expression-substrait-support
Closed

feat(substrait): implement consume_nested for List expressions#20821
dd-david-levin wants to merge 1 commit intoapache:mainfrom
dd-david-levin:david.levin/nested-expression-substrait-support

Conversation

@dd-david-levin
Copy link

Summary

  • Implements consume_nested in the default SubstraitConsumer trait to handle Nested::List expressions, which represent inline array constructors like ARRAY['a', 'b'] in Substrait plans
  • Nested::List is converted to a make_array(...) scalar function call by recursively converting each element via from_substrait_rex
  • Nested::Struct and Nested::Map return not_impl_err for now (separate follow-ups)

Motivation

Previously, any Substrait plan containing an inline array constructor (e.g. from SQL like SELECT f(col, ARRAY['x', 'y'])) would fail at the consumer with:

This feature is not implemented: Nested expression not supported

The fix follows the same pattern used by every other expression type in the consumer — a from_nested function in a dedicated expr/nested.rs module, called from the trait's default consume_nested method.

Test plan

  • Existing substrait tests pass
  • Manual verification: SELECT make_array('a', 'b') roundtrips through Substrait producer → consumer correctly

🤖 Generated with Claude Code

Converts Substrait Nested::List expressions into DataFusion
make_array(...) scalar function calls, enabling inline array
constructors like ARRAY['name', 'city'] to flow through the
Substrait consumer without error.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions github-actions bot added the substrait Changes to the substrait crate label Mar 9, 2026
Comment on lines +53 to +58
NestedType::Struct(_) => {
not_impl_err!("Nested Struct expression is not yet supported")
}
NestedType::Map(_) => {
not_impl_err!("Nested Map expression is not yet supported")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we support these types as well?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say this PR is well scoped enough that those can be follow up, given the current support for NestedType is 0 anyways we aren't breaking anything.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the substrait producer does not currently have support for Nested types as well -- it will never produce a substrait plan with a Nested type in it. This means means roundtrips with DataFusion's substrait producer <-> consumer are not possible, but it is possible to have NestedType from an external substrait producer, so I think this is a valid solution still.

Comment on lines +346 to 349
input_schema: &DFSchema,
) -> datafusion::common::Result<Expr> {
not_impl_err!("Nested expression not supported")
from_nested(self, expr, input_schema).await
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we should add a test in datafusion/substrait/tests/cases/logical_plans.rs?

NestedType::List(list) => {
let mut args = Vec::with_capacity(list.values.len());
for expr in &list.values {
args.push(from_substrait_rex(consumer, expr, input_schema).await?);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be just consumed with the consumer instead of directly calling from_substrait_rex

@alexanderbianchi
Copy link
Contributor

#20953
FYI , I think we can redirect to the above PR which includes tests and a couple small code cleanliness improvements. It still doesn't implement map + struct nested types (in purpose, since those are non-trivial compared to the Array).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

substrait Changes to the substrait crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants