feat(substrait): implement consume_nested for List expressions#20821
feat(substrait): implement consume_nested for List expressions#20821dd-david-levin wants to merge 1 commit intoapache:mainfrom
Conversation
Converts Substrait Nested::List expressions into DataFusion make_array(...) scalar function calls, enabling inline array constructors like ARRAY['name', 'city'] to flow through the Substrait consumer without error. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
| NestedType::Struct(_) => { | ||
| not_impl_err!("Nested Struct expression is not yet supported") | ||
| } | ||
| NestedType::Map(_) => { | ||
| not_impl_err!("Nested Map expression is not yet supported") | ||
| } |
There was a problem hiding this comment.
Should we support these types as well?
There was a problem hiding this comment.
I'd say this PR is well scoped enough that those can be follow up, given the current support for NestedType is 0 anyways we aren't breaking anything.
There was a problem hiding this comment.
I see the substrait producer does not currently have support for Nested types as well -- it will never produce a substrait plan with a Nested type in it. This means means roundtrips with DataFusion's substrait producer <-> consumer are not possible, but it is possible to have NestedType from an external substrait producer, so I think this is a valid solution still.
| input_schema: &DFSchema, | ||
| ) -> datafusion::common::Result<Expr> { | ||
| not_impl_err!("Nested expression not supported") | ||
| from_nested(self, expr, input_schema).await | ||
| } |
There was a problem hiding this comment.
Do you think we should add a test in datafusion/substrait/tests/cases/logical_plans.rs?
| NestedType::List(list) => { | ||
| let mut args = Vec::with_capacity(list.values.len()); | ||
| for expr in &list.values { | ||
| args.push(from_substrait_rex(consumer, expr, input_schema).await?); |
There was a problem hiding this comment.
This could be just consumed with the consumer instead of directly calling from_substrait_rex
|
#20953 |
Summary
consume_nestedin the defaultSubstraitConsumertrait to handleNested::Listexpressions, which represent inline array constructors likeARRAY['a', 'b']in Substrait plansNested::Listis converted to amake_array(...)scalar function call by recursively converting each element viafrom_substrait_rexNested::StructandNested::Mapreturnnot_impl_errfor now (separate follow-ups)Motivation
Previously, any Substrait plan containing an inline array constructor (e.g. from SQL like
SELECT f(col, ARRAY['x', 'y'])) would fail at the consumer with:The fix follows the same pattern used by every other expression type in the consumer — a
from_nestedfunction in a dedicatedexpr/nested.rsmodule, called from the trait's defaultconsume_nestedmethod.Test plan
SELECT make_array('a', 'b')roundtrips through Substrait producer → consumer correctly🤖 Generated with Claude Code