Skip to content

[WIP] HIVE-29641: Upgrade Calcite to 1.42.0#6523

Draft
rubenada wants to merge 12 commits into
apache:masterfrom
rubenada:HIVE-29641
Draft

[WIP] HIVE-29641: Upgrade Calcite to 1.42.0#6523
rubenada wants to merge 12 commits into
apache:masterfrom
rubenada:HIVE-29641

Conversation

@rubenada
Copy link
Copy Markdown
Contributor

@rubenada rubenada commented Jun 3, 2026

What changes were proposed in this pull request?

Upgrade Calcite to 1.42.0.

Why are the changes needed?

Upgrade to latest Calcite version.

Does this PR introduce any user-facing change?

tbd

How was this patch tested?

tbd

- AggCall construction adjustments in HiveAggregate and HiveSqlSumAggFunction, due to "ddb4200f8f Refactor: Add fields AggregateCall.rexList and RelBuilder.AggCall.preOperands" (1.35)
- Add visitLambda and visitLambdaRef in RexVisitor implementation (HiveCalciteUtil#ConstantFinder), due to "[CALCITE-3679] Allow lambda expressions in SQL queries" (1.37)
…umAggFunction, RexNodeConvertera and ASTConverter; previous method deprecated as a consequence of "[CALCITE-5557] Add SAFE_CAST function (enabled in BigQuery library)" (1.35)
… of HepPlanner, otherwise we get IllegalArgumentException.

The reason is "7fc3e1b Refactor: Add RelNode.stripped" (Calcite 1.35), which added a new check in HepRelVertex constructor (inner rel cannot be another HepRelVertex), and this is violated by HiveHepExtractRelNodeRule, which takes the root node (already extracted from its HepRelVertex), but its whole subtree is composed by HepRelVertex, so when trying to initialize the HepPlanner inside HiveHepExtractRelNodeRule, we reach the new exception (because the root's children are already HepRelVertex):

  HepRelVertex(RelNode rel) {
    super(rel.getCluster(), rel.getTraitSet());
    currentRel = requireNonNull(rel, "rel");
    checkArgument(!(rel instanceof HepRelVertex)); // <-- new check
  }

java.lang.IllegalArgumentException
	at org.apache.hive.com.google.common.base.Preconditions.checkArgument(Preconditions.java:120)
	at org.apache.calcite.plan.hep.HepRelVertex.<init>(HepRelVertex.java:54)
	at org.apache.calcite.plan.hep.HepPlanner.addRelToGraph(HepPlanner.java:837)
	at org.apache.calcite.plan.hep.HepPlanner.addRelToGraph(HepPlanner.java:813)
	at org.apache.calcite.plan.hep.HepPlanner.setRoot(HepPlanner.java:163)
	at org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveHepExtractRelNodeRule.execute(HiveHepExtractRelNodeRule.java:39)
…dencies to maven-shade-plugin, to avoid:

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-shade-plugin:3.6.0:shade (default) on project hive-druid-handler:
Error creating shaded jar: Problem shading JAR .../hive/druid-handler/target/hive-druid-handler-4.3.0-SNAPSHOT.jar entry
org/apache/hive/druid/org/apache/calcite/runtime/SqlFunctions.class: org.apache.maven.plugin.MojoExecutionException:
Error in ASM processing class org/apache/hive/druid/org/apache/calcite/runtime/SqlFunctions.class: Index 65536 out of bounds for length 334
…the default ISO-8859-1), otherwise we get "charset pollution" on char literals in explained plans (e.g. _UTF-16LE'1' instead of simply '1') due to "CALCITE-6006: RelToSqlConverter loses charset information" (introduced in 1.36)
… as part of the UnionRewritingPullProgram.

Due to "7fc3e1b Refactor: Add RelNode.stripped" (included in Calcite 1.35), which added a new check in HepRelVertex constructor
(inner rel cannot be another HepRelVertex), we basically cannot apply a HepProgram on a tree containing already HepVertex.
As a workaround, it is proposed to extend the MaterializedView* rules (as HiveMaterializedView*Rule), override the rewriteQuery
(where the UnionRewritingPullProgram is called), and unwrap the HepRelVertex tree ourselves before calling the super.rewriteQuery.
Also, there was already an auxiliary method for this unwrapping (HiveCalciteUtil.stripHepVertices), so no need to create a new
one inside HiveHepExtractRelNodeRule (which could be removed indeed).
…avadoc "none of the relnode above aggregate refers to these [group] keys"

should not be checked via fieldsUsed.contains(aggregate.getGroupSet()) but rather fieldsUsed.intersects(aggregate.getGroupSet())
Also get rid of unnecessary check aggregate.getIndicatorCount() > 0 which is always false (this method is deprecated and always returns zero)
…n on Jenkins:

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-shade-plugin:3.6.0:shade (default) on project hive-druid-handler:
Error creating shaded jar: Problem shading JAR /home/jenkins/agent/workspace/hive-precommit_PR-6523/druid-handler/target/hive-druid-handler-4.3.0-SNAPSHOT.jar
entry org/apache/hive/druid/org/apache/calcite/runtime/SqlFunctions.class: org.apache.maven.plugin.MojoExecutionException:
Error in ASM processing class org/apache/hive/druid/org/apache/calcite/runtime/SqlFunctions.class: Index 65536 out of bounds for length 334
As a consequence of defining Calcite default charset system property in a previous commit, now in RexLiteral#appendAsJava
(which already contained the literal charset vs default charset verification, added to SqlImplementor in CALCITE-6006 1.36)
literals that used to print the charset don't do it anymore, e.g. _UTF-16LE'ten' => 'ten'
Due to "[CALCITE-5717] RelBuilder.project of literals on a single-row Aggregate should create a Values" (introduced 1.35), cases like
TestMiniLlapLocalCliDriver with file explainuser_1.q led to: "java.lang.UnsupportedOperationException: Values with non-empty tuples are not supported.
at org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter.convert(ASTConverter.java:264)".
The reason is that RelBuilder#project_ contains the simplification (only applicable if config.simplifyValues is true):
 "If the expressions are all literals, and the input is a Values with N rows [...], replace with a Values with same tuple N times";
 and with CALCITE-5717 that simplification was extended to not only "Values with N rows" but also "Aggregates with 1 row", and this case
 would lead to creating a non-empty HiveValues, which is not supported in our case.
 The easiest solution to prevent that from happening is disabling simplifyValues in HiveRelBuilder config.
@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented Jun 5, 2026

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants