feat: [ANSI] Ansi sql error messages#3580
Open
parthchandra wants to merge 2 commits intoapache:mainfrom
Open
Conversation
fc0b78f to
cf4588f
Compare
Contributor
Author
|
@coderfender, fyi |
Contributor
|
Thank you @parthchandra . This is awesome |
andygrove
reviewed
Feb 23, 2026
andygrove
reviewed
Feb 23, 2026
spark/src/main/spark-4.0/org/apache/spark/sql/comet/shims/ShimSparkErrorConverter.scala
Show resolved
Hide resolved
andygrove
reviewed
Feb 23, 2026
Contributor
Author
|
Changed to draft to figure out backward compatibility |
71caba8 to
e386824
Compare
9f6e462 to
8b6f25c
Compare
Contributor
Author
|
@andygrove @coderfender This is ready for review. I'll rebase after the review. |
andygrove
reviewed
Mar 2, 2026
Contributor
Author
|
Also fixed another failing test. |
andygrove
reviewed
Mar 3, 2026
common/src/main/java/org/apache/comet/exceptions/CometQueryExecutionException.java
Outdated
Show resolved
Hide resolved
andygrove
reviewed
Mar 3, 2026
andygrove
approved these changes
Mar 4, 2026
Member
andygrove
left a comment
There was a problem hiding this comment.
LGTM pending CI. Thanks for the huge effort with this one @parthchandra!
Contributor
Author
|
Thank you Andy. Will wait for #3559 before rebasing. |
0a876ef to
ba03224
Compare
cbf4927 to
2635270
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Closes parts of #551
Closes #2215
Closes #3375
Rationale for this change
With Spark 4.0 (And Spark 3.5 with Ansi mode), Spark produces ansi compliant error messages that have an error code and in many cases include the original SQL query. When we encounter errors in native code, Comet throws a SparkException or CometNativeException that do not conform to the expected error reporting standard.
What changes are included in this PR?
This PR introduces a framework to report ansi compliant error messages from native code.
Summary of error propagation -
expr_idis generated. If the expression's origin carries aQueryContext(SQL text, line, column, object name), it is extracted and attached to the protobuf. This is done for bothExprandAggExpr.QueryContextMap. When planningExprandAggExprnodes, ifexpr_idandquery_contextare present, the context is registered in the map. When creating physical expressions for Cast, CheckOverflow, ListExtract, SumDecimal, AvgDecimal, and arithmetic binary expressions, the relevant QueryContext is looked up and passed to the constructor.SparkErrorenum is extended with new variants will all the Spark ANSI errors (from https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala). A newSparkErrorWithContexttype wraps aSparkErrorwith aQueryContext. All affected expression implementations look up the context and produce aSparkErrorWithContextwhen available.The
SparkErrorimplementation also has newto_json()andexception_class()methods for JNI serialization.throw_exceptionfunction now checks forSparkErrorWithContextorSparkErrorand throwsCometQueryExecutionException.CometQueryExecutionExceptioncarries the entireSparkErrorWithContextas a JSON message. On the Scala side,CometExecIteratorcatches this exception and callsSparkErrorConverter.convertToSparkException()to convert to the appropriate Spark exception. If the JSON message contained theQueryContext, the exception will contain the query, otherwise it will not.Notes: Not all expressions have been updated. All the expressions that failed unit tests as a result of incorrect error messages have been updated. ( Cast, CheckOverflow, ListExtract, SumDecimal, AvgDecimal, and binary arithmetic expressions). Binary arithmetic expressions are now represented as
CheckedBinaryExprwhich also includes the query context.Most errors in
QueryExecutionErrorsare reproduced as is in the native side. However some errors likeINTERVAL_ARITHMETIC_OVERFLOWhave a version with a user suggestion and one without a user suggestion. In such cases there are two variants in the native side.How are these changes tested?
New unit tests. Failing tests listed in #551, #2215, #3375
This PR was produced with the generous assistance of Claude Code