fix(sqlglot): use Athena dialect for awsathena parsing #36747

ankitajhanwar2001 · 2025-12-18T19:25:01Z

User description

SUMMARY

This PR updates the SQLGlot dialect mapping for AWS Athena from PRESTO to ATHENA.
Athena-specific SQL syntax (e.g., USING EXTERNAL FUNCTION) now parses correctly in Superset, preventing “Unable to parse SQL” errors for queries that are valid in Athena but not in Presto.

Related Issue

Fixes #36717

Changes

Updated SQLGlot dialect mapping for awsathena in superset/sql/parse.py
Added unit tests in tests/unit_tests/sql/parse_tests.py covering Athena-specific SQL parsing
Verified parsing does not fail for USING EXTERNAL FUNCTION constructs
Tested with local Superset Athena database connection

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

N/A

TESTING INSTRUCTIONS

Unit tests

pytest tests/unit_tests/sql/parse_tests.py

Local Superset setup with Athena

Configure Athena database connection locally in Superset.

Execute a query using USING EXTERNAL FUNCTION, e.g.:

USING EXTERNAL FUNCTION decrypt(data varbinary) RETURNS VARCHAR LAMBDA 'arn:aws:lambda:ap-south-1:123456789111:function:lambda-test' SELECT 1

Verify parsing succeeds and no Superset errors occur.

ADDITIONAL INFORMATION

Has associated issue: Athena SQL parsing fails due to Presto dialect mapping in SQLGlot #36717
Required feature flags:
Changes UI
Includes DB Migration (follow approval process in SIP-59)
- Migration is atomic, supports rollback & is backwards-compatible
- Confirm DB migration upgrade and downgrade tested
- Runtime estimates and downtime expectations provided
Introduces new feature or API
Removes existing feature or API

CodeAnt-AI Description

Use Athena dialect for awsathena so Athena-specific SQL parses correctly

What Changed

Switched the SQL parser mapping for engine "awsathena" from Presto to Athena so Athena-only syntax parses correctly
Added a unit test that verifies queries using "USING EXTERNAL FUNCTION" parse and format without errors

Impact

✅ Fewer Athena parse failures for valid queries
✅ Clearer parsing for queries using Athena-specific syntax
✅ Prevents "Unable to parse SQL" errors for awsathena connections

💡 Usage Guide

Checking Your Pull Request

Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.

Talking to CodeAnt AI

Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:

@codeant-ai ask: Your question here

This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.

Example

@codeant-ai ask: Can you suggest a safer alternative to storing this secret?

Preserve Org Learnings with CodeAnt

You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:

@codeant-ai: Your feedback here

This helps CodeAnt AI learn and adapt to your team's coding style and standards.

Example

@codeant-ai: Do not flag unused imports.

Retrigger review

Ask CodeAnt AI to review the PR again, by typing:

@codeant-ai: review

Check Your Repository Health

To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.

codeant-ai-for-open-source · 2025-12-18T19:25:07Z

CodeAnt AI is reviewing your PR.

Thanks for using CodeAnt! 🎉

We're free for open-source projects. if you're enjoying it, help us grow by sharing.

Share on X ·
Reddit ·
LinkedIn

bito-code-review · 2025-12-18T19:25:12Z

Code Review Agent Run #0b9f57

Actionable Suggestions - 0

Additional Suggestions - 1

tests/unit_tests/sql/parse_tests.py - 1
- Missing type annotation · Line 2894-2894
  
  The test function `test_awsathena_engine_mapping` is missing the return type annotation `-> None`, which is inconsistent with other test functions in the file (e.g., `test_remove_quotes() -> None`). This violates the project's typing standards requiring type hints for all new Python code.

Review Details

Files reviewed - 2 · Commit Range: 0c417be..0c417be
- superset/sql/parse.py
- tests/unit_tests/sql/parse_tests.py
Files skipped - 0
Tools
- Whispers (Secret Scanner) - ✔︎ Successful
- Detect-secrets (Secret Scanner) - ✔︎ Successful
- MyPy (Static Code Analysis) - ✔︎ Successful
- Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

/review - Manually triggers a full AI review.
/pause - Pauses automatic reviews on this pull request.
/resume - Resumes automatic reviews.
/resolve - Marks all Bito-posted review comments as resolved.
/abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Default Agent You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by

codeant-ai-for-open-source · 2025-12-18T19:27:28Z

Nitpicks 🔍

🔒 No security issues identified
⚡ Recommended areas for review Dialect availability Mapping `awsathena` to `Dialects.ATHENA` assumes the installed `sqlglot` version exposes `Dialects.ATHENA`. If a runtime environment uses an older `sqlglot` release that doesn't provide ATHENA, importing/using this mapping may cause runtime failures or unexpected fallbacks. Validate CI/test environments and consider a safe fallback. Parsing behavior changes Switching from PRESTO to ATHENA will change sqlglot parsing/AST for some queries. Ensure the unit tests cover edge cases (e.g., constructs that were previously accepted under PRESTO but not under ATHENA) and that downstream codepaths (formatting, transpile, optimize) behave correctly. Missing Assertion The new test `test_awsathena_engine_mapping` calls `statement.format()` but does not assert any outcome. This makes the test weaker — it will only fail on exceptions during parsing, and won't catch regressions where the wrong dialect is used but parsing still succeeds. Consider adding explicit assertions that validate the mapping or the formatted output. Dialect Mapping Verification The goal of the change is to ensure `awsathena` maps to the ATHENA SQLGlot dialect. The test should explicitly verify the mapping (e.g., `SQLGLOT_DIALECTS["awsathena"] == Dialects.ATHENA`) or assert properties of the formatted SQL that demonstrate Athena-specific syntax was handled. Otherwise the test may pass without actually validating the intended configuration. Engine name variants The key used is `awsathena`. Database engine identifiers sometimes vary (e.g. `athena`, `aws_athena`, `aws-athena`). If other parts of the codebase or user configs use a different identifier the mapping won't apply and parsing will revert to base behavior. Consider adding common aliases or normalizing engine names before lookup.

codeant-ai-for-open-source · 2025-12-18T19:27:50Z

superset/sql/parse.py

@@ -58,7 +58,7 @@
 SQLGLOT_DIALECTS = {
    "base": Dialects.DIALECT,
    "ascend": Dialects.HIVE,
-    "awsathena": Dialects.PRESTO,
+    "awsathena": Dialects.ATHENA,


Suggestion: Compatibility with older sqlglot versions: directly referencing Dialects.ATHENA may raise an AttributeError if the installed sqlglot version lacks the ATHENA dialect; use a safe fallback (e.g. PRESTO) via getattr to avoid runtime attribute errors and ensure parsing still proceeds. [possible bug]

Severity Level: Critical 🚨

Suggested change

"awsathena": Dialects.ATHENA,

"awsathena": getattr(Dialects, "ATHENA", Dialects.PRESTO),

Why it matters? ⭐

Using getattr(Dialects, "ATHENA", Dialects.PRESTO) guards against environments with older sqlglot that lack an ATHENA entry and avoids an import-time AttributeError. The fallback to PRESTO is a reasonable defensive choice when ATHENA isn't available.

Prompt for AI Agent 🤖

This is a comment left during a code review. **Path:** superset/sql/parse.py **Line:** 61:61 **Comment:** *Possible Bug: Compatibility with older sqlglot versions: directly referencing `Dialects.ATHENA` may raise an AttributeError if the installed sqlglot version lacks the ATHENA dialect; use a safe fallback (e.g. PRESTO) via getattr to avoid runtime attribute errors and ensure parsing still proceeds. Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.

codeant-ai-for-open-source · 2025-12-18T19:27:50Z

tests/unit_tests/sql/parse_tests.py

+    )
+    statement = SQLStatement(sql, engine="awsathena")
+
+    # Should parse without errors using Athena dialect


Suggestion: The test calls statement.format() but contains no assertion that the parsed statement is a SELECT (or otherwise valid); add an explicit assertion like assert statement.is_select() to ensure the SQL is parsed as the expected statement type rather than silently passing without verification. [logic error]

Severity Level: Minor ⚠️

Suggested change

# Should parse without errors using Athena dialect

# Ensure the parsed statement is a SELECT (verifies parsing result)

assert statement.is_select()

Why it matters? ⭐

Reasonable and useful: asserting statement.is_select() makes the test explicit
about the expected kind of parsed AST instead of silently passing because
format() didn't raise. It's a low-risk, high-signal assertion that improves test quality.

Prompt for AI Agent 🤖

This is a comment left during a code review. **Path:** tests/unit_tests/sql/parse_tests.py **Line:** 2904:2904 **Comment:** *Logic Error: The test calls `statement.format()` but contains no assertion that the parsed statement is a SELECT (or otherwise valid); add an explicit assertion like `assert statement.is_select()` to ensure the SQL is parsed as the expected statement type rather than silently passing without verification. Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.

codeant-ai-for-open-source · 2025-12-18T19:27:55Z

CodeAnt AI finished reviewing your PR.

fix(sqlglot): use Athena dialect for awsathena parsing

0c417be

pull-request-size bot added the size/S label Dec 18, 2025

codeant-ai-for-open-source bot added the size:S This PR changes 10-29 lines, ignoring generated files label Dec 18, 2025

dosubot bot added the change:backend Requires changing the backend label Dec 18, 2025

codeant-ai-for-open-source bot reviewed Dec 18, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(sqlglot): use Athena dialect for awsathena parsing #36747

fix(sqlglot): use Athena dialect for awsathena parsing #36747

ankitajhanwar2001 commented Dec 18, 2025 •

edited by codeant-ai-for-open-source bot

Loading

Uh oh!

codeant-ai-for-open-source bot commented Dec 18, 2025

Uh oh!

bito-code-review bot commented Dec 18, 2025 •

edited

Loading

Code Review Agent Run #0b9f57

Uh oh!

codeant-ai-for-open-source bot commented Dec 18, 2025

Uh oh!

codeant-ai-for-open-source bot Dec 18, 2025

Uh oh!

codeant-ai-for-open-source bot Dec 18, 2025

Uh oh!

codeant-ai-for-open-source bot commented Dec 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	"awsathena": Dialects.ATHENA,
	"awsathena": getattr(Dialects, "ATHENA", Dialects.PRESTO),

-    # Should parse without errors using Athena dialect
+    # Ensure the parsed statement is a SELECT (verifies parsing result)
+    assert statement.is_select()

fix(sqlglot): use Athena dialect for awsathena parsing #36747

Are you sure you want to change the base?

fix(sqlglot): use Athena dialect for awsathena parsing #36747

Conversation

ankitajhanwar2001 commented Dec 18, 2025 • edited by codeant-ai-for-open-source bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

User description

SUMMARY

Related Issue

Changes

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

CodeAnt-AI Description

What Changed

Impact

Checking Your Pull Request

Talking to CodeAnt AI

Example

Preserve Org Learnings with CodeAnt

Example

Retrigger review

Check Your Repository Health

Uh oh!

codeant-ai-for-open-source bot commented Dec 18, 2025

Thanks for using CodeAnt! 🎉

Uh oh!

bito-code-review bot commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review Agent Run #0b9f57

Uh oh!

codeant-ai-for-open-source bot commented Dec 18, 2025

Nitpicks 🔍

Uh oh!

codeant-ai-for-open-source bot Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

codeant-ai-for-open-source bot Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

codeant-ai-for-open-source bot commented Dec 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ankitajhanwar2001 commented Dec 18, 2025 •

edited by codeant-ai-for-open-source bot

Loading

bito-code-review bot commented Dec 18, 2025 •

edited

Loading