Skip to content

[SPARK-55719][SQL] Remove deprecation warning for spark.sql.hive.convertCTAS#54521

Open
allisonwang-db wants to merge 1 commit intoapache:masterfrom
allisonwang-db:remove-convertctas-deprecation
Open

[SPARK-55719][SQL] Remove deprecation warning for spark.sql.hive.convertCTAS#54521
allisonwang-db wants to merge 1 commit intoapache:masterfrom
allisonwang-db:remove-convertctas-deprecation

Conversation

@allisonwang-db
Copy link
Contributor

What changes were proposed in this pull request?

This PR removes the spark.sql.hive.convertCTAS configuration from deprecatedSQLConfigs in
SQLConf.scala.

Why are the changes needed?

The spark.sql.hive.convertCTAS configuration is actually useful and continues to serve a purpose in the codebase. Removing the deprecation warning prevents unnecessary confusion for users who rely on this configuration.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Existing UTs.

Was this patch authored or co-authored using generative AI tooling?

No

@allisonwang-db
Copy link
Contributor Author

cc @cloud-fan

Copy link
Member

@szehon-ho szehon-ho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially was wondering, why not just use the suggestion (LEGACY_CREATE_HIVE_TABLE_BY_DEFAULT_KEY), but that flag seems its a bit broader than CONVERT_CTAS?

Maybe could clarify the doc for CONVERT_CTAS to not lose this correlation, but otherwise make sense to me

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea these two configs have different behaviors and users can't simply use the new one to replace CONVERT_CTAS.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @allisonwang-db , @szehon-ho , @cloud-fan .

Given that the long history of this config and related area, this has been one of the Apache Spark SQL's developing direction instead of the current status. I believe this is not a simple issue like this PR claims. I'd recommend to keep the existing long-standing decision until we made a new clear consensus on the future development direction.

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Mar 2, 2026

To simply put, we need to discuss this officially in the dev mailing list. I'd like to recommend to send an email to inform that you want to decide that the Apache Spark community doesn't aim to deprecate spark.sql.hive.convertCTAS any longer from Apache Spark 4.2.0 and for the future. Then, we can make this a part of the official Spark 4.2.0, @allisonwang-db .

The spark.sql.hive.convertCTAS configuration is actually useful and continues to serve a purpose in the codebase. Removing the deprecation warning prevents unnecessary confusion for users who rely on this configuration.

When you send an email, it would be great if you can provide more concrete backgrounds (or examples) than the above.

Also, cc @huaxingao as the release manager of Apache Spark 4.2.0.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This deprecation message is simply wrong, as these two configs have different behaviors and we have to use CONVERT_CTAS in our workloads to retain legacy behaviors.

To be more concrete, CONVERT_CTAS only affects CTAS to create data source table instead of Hive tables, while LEGACY_CREATE_HIVE_TABLE_BY_DEFAULT affects all the table creation cases.

To keep the legacy behavior that only allow CTAS to convert Hive tables to data source tables, CONVERT_CTAS is the only option.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dongjoon-hyun do we have to make things complicated? This is removing a deprecation message after all, and I believe most people in the dev list don't care, it's wasting people's attention. With such high bar, almost all PRs need to go to dev list for confirmation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also cc @huaxingao as the 4.2.0 release manager.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants