use EXECUTE IMMEDIATE in snowflake clean test tables macro to avoid d…#1021
use EXECUTE IMMEDIATE in snowflake clean test tables macro to avoid d…#1021GuyEshdat wants to merge 2 commits into
Conversation
…bt-snowflake splitting the multi-statement query into individual round-trips, which significantly increases on-run-end duration
|
👋 @GuyEshdat |
📝 WalkthroughWalkthroughAdds adapter-dispatched stale-table discovery macros, parameterized cleanup entrypoints that accept discovered relations, a Snowflake wrapper that bundles DROP statements into a single EXECUTE IMMEDIATE block, and integration tests exercising stale-table cleanup. ChangesStale test table discovery & cleanup
🎯 3 (Moderate) | ⏱️ ~20 minutes
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@macros/edr/tests/test_utils/clean_elementary_test_tables.sql`:
- Line 45: The SQL file has formatting drift causing sqlfmt to modify it; run
the SQL formatter and commit the formatted file so pre-commit passes.
Specifically reformat the line that sets drop_queries (the call to
elementary.get_transactionless_clean_elementary_test_tables_queries with
test_table_relations) using your project's sqlfmt configuration, verify no other
formatting changes are pending, and commit the updated file.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 99b23ee7-304e-42e7-bbeb-dfebb7f6daf0
📒 Files selected for processing (1)
macros/edr/tests/test_utils/clean_elementary_test_tables.sql
dfaee3d to
e86ce47
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@macros/edr/tests/test_utils/clean_elementary_test_tables.sql`:
- Around line 39-54: The macro is passing fully-qualified table name strings
(returned by elementary.get_stale_test_tables) directly to
elementary.clean_elementary_test_tables, but downstream cleanup expects Relation
objects; convert/normalize each entry in the tables list into a Relation with
database/schema/identifier attributes (e.g., via the adapter's relation-creation
utility or elementary's relation helper) before calling
elementary.clean_elementary_test_tables so adapters like Athena receive
relations with .database/.schema/.identifier instead of plain strings (update
the tables variable transformation in this macro that uses
elementary.get_stale_test_tables and ensure
elementary.clean_elementary_test_tables receives the Relation list).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 9b1c92a7-7329-46d9-9281-931b0e92e575
📒 Files selected for processing (5)
integration_tests/dbt_project/macros/test_cleanup_stale_test_tables.sqlintegration_tests/tests/test_cleanup_stale_test_tables.pymacros/edr/system/system_utils/clean_elementary_temp_tables.sqlmacros/edr/tests/test_utils/clean_elementary_test_tables.sqlmacros/edr/tests/test_utils/get_stale_test_tables.sql
| {% set tables = elementary.get_stale_test_tables( | ||
| elementary_database, elementary_schema, hours, table_name_pattern | ||
| ) %} | ||
| {% if tables | length == 0 %} | ||
| {% do elementary.edr_log( | ||
| "No temp tables older than " | ||
| ~ hours | ||
| ~ " hours found - nothing to clean up." | ||
| ) %} | ||
| {% else %} | ||
| {% do elementary.edr_log( | ||
| "Dropping " ~ tables | ||
| | length ~ " temp tables older than " ~ hours ~ " hours." | ||
| ) %} | ||
| {% do elementary.clean_elementary_test_tables(tables) %} | ||
| {% do elementary.edr_log("Done.") %} |
There was a problem hiding this comment.
Normalize discovered stale tables to Relation objects before cleanup.
elementary.get_stale_test_tables returns fully-qualified strings, but this macro forwards them as test_table_relations. Downstream adapter paths expect relation attributes (for example, Athena cleanup/query macros use .database/.schema/.identifier), so this can produce invalid DROP statements or adapter cleanup failures on non-Snowflake adapters.
Suggested direction
{% macro cleanup_stale_test_tables(hours=24) %}
@@
- {% set tables = elementary.get_stale_test_tables(
+ {% set tables = elementary.get_stale_test_tables(
elementary_database, elementary_schema, hours, table_name_pattern
) %}
+ {% set table_relations = [] %}
+ {% for table_name in tables %}
+ {% set parts = table_name.split('.') %}
+ {% if parts | length == 3 %}
+ {% do table_relations.append(
+ api.Relation.create(
+ database=parts[0], schema=parts[1], identifier=parts[2]
+ )
+ ) %}
+ {% elif parts | length == 2 %}
+ {% do table_relations.append(
+ api.Relation.create(schema=parts[0], identifier=parts[1])
+ ) %}
+ {% else %}
+ {% do exceptions.raise_compiler_error(
+ "Unexpected relation format from get_stale_test_tables: " ~ table_name
+ ) %}
+ {% endif %}
+ {% endfor %}
@@
- {% do elementary.clean_elementary_test_tables(tables) %}
+ {% do elementary.clean_elementary_test_tables(table_relations) %}
{% do elementary.edr_log("Done.") %}
{% endif %}
{% endmacro %}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@macros/edr/tests/test_utils/clean_elementary_test_tables.sql` around lines 39
- 54, The macro is passing fully-qualified table name strings (returned by
elementary.get_stale_test_tables) directly to
elementary.clean_elementary_test_tables, but downstream cleanup expects Relation
objects; convert/normalize each entry in the tables list into a Relation with
database/schema/identifier attributes (e.g., via the adapter's relation-creation
utility or elementary's relation helper) before calling
elementary.clean_elementary_test_tables so adapters like Athena receive
relations with .database/.schema/.identifier instead of plain strings (update
the tables variable transformation in this macro that uses
elementary.get_stale_test_tables and ensure
elementary.clean_elementary_test_tables receives the Relation list).
Problem
On Snowflake, the `clean_elementary_test_tables` on-run-end hook was executing each `DROP TABLE` as a separate round-trip to the warehouse. With many `elementary_source_schema_changes` tests, this adds up significantly (e.g. 74 tables × ~200ms = ~15–20s of pure overhead per run).
The root cause is in dbt-snowflake's `SnowflakeConnectionManager.execute()`, which calls `_split_queries` to split any SQL string at semicolons before execution — so even though `get_transaction_clean_elementary_test_tables_queries` returns a single `BEGIN TRANSACTION; ...; COMMIT;` string, it gets split into N+2 individual `cursor.execute()` calls.
Fix
Add a `snowflake__get_clean_elementary_test_tables_queries` override that wraps all drops in an `EXECUTE IMMEDIATE $$ BEGIN...END $$` Snowflake Scripting block. From the connector's perspective this is a single SQL statement, so `_split_queries` does not split it — all drops execute server-side in one round-trip.
The `$$`-quoting is respected by `snowflake-connector-python`'s `split_statements` parser, which tracks `in_double_dollars` state and skips semicolon detection inside `$$...$$` blocks.
Summary by CodeRabbit
New Features
Improvements
Tests