Skip to content

use EXECUTE IMMEDIATE in snowflake clean test tables macro to avoid d…#1021

Open
GuyEshdat wants to merge 2 commits into
masterfrom
avoid-query-split-for-snowflake-test-tables-cleanup
Open

use EXECUTE IMMEDIATE in snowflake clean test tables macro to avoid d…#1021
GuyEshdat wants to merge 2 commits into
masterfrom
avoid-query-split-for-snowflake-test-tables-cleanup

Conversation

@GuyEshdat
Copy link
Copy Markdown
Collaborator

@GuyEshdat GuyEshdat commented Jun 4, 2026

Problem

On Snowflake, the `clean_elementary_test_tables` on-run-end hook was executing each `DROP TABLE` as a separate round-trip to the warehouse. With many `elementary_source_schema_changes` tests, this adds up significantly (e.g. 74 tables × ~200ms = ~15–20s of pure overhead per run).

The root cause is in dbt-snowflake's `SnowflakeConnectionManager.execute()`, which calls `_split_queries` to split any SQL string at semicolons before execution — so even though `get_transaction_clean_elementary_test_tables_queries` returns a single `BEGIN TRANSACTION; ...; COMMIT;` string, it gets split into N+2 individual `cursor.execute()` calls.

Fix

Add a `snowflake__get_clean_elementary_test_tables_queries` override that wraps all drops in an `EXECUTE IMMEDIATE $$ BEGIN...END $$` Snowflake Scripting block. From the connector's perspective this is a single SQL statement, so `_split_queries` does not split it — all drops execute server-side in one round-trip.

The `$$`-quoting is respected by `snowflake-connector-python`'s `split_statements` parser, which tracks `in_double_dollars` state and skips semicolon detection inside `$$...$$` blocks.

Summary by CodeRabbit

  • New Features

    • Added a new entry-point to clean only the current run's temporary test tables.
    • Added a stale-test-table cleanup command that detects and removes old temporary tables across adapters.
  • Improvements

    • Per-adapter logic to identify stale test tables (age-aware where supported).
    • Snowflake optimization: batches transactionless DROPs into a single execution block for efficient teardown.
  • Tests

    • Integration test verifying stale-table detection and full cleanup.

…bt-snowflake splitting the multi-statement query into individual round-trips, which significantly increases on-run-end duration
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 4, 2026

👋 @GuyEshdat
Thank you for raising your pull request.
Please make sure to add tests and document all user-facing changes.
You can do this by editing the docs files in the elementary repository.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 4, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

Adds adapter-dispatched stale-table discovery macros, parameterized cleanup entrypoints that accept discovered relations, a Snowflake wrapper that bundles DROP statements into a single EXECUTE IMMEDIATE block, and integration tests exercising stale-table cleanup.

Changes

Stale test table discovery & cleanup

Layer / File(s) Summary
Stale table discovery (adapters)
macros/edr/tests/test_utils/get_stale_test_tables.sql
Dispatcher get_stale_test_tables and adapter implementations (Snowflake, BigQuery, Redshift, Postgres, Databricks, Spark, fabricspark, ClickHouse, Athena, Trino, DuckDB, Dremio) to return matching stale temp tables with optional age filtering.
Parameterized cleanup macros & Snowflake wrapper
macros/edr/tests/test_utils/clean_elementary_test_tables.sql, macros/edr/system/system_utils/clean_elementary_temp_tables.sql
Renames entrypoint to clean_current_invocation_test_tables(), adds clean_elementary_test_tables(test_table_relations), cleanup_stale_test_tables(hours=24), and snowflake__get_clean_elementary_test_tables_queries that returns a single EXECUTE IMMEDIATE $$ BEGIN ... END; $$ query for Snowflake. Also updates system util to call current-invocation cleaner.
Integration tests for cleanup
integration_tests/dbt_project/macros/test_cleanup_stale_test_tables.sql, integration_tests/tests/test_cleanup_stale_test_tables.py
DBT macro creates fake temp tables and invokes elementary.cleanup_stale_test_tables(hours=0); pytest runs the macro and asserts tables exist before and are removed after cleanup.

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • haritamar

"I hopped and I sniffed through SQL and logs,
Found dusty tables hidden in bogs.
One tidy drop wrapped neat and tight,
I nudged them gone in the moonlit night.
🐇✨"

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main technical change: using EXECUTE IMMEDIATE in Snowflake to optimize the clean test tables macro, which is the core fix addressing the performance overhead problem detailed in the PR objectives.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch avoid-query-split-for-snowflake-test-tables-cleanup

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@macros/edr/tests/test_utils/clean_elementary_test_tables.sql`:
- Line 45: The SQL file has formatting drift causing sqlfmt to modify it; run
the SQL formatter and commit the formatted file so pre-commit passes.
Specifically reformat the line that sets drop_queries (the call to
elementary.get_transactionless_clean_elementary_test_tables_queries with
test_table_relations) using your project's sqlfmt configuration, verify no other
formatting changes are pending, and commit the updated file.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 99b23ee7-304e-42e7-bbeb-dfebb7f6daf0

📥 Commits

Reviewing files that changed from the base of the PR and between 3dc104a and e8320c2.

📒 Files selected for processing (1)
  • macros/edr/tests/test_utils/clean_elementary_test_tables.sql

Comment thread macros/edr/tests/test_utils/clean_elementary_test_tables.sql Outdated
@GuyEshdat GuyEshdat force-pushed the avoid-query-split-for-snowflake-test-tables-cleanup branch from dfaee3d to e86ce47 Compare June 4, 2026 16:44
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@macros/edr/tests/test_utils/clean_elementary_test_tables.sql`:
- Around line 39-54: The macro is passing fully-qualified table name strings
(returned by elementary.get_stale_test_tables) directly to
elementary.clean_elementary_test_tables, but downstream cleanup expects Relation
objects; convert/normalize each entry in the tables list into a Relation with
database/schema/identifier attributes (e.g., via the adapter's relation-creation
utility or elementary's relation helper) before calling
elementary.clean_elementary_test_tables so adapters like Athena receive
relations with .database/.schema/.identifier instead of plain strings (update
the tables variable transformation in this macro that uses
elementary.get_stale_test_tables and ensure
elementary.clean_elementary_test_tables receives the Relation list).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9b1c92a7-7329-46d9-9281-931b0e92e575

📥 Commits

Reviewing files that changed from the base of the PR and between e86ce47 and dfaee3d.

📒 Files selected for processing (5)
  • integration_tests/dbt_project/macros/test_cleanup_stale_test_tables.sql
  • integration_tests/tests/test_cleanup_stale_test_tables.py
  • macros/edr/system/system_utils/clean_elementary_temp_tables.sql
  • macros/edr/tests/test_utils/clean_elementary_test_tables.sql
  • macros/edr/tests/test_utils/get_stale_test_tables.sql

Comment on lines +39 to +54
{% set tables = elementary.get_stale_test_tables(
elementary_database, elementary_schema, hours, table_name_pattern
) %}
{% if tables | length == 0 %}
{% do elementary.edr_log(
"No temp tables older than "
~ hours
~ " hours found - nothing to clean up."
) %}
{% else %}
{% do elementary.edr_log(
"Dropping " ~ tables
| length ~ " temp tables older than " ~ hours ~ " hours."
) %}
{% do elementary.clean_elementary_test_tables(tables) %}
{% do elementary.edr_log("Done.") %}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Normalize discovered stale tables to Relation objects before cleanup.

elementary.get_stale_test_tables returns fully-qualified strings, but this macro forwards them as test_table_relations. Downstream adapter paths expect relation attributes (for example, Athena cleanup/query macros use .database/.schema/.identifier), so this can produce invalid DROP statements or adapter cleanup failures on non-Snowflake adapters.

Suggested direction
 {% macro cleanup_stale_test_tables(hours=24) %}
@@
-    {% set tables = elementary.get_stale_test_tables(
+    {% set tables = elementary.get_stale_test_tables(
         elementary_database, elementary_schema, hours, table_name_pattern
     ) %}
+    {% set table_relations = [] %}
+    {% for table_name in tables %}
+        {% set parts = table_name.split('.') %}
+        {% if parts | length == 3 %}
+            {% do table_relations.append(
+                api.Relation.create(
+                    database=parts[0], schema=parts[1], identifier=parts[2]
+                )
+            ) %}
+        {% elif parts | length == 2 %}
+            {% do table_relations.append(
+                api.Relation.create(schema=parts[0], identifier=parts[1])
+            ) %}
+        {% else %}
+            {% do exceptions.raise_compiler_error(
+                "Unexpected relation format from get_stale_test_tables: " ~ table_name
+            ) %}
+        {% endif %}
+    {% endfor %}
@@
-        {% do elementary.clean_elementary_test_tables(tables) %}
+        {% do elementary.clean_elementary_test_tables(table_relations) %}
         {% do elementary.edr_log("Done.") %}
     {% endif %}
 {% endmacro %}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@macros/edr/tests/test_utils/clean_elementary_test_tables.sql` around lines 39
- 54, The macro is passing fully-qualified table name strings (returned by
elementary.get_stale_test_tables) directly to
elementary.clean_elementary_test_tables, but downstream cleanup expects Relation
objects; convert/normalize each entry in the tables list into a Relation with
database/schema/identifier attributes (e.g., via the adapter's relation-creation
utility or elementary's relation helper) before calling
elementary.clean_elementary_test_tables so adapters like Athena receive
relations with .database/.schema/.identifier instead of plain strings (update
the tables variable transformation in this macro that uses
elementary.get_stale_test_tables and ensure
elementary.clean_elementary_test_tables receives the Relation list).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants