Skip to content

added a macro that cleans stale test temp tables. Will be usefull for…#1022

Open
GuyEshdat wants to merge 3 commits into
masterfrom
delete-stale-test-tables
Open

added a macro that cleans stale test temp tables. Will be usefull for…#1022
GuyEshdat wants to merge 3 commits into
masterfrom
delete-stale-test-tables

Conversation

@GuyEshdat
Copy link
Copy Markdown
Collaborator

@GuyEshdat GuyEshdat commented Jun 4, 2026

… when the cleanup process takes too long in the on_run_end, and then a solution can be to set clean_elementary_temp_tables to false and use this macro in a dedicated job instead

Summary by CodeRabbit

  • New Features

    • Added configurable stale temporary test-table cleanup (default 24h) with per-adapter support and an explicit operation to remove stale temp tables.
    • Added an operation to clean current-invocation test tables separately from stale-table cleanup.
  • Tests

    • Added an integration test that creates temp tables, runs cleanup, and verifies before/after counts.
  • Chores

    • CI test run restricted to the new cleanup test during the workflow.

… when the cleanup process takes too long in the on_run_end, and then a solution can be to set clean_elementary_temp_tables to false and use this macro in a dedicated job instead
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 4, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

Adds adapter-dispatched stale-table discovery, refactors cleanup macros into current-invocation and parameterized cleanup, updates the system caller to the new entrypoint, and adds integration macros and a Python test that create and remove temporary test tables and report before/after counts.

Changes

Stale Test Table Cleanup Across Adapters

Layer / File(s) Summary
Adapter-specific stale table discovery
macros/edr/tests/test_utils/get_stale_test_tables.sql
New get_stale_test_tables adapter-dispatch macro plus per-adapter implementations; some adapters apply time-based filtering (Snowflake, BigQuery, Databricks, ClickHouse), others log warnings and return matches without age filtering.
Cleanup macro refactoring and stale-table cleanup
macros/edr/tests/test_utils/clean_elementary_test_tables.sql
Adds clean_current_invocation_test_tables(), changes clean_elementary_test_tables(test_table_relations) to perform cleanup of given relations, and adds cleanup_stale_test_tables(hours=24) which discovers stale tables and calls the parameterized cleanup.
Update system cleanup caller
macros/edr/system/system_utils/clean_elementary_temp_tables.sql
clean_elementary_temp_tables() now calls elementary.clean_current_invocation_test_tables() instead of the parameterless cleanup.
Integration tests for stale cleanup
integration_tests/dbt_project/macros/test_cleanup_stale_test_tables.sql, integration_tests/tests/test_cleanup_stale_test_tables.py
Macro test_cleanup_stale_test_tables creates two temp tables, calls cleanup_stale_test_tables(hours=0), and returns counts; Python integration test runs the macro, parses JSON, asserts >=2 tables before and 0 after cleanup.
CI test filter tweak
.github/workflows/test-warehouse.yml
Pytest invocation adjusted to include -k "test_cleanup_stale_test_tables" to run only the new cleanup test in that workflow step.

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested reviewers:

  • haritamar

🐰 I hopped through schemas late and bright,

Nosed out the test tables hidden from sight,
With tiny paws I tipped each stale heap,
Sent them away so the warehouses sleep,
Now clean fields gleam beneath morning light.

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title is incomplete and truncated with ellipsis, making it vague about the full intent. While it mentions the main feature (cleaning stale test temp tables), it's unclear what the macro is useful for. Complete the title to clearly state the purpose, e.g., 'Add macro to clean up stale test temporary tables for manual cleanup jobs' or provide a more descriptive complete sentence.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch delete-stale-test-tables

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 4, 2026

👋 @GuyEshdat
Thank you for raising your pull request.
Please make sure to add tests and document all user-facing changes.
You can do this by editing the docs files in the elementary repository.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@macros/edr/tests/test_utils/clean_elementary_test_tables.sql`:
- Around line 34-41: The macro cleanup_stale_test_tables currently accepts a
negative hours value which inverts the stale-window; add input validation at the
start of the macro (inside cleanup_stale_test_tables) to ensure the hours
parameter is non-negative: if hours is negative, raise/abort with a clear error
message (or coerce to 0) before calling elementary.get_stale_test_tables;
reference the macro name cleanup_stale_test_tables and the hours parameter so
reviewers can find and update the validation logic.
- Around line 39-54: The stale-table list returned by
elementary.get_stale_test_tables may contain raw strings (e.g.,
"database/schema/identifier") which break Athena DROP SQL; before calling
elementary.clean_elementary_test_tables(tables) normalize each item in tables
into a relation object with explicit fields (database, schema, identifier) or a
properly escaped/formatted relation string expected by the cleaner. Update the
code around the tables variable (result of elementary.get_stale_test_tables) to
detect string entries, parse them into {database, schema, identifier} (or
reformat/escape the path) and build a sanitized list, then pass that sanitized
list to elementary.clean_elementary_test_tables; ensure entries already in
proper structured form are left unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6f3294b0-6259-4b4c-86f7-427d76f8a3c9

📥 Commits

Reviewing files that changed from the base of the PR and between 3dc104a and 6d92e04.

📒 Files selected for processing (5)
  • integration_tests/dbt_project/macros/test_cleanup_stale_test_tables.sql
  • integration_tests/tests/test_cleanup_stale_test_tables.py
  • macros/edr/system/system_utils/clean_elementary_temp_tables.sql
  • macros/edr/tests/test_utils/clean_elementary_test_tables.sql
  • macros/edr/tests/test_utils/get_stale_test_tables.sql

Comment on lines +34 to +41
{% macro cleanup_stale_test_tables(hours=24) %}
{% set elementary_database, elementary_schema = (
elementary.get_package_database_and_schema()
) %}
{% set table_name_pattern = "test%__tmp_%" %}
{% set tables = elementary.get_stale_test_tables(
elementary_database, elementary_schema, hours, table_name_pattern
) %}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Validate hours as non-negative before running cleanup.

Line 34 currently accepts negative values, which flips the stale-window logic and can clean recently-created temp tables.

Suggested patch
 {% macro cleanup_stale_test_tables(hours=24) %}
+    {% if hours | int < 0 %}
+        {% do exceptions.raise_compiler_error(
+            "cleanup_stale_test_tables: 'hours' must be >= 0."
+        ) %}
+    {% endif %}
     {% set elementary_database, elementary_schema = (
         elementary.get_package_database_and_schema()
     ) %}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
{% macro cleanup_stale_test_tables(hours=24) %}
{% set elementary_database, elementary_schema = (
elementary.get_package_database_and_schema()
) %}
{% set table_name_pattern = "test%__tmp_%" %}
{% set tables = elementary.get_stale_test_tables(
elementary_database, elementary_schema, hours, table_name_pattern
) %}
{% macro cleanup_stale_test_tables(hours=24) %}
{% if hours | int < 0 %}
{% do exceptions.raise_compiler_error(
"cleanup_stale_test_tables: 'hours' must be >= 0."
) %}
{% endif %}
{% set elementary_database, elementary_schema = (
elementary.get_package_database_and_schema()
) %}
{% set table_name_pattern = "test%__tmp_%" %}
{% set tables = elementary.get_stale_test_tables(
elementary_database, elementary_schema, hours, table_name_pattern
) %}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@macros/edr/tests/test_utils/clean_elementary_test_tables.sql` around lines 34
- 41, The macro cleanup_stale_test_tables currently accepts a negative hours
value which inverts the stale-window; add input validation at the start of the
macro (inside cleanup_stale_test_tables) to ensure the hours parameter is
non-negative: if hours is negative, raise/abort with a clear error message (or
coerce to 0) before calling elementary.get_stale_test_tables; reference the
macro name cleanup_stale_test_tables and the hours parameter so reviewers can
find and update the validation logic.

Comment on lines +39 to +54
{% set tables = elementary.get_stale_test_tables(
elementary_database, elementary_schema, hours, table_name_pattern
) %}
{% if tables | length == 0 %}
{% do elementary.edr_log(
"No temp tables older than "
~ hours
~ " hours found - nothing to clean up."
) %}
{% else %}
{% do elementary.edr_log(
"Dropping " ~ tables
| length ~ " temp tables older than " ~ hours ~ " hours."
) %}
{% do elementary.clean_elementary_test_tables(tables) %}
{% do elementary.edr_log("Done.") %}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Normalize stale table results before passing to cleanup formatter.

Line 53 passes raw tables through to cleanup. For Athena, downstream cleanup SQL reads test_relation.database/schema/identifier; string inputs from stale discovery can lead to malformed DROP targets.

Suggested patch
 {% macro cleanup_stale_test_tables(hours=24) %}
@@
     {% set tables = elementary.get_stale_test_tables(
         elementary_database, elementary_schema, hours, table_name_pattern
     ) %}
+    {% set table_relations = [] %}
+    {% for table in tables %}
+        {% if table is string %}
+            {% set parts = table.split(".") %}
+            {% if parts | length == 3 %}
+                {% do table_relations.append(
+                    api.Relation.create(
+                        database=parts[0], schema=parts[1], identifier=parts[2]
+                    )
+                ) %}
+            {% elif parts | length == 2 %}
+                {% do table_relations.append(
+                    api.Relation.create(schema=parts[0], identifier=parts[1])
+                ) %}
+            {% else %}
+                {% do exceptions.raise_compiler_error(
+                    "cleanup_stale_test_tables: unexpected table format: " ~ table
+                ) %}
+            {% endif %}
+        {% else %}
+            {% do table_relations.append(table) %}
+        {% endif %}
+    {% endfor %}
@@
-        {% do elementary.clean_elementary_test_tables(tables) %}
+        {% do elementary.clean_elementary_test_tables(table_relations) %}
         {% do elementary.edr_log("Done.") %}
     {% endif %}
 {% endmacro %}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
{% set tables = elementary.get_stale_test_tables(
elementary_database, elementary_schema, hours, table_name_pattern
) %}
{% if tables | length == 0 %}
{% do elementary.edr_log(
"No temp tables older than "
~ hours
~ " hours found - nothing to clean up."
) %}
{% else %}
{% do elementary.edr_log(
"Dropping " ~ tables
| length ~ " temp tables older than " ~ hours ~ " hours."
) %}
{% do elementary.clean_elementary_test_tables(tables) %}
{% do elementary.edr_log("Done.") %}
{% set tables = elementary.get_stale_test_tables(
elementary_database, elementary_schema, hours, table_name_pattern
) %}
{% set table_relations = [] %}
{% for table in tables %}
{% if table is string %}
{% set parts = table.split(".") %}
{% if parts | length == 3 %}
{% do table_relations.append(
api.Relation.create(
database=parts[0], schema=parts[1], identifier=parts[2]
)
) %}
{% elif parts | length == 2 %}
{% do table_relations.append(
api.Relation.create(schema=parts[0], identifier=parts[1])
) %}
{% else %}
{% do exceptions.raise_compiler_error(
"cleanup_stale_test_tables: unexpected table format: " ~ table
) %}
{% endif %}
{% else %}
{% do table_relations.append(table) %}
{% endif %}
{% endfor %}
{% if tables | length == 0 %}
{% do elementary.edr_log(
"No temp tables older than "
~ hours
~ " hours found - nothing to clean up."
) %}
{% else %}
{% do elementary.edr_log(
"Dropping " ~ tables
| length ~ " temp tables older than " ~ hours ~ " hours."
) %}
{% do elementary.clean_elementary_test_tables(table_relations) %}
{% do elementary.edr_log("Done.") %}
{% endif %}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@macros/edr/tests/test_utils/clean_elementary_test_tables.sql` around lines 39
- 54, The stale-table list returned by elementary.get_stale_test_tables may
contain raw strings (e.g., "database/schema/identifier") which break Athena DROP
SQL; before calling elementary.clean_elementary_test_tables(tables) normalize
each item in tables into a relation object with explicit fields (database,
schema, identifier) or a properly escaped/formatted relation string expected by
the cleaner. Update the code around the tables variable (result of
elementary.get_stale_test_tables) to detect string entries, parse them into
{database, schema, identifier} (or reformat/escape the path) and build a
sanitized list, then pass that sanitized list to
elementary.clean_elementary_test_tables; ensure entries already in proper
structured form are left unchanged.

GuyEshdat added 2 commits June 4, 2026 20:35
…test-fixes iteration, will revert this commit once this test passes for all warehouses
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/test-warehouse.yml:
- Around line 279-282: Remove the temporary test filter that scopes pytest to a
single test: delete the -k "test_cleanup_stale_test_tables" argument (and the
preceding TODO comment) from the py.test invocation (the line containing py.test
-n"$PYTEST_PARALLEL" -vvv --target "$WAREHOUSE") so the workflow runs the full
test suite and restores CI coverage across warehouses and versions.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 73979cab-7d7c-4328-868d-48a49b15f0c8

📥 Commits

Reviewing files that changed from the base of the PR and between fb10ced and cae3659.

📒 Files selected for processing (1)
  • .github/workflows/test-warehouse.yml

Comment on lines +279 to 282
# TODO: remove -k filter before merging — temporarily scoped to speed up iteration on test_cleanup_stale_test_tables
py.test -n"$PYTEST_PARALLEL" -vvv --target "$WAREHOUSE" \
-k "test_cleanup_stale_test_tables" \
--junit-xml=test-results.xml \
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Remove the temporary -k filter before merge to avoid CI coverage regression.

Limiting this workflow to a single test disables broad integration coverage across warehouses and versions, so regressions outside stale-table cleanup can slip through.

Suggested fix
-          # TODO: remove -k filter before merging — temporarily scoped to speed up iteration on test_cleanup_stale_test_tables
           py.test -n"$PYTEST_PARALLEL" -vvv --target "$WAREHOUSE" \
-            -k "test_cleanup_stale_test_tables" \
             --junit-xml=test-results.xml \
             --html="detailed_report_${WAREHOUSE}_dbt_${DBT_VERSION}.html" \
             --self-contained-html --clear-on-end $FUSION_RUNNER_FLAG
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/test-warehouse.yml around lines 279 - 282, Remove the
temporary test filter that scopes pytest to a single test: delete the -k
"test_cleanup_stale_test_tables" argument (and the preceding TODO comment) from
the py.test invocation (the line containing py.test -n"$PYTEST_PARALLEL" -vvv
--target "$WAREHOUSE") so the workflow runs the full test suite and restores CI
coverage across warehouses and versions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant