added a macro that cleans stale test temp tables. Will be usefull for…#1022
added a macro that cleans stale test temp tables. Will be usefull for…#1022GuyEshdat wants to merge 3 commits into
Conversation
… when the cleanup process takes too long in the on_run_end, and then a solution can be to set clean_elementary_temp_tables to false and use this macro in a dedicated job instead
📝 WalkthroughWalkthroughAdds adapter-dispatched stale-table discovery, refactors cleanup macros into current-invocation and parameterized cleanup, updates the system caller to the new entrypoint, and adds integration macros and a Python test that create and remove temporary test tables and report before/after counts. ChangesStale Test Table Cleanup Across Adapters
🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers:
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
👋 @GuyEshdat |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@macros/edr/tests/test_utils/clean_elementary_test_tables.sql`:
- Around line 34-41: The macro cleanup_stale_test_tables currently accepts a
negative hours value which inverts the stale-window; add input validation at the
start of the macro (inside cleanup_stale_test_tables) to ensure the hours
parameter is non-negative: if hours is negative, raise/abort with a clear error
message (or coerce to 0) before calling elementary.get_stale_test_tables;
reference the macro name cleanup_stale_test_tables and the hours parameter so
reviewers can find and update the validation logic.
- Around line 39-54: The stale-table list returned by
elementary.get_stale_test_tables may contain raw strings (e.g.,
"database/schema/identifier") which break Athena DROP SQL; before calling
elementary.clean_elementary_test_tables(tables) normalize each item in tables
into a relation object with explicit fields (database, schema, identifier) or a
properly escaped/formatted relation string expected by the cleaner. Update the
code around the tables variable (result of elementary.get_stale_test_tables) to
detect string entries, parse them into {database, schema, identifier} (or
reformat/escape the path) and build a sanitized list, then pass that sanitized
list to elementary.clean_elementary_test_tables; ensure entries already in
proper structured form are left unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 6f3294b0-6259-4b4c-86f7-427d76f8a3c9
📒 Files selected for processing (5)
integration_tests/dbt_project/macros/test_cleanup_stale_test_tables.sqlintegration_tests/tests/test_cleanup_stale_test_tables.pymacros/edr/system/system_utils/clean_elementary_temp_tables.sqlmacros/edr/tests/test_utils/clean_elementary_test_tables.sqlmacros/edr/tests/test_utils/get_stale_test_tables.sql
| {% macro cleanup_stale_test_tables(hours=24) %} | ||
| {% set elementary_database, elementary_schema = ( | ||
| elementary.get_package_database_and_schema() | ||
| ) %} | ||
| {% set table_name_pattern = "test%__tmp_%" %} | ||
| {% set tables = elementary.get_stale_test_tables( | ||
| elementary_database, elementary_schema, hours, table_name_pattern | ||
| ) %} |
There was a problem hiding this comment.
Validate hours as non-negative before running cleanup.
Line 34 currently accepts negative values, which flips the stale-window logic and can clean recently-created temp tables.
Suggested patch
{% macro cleanup_stale_test_tables(hours=24) %}
+ {% if hours | int < 0 %}
+ {% do exceptions.raise_compiler_error(
+ "cleanup_stale_test_tables: 'hours' must be >= 0."
+ ) %}
+ {% endif %}
{% set elementary_database, elementary_schema = (
elementary.get_package_database_and_schema()
) %}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| {% macro cleanup_stale_test_tables(hours=24) %} | |
| {% set elementary_database, elementary_schema = ( | |
| elementary.get_package_database_and_schema() | |
| ) %} | |
| {% set table_name_pattern = "test%__tmp_%" %} | |
| {% set tables = elementary.get_stale_test_tables( | |
| elementary_database, elementary_schema, hours, table_name_pattern | |
| ) %} | |
| {% macro cleanup_stale_test_tables(hours=24) %} | |
| {% if hours | int < 0 %} | |
| {% do exceptions.raise_compiler_error( | |
| "cleanup_stale_test_tables: 'hours' must be >= 0." | |
| ) %} | |
| {% endif %} | |
| {% set elementary_database, elementary_schema = ( | |
| elementary.get_package_database_and_schema() | |
| ) %} | |
| {% set table_name_pattern = "test%__tmp_%" %} | |
| {% set tables = elementary.get_stale_test_tables( | |
| elementary_database, elementary_schema, hours, table_name_pattern | |
| ) %} |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@macros/edr/tests/test_utils/clean_elementary_test_tables.sql` around lines 34
- 41, The macro cleanup_stale_test_tables currently accepts a negative hours
value which inverts the stale-window; add input validation at the start of the
macro (inside cleanup_stale_test_tables) to ensure the hours parameter is
non-negative: if hours is negative, raise/abort with a clear error message (or
coerce to 0) before calling elementary.get_stale_test_tables; reference the
macro name cleanup_stale_test_tables and the hours parameter so reviewers can
find and update the validation logic.
| {% set tables = elementary.get_stale_test_tables( | ||
| elementary_database, elementary_schema, hours, table_name_pattern | ||
| ) %} | ||
| {% if tables | length == 0 %} | ||
| {% do elementary.edr_log( | ||
| "No temp tables older than " | ||
| ~ hours | ||
| ~ " hours found - nothing to clean up." | ||
| ) %} | ||
| {% else %} | ||
| {% do elementary.edr_log( | ||
| "Dropping " ~ tables | ||
| | length ~ " temp tables older than " ~ hours ~ " hours." | ||
| ) %} | ||
| {% do elementary.clean_elementary_test_tables(tables) %} | ||
| {% do elementary.edr_log("Done.") %} |
There was a problem hiding this comment.
Normalize stale table results before passing to cleanup formatter.
Line 53 passes raw tables through to cleanup. For Athena, downstream cleanup SQL reads test_relation.database/schema/identifier; string inputs from stale discovery can lead to malformed DROP targets.
Suggested patch
{% macro cleanup_stale_test_tables(hours=24) %}
@@
{% set tables = elementary.get_stale_test_tables(
elementary_database, elementary_schema, hours, table_name_pattern
) %}
+ {% set table_relations = [] %}
+ {% for table in tables %}
+ {% if table is string %}
+ {% set parts = table.split(".") %}
+ {% if parts | length == 3 %}
+ {% do table_relations.append(
+ api.Relation.create(
+ database=parts[0], schema=parts[1], identifier=parts[2]
+ )
+ ) %}
+ {% elif parts | length == 2 %}
+ {% do table_relations.append(
+ api.Relation.create(schema=parts[0], identifier=parts[1])
+ ) %}
+ {% else %}
+ {% do exceptions.raise_compiler_error(
+ "cleanup_stale_test_tables: unexpected table format: " ~ table
+ ) %}
+ {% endif %}
+ {% else %}
+ {% do table_relations.append(table) %}
+ {% endif %}
+ {% endfor %}
@@
- {% do elementary.clean_elementary_test_tables(tables) %}
+ {% do elementary.clean_elementary_test_tables(table_relations) %}
{% do elementary.edr_log("Done.") %}
{% endif %}
{% endmacro %}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| {% set tables = elementary.get_stale_test_tables( | |
| elementary_database, elementary_schema, hours, table_name_pattern | |
| ) %} | |
| {% if tables | length == 0 %} | |
| {% do elementary.edr_log( | |
| "No temp tables older than " | |
| ~ hours | |
| ~ " hours found - nothing to clean up." | |
| ) %} | |
| {% else %} | |
| {% do elementary.edr_log( | |
| "Dropping " ~ tables | |
| | length ~ " temp tables older than " ~ hours ~ " hours." | |
| ) %} | |
| {% do elementary.clean_elementary_test_tables(tables) %} | |
| {% do elementary.edr_log("Done.") %} | |
| {% set tables = elementary.get_stale_test_tables( | |
| elementary_database, elementary_schema, hours, table_name_pattern | |
| ) %} | |
| {% set table_relations = [] %} | |
| {% for table in tables %} | |
| {% if table is string %} | |
| {% set parts = table.split(".") %} | |
| {% if parts | length == 3 %} | |
| {% do table_relations.append( | |
| api.Relation.create( | |
| database=parts[0], schema=parts[1], identifier=parts[2] | |
| ) | |
| ) %} | |
| {% elif parts | length == 2 %} | |
| {% do table_relations.append( | |
| api.Relation.create(schema=parts[0], identifier=parts[1]) | |
| ) %} | |
| {% else %} | |
| {% do exceptions.raise_compiler_error( | |
| "cleanup_stale_test_tables: unexpected table format: " ~ table | |
| ) %} | |
| {% endif %} | |
| {% else %} | |
| {% do table_relations.append(table) %} | |
| {% endif %} | |
| {% endfor %} | |
| {% if tables | length == 0 %} | |
| {% do elementary.edr_log( | |
| "No temp tables older than " | |
| ~ hours | |
| ~ " hours found - nothing to clean up." | |
| ) %} | |
| {% else %} | |
| {% do elementary.edr_log( | |
| "Dropping " ~ tables | |
| | length ~ " temp tables older than " ~ hours ~ " hours." | |
| ) %} | |
| {% do elementary.clean_elementary_test_tables(table_relations) %} | |
| {% do elementary.edr_log("Done.") %} | |
| {% endif %} |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@macros/edr/tests/test_utils/clean_elementary_test_tables.sql` around lines 39
- 54, The stale-table list returned by elementary.get_stale_test_tables may
contain raw strings (e.g., "database/schema/identifier") which break Athena DROP
SQL; before calling elementary.clean_elementary_test_tables(tables) normalize
each item in tables into a relation object with explicit fields (database,
schema, identifier) or a properly escaped/formatted relation string expected by
the cleaner. Update the code around the tables variable (result of
elementary.get_stale_test_tables) to detect string entries, parse them into
{database, schema, identifier} (or reformat/escape the path) and build a
sanitized list, then pass that sanitized list to
elementary.clean_elementary_test_tables; ensure entries already in proper
structured form are left unchanged.
…test-fixes iteration, will revert this commit once this test passes for all warehouses
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/test-warehouse.yml:
- Around line 279-282: Remove the temporary test filter that scopes pytest to a
single test: delete the -k "test_cleanup_stale_test_tables" argument (and the
preceding TODO comment) from the py.test invocation (the line containing py.test
-n"$PYTEST_PARALLEL" -vvv --target "$WAREHOUSE") so the workflow runs the full
test suite and restores CI coverage across warehouses and versions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 73979cab-7d7c-4328-868d-48a49b15f0c8
📒 Files selected for processing (1)
.github/workflows/test-warehouse.yml
| # TODO: remove -k filter before merging — temporarily scoped to speed up iteration on test_cleanup_stale_test_tables | ||
| py.test -n"$PYTEST_PARALLEL" -vvv --target "$WAREHOUSE" \ | ||
| -k "test_cleanup_stale_test_tables" \ | ||
| --junit-xml=test-results.xml \ |
There was a problem hiding this comment.
Remove the temporary -k filter before merge to avoid CI coverage regression.
Limiting this workflow to a single test disables broad integration coverage across warehouses and versions, so regressions outside stale-table cleanup can slip through.
Suggested fix
- # TODO: remove -k filter before merging — temporarily scoped to speed up iteration on test_cleanup_stale_test_tables
py.test -n"$PYTEST_PARALLEL" -vvv --target "$WAREHOUSE" \
- -k "test_cleanup_stale_test_tables" \
--junit-xml=test-results.xml \
--html="detailed_report_${WAREHOUSE}_dbt_${DBT_VERSION}.html" \
--self-contained-html --clear-on-end $FUSION_RUNNER_FLAG🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.github/workflows/test-warehouse.yml around lines 279 - 282, Remove the
temporary test filter that scopes pytest to a single test: delete the -k
"test_cleanup_stale_test_tables" argument (and the preceding TODO comment) from
the py.test invocation (the line containing py.test -n"$PYTEST_PARALLEL" -vvv
--target "$WAREHOUSE") so the workflow runs the full test suite and restores CI
coverage across warehouses and versions.
… when the cleanup process takes too long in the on_run_end, and then a solution can be to set clean_elementary_temp_tables to false and use this macro in a dedicated job instead
Summary by CodeRabbit
New Features
Tests
Chores