-
Notifications
You must be signed in to change notification settings - Fork 1
NO-TICKET: Consolidate backfill into transfers #364
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR consolidates legacy backfill transfer scripts into the main transfer pipeline, removing the separate backfill orchestrator and integrating five legacy datasets (Chemistry SampleInfo, NGWMN views, SurfaceWaterData, WaterLevelsContinuous_Pressure_Daily, and WeatherData) into the standard transfer workflow with new feature flags.
Changes:
- Moved and renamed legacy transfer modules from
transfers/backfill/totransfers/, updating class names to standard*Transfererconvention - Integrated legacy transfers into main transfer orchestration with new feature flags (
TRANSFER_SURFACE_WATER_DATA,TRANSFER_CHEMISTRY_SAMPLEINFO,TRANSFER_NGWMN_VIEWS,TRANSFER_WATERLEVELS_PRESSURE_DAILY,TRANSFER_WEATHER_DATA) - Added comprehensive unit tests for legacy models and ERD generator script
Reviewed changes
Copilot reviewed 21 out of 23 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| transfers/well_transfer.py | Refactored JSON blob operations to use new utility functions |
| transfers/weather_data.py | Renamed class from WeatherDataBackfill to WeatherDataTransferer |
| transfers/waterlevelscontinuous_pressure_daily.py | Renamed class to NMAWaterLevelsContinuousPressureDailyTransferer and removed unused import |
| transfers/util.py | Added retry logic for GCS operations and JSON blob helpers |
| transfers/transfer.py | Integrated five legacy transferers with new feature flags in parallel and sequential flows |
| transfers/surface_water_data.py | Renamed class to SurfaceWaterDataTransferer |
| transfers/ngwmn_views.py | Renamed base and derived classes from *Backfill to *Transferer |
| transfers/metrics.py | Added metrics methods for all newly-integrated legacy tables |
| transfers/chemistry_sampleinfo.py | Renamed class to ChemistrySampleInfoTransferer |
| transfers/backfill/backfill.py | Emptied backfill orchestrator steps |
| tests/test_waterlevelscontinuous_pressure_daily_legacy.py | Added CRUD tests for pressure daily legacy model |
| tests/test_surface_water_data_legacy.py | Added CRUD tests for surface water data legacy model |
| tests/test_ngwmn_views_legacy.py | Added CRUD tests for NGWMN view legacy models |
| tests/test_chemistry_sampleinfo_legacy.py | Added CRUD tests for chemistry sample info legacy model |
| services/util.py | Refactored HTTP calls to use new retry helper with backoff |
| scripts/generate_erd.py | Added ERD generator script for SQLAlchemy schema visualization |
| schema_erd.dot | Generated Graphviz DOT file with current schema ERD |
| run_backfill.sh | Added GitHub workflow reference as documentation |
| pyproject.toml | Added empty line (formatting) |
| .github/workflows/CD_staging.yml | Removed backfill step from staging deployment |
| .github/workflows/CD_production.yml | Removed backfill step from production deployment |
…ensitive information Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 23 out of 25 changed files in this pull request and generated 1 comment.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…fallback handling
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 23 out of 25 changed files in this pull request and generated 1 comment.
…and improve schema reset logic
chasetmartin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me given my previous knowledge of the transfer process, I do have one comment about removing the unused run_backfill.sh file if it is not relevant.
…ess and its distinction from data transfer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 24 out of 26 changed files in this pull request and generated 3 comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 47 out of 49 changed files in this pull request and generated 4 comments.
…y in various modules
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 49 out of 52 changed files in this pull request and generated 6 comments.
# Conflicts: # db/nma_legacy.py # transfers/chemistry_sampleinfo.py # transfers/metrics.py # transfers/transfer.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 51 out of 54 changed files in this pull request and generated 6 comments.
…enhance data validation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 52 out of 55 changed files in this pull request and generated 8 comments.
| z = convert_ngvd29_to_navd88( | ||
| z, transformed_point.x, transformed_point.y | ||
| ) | ||
| if z is None: |
Copilot
AI
Jan 13, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The nested if-else structure for handling NGVD29 conversion failures is difficult to follow. Consider extracting the fallback logic into a helper function (e.g., _try_datum_conversion_with_fallback) that takes the original z value, point coordinates, and elevations cache, returning the final z value and whether EPQS was used. This would make the main make_location function more readable and the fallback logic easier to test independently.
| ) | ||
| stmt = insert_stmt.values(chunk).on_conflict_do_update( | ||
| index_elements=["OBJECTID"], | ||
| index_elements=["SamplePtID"], |
Copilot
AI
Jan 13, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The upsert uses index_elements=["SamplePtID"] for conflict resolution, but the excluded update includes both SamplePtID and OBJECTID. Since SamplePtID is now the primary key and is used for conflict detection, updating it in the SET clause is redundant. Consider removing SamplePtID from the SET clause to avoid unnecessary operations.
…ensitive information Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 52 out of 55 changed files in this pull request and generated 6 comments.
Comments suppressed due to low confidence (1)
transfers/chemistry_sampleinfo.py:1
- The
thing_idfield inChemistrySampleInfois added to the upsert set clause, which could update an existing record to point to a different Thing. This breaks the relationship integrity asSamplePointIDis unique and should always map to the same Thing. Removething_idfrom the upsertset_clause to prevent accidental relationship changes.
# ===============================================================================
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 52 out of 55 changed files in this pull request and generated 2 comments.
jacob-a-brown
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall I think it looks good! I have some minor comments scattered throughout. Most of my comments are about Starlette-Admin, so I've compiled them below.
Comments on Starlette-Admin
- I don't think that Starlette-Admin has a separate list view and form view. Looking at its current implementation I think that you can get rid of all of the attributes in the "List View" sections and then keep the following:
fieldssortable_fieldsfields_default_sortsearchable_fieldspage_sizepage_size_optionsexclude_fields_from_createexclude_fields_from_edit
You can also define export types as desired (if you want all views to have the same export types the export_type attribute can be defined in OcotilloModelView). For example:
rom starlette_admin import BaseField
from starlette_admin import ExportType
from admin.views.base import OcotilloModelView
class ExampleModelAdmin(OcotilloModelView):
fields = [...]
sortable_fields = [...]
fields_default_sort = [...]
searchable_fields = [...]
page_size = ...
page_size_options = [...]
exclude_fields_from_create = [...]
exclude_fields_from_edit = [...]
export_type = [ExportType.CSV, ExportType.EXCEL]- There may be some discrepancies between what is enumerated in
list_fieldsandfields. These should be resolved and then put in the intended order in thefieldsattribute (e.g. puttingsample_pt_idfirst forChemistrySampleInfosince that's the PK) - Unfortunately there isn't a way to list out all of the labels in a dictionary or similar object. To have labels, each field in the
fieldslist needs to be defined as a BaseField or a particular field type (likeIntegerFieldorStringField, but I think that'd be overkill) and then each label needs to be defined one-at-a-time. The same goes for help text. For example:
from starlette_admin import BaseField
from admin.views.base import OcotilloModelView
class ExampleModelAdmin(OcotilloModelView):
fields = [
BaseField("field_one", label="FieldOne", help_text="Help for FieldOne"),
BaseField("field_two", label="FieldTwo", help_text="Help for FieldTwo")
]
What
WaterLevelsContinuous_Pressure_Daily, and WeatherData from transfers/backfill/ into the main transfers/ package and
renamed their classes to standard *Transferer.
sequential flows, with new feature flags to toggle them.
Why
as the rest of the transfer logic.
Key Changes
NMAWaterLevelsContinuousPressureDailyTransferer, and WeatherDataTransferer.
Testing
Notes