Skip to content

Replace regions.dataset_id FK with many-to-many join table and add simulation year #94

@anth-volk

Description

@anth-volk

Problem

Each region currently has a single dataset_id FK pointing to one Dataset record. But each geographic area has up to 12 yearly datasets (2024–2035). This means:

  • Only one year's dataset is accessible per region
  • There's no way to run analyses for different years
  • The year is not directly stored on simulations

Requirements

  1. Replace regions.dataset_id FK with a region_datasets join table (many-to-many) so each region can link to multiple yearly datasets
  2. Add year column to simulations table so the simulation year is directly queryable
  3. Add year to analysis request schemas so the API can select the correct yearly dataset for a region
  4. Make seed_regions.py the sole source of truth for region-to-dataset wiring — import_state_datasets.py only uploads dataset files and creates DB records
  5. Update dataset resolution in analysis and simulation endpoints to query the join table, with optional year filtering (latest year used when omitted)

Acceptance criteria

  • region_datasets join table exists with composite PK (region_id, dataset_id)
  • regions.dataset_id column removed
  • simulations.year column added
  • Alembic migration handles data migration from old FK to join table
  • POST /analysis/economic-impact accepts optional year parameter
  • POST /simulations/economy accepts optional year parameter
  • seed_regions.py creates RegionDatasetLink entries based on dataset filepath patterns
  • import_state_datasets.py no longer touches regions

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions