Skip to content

Use ACS 2024 donor without changing CPS scaffold#184

Merged
MaxGhenis merged 1 commit into
mainfrom
codex/cps-scaffold-with-acs-donor-20260602
Jun 2, 2026
Merged

Use ACS 2024 donor without changing CPS scaffold#184
MaxGhenis merged 1 commit into
mainfrom
codex/cps-scaffold-with-acs-donor-20260602

Conversation

@MaxGhenis
Copy link
Copy Markdown
Contributor

Summary

  • keep CPS/ASEC as the scaffold whenever the PUF support clone is enabled, even if ACS donor rows add more variable coverage
  • default the PE-US-data rebuild ACS donor to 2024
  • add an ACS H5 fallback so local acs_2024.h5 can load even though the current PE-US-data Python module only exposes ACS_2022

Validation

  • uv run --python 3.13 --extra dev --extra policyengine pytest tests/test_donor_survey_source_providers.py -k acs_source_provider
  • uv run --python 3.13 --extra dev --extra policyengine pytest tests/pipelines/test_pe_us_data_rebuild.py -k source_providers
  • uv run --python 3.13 --extra dev --extra policyengine pytest tests/pipelines/test_us.py -k select_scaffold_prefers_cps_when_puf_support_clone_enabled
  • uv run --python 3.13 --extra dev --extra policyengine ruff check ...
  • uv run --python 3.13 --extra dev --extra policyengine ruff format --check ...
  • local ACS 2024 storage smoke with ACSSourceProvider(year=2024, policyengine_us_data_repo=...)

@MaxGhenis MaxGhenis merged commit a303bd4 into main Jun 2, 2026
5 checks passed
@MaxGhenis MaxGhenis deleted the codex/cps-scaffold-with-acs-donor-20260602 branch June 2, 2026 18:53
MaxGhenis added a commit that referenced this pull request Jun 3, 2026
… 2024

Toward making the source year a single key rather than a value smeared across
the provider/checkpoint/CLI/scripts/names:

- Profiles are addressed by (dataset, model_year): DatasetProfile.key and .name
  (mp_ecps_2024), plus resolve_profile(dataset, year).
- version_id(variant, commit, build_date) derives the canonical build name from
  the profile, so the asec{cps}-calendar{model} years in the name cannot drift
  from the data (e.g. mp-ecps-shaped-asec2025-calendar2024-...). Names become an
  output of the profile, never hand-typed.
- source_years() exposes the per-source years from one place, so callers thread
  a profile instead of five loose year args.
- Correct MP_2024.acs to the native 2024 release (codex #184 default + the
  ACS-2024 donor H5 fallback). Drops the earlier 2022 gap that wrongly anchored
  on the stale manifest/scripts. ACS is excluded from the manifest-tie because
  MP loads a local acs_2024.h5 beyond the module's ACS_2022 baseline.

Next (same PR or follow-up): thread `--profile` / source_years() through the CLI
and build scripts, derive the artifact version_id from the profile, and retire
the per-year args and `--*-year` flags.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant