Enable app v2 to run on API v2 by anth-volk · Pull Request #97 · PolicyEngine/policyengine-api-v2-alpha

anth-volk · 2026-03-03T22:19:43Z

Fixes #96

Summary

This PR adds all API capabilities required for the PolicyEngine app v2 frontend to run on top of the API v2 backend. It introduces 86 commits across 123 files (~19,000 lines added) with 517 tests covering all new functionality.

New API Endpoints (40+ endpoints across 13 routers)

User association endpoints (anonymous user support via client-generated UUIDs):

/user-policies — CRUD (7 endpoints)
/user-simulations — CRUD (7 endpoints)
/user-reports — CRUD + /full composite read (8 endpoints)
/user-household-associations — CRUD (5 endpoints)

Household endpoints:

POST/GET/DELETE /households — Persistent household storage
POST /analysis/household-impact + GET /{report_id} — Baseline vs reform household comparison

Standalone simulation endpoints:

POST/GET /simulations/household — Household simulation jobs
POST/GET /simulations/economy — Economy simulation jobs with region support

Enhanced analysis:

POST /analysis/economic-impact — Now supports region param and year selection
GET /analysis/options — List available computation modules
POST /analysis/economy-custom — Run analysis with custom module selection

Metadata bulk lookups:

POST /parameters/by-name — Bulk parameter lookup
GET /parameters/children — Lazy parameter tree loading
POST /variables/by-name — Bulk variable lookup
GET /tax-benefit-models/by-country/{country_id} — Model + latest version

Region endpoints:

GET /regions, GET /regions/{id}, GET /regions/by-code/{code}

New Database Tables (12 tables, 17 Alembic migrations)

households — Persistent household definitions
user_policies, user_simulation_associations, user_report_associations, user_household_associations — Anonymous user associations
regions + region_datasets — Geographic areas with M:N dataset links
budget_summary, intra_decile_impacts — Economy comparison outputs
constituency_impacts, local_authority_impacts, congressional_district_impacts — Geographic impacts

Architectural Changes

Computation module extraction: 14 composable modules extracted from monolithic analysis functions into computation_modules.py
Module registry: Central registry with metadata for discovery (/analysis/options) and selective execution (/analysis/economy-custom)
Deterministic simulation IDs: UUID v5 for automatic deduplication
Region-dataset join table: M:N relationship supports multiple dataset years per region
Modular seed scripts: Split into seed_models.py, seed_datasets.py, seed_regions.py, seed_policies.py with --preset=testing for fast CI

Code Quality (from PR review)

Type-safe enums: RegionType, DecileType, ReportType
model_validator on request schemas for field co-dependency
Typed PolicyParameterValueInput (replaces list[dict])
GeographicImpactBase shared model for geographic impacts
Ownership verification on user association update/delete
Narrowed exception handling with logfire warnings
Pagination on list endpoints
IntegrityError handling for race conditions

Economy Comparison Response — New Fields

Field	Description
`region`	Region metadata (code, label, type)
`budget_summary`	Net cost, program budgets
`intra_decile`	5-band income change distribution per decile
`detailed_budget`	Per-program budget impacts
`wealth_decile`	Wealth-based decile impacts
`intra_wealth_decile`	Within wealth-decile distributions
`congressional_district_impact`	US congressional district impacts
`constituency_impact`	UK parliamentary constituency impacts
`local_authority_impact`	UK local authority impacts

Test plan

🤖 Generated with Claude Code

- Replace synchronous inline calculation with async trigger pattern - Add _trigger_household_impact() mirroring _trigger_economy_comparison() - Add _run_local_household_impact() for local execution (blocking) - Add _run_simulation_in_session() for running individual simulations - Update POST endpoint to trigger and return immediately - Add test script for manual end-to-end testing Note: Local execution blocks the request (same as economic impact). True async requires Modal functions (household_impact_uk/us). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

…lue serialization - Remove Enum/date serialization from seed_common.py since policyengine.py now pre-serializes default_value for JSON compatibility - Change default_value type from `str | None` to `Any` in Variable model since it stores JSON values (bool, int, float, str) Depends on policyengine.py feat/add-variable-default-value branch 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Add tests for Variable with int, float, bool, and string default values - Add test for null default_value handling - Add test for Household model creation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

session.get(Report, report_id) expects a UUID, but report_id was passed as a string. This caused 'str' object has no attribute 'hex' errors. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

… reform bug is fixed This commit removes the workaround code that was added to bypass policyengine.py's Simulation class. The workaround was needed because policyengine.py's US simulation applied reforms via p.update() after Microsimulation construction, which didn't work due to the US package's shared singleton TaxBenefitSystem. That bug has now been fixed in policyengine.py (issue #232), so we can use policyengine.py's Simulation class directly again. Changes: - Revert household.py to use policyengine.core.Simulation instead of manually building Microsimulation with reform dicts - Revert modal_app.py to use PESimulation instead of custom helper functions (_pe_policy_to_reform_dict, _merge_reform_dicts, _run_us_economy_simulation, _run_uk_economy_simulation) - Remove now-obsolete test files for the workaround functions 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Add tests to verify that US policy reforms are applied correctly to household calculations. These tests cover: - Integration tests via API endpoints (TestUSPolicyReform, TestUKPolicyReform) - Unit tests for the calculation functions directly (test_household_calculation.py) The tests verify: 1. Baseline calculations work correctly 2. Reforms change household net income as expected 3. Running a reform doesn't pollute subsequent baseline calculations (regression test for the singleton pollution bug) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

The PyPI release (3.1.15) has a bug where US reforms silently fail due to the shared singleton TaxBenefitSystem (policyengine.py#232). The fix exists on the app-v2-migration branch but hasn't been released. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Hatchling rejects git URL dependencies unless explicitly opted in. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ions Test scripts in scripts/ were ad-hoc debugging aids, not part of the test suite. Nevada seed is no longer needed. Archived Supabase migrations are superseded by Alembic. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Add Region SQLModel with filtering fields (code, label, region_type, requires_filter, filter_field, filter_value, dataset_id, etc.) - Add Alembic migration for regions table - Add GET /regions/ endpoint with filters by model and region type - Add GET /regions/{region_id} and GET /regions/by-code/{code} endpoints - Add region parameter to analysis endpoint with dataset/region resolution 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Add filter_field and filter_value to Simulation model - Include filter params in deterministic simulation ID generation - Pass filter params from region to simulation creation - Pass filter params to policyengine.py PESimulation when running 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Wire filter_field/filter_value through Modal functions to policyengine.py: - simulate_economy_uk, simulate_economy_us - economy_comparison_uk, economy_comparison_us - Add fixtures_regions.py with factory functions for test data - Add 25 unit tests for region resolution and filtering: - test__given_region_with_filter__then_filter_params_included.py - test__given_region_without_filter__then_filter_params_none.py - test__given_dataset_id__then_region_is_none.py - test__given_same_params__then_deterministic_id.py - test__given_invalid_region__then_404_error.py - test__given_existing_simulation__then_reuses_existing.py 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Add seed_regions.py to populate the regions table with geographic data from policyengine.py's region registries: - US: National + 51 states (DC included) - UK: National + 4 countries (England, Scotland, Wales, NI) Optional flags: - --include-places: Add US cities (333 places over 100K population) - --include-districts: Add US congressional districts (436) - --us-only / --uk-only: Seed only one country 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Add --skip-regions, --include-places, and --include-districts CLI options to seed.py. Regions are now seeded as part of the standard database setup process, sourcing region definitions from policyengine.py's registries. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Default behavior now seeds all US regions (national, states, districts, places). Use --skip-places and --skip-districts to exclude specific region types. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Break up monolithic seed.py into focused subscripts: - seed_utils.py: Shared utilities (get_session, bulk_insert, console) - seed_models.py: TaxBenefitModel, Version, Variables, Parameters, ParameterValues - seed_datasets.py: Dataset seeding and S3 upload - seed_policies.py: Example policy reforms - seed_regions.py: Geographic regions (updated to use seed_utils) Main seed.py is now an orchestrator with preset configurations: - full: Everything (default) - lite: Both countries, 2026 only, skip state params, core regions - minimal: Both countries, 2026 only, no policies/regions - uk-lite, uk-minimal: UK-only variants - us-lite, us-minimal: US-only variants Each subscript can also run standalone with its own CLI. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Tests were written before _get_or_create_simulation and _get_deterministic_simulation_id gained the simulation_type parameter. Add SimulationType.ECONOMY and use keyword args for dataset_id/filter params to match the current function signatures. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Merge 6 separate test__given_* files into test_analysis.py organized by function tested: TestResolveDatasetAndRegion, TestGetDeterministicSimulationId, TestGetOrCreateSimulation. Fix pre-existing test_missing_dataset_id assertion (400 not 422). Move @pytest.mark.integration from file-level to class-level. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

This test hits the real database (valid request passes validation), so it needs a running Supabase instance like the other integration tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Adds ConstituencyImpact DB model, Alembic migration, and wires constituency computation into both local and Modal UK economy comparison paths. Uses GCS-hosted weight matrix and constituency CSV. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Adds LocalAuthorityImpact DB model, Alembic migration, and wires computation into both local and Modal UK economy comparison paths. Uses GCS-hosted weight matrix and local authority CSV. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ne.py Migrate all intra-decile computation (UK+US, local+Modal) from the API's inline intra_decile.py helper to policyengine.py's new IntraDecileImpact output class. Add wealth decile impact and intra-wealth-decile impact for UK economy comparisons, using DecileImpact with decile_variable= "household_wealth_decile". Add decile_type column to intra_decile_impacts table to distinguish income vs wealth records. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Extend fixtures with factory functions for congressional district, constituency, local authority, wealth decile, and intra-wealth-decile records. Add test classes verifying _build_response() populates all new fields correctly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Define ComputationModule dataclass and MODULE_REGISTRY with 10 modules (decile, program_statistics, poverty, inequality, budget_summary, intra_decile, congressional_district, constituency, local_authority, wealth_decile) plus helper functions for country filtering and validation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Returns available economy analysis modules from the registry, with optional country query param to filter by UK/US applicability. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Accepts same inputs as /analysis/economic-impact plus a modules list. Validates module names against the registry for the given country, triggers computation, and filters the response to only include fields for the requested modules. Includes GET polling endpoint with optional modules query param. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…tions Move each module's computation logic (decile, poverty, inequality, budget_summary, program_statistics, intra_decile, constituency, local_authority, wealth_decile, congressional_district) into standalone functions in computation_modules.py with UK/US dispatch tables. The local economy comparison functions now call run_modules() with an optional modules list, enabling selective computation from the /analysis/economy-custom endpoint. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add comprehensive test coverage for module registry, analysis options, economy-custom endpoint, and computation module dispatch system. Includes lint/format fixes from ruff. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Allows looking up parameters by their exact names for a given model, returning ParameterRead[] for matches. Enables the app to fetch metadata for specific parameters (e.g. those in a saved policy) without loading the entire parameter catalog. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Accepts country_id and parent_path, returns direct children as nodes (with child_count) or leaf parameters (with full metadata). Adds COUNTRY_MODEL_NAMES mapping to constants for country_id resolution. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Accepts a list of variable names and country_id, returns matching VariableRead objects. Mirrors the parameters/by-name pattern for targeted variable fetching without bulk loading. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Returns the model and its latest version in a single response, keyed by country_id (us/uk). Used on page load for model version checking and cache invalidation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ers/by-name Aligns the POST /parameters/by-name endpoint with GET /parameters/children by accepting country_id ("us" or "uk") and resolving the model name internally via COUNTRY_MODEL_NAMES. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add a 'testing' preset to seed.py that seeds only ~100 curated US variables and ~100 parameters (by prefix), enabling fast local database setup for integration testing. Changes: - SeedConfig: add variable_whitelist and parameter_prefixes fields - seed_models.py: add whitelist/prefix filtering before row construction - seed.py: add TESTING_VARIABLES, TESTING_PARAMETER_PREFIXES constants - Wire new params through run_seed → seed_us_model → seed_model Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The Policy model requires tax_benefit_model_id (NOT NULL), but seed_policies.py was not passing it to the constructor. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Add environment_name param to all modal.Function.from_name calls so functions resolve against the testing environment (not main) - Add modal_environment setting to Pydantic Settings (default: testing) - Install policyengine from app-v2-migration branch instead of PyPI - Add pydantic-settings and git to Modal base image dependencies - Use SUPABASE_SERVICE_KEY for storage access (private bucket) - Add US variable pre-calculation step matching the UK function - Use household_net_income for US decile impacts (matches v1 API) - Bump economy_comparison_us memory to 24GB for full dataset processing - Update .env.example with Modal environment documentation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Add region_id foreign key to Simulation model for region provenance - Alembic migration 963e91da9298 applied to deployed Supabase - Pass region_id when creating economy simulations - Fix polling endpoints to look up region from simulation.region_id - Simplify _build_region_info to use direct FK instead of filter matching - Add /reports/{id}/full and /user-reports/{id}/full composite endpoints - Extend PolicyRead with parameter_values (ParameterValueWithName) - Update policy endpoints to eager-load parameter values with names Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- POST /analysis/rerun/{report_id}: resets report + simulations to PENDING, deletes all result records, and re-triggers computation. Works for both economy and household reports. - computation_modules: use country-specific income variable for decile impacts (household_net_income for US, equiv_household_net_income for UK) instead of hardcoded UK variable. Accepts country_id via kwargs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add output_filepath() helper in storage.py and use it across all 6 upload sites in modal_app.py so new simulation output datasets are stored at outputs/output_{sim_id}.h5 instead of the bucket root. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The computation module functions now include a 7th parameter for country_id (passed as kwarg by run_modules). Some functions accept it explicitly while others use **_kwargs. Updated tests to: - Expect 7 parameters instead of 6 - Accept either 'country_id' or '_kwargs' as the 7th param name - Add **kwargs to tracker functions in mock tests - Include country_id='' in mock assertion expectations 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Each region can now link to multiple datasets (one per year, 2024-2035) via a many-to-many join table, and simulations store their year directly. seed_regions.py is the sole source of truth for region-to-dataset wiring. - Add RegionDatasetLink model (composite PK: region_id + dataset_id) - Remove dataset_id FK from Region, add datasets list relationship - Add year column to Simulation model - Alembic migration with data migration from old FK to join table - Update analysis.py and simulations.py to resolve datasets from join table with optional year filtering - Update seed_regions.py to create RegionDatasetLink entries based on dataset filepath patterns (states/, districts/, CPS fallback) - Remove region update code from import_state_datasets.py Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Test fixtures were still setting dataset_id on Region (removed column). Now they create RegionDatasetLink entries via the join table instead. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…safety, API design Implements all 17 issues and 8 type design recommendations from PR review: Phase 1 - Critical bugs: - Fix session.commit() outside context manager in modal_app.py - Remove debug JWT decoding block from production code Phase 2 - Security: - Add user_id ownership checks on user-household association update/delete Phase 3 - Error handling: - Narrow bare except blocks to specific exceptions (FileNotFoundError, KeyError) - Add logfire warnings for expected failures - Raise ValueError instead of silent returns for missing reports/policies - Wrap Modal fn.spawn() with error handling, mark reports FAILED on failure - Add IntegrityError handling for simulation/report race conditions - Add logging to household simulation error handler Phase 4 - Type safety: - Rewrite SimulationCreate with model_validator for type consistency - Create PolicyParameterValueInput typed schema - Add RegionType, DecileType, ReportType enums - Add field constraints to IntraDecileImpact - Add model_validator to RegionCreate for filter co-dependency - Create GeographicImpactBase to reduce duplication across impact models - Rewrite ReportCreate as standalone schema Phase 5 - API design: - Add pagination (limit/offset) to list_simulations - Add model_validators to request schemas for dataset/region requirement Also applies ruff formatting across codebase. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…bles When a variable has possible_values (from policyengine), populate the possible_values field with JSON-encoded values and set data_type to "Enum". This allows the frontend to render dropdowns for enum variables like state_name. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

fix: Populate possible_values and set data_type='Enum' for enum variables

anth-volk and others added 30 commits February 7, 2026 01:24

feat: First household CRUD setup

51b4a17

feat: User-household associations

58cd691

feat: Household analysis

77c5a3d

fix: Improve code quality

3c28466

test: Add tests

78f8780

feat: Use Alembic for db migrations

6f90fbe

fix: Break up Alembic; add smaller seed scripts

dbb6534

test: Tests

cf8511f

fix: Fix household user models; add variable default values

34c34c4

fix: FINALLY use the ACTUAL Alembic script to generate migrations

a97f473

fix: Allow direct references in hatch metadata

6e77d4d

Hatchling rejects git URL dependencies unless explicitly opted in. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: Mark TestEconomicImpactNotFound as integration test

73b0db2

This test hits the real database (valid request passes validation), so it needs a running Supabase instance like the other integration tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat: add user-policy associations

5cdd5c2

anth-volk and others added 24 commits March 3, 2026 21:14

feat: Add GET /analysis/options endpoint

d3ef57c

Returns available economy analysis modules from the registry, with optional country query param to filter by UK/US applicability. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

test: Expand Phase 4 unit tests from 43 to 119

89ba96f

Add comprehensive test coverage for module registry, analysis options, economy-custom endpoint, and computation module dispatch system. Includes lint/format fixes from ruff. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat: Add POST /variables/by-name endpoint

c2efa1e

Accepts a list of variable names and country_id, returns matching VariableRead objects. Mirrors the parameters/by-name pattern for targeted variable fetching without bulk loading. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat: Add GET /tax-benefit-models/by-country/{country_id} endpoint

fc60e7e

Returns the model and its latest version in a single response, keyed by country_id (us/uk). Used on page load for model version checking and cache invalidation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: Set tax_benefit_model_id when seeding example policies

4f66e73

The Policy model requires tax_benefit_model_id (NOT NULL), but seed_policies.py was not passing it to the constructor. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: Update test fixtures to use RegionDatasetLink instead of dataset_id

b495485

Test fixtures were still setting dataset_id on Region (removed column). Now they create RegionDatasetLink entries via the join table instead. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

anth-volk marked this pull request as ready for review March 3, 2026 22:22

anth-volk requested a review from nikhilwoodruff March 3, 2026 22:22

SakshiKekre and others added 2 commits March 4, 2026 13:37

Merge pull request #91 from PolicyEngine/fix/enum-possible-values-v2

2dc1bee

fix: Populate possible_values and set data_type='Enum' for enum variables

anth-volk merged commit 2099677 into main Mar 5, 2026
1 check passed

anth-volk deleted the app-v2-migration branch March 5, 2026 18:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable app v2 to run on API v2#97

Enable app v2 to run on API v2#97
anth-volk merged 88 commits intomainfrom
app-v2-migration

anth-volk commented Mar 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

anth-volk commented Mar 3, 2026

Summary

New API Endpoints (40+ endpoints across 13 routers)

New Database Tables (12 tables, 17 Alembic migrations)

Architectural Changes

Code Quality (from PR review)

Economy Comparison Response — New Fields

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants