Skip to content

Add AOTC eligibility-input imputations (G3)#132

Merged
MaxGhenis merged 3 commits into
mainfrom
g3-aotc
Jun 1, 2026
Merged

Add AOTC eligibility-input imputations (G3)#132
MaxGhenis merged 3 commits into
mainfrom
g3-aotc

Conversation

@MaxGhenis
Copy link
Copy Markdown
Contributor

Summary

Implements the eight American Opportunity Tax Credit (AOTC) factual eligibility-input columns the eCPS export contract requires, mirroring the enhanced-CPS baseline.

  • is_pursuing_credential_for_american_opportunity_credit
  • attends_eligible_educational_institution_for_american_opportunity_credit
  • is_enrolled_at_least_half_time_for_american_opportunity_credit
  • has_american_opportunity_credit_1098_t_or_exception
  • has_american_opportunity_credit_institution_ein
  • has_completed_first_four_years_of_postsecondary_education
  • american_opportunity_credit_claimed_prior_years
  • has_felony_drug_conviction

eCPS source mirrored

PolicyEngine/policyengine-us-data policyengine_us_data/datasets/cps/extended_cps.py:1204-1369 (ExtendedCPS._impute_aotc_eligibility_inputs), column tuple AOTC_ELIGIBILITY_INPUTS at extended_cps.py:61-71, and the back-solve helpers in policyengine_us_data/utils/aotc.py. The driving signal is the PUF-imputed american_opportunity_credit (PUF E87521; policyengine_us_data datasets/puf/puf.py:707).

Dependency availability (verified)

Microplex already produces every input the credit-driven eCPS path needs, all of which land on the person table keyed by tax_unit_id after build_policyengine_entity_tables runs the tax-unit split:

signal source in MP
american_opportunity_credit PUF E87521 (manifests/puf.json, microdata_roles.py); a PUF calculated-tax output broadcast per tax unit
qualified_tuition_expenses PUF E03230
is_full_time_college_student carried on the person frame (default False when absent)
is_tax_unit_dependent computed in the CPS pipeline

So the full credit-driven construction runs (not the fallback). No PUF signal is missing; nothing is fabricated.

Implementation

USMicroplexPipeline._construct_aotc_eligibility_inputs (in src/microplex_us/pipelines/us.py) is called inside build_policyengine_entity_tables immediately after _build_policyengine_tax_units. Because Microplex carries the eCPS dict-of-arrays signals as columns on a single person DataFrame keyed by tax_unit_id, the per-tax-unit back-solve is the same algorithm applied to one frame:

  1. For each tax unit with positive american_opportunity_credit, back-solve per-student qualified_tuition_expenses from the credit using the PolicyEngine-US AOTC amount schedule (src/microplex_us/policyengine/aotc.py, a logic-for-logic port of utils/aotc.py).
  2. Select students by the eCPS priority: positive tuition -> full-time college student -> tax-unit dependent -> any member, until the credit is exhausted.
  3. With no credit signal, fall back to the eCPS rule aotc_student = qualified_tuition_expenses > 0.
  4. Selected students get the five factual flags True; has_completed_first_four_years_of_postsecondary_education and has_felony_drug_conviction False; american_opportunity_credit_claimed_prior_years clamped to a max of 3.

All eight columns are added to SAFE_POLICYENGINE_US_EXPORT_VARIABLES and given False/0 defaults in POLICYENGINE_US_EXPORT_DEFAULTS (src/microplex_us/policyengine/us.py) so the contract-required columns always export even when no positive AOTC signal exists. american_opportunity_credit remains a PUF calculated-tax output and is not exported; PolicyEngine-US recomputes the credit from these inputs.

Per-column provenance

column value basis
5 factual flags (pursuing / attends / half-time / 1098-T / EIN) True for selected students, else default False real selection driven by the PUF credit signal; eCPS sets these True for AOTC students
has_completed_first_four_years_of_postsecondary_education False constant in eCPS (extended_cps.py:1345-1349)
has_felony_drug_conviction False constant in eCPS (extended_cps.py:1345-1349)
american_opportunity_credit_claimed_prior_years min(existing, 3) else 0 eCPS prior_years[aotc_student] = np.minimum(prior_years[aotc_student], 3) (extended_cps.py:1351-1359)

The student selection is real (the actual PUF credit / tuition signal); only the documented eCPS constants are constant.

PolicyEngine-US storability (installed policyengine_us==1.715.2)

All eight leaves are storable inputs (def formula count = 0, no formulas/adds/subtracts), entity = Person: seven bool, plus american_opportunity_credit_claimed_prior_years (int). No default_value override, so PE-US type defaults (False / 0) match the export defaults.

Tests

New tests/pipelines/test_us_aotc_eligibility_inputs.py exercises the real construction and export path: safe-export/defaults registration, the credit-driven back-solve ($2,500 -> $4,000; $1,250 -> $1,250 overwriting a reported $2,000), full-time-student selection when no member has tuition, the no-credit tuition fallback, prior-year clamping to 3, the zero-credit early return, and the full export emitting all eight columns (with defaults when no signal is present) while excluding american_opportunity_credit. Credit-driven cases importorskip policyengine_us.

Results (env with policyengine_us + microunit): 218 passed across the new file, test_us.py, test_check_export_columns.py, test_mp300k_artifact_gates.py, and test_forbidden_export_block.py; ruff check clean on all changed files.

Two pre-existing failures on origin/main are unrelated to this change and reproduce on the clean base: the 7 microunit-dependent entity-table tests fail only when microunit is not installed, and test_default_policyengine_us_export_surface_avoids_formula_aggregates already fails on origin/main for social_security_retirement (not any AOTC column).

🤖 Generated with Claude Code

MaxGhenis and others added 3 commits June 1, 2026 06:47
Populate the eight American Opportunity Tax Credit factual eligibility
inputs the eCPS export contract requires, driven by the PUF-imputed
american_opportunity_credit signal (PUF E87521). Mirrors the enhanced-CPS
baseline PolicyEngine/policyengine-us-data
policyengine_us_data/datasets/cps/extended_cps.py:1204-1369
(_impute_aotc_eligibility_inputs) and policyengine_us_data/utils/aotc.py.

- src/microplex_us/policyengine/aotc.py: port of utils/aotc.py
  back-solve helpers (max credit per student; minimum qualifying
  expenses generating a given credit) off PolicyEngine-US parameters.
- pipelines/us.py: _construct_aotc_eligibility_inputs runs in
  build_policyengine_entity_tables after the tax-unit split, where the
  person table already carries american_opportunity_credit,
  qualified_tuition_expenses, is_full_time_college_student and
  is_tax_unit_dependent keyed by tax_unit_id. Per tax unit with positive
  credit it back-solves per-student tuition and selects students by the
  eCPS priority (tuition>0 -> full-time college student -> tax-unit
  dependent -> any member); falls back to qualified_tuition_expenses>0
  when no credit signal is present. Selected students get the five
  factual flags True, has_completed_first_four_years and
  has_felony_drug_conviction False, and claimed_prior_years clamped to 3.
- policyengine/us.py: register the eight columns in
  SAFE_POLICYENGINE_US_EXPORT_VARIABLES and add False/0 defaults so the
  contract-required columns always export even with no positive signal.

american_opportunity_credit remains a PUF calculated-tax output and is
not exported; PolicyEngine-US recomputes the credit from these inputs.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Cover USMicroplexPipeline._construct_aotc_eligibility_inputs and the
export path: the eight contract-required AOTC columns are registered as
safe export variables with False/0 defaults; the credit-driven back-solve
selects students by the eCPS priority and rewrites per-student tuition
($2,500 -> $4,000; $1,250 -> $1,250); the no-credit fallback marks
tuition holders; prior-year claims clamp to 3; and the export emits all
eight columns (with defaults when no signal is present) while excluding
the PUF american_opportunity_credit driver. Credit-driven cases
importorskip policyengine_us so they run where PE-US is installed and
skip otherwise.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…t members

Cycle review found _construct_aotc_eligibility_inputs diverged from the eCPS
baseline it mirrors. eCPS (_impute_aotc_eligibility_inputs in
PolicyEngine/policyengine-us-data, unmerged branch codex/fix-aotc-eligibility):
  - if any tax-unit member reports positive qualified tuition, flag ALL such
    members and leave their reported tuition unchanged (no back-solve, no
    overwrite);
  - otherwise select one student by priority (full-time -> tax-unit dependent
    -> any member) and back-solve only that student's tuition to the
    credit-implied minimum.

The PR instead ran a credit-exhaustion loop that overwrote reported tuition and
could flag only a subset of tuition-positive members, changing the exported
qualified_tuition_expenses (a contract-required column) relative to eCPS on the
common positive-credit-with-tuition case. Ported the eCPS branch structure
exactly; the numeric back-solve helper (verified equal to eCPS's inverse across
14 credit values) is unchanged. Dropped the now-unused
maximum_american_opportunity_credit_per_student import.

Also corrected fabricated provenance citations (puf.py:768, extended_cps.py
:1204-1369 / :61-71, AOTC_ELIGIBILITY_INPUTS, utils/aotc.py -- none exist) to
cite the real unmerged eCPS branch, and updated the tests accordingly:
preserve-existing-tuition, flag-all-tuition-positive-members (new), and the
no-tuition back-solve path.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@MaxGhenis MaxGhenis marked this pull request as ready for review June 1, 2026 10:59
@MaxGhenis MaxGhenis merged commit e205c35 into main Jun 1, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant