fix: parity harness compares against hbai_household_net_income#61
Open
vahid-ahmadi wants to merge 9 commits into
Open
fix: parity harness compares against hbai_household_net_income#61vahid-ahmadi wants to merge 9 commits into
vahid-ahmadi wants to merge 9 commits into
Conversation
Adds a classmethod to the Python wrapper that accepts the PolicyEngine web-app situation-JSON format (people / benunits / households with `members` lists and period-keyed values) and converts it into the three input DataFrames the Rust engine consumes. Closes #51 in part — the small, low-risk piece. Datasets-from-URL and a direct dataframe entry point can follow in subsequent PRs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds `scripts/parity.py`, which runs a fixed set of synthetic households through both the Python `policyengine-uk` package and the Rust `policyengine_uk_compiled` wrapper, diffs key tax / benefit / net-income outputs cell-for-cell, and prints a summary. Skips Python comparison gracefully when the Python package isn't installed. Wired into CI as a non-failing smoke step so it surfaces drift on every PR without breaking on the divergences that already exist (currently up to £3,276 on couple-with-children scenarios). Tolerance can be tightened once those gaps close. Stacked on top of #52 (Simulation.from_situation). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds `policyengine_uk_compiled.yaml_tests` — a runner that mirrors the
format used by `policyengine_uk/tests/policy/` so cases can be ported one
at a time.
The runner accepts either single-person flat input
(`input: { employment_income: 50000 }`) or full-situation input
(`input: { people: ..., benunits: ..., households: ... }`), supports
absolute and relative error margins, and writes outputs against the Rust
microdata column names (`baseline_income_tax`, `baseline_universal_credit`,
`baseline_net_income`, etc.).
This PR ships:
- The runner module with CLI: `python -m policyengine_uk_compiled.yaml_tests tests/policy`
- 11 hand-written YAML cases under `tests/policy/` covering income tax,
employee NI, and Child Benefit (single + multi-person)
- A pytest module that auto-discovers and parametrizes the YAML cases
- 21 unit tests for the runner itself (input mapping, tolerance, parsing)
- pyyaml added to the package's runtime dependencies
Stacked on #53 (parity harness) which is itself stacked on #52
(Simulation.from_situation). Future PRs port more of the 196 Python YAML
tests that already exist in `policyengine_uk/tests/policy/`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Property-transaction tax now dispatches by region: - Scotland → LBTT (LBTT (Scotland) Act 2013) - Wales → LTT (LTT and Anti-avoidance of Devolved Taxes (Wales) Act 2017) - elsewhere → SDLT (Finance Act 2003 s.55, unchanged) 2025/26 residential bands per: - SSI 2015/126 (Scotland) - WSI 2018/128 (Wales) Adds: - `lbtt` and `ltt` parameter blocks in `parameters/2025_26.yaml` - `Parameters.lbtt`/`Parameters.ltt` Rust fields and Python wrapper exposure - `calculate_property_transaction_tax` dispatch function in `src/variables/wealth_taxes.rs` - New `baseline_property_transaction_tax` and `reform_property_transaction_tax` per-household microdata columns - Six Rust unit tests covering LBTT/LTT/SDLT dispatch and nil-band edges - Six YAML policy-test cases (`tests/policy/property_transaction_tax.yaml`) Stacked on #54 (YAML test harness). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Until now, PIP amount fields (`pip_daily_living`, `pip_mobility`) were
only populated from FRS recorded values; setting an eligibility flag on
a synthetic household built via `from_situation` produced £0 PIP, and
PIP-rate reforms had no effect even on FRS data when the recorded amount
sat outside the modelled rate.
This change adds:
- `PipParams` Rust struct (and Python wrapper class) with the four PIP
weekly rates: daily-living standard/enhanced and mobility standard/enhanced
- 2025/26 rates per gov.uk/pip/what-youll-get sourced under Welfare
Reform Act 2012 s.79 / SI 2013/377
- `pip_daily_living_amount` and `pip_mobility_amount` helpers in
`src/variables/benefits.rs` that:
- Pass through any FRS-recorded amount unchanged (preserves existing
calibration behaviour)
- Otherwise compute from the eligibility flag × the rate parameter
- Return 0 when neither holds or `params.pip` is unset
- `passthrough_benefits` now uses these helpers, so PIP from flags flows
into total_benefits and downstream household net income
Tests:
- 8 Rust unit tests covering the std/enh/recorded-override/no-flag/no-
params/reform-scaling paths
- 4 YAML policy-test cases covering the same paths end-to-end
Stacked on #55 (LBTT/LTT).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extends the pattern from #56 (PIP) to: - DLA care component (low/mid/high) — SSCBA 1992 Sch.2 para.2 - DLA mobility component (low/high) — SSCBA 1992 Sch.2 para.3 - Attendance Allowance (low/high) — SSCBA 1992 s.64 Synthetic households that set `dla_care_*` / `dla_mob_*` / `aa_*` eligibility flags now produce non-zero amounts via the new `DlaParams` and `AaParams` structs (with 2025/26 weekly rates from gov.uk). FRS-recorded amounts continue to pass through unchanged. Adds: - 2025/26 rates in `parameters/2025_26.yaml`: DLA care low/mid/high £29.20/£73.90/£110.40 weekly, DLA mob low/high £29.20/£77.05 weekly, AA low/high £73.90/£110.40 weekly - Helpers `dla_care_amount`, `dla_mobility_amount`, `attendance_allowance_amount` in `src/variables/benefits.rs` - 10 Rust unit tests (recorded-override / no-flag / per-band-rate / passthrough flow) - 4 YAML policy-test cases under `tests/policy/dla_aa.yaml` - Python wrapper exposure (`DlaParams`, `AaParams`, `Parameters.dla`, `Parameters.aa`) Stacked on #56. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Households with exactly one adult (18+) now receive a 25% discount on the calculated council tax — Local Government Finance Act 1992 s.11(1)(a). Adds: - `single_person_discount_rate` field on `CouncilTaxParams` (default 0.25) - Updates `calculate_council_tax(hh, params, is_single_adult)` to apply the discount - Counts adults via `Person::is_adult()` (age >= 18) in `simulation.rs` - New `baseline_council_tax_calculated` / `reform_council_tax_calculated` per-household microdata columns - First-time exposure of `CouncilTaxParams` in the Python wrapper - 3 new Rust unit tests (band D + band A discount, zero-discount-rate edge) - 4 new YAML policy-test cases (`tests/policy/council_tax.yaml`) The baseline run still uses the FRS-recorded `hh.council_tax` for net income; the calculated value is for reform modelling, where now reforms to either band-D rate or the discount fraction take effect. Stacked on #57. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mirrors the existing old-SP scaling pattern for the new-SP cohort: - If `person.state_pension > 0`: pass through, scaled by `(new_state_pension_weekly / baseline_new_sp_weekly)` for reform correctness - Else: fall back to `new_state_pension_weekly × 52` Previously the new-SP branch always returned the full parameter rate × 52, ignoring any recorded amount. This over-stated SP for partial- year claimants and broke parity for the pensioner_couple synthetic scenario in PR #53's parity harness (£946 diff). Implementation: - Plumb `baseline_new_sp_weekly` through `Simulation`, `calculate_benunit`, `calculate_state_pension`, and `person_state_pension`, parallel to the existing `baseline_old_sp_weekly` field - 3 new Rust unit tests (recorded-amount preserved, fallback to param when no record, recorded amount scales under reform) Parity-harness impact (synthetic pensioner_couple scenario): state_pension rust=23,000 py=23,000 diff=£0 (was £946) household_net_income diff=£-41 (was £905) Stacked on #58. Closes #59 (filed today as a follow-up to PR #53). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Rust's `baseline_net_income` is the HBAI net-income definition (gross minus direct taxes plus benefits, excluding council tax / TV licence / transaction taxes). The parity harness was comparing it against Python's broader `household_net_income`, which subtracts council_tax, TV licence, expected_sdlt/lbtt/ltt, etc., on top. Net effect: every single scenario showed a £159 diff that was exclusively the TV licence (£174.50 × ~0.911 take-up). That diff masked the real, smaller divergences and made the harness's output look worse than it was. Switching to `hbai_household_net_income` reveals: - single/couple scenarios: £1.20 / £2.40 diffs (just employer-NI rounding) - lone_parent_2kids: £554 (real UC entitlement gap) - pensioner_couple: £200 (Winter Fuel Allowance — Python includes, Rust doesn't yet) - scotland_single_45k: £1.20 The headline "couple_2kids £3,276 UC gap" from the original PR #53 description was an artefact of this measurement bug — that scenario now shows £2.40, well within tolerance. Stacked on #60. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
89b4655 to
10c89d3
Compare
4 tasks
Contributor
Author
|
Superseded — recommend closing. The fix here (comparing against This PR also can't stand alone on |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Quick fix for the parity harness from PR #53. Rust's
baseline_net_incomeis the HBAI net-income definition (gross minus direct taxes plus benefits — excluding council tax, TV licence, transaction taxes). The harness was comparing it against Python's broaderhousehold_net_income, which subtracts those extras on top.Net effect: every single scenario showed a £159 diff that was exclusively the TV licence (£174.50 × ~0.911 take-up). That noise masked the real, smaller divergences.
Before / after
The headline "couple_2kids £3,276 UC gap" claim from the original parity-harness PR was an artefact of this measurement bug — that scenario now shows £2.40, well within tolerance.
Real gaps remaining
After this and #60 (state-pension recorded amount), the meaningful parity divergences are:
Both are concrete next-PR candidates.
What's included
scripts/parity.py:household_net_income→hbai_household_net_incomeinterfaces/python/tests/test_parity_harness.pychangelog.d/fixed/Verified locally
cargo test: 165 passing (unchanged)pytest interfaces/python/tests: 87 passingpython -m policyengine_uk_compiled.yaml_tests tests/policy: 29/29Stacking
vahid/parity-harness-hbai←vahid/state-pension-recorded(#60) ←vahid/council-tax-spd(#58) ←vahid/dla-aa-from-flags(#57) ←vahid/pip-from-flags(#56) ←vahid/lbtt-ltt(#55) ←vahid/yaml-test-harness(#54) ←vahid/parity-harness(#53) ←vahid/from-situation(#52). Nine-deep stack.🤖 Generated with Claude Code