fix: parity harness compares against hbai_household_net_income by vahid-ahmadi · Pull Request #61 · PolicyEngine/policyengine-uk-rust

vahid-ahmadi · 2026-05-01T11:07:11Z

Summary

Quick fix for the parity harness from PR #53. Rust's baseline_net_income is the HBAI net-income definition (gross minus direct taxes plus benefits — excluding council tax, TV licence, transaction taxes). The harness was comparing it against Python's broader household_net_income, which subtracts those extras on top.

Net effect: every single scenario showed a £159 diff that was exclusively the TV licence (£174.50 × ~0.911 take-up). That noise masked the real, smaller divergences.

Before / after

                                        BEFORE       AFTER
single_£0 / £12k / … / £150k     diff   £159         £1.20
couple_no_kids_40k_25k           diff   £159         £2.40
couple_2kids_30k_15k             diff   £3,276       £2.40   ← biggest "fix"
lone_parent_2kids_18k            diff   £2,722       £554
pensioner_couple                 diff   £905         £200
scotland_single_45k              diff   £159         £1.20

The headline "couple_2kids £3,276 UC gap" claim from the original parity-harness PR was an artefact of this measurement bug — that scenario now shows £2.40, well within tolerance.

Real gaps remaining

After this and #60 (state-pension recorded amount), the meaningful parity divergences are:

scenario	diff	likely cause
lone_parent_2kids_18k	£554	Real UC entitlement gap (Python > Rust)
pensioner_couple	£200	Winter Fuel Allowance (Python includes, Rust doesn't yet)
everyone else	£1–2	employer-NI £-1 rounding

Both are concrete next-PR candidates.

What's included

One-line variable swap in scripts/parity.py: household_net_income → hbai_household_net_income
Updated test assertion in interfaces/python/tests/test_parity_harness.py
Changelog fragment under changelog.d/fixed/

Verified locally

cargo test: 165 passing (unchanged)
pytest interfaces/python/tests: 87 passing
python -m policyengine_uk_compiled.yaml_tests tests/policy: 29/29
Parity harness: 5 scenarios within tolerance (vs 0 before), 3 real gaps (was 11)

Stacking

vahid/parity-harness-hbai ← vahid/state-pension-recorded (#60) ← vahid/council-tax-spd (#58) ← vahid/dla-aa-from-flags (#57) ← vahid/pip-from-flags (#56) ← vahid/lbtt-ltt (#55) ← vahid/yaml-test-harness (#54) ← vahid/parity-harness (#53) ← vahid/from-situation (#52). Nine-deep stack.

🤖 Generated with Claude Code

Adds a classmethod to the Python wrapper that accepts the PolicyEngine web-app situation-JSON format (people / benunits / households with `members` lists and period-keyed values) and converts it into the three input DataFrames the Rust engine consumes. Closes #51 in part — the small, low-risk piece. Datasets-from-URL and a direct dataframe entry point can follow in subsequent PRs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds `scripts/parity.py`, which runs a fixed set of synthetic households through both the Python `policyengine-uk` package and the Rust `policyengine_uk_compiled` wrapper, diffs key tax / benefit / net-income outputs cell-for-cell, and prints a summary. Skips Python comparison gracefully when the Python package isn't installed. Wired into CI as a non-failing smoke step so it surfaces drift on every PR without breaking on the divergences that already exist (currently up to £3,276 on couple-with-children scenarios). Tolerance can be tightened once those gaps close. Stacked on top of #52 (Simulation.from_situation). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds `policyengine_uk_compiled.yaml_tests` — a runner that mirrors the format used by `policyengine_uk/tests/policy/` so cases can be ported one at a time. The runner accepts either single-person flat input (`input: { employment_income: 50000 }`) or full-situation input (`input: { people: ..., benunits: ..., households: ... }`), supports absolute and relative error margins, and writes outputs against the Rust microdata column names (`baseline_income_tax`, `baseline_universal_credit`, `baseline_net_income`, etc.). This PR ships: - The runner module with CLI: `python -m policyengine_uk_compiled.yaml_tests tests/policy` - 11 hand-written YAML cases under `tests/policy/` covering income tax, employee NI, and Child Benefit (single + multi-person) - A pytest module that auto-discovers and parametrizes the YAML cases - 21 unit tests for the runner itself (input mapping, tolerance, parsing) - pyyaml added to the package's runtime dependencies Stacked on #53 (parity harness) which is itself stacked on #52 (Simulation.from_situation). Future PRs port more of the 196 Python YAML tests that already exist in `policyengine_uk/tests/policy/`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Property-transaction tax now dispatches by region: - Scotland → LBTT (LBTT (Scotland) Act 2013) - Wales → LTT (LTT and Anti-avoidance of Devolved Taxes (Wales) Act 2017) - elsewhere → SDLT (Finance Act 2003 s.55, unchanged) 2025/26 residential bands per: - SSI 2015/126 (Scotland) - WSI 2018/128 (Wales) Adds: - `lbtt` and `ltt` parameter blocks in `parameters/2025_26.yaml` - `Parameters.lbtt`/`Parameters.ltt` Rust fields and Python wrapper exposure - `calculate_property_transaction_tax` dispatch function in `src/variables/wealth_taxes.rs` - New `baseline_property_transaction_tax` and `reform_property_transaction_tax` per-household microdata columns - Six Rust unit tests covering LBTT/LTT/SDLT dispatch and nil-band edges - Six YAML policy-test cases (`tests/policy/property_transaction_tax.yaml`) Stacked on #54 (YAML test harness). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Until now, PIP amount fields (`pip_daily_living`, `pip_mobility`) were only populated from FRS recorded values; setting an eligibility flag on a synthetic household built via `from_situation` produced £0 PIP, and PIP-rate reforms had no effect even on FRS data when the recorded amount sat outside the modelled rate. This change adds: - `PipParams` Rust struct (and Python wrapper class) with the four PIP weekly rates: daily-living standard/enhanced and mobility standard/enhanced - 2025/26 rates per gov.uk/pip/what-youll-get sourced under Welfare Reform Act 2012 s.79 / SI 2013/377 - `pip_daily_living_amount` and `pip_mobility_amount` helpers in `src/variables/benefits.rs` that: - Pass through any FRS-recorded amount unchanged (preserves existing calibration behaviour) - Otherwise compute from the eligibility flag × the rate parameter - Return 0 when neither holds or `params.pip` is unset - `passthrough_benefits` now uses these helpers, so PIP from flags flows into total_benefits and downstream household net income Tests: - 8 Rust unit tests covering the std/enh/recorded-override/no-flag/no- params/reform-scaling paths - 4 YAML policy-test cases covering the same paths end-to-end Stacked on #55 (LBTT/LTT). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Extends the pattern from #56 (PIP) to: - DLA care component (low/mid/high) — SSCBA 1992 Sch.2 para.2 - DLA mobility component (low/high) — SSCBA 1992 Sch.2 para.3 - Attendance Allowance (low/high) — SSCBA 1992 s.64 Synthetic households that set `dla_care_*` / `dla_mob_*` / `aa_*` eligibility flags now produce non-zero amounts via the new `DlaParams` and `AaParams` structs (with 2025/26 weekly rates from gov.uk). FRS-recorded amounts continue to pass through unchanged. Adds: - 2025/26 rates in `parameters/2025_26.yaml`: DLA care low/mid/high £29.20/£73.90/£110.40 weekly, DLA mob low/high £29.20/£77.05 weekly, AA low/high £73.90/£110.40 weekly - Helpers `dla_care_amount`, `dla_mobility_amount`, `attendance_allowance_amount` in `src/variables/benefits.rs` - 10 Rust unit tests (recorded-override / no-flag / per-band-rate / passthrough flow) - 4 YAML policy-test cases under `tests/policy/dla_aa.yaml` - Python wrapper exposure (`DlaParams`, `AaParams`, `Parameters.dla`, `Parameters.aa`) Stacked on #56. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Households with exactly one adult (18+) now receive a 25% discount on the calculated council tax — Local Government Finance Act 1992 s.11(1)(a). Adds: - `single_person_discount_rate` field on `CouncilTaxParams` (default 0.25) - Updates `calculate_council_tax(hh, params, is_single_adult)` to apply the discount - Counts adults via `Person::is_adult()` (age >= 18) in `simulation.rs` - New `baseline_council_tax_calculated` / `reform_council_tax_calculated` per-household microdata columns - First-time exposure of `CouncilTaxParams` in the Python wrapper - 3 new Rust unit tests (band D + band A discount, zero-discount-rate edge) - 4 new YAML policy-test cases (`tests/policy/council_tax.yaml`) The baseline run still uses the FRS-recorded `hh.council_tax` for net income; the calculated value is for reform modelling, where now reforms to either band-D rate or the discount fraction take effect. Stacked on #57. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Mirrors the existing old-SP scaling pattern for the new-SP cohort: - If `person.state_pension > 0`: pass through, scaled by `(new_state_pension_weekly / baseline_new_sp_weekly)` for reform correctness - Else: fall back to `new_state_pension_weekly × 52` Previously the new-SP branch always returned the full parameter rate × 52, ignoring any recorded amount. This over-stated SP for partial- year claimants and broke parity for the pensioner_couple synthetic scenario in PR #53's parity harness (£946 diff). Implementation: - Plumb `baseline_new_sp_weekly` through `Simulation`, `calculate_benunit`, `calculate_state_pension`, and `person_state_pension`, parallel to the existing `baseline_old_sp_weekly` field - 3 new Rust unit tests (recorded-amount preserved, fallback to param when no record, recorded amount scales under reform) Parity-harness impact (synthetic pensioner_couple scenario): state_pension rust=23,000 py=23,000 diff=£0 (was £946) household_net_income diff=£-41 (was £905) Stacked on #58. Closes #59 (filed today as a follow-up to PR #53). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Rust's `baseline_net_income` is the HBAI net-income definition (gross minus direct taxes plus benefits, excluding council tax / TV licence / transaction taxes). The parity harness was comparing it against Python's broader `household_net_income`, which subtracts council_tax, TV licence, expected_sdlt/lbtt/ltt, etc., on top. Net effect: every single scenario showed a £159 diff that was exclusively the TV licence (£174.50 × ~0.911 take-up). That diff masked the real, smaller divergences and made the harness's output look worse than it was. Switching to `hbai_household_net_income` reveals: - single/couple scenarios: £1.20 / £2.40 diffs (just employer-NI rounding) - lone_parent_2kids: £554 (real UC entitlement gap) - pensioner_couple: £200 (Winter Fuel Allowance — Python includes, Rust doesn't yet) - scotland_single_45k: £1.20 The headline "couple_2kids £3,276 UC gap" from the original PR #53 description was an artefact of this measurement bug — that scenario now shows £2.40, well within tolerance. Stacked on #60. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

vahid-ahmadi · 2026-05-29T12:17:07Z

Superseded — recommend closing.

The fix here (comparing against hbai_household_net_income rather than household_net_income, to exclude the indirect/transaction taxes the Rust baseline omits) has been folded directly into the reworked #53. As part of decoupling the stack, #53 was rebased onto main and rewritten to compare FRS microdata outputs from both engines using hbai_household_net_income, so this one-line follow-up is no longer needed as a separate PR.

This PR also can't stand alone on main, since it only patches scripts/parity.py, which is introduced by #53. Suggest closing in favour of #53.

vahid-ahmadi and others added 9 commits April 30, 2026 13:24

vahid-ahmadi force-pushed the vahid/state-pension-recorded branch from 89b4655 to 10c89d3 Compare May 29, 2026 09:17

vahid-ahmadi mentioned this pull request May 29, 2026

feat: add Python ↔ Rust parity harness #53

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: parity harness compares against hbai_household_net_income#61

fix: parity harness compares against hbai_household_net_income#61
vahid-ahmadi wants to merge 9 commits into
vahid/state-pension-recordedfrom
vahid/parity-harness-hbai

vahid-ahmadi commented May 1, 2026

Uh oh!

vahid-ahmadi commented May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

vahid-ahmadi commented May 1, 2026

Summary

Before / after

Real gaps remaining

What's included

Verified locally

Stacking

Uh oh!

vahid-ahmadi commented May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant