Bug: SIPP tip income imputation includes allocation flag columns

## Summary

The tip income construction in `sipp.py` sums all columns matching `*TXAMT*`, which inadvertently includes both the actual tip dollar amounts (`TJB*_TXAMT`) and Census allocation flags (`AJB*_TXAMT`). The allocation flags are small integers (0, 1, 2) indicating whether Census imputed the value, not dollar amounts.

## Current code

`policyengine_us_data/datasets/sipp/sipp.py` line ~69-72:

```python
df["tip_income"] = (
    df[df.columns[df.columns.str.contains("TXAMT")]].fillna(0).sum(axis=1)
    * 12
)
```

## Fix

Filter to only the actual tip amount columns:

```python
df["tip_income"] = (
    df[df.columns[df.columns.str.match(r"TJB\d_TXAMT")]].fillna(0).sum(axis=1)
    * 12
)
```

## Impact

Likely minor since allocation flags are small integers vs dollar amounts, but it's incorrect and should be fixed.

## Context

This was identified while comparing PolicyEngine's tip income deduction revenue estimate ($4.7B) against JCT's score ($10.0B for FY2026). See related issues for other improvements to close this gap.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: SIPP tip income imputation includes allocation flag columns #524

Summary

Current code

Fix

Impact

Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Bug: SIPP tip income imputation includes allocation flag columns #524

Description

Summary

Current code

Fix

Impact

Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions