Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
## [Unreleased]

### Added
- **`StackedDiD` covariate balancing (CBWSDID; Ustyuzhanin 2026, arXiv:2604.02293).** New constructor parameter `balance="entropy"` plus `fit(..., covariates=[...])` add a within-sub-experiment design stage: entropy balancing (Hainmueller 2012) reweights the clean controls toward the treated cohort's covariate means (read at the last pre-treatment period), and the resulting design weights `b_sa` compose with the Wing et al. (2024) corrective weights via the effective control mass into the final stacked weights `W_sa`. This is **control-only reweighting**, so it estimates untreated trends under *conditional* parallel trends while preserving the trimmed-aggregate-ATT estimand (at `b_sa=1` it reduces to the paper's unit-count weighted stacked DID, equal to `StackedDiD(weighting="aggregate")` on balanced event windows). Inference reuses the existing conditional-on-weights cluster-robust path. Scope: requires `weighting="aggregate"` and **balanced event windows** (ragged windows raise — the unit-count vs observation-count convention is unresolved off balanced panels); `population`/`sample_share`/`survey_design=` and matching-based balancing / the repeated-treatment extension are not supported (raise `NotImplementedError`). Infeasible cohorts fail closed with a clear error. New `diff_diff/balancing.py` (entropy-balancing solver). Estimand validated end-to-end against the closed-form CBWSDID formula (`tests/test_methodology_stacked_did.py`).
- **`SyntheticControl` conformal inference (Chernozhukov, Wüthrich & Zhu 2021, *JASA* 116(536)).** Three opt-in `SyntheticControlResults` methods give valid p-values for the post-period effect trajectory and pointwise confidence intervals — what the in-space placebo / Firpo-Possebom test-inversion paths cannot. Unlike the Firpo path (which re-ranks the cross-unit placebo gaps), the conformal layer fits its **own** time-permutation-invariant constrained-LS synthetic-control proxy (CWZ §2.3 eqs 3–4 — simplex weights on raw outcomes over **all** periods under the null, no `V`-matrix, no intercept) and permutes residuals **over time** for the single treated unit (CWZ's exactness theory requires a time-symmetric proxy, which the headline ADH `V`-matrix fit is not). **`conformal_test(effect, q=1, scheme="moving_block", n_iid=10000, seed=None)`** computes the joint sharp-null permutation p-value (eqs 1–2) of `S_q(û) = ((1/√T*)·Σ_{t>T0}|û_t|^q)^{1/q}` (`q ∈ {1, 2, ∞}`); the proxy is fit once and only residuals are permuted (footnote 7). **`conformal_confidence_intervals(alpha=0.1, scheme="moving_block", bounds=None, n_grid=100, seed=None)`** returns pointwise per-period CIs by test inversion (Algorithm 1 — each period `t` uses `Z = (pre-periods, t)` with the other post-periods dropped, a clean `T*=1` test). **`conformal_average_effect(alpha=0.1, scheme="moving_block", bounds=None, n_grid=200, seed=None)`** returns a CI for the average post-period effect by collapsing the panel into non-overlapping `T*`-blocks and permuting the block residuals (Appendix A.1). Permutation schemes: `"moving_block"` (`Π_→` cyclic shifts, valid under serial dependence — the default) and `"iid"` (`Π_all`, sampled, finer p-values); both include the identity so the p-value floor is `1/|Π|` (no extra `+1`). Fail-closed handling for `<1` donor / unpickled result / non-finite panel / non-converged grid points (treated as indeterminate, not rejected) / grid-limited / empty / unbounded sets; a single donor and `T*≥T0` warn. Surfaced under `conformal_inference` / `get_conformal_grid_df()` and `DiagnosticReport`'s `estimator_native_diagnostics`; the analytical `se`/`t_stat`/`p_value`/`conf_int`/`is_significant` stay NaN throughout. Core in the new `diff_diff/conformal.py` (reuses the Frank-Wolfe simplex solver). *Deferred:* one-sided variants (§7), covariates folded into the proxy, and the AR/innovation-permutation path (Lemmas 5–7).

### Changed
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,7 @@ Full guide: `diff_diff.get_llm_guide("practitioner")`.
- [TripleDifference](https://diff-diff.readthedocs.io/en/stable/api/triple_diff.html) - triple difference (DDD) estimator for designs requiring two criteria for treatment eligibility
- [ContinuousDiD](https://diff-diff.readthedocs.io/en/stable/api/continuous_did.html) - Callaway, Goodman-Bacon & Sant'Anna (2024) continuous treatment DiD with dose-response curves
- [HeterogeneousAdoptionDiD](https://diff-diff.readthedocs.io/en/stable/api/had.html) - de Chaisemartin, Ciccia, D'Haultfœuille & Knau (2026) for designs where **no unit remains untreated**; local-linear estimator at the dose support boundary returning Weighted Average Slope (WAS) on Design 1' (`d̲ = 0` / QUG) or `WAS_{d̲}` on Design 1 (`d̲ > 0`, continuous-near-d̲ or mass-point), with a multi-period event-study extension (last-treatment cohort, pointwise CIs). **Panel-only** in this release - repeated cross-sections rejected by the validator. Alias `HAD`.
- [StackedDiD](https://diff-diff.readthedocs.io/en/stable/api/stacked_did.html) - Wing, Freedman & Hollingsworth (2024) stacked DiD with Q-weights and sub-experiments
- [StackedDiD](https://diff-diff.readthedocs.io/en/stable/api/stacked_did.html) - Wing, Freedman & Hollingsworth (2024) stacked DiD with Q-weights and sub-experiments; optional covariate balancing (Ustyuzhanin 2026)
- [EfficientDiD](https://diff-diff.readthedocs.io/en/stable/api/efficient_did.html) - Chen, Sant'Anna & Xie (2025) efficient DiD with optimal weighting for tighter SEs
- [TROP](https://diff-diff.readthedocs.io/en/stable/api/trop.html) - Triply Robust Panel estimator (Athey et al. 2025) with nuclear norm factor adjustment
- [StaggeredTripleDifference](https://diff-diff.readthedocs.io/en/stable/api/staggered.html#staggeredtripledifference) - Ortiz-Villavicencio & Sant'Anna (2025) staggered DDD with group-time ATT
Expand Down
1 change: 1 addition & 0 deletions TODO.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,7 @@ Deferred items from PR reviews that were not addressed before merge.

| Issue | Location | PR | Priority |
|-------|----------|----|----------|
| CBWSDID covariate balancing (`StackedDiD(balance="entropy")`) v1 supports only balanced event windows + `weighting="aggregate"`; unbalanced/ragged panels fail closed (the unit-count vs observation-count corrector convention is unresolved off balanced panels). Matching-based balancing and the repeated `0→1`/`1→0` episode extension are also deferred (out-of-scope guards raise). Documented in REGISTRY.md StackedDiD "Covariate balancing (CBWSDID)" Notes. | `stacked_did.py`, `balancing.py`, `docs/methodology/REGISTRY.md` | follow-up | Low |
| `SyntheticControl` cv: `in_space_placebo()` / `leave_one_out()` report a cv refit excluded for STRUCTURAL infeasibility (donor-indistinguishable re-aggregated window) with the generic `status="failed"` — same machine-readable status as a genuine inner-solver non-convergence. The failure warnings now distinguish the two causes (and the correct remediation) under cv, and `in_time_placebo()` already splits structural→`"infeasible"` vs `"failed"`, but in-space/LOO do not yet emit a separate machine-readable status/reason-code. Thread a reason code from `_outer_solve_V_cv()`/`_placebo_fit_unit()` and add an `"infeasible"` status + count to the in-space/LOO outputs (mirror the in-time split). | `synthetic_control.py`, `synthetic_control_results.py` | follow-up | Low |
| dCDH: Phase 1 per-period placebo DID_M^pl has NaN SE (no IF derivation for the per-period aggregation path). Multi-horizon placebos (L_max >= 1) have valid SE. | `chaisemartin_dhaultfoeuille.py` | #294 | Low |
| dCDH: Survey cell-period allocator's post-period attribution is a library convention, not derived from the observation-level survey linearization. MC coverage is empirically close to nominal on the test DGP; a formal derivation (or a covariance-aware two-cell alternative) is deferred. Documented in REGISTRY.md survey IF expansion Note. | `chaisemartin_dhaultfoeuille.py`, `docs/methodology/REGISTRY.md` | #408 | Medium |
Expand Down
52 changes: 52 additions & 0 deletions benchmarks/R/generate_cbwsdid_golden.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
#!/usr/bin/env Rscript
# Generate the cross-language golden fixture for StackedDiD's covariate-balancing
# (CBWSDID) path against the reference R package `cbwsdid` (Ustyuzhanin 2026).
#
# Unlike generate_stacked_did_golden.R (which operates on a PRE-stacked CSV so the
# R side is independent of Python stacking logic), `cbwsdid` does its OWN stacking
# + balancing, so this harness hands it the raw panel and dumps the dynamic
# event-study ATTs. The Python side (StackedDiD(balance="entropy", ...)) reproduces
# them via its independent entropy-balancing solver + effective-mass W_sa.
#
# Refinement: refinement.method="weightit", method="ebal" = entropy balancing
# (Hainmueller 2012) on covs.formula=~x, matching StackedDiD(balance="entropy",
# covariates=["x"]). Install: remotes::install_github("vadvu/cbwsdid").
#
# Usage: Rscript benchmarks/R/generate_cbwsdid_golden.R

suppressMessages({
library(cbwsdid)
library(jsonlite)
})

# Run from the repository root: Rscript benchmarks/R/generate_cbwsdid_golden.R
panel_csv <- "benchmarks/data/cbwsdid_balance_panel.csv"
out_json <- "benchmarks/data/cbwsdid_golden.json"

df <- read.csv(panel_csv)

m <- cbwsdid(
data = df, y = "y", d = "d", id = c("unit", "time"),
kappa = c(-2, 2), design = "absorbing", post_path = "stable",
refinement.method = "weightit", covs.formula = ~x,
refinement.args = list(method = "ebal"), pooled = TRUE
)
qoi <- cbwsdid_qoi(m, type = "dynamic")

golden <- list(
meta = list(
package = "cbwsdid",
R_version = R.version.string,
panel = "benchmarks/data/cbwsdid_balance_panel.csv",
estimator = "cbwsdid(design='absorbing', refinement.method='weightit', method='ebal', covs.formula=~x)",
kappa = c(-2L, 2L)
),
dynamic = list(
event_time = as.integer(qoi$et),
estimate = as.numeric(qoi$estimate),
std_error = as.numeric(qoi$std.error)
)
)
write_json(golden, out_json, auto_unbox = TRUE, digits = 15, pretty = TRUE)
cat("wrote", out_json, "\n")
print(data.frame(et = qoi$et, estimate = qoi$estimate, se = qoi$std.error))
Loading
Loading