phaseR

what before-and-after comparisons miss

Joint Bayesian inference for longitudinal change, fit by a native NUTS sampler in C++.

phaseR treats "before" and "after" as latent regimes and models two things at once: the transition that moves a unit between regimes, and the outcome dynamics within each regime. The sampler marginalizes over which regime each observation belongs to, so estimates never condition on the realized transition. Splitting your data at the observed change point and comparing the two halves does condition on it, which biases the comparison whenever the chance of transitioning depends on the same covariates that drive the outcome.

library(phaseR)

dat <- sim_phaseR(n_units = 100, n_times = 5, seed = 42)

model <- phase_model(
  phases("inactive", "active"),
  transition("inactive", "active", ~ x),   # what moves a unit between regimes
  dynamics("inactive", y ~ x),             # outcome model in each regime
  dynamics("active",   y ~ x)
)

fit <- fit_phaseR(model, data = dat, chains = 2, iter = 1000, seed = 42)
summary(fit)

Why before/after comparison is biased

Compare the outcomes of units that transitioned, before versus after, and you have stratified on a realized outcome. When units with certain covariate values are both more likely to transition and have different outcomes, the difference you measure mixes the regime effect with that selection. The bias survives any amount of data.

phaseR writes down the full generative process instead: a logistic transition model for regime membership and a per-regime outcome model, fit jointly. The likelihood integrates over the latent regime of every observation (a log-sum-exp over phase assignments), so the selection mechanism is part of the model rather than a thing you condition away. The estimand-failure vignette runs the biased before/after estimate and the phaseR estimate against a known truth on simulated data.

Counterfactuals once you have a fit

Because the transition and dynamics are modeled together, you can ask what outcomes would have looked like under covariate values you never observed, integrating over phase uncertainty:

ate(fit, treatment_var = "x")                          # average treatment effect, with a credible interval

counterfactual(fit, newdata = transform(dat, x = 1))   # predicted outcomes under x = 1
ite(fit, treatment_var = "x")                          # per-unit effects

What the sampler handles

k regimes: arbitrary sequential phases, declared with phases("susceptible", "infected", "recovered") and one transition() per consecutive pair.
GLM outcomes: Gaussian, Poisson, and Binomial dynamics, set per phase.
Random effects: random intercepts (1 | group), crossed (1|site) + (1|year), random slopes (1 + x | group), and nested (1|site/plot).
Custom priors: prior(beta ~ normal(0, 2.5), sigma ~ half_cauchy(1)), with normal(), student_t(), cauchy(), half_normal(), half_cauchy(), and lkj().
Model comparison: waic() / compare_waic(), and LOO-CV via PSIS with loo() / loo_compare().

The NUTS sampler is written from scratch in C++ via Rcpp. There is no Stan, JAGS, or other external inference engine to install.

Specifying a model

phases("susceptible", "infected", "recovered")        # ordered regimes

transition("susceptible", "infected", ~ exposure + age)
transition("infected", "recovered", ~ treatment)      # one per consecutive pair

dynamics("susceptible", y ~ 1)
dynamics("infected",    y ~ time_since_infection)      # outcome model per regime
dynamics("recovered",   y ~ 1)

model <- phase_model(
  phases("susceptible", "infected", "recovered"),
  transition("susceptible", "infected", ~ exposure + age),
  transition("infected", "recovered", ~ treatment),
  dynamics("susceptible", y ~ 1),
  dynamics("infected", y ~ time_since_infection),
  dynamics("recovered", y ~ 1)
)

Where it applies

Any longitudinal setting with a latent before/after where the chance of crossing over depends on covariates that also shape the outcome: disease onset and recovery, treatment adoption, technology uptake, behavioral change, ecological regime shifts.

Installation

# install.packages("pak")
pak::pak("gcol33/phaseR")

Documentation

Why conditioning on realized change fails

License

MIT (see the LICENSE file)

Citation

@software{phaseR,
  title  = {phaseR: Phase-Structured Inference for Longitudinal Change},
  author = {Colling, Gilles},
  year   = {2025},
  url    = {https://github.com/gcol33/phaseR}
}

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.github/workflows		.github/workflows
R		R
man		man
src		src
tests		tests
vignettes		vignettes
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
LICENSE.md		LICENSE.md
NAMESPACE		NAMESPACE
NEWS.md		NEWS.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

phaseR

Why before/after comparison is biased

Counterfactuals once you have a fit

What the sampler handles

Specifying a model

Where it applies

Installation

Documentation

License

Citation

About

Licenses found

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

phaseR

Why before/after comparison is biased

Counterfactuals once you have a fit

What the sampler handles

Specifying a model

Where it applies

Installation

Documentation

License

Citation

About

Topics

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages