causalis.scenarios.cuped.model.CUPEDModelCUPEDModel
CUPED-style regression adjustment estimator for ATE/ITT in randomized experiments.
The CUPED estimator uses pre-experiment data (covariates) to reduce the variance of the treatment effect estimate without introducing bias. While the canonical CUPED estimator uses a single variance-reduction parameter $\theta$, this implementation follows Lin (2013) and uses a fully interacted OLS specification.
Notes
The canonical CUPED adjusted outcome is defined as:
where $\theta = \frac{Cov(Y, X)}{Var(X)}$ minimizes $Var(Y_{cuped})$.
This model implements the Lin (2013) specification, which is equivalent to saturated OLS and robust to heterogeneous treatment effects:
where:
$D$ is the binary treatment indicator ($D=1$ for treatment, $D=0$ for control).
$X$ are the pre-treatment covariates (centered globally).
$\tau$ is the Average Treatment Effect (ATE).
Centering covariates at their global mean $\bar{X}$ ensures that the coefficient $\tau$ on the treatment indicator $D$ directly estimates the ATE.
Examples
Generate synthetic data with pre-treatment covariate
Parameters
- cov_typestr, default=”HC2”
Covariance estimator passed to statsmodels (e.g., “nonrobust”, “HC0”, “HC1”, “HC2”, “HC3”). Note: for cluster-randomized designs, use cluster-robust SEs (not implemented here).
- alphafloat, default=0.05
Significance level for confidence intervals.
- use_tbool | None, default=None
If bool, passed to statsmodels
.fit(..., use_t=use_t)directly. If None, automatic policy is used: for robust HC* covariances,use_t=Truewhenn < use_t_auto_n_threshold, elseFalse. For non-robust covariance,use_t=True.- use_t_auto_n_thresholdint, default=5000
Sample-size threshold for automatic
use_tselection whenuse_t=Noneand covariance is HC* robust.- relative_ci_method{“delta”, “bootstrap”}, default=”delta”
Method for relative CI of
100 * tau / denominator. - “delta”: joint delta method that accounts for covariance between the adjusted ATE and the selected denominator. - “bootstrap”: percentile bootstrap CI on the relative effect.- relative_denominator{“adjusted_control”, “raw_control”}, default=”adjusted_control”
Denominator used for relative effects. - “adjusted_control”: model-implied control mean at the full-sample covariate mean. - “raw_control”: observed control-group outcome mean.
- relative_ci_bootstrap_drawsint, default=1000
Number of bootstrap resamples used when
relative_ci_method="bootstrap".- relative_ci_bootstrap_seedint | None, default=None
RNG seed used for bootstrap relative CI.
- refutation_configCUPEDRefutationConfig | None, default=None
Grouped configuration for regression checks, refutation thresholds, and check actions.
- covariate_variance_minfloat, default=1e-12
Minimum variance threshold for retaining a CUPED covariate. Covariates with variance less than or equal to this threshold are dropped before fitting.
Notes
Validity requires covariates be pre-treatment. Post-treatment covariates can bias estimates.
Covariates are globally centered over the full sample only. This centering convention is required so the treatment coefficient in the Lin specification remains the ATE/ITT.
The Lin (2013) specification is recommended as a robust regression-adjustment default in RCTs.
Canonical target
causalis.scenarios.cuped.model.CUPEDModel
Sections