causalis.scenarios.cuped.model.CUPEDModelCUPEDModel
CUPED-style regression adjustment estimator for ATE/ITT in randomized experiments.
Fits an outcome regression with pre-treatment covariates (always centered over the full sample, never within treatment groups) implemented as Lin (2013) fully interacted OLS:
Y ~ 1 + D + X^c + D * X^c
The reported effect is the coefficient on D, with robust covariance as requested.
This specification ensures the coefficient on D is the ATE/ITT even if the
treatment effect is heterogeneous with respect to covariates.
This is broader than canonical single-theta CUPED (Y - theta*(X - mean(X))).
Parameters
- cov_typestr, default=”HC2”
Covariance estimator passed to statsmodels (e.g., “nonrobust”, “HC0”, “HC1”, “HC2”, “HC3”). Note: for cluster-randomized designs, use cluster-robust SEs (not implemented here).
- alphafloat, default=0.05
Significance level for confidence intervals.
- strict_binary_treatmentbool, default=True
If True, require treatment to be binary {0,1}.
- use_tbool | None, default=None
If bool, passed to statsmodels
.fit(..., use_t=use_t)directly. If None, automatic policy is used: for robust HC* covariances,use_t=Truewhenn < use_t_auto_n_threshold, elseFalse. For non-robust covariance,use_t=True.- use_t_auto_n_thresholdint, default=5000
Sample-size threshold for automatic
use_tselection whenuse_t=Noneand covariance is HC* robust.- relative_ci_method{“delta_nocov”, “bootstrap”}, default=”delta_nocov”
Method for relative CI of
100 * tau / mu_c. - “delta_nocov”: delta method using robustVar(tau)andVar(mu_c)while settingCov(tau, mu_c)=0(safe fallback without unsupported hybrid IF covariance). - “bootstrap”: percentile bootstrap CI on the relative effect.- relative_ci_bootstrap_drawsint, default=1000
Number of bootstrap resamples used when
relative_ci_method="bootstrap".- relative_ci_bootstrap_seedint | None, default=None
RNG seed used for bootstrap relative CI.
- covariate_variance_minfloat, default=1e-12
Minimum variance threshold for retaining a CUPED covariate. Covariates with variance less than or equal to this threshold are dropped before fitting.
- condition_number_warn_thresholdfloat, default=1e8
Trigger diagnostics signal when the design matrix condition number exceeds this threshold.
- run_regression_checksbool, default=True
Whether to compute regression diagnostics payload during
fit().- check_action{“ignore”, “raise”}, default=”ignore”
Action used when a diagnostics threshold is violated.
- raise_on_yellowbool, default=False
When
check_action="raise", also raise on YELLOW assumption flags.- corr_near_one_tolfloat, default=1e-10
Correlation tolerance used to mark near-duplicate centered covariates.
- vif_warn_thresholdfloat, default=20.0
VIF threshold that triggers a diagnostics signal.
- winsor_qfloat | None, default=0.01
Quantile used for winsor sensitivity refit. Set
Noneto disable.- tiny_one_minus_h_tolfloat, default=1e-8
Threshold for flagging near-degenerate
1 - leverageterms in HC2/HC3.
Notes
Validity requires covariates be pre-treatment. Post-treatment covariates can bias estimates.
Covariates are globally centered over the full sample only. This centering convention is required so the treatment coefficient in the Lin specification remains the ATE/ITT.
The Lin (2013) specification is recommended as a robust regression-adjustment default in RCTs.