Skip to content
Submodule
causalis.scenarios.cuped.model

model

Submodule causalis.scenarios.cuped.model with no child pages and 6 documented members.

Classes

Jump directly into the documented classes for this page.

1 items
class
causalis.scenarios.cuped.model.CUPEDModel

CUPEDModel

CUPED-style regression adjustment estimator for ATE/ITT in randomized experiments.

The CUPED estimator uses pre-experiment data (covariates) to reduce the variance of the treatment effect estimate without introducing bias. While the canonical CUPED estimator uses a single variance-reduction parameter $\theta$, this implementation follows Lin (2013) and uses a fully interacted OLS specification.

Notes

The canonical CUPED adjusted outcome is defined as:

Ycuped=Yθ(XE[X])Y_{cuped} = Y - \theta (X - E[X])

where $\theta = \frac{Cov(Y, X)}{Var(X)}$ minimizes $Var(Y_{cuped})$.

This model implements the Lin (2013) specification, which is equivalent to saturated OLS and robust to heterogeneous treatment effects:

Y=α+τD+β(XXˉ)+γD(XXˉ)+ϵY = \alpha + \tau D + \beta (X - \bar{X}) + \gamma D(X - \bar{X}) + \epsilon

where:

  • $D$ is the binary treatment indicator ($D=1$ for treatment, $D=0$ for control).

  • $X$ are the pre-treatment covariates (centered globally).

  • $\tau$ is the Average Treatment Effect (ATE).

Centering covariates at their global mean $\bar{X}$ ensures that the coefficient $\tau$ on the treatment indicator $D$ directly estimates the ATE.

Examples

Generate synthetic data with pre-treatment covariate

Parameters

cov_typestr, default=”HC2”

Covariance estimator passed to statsmodels (e.g., “nonrobust”, “HC0”, “HC1”, “HC2”, “HC3”). Note: for cluster-randomized designs, use cluster-robust SEs (not implemented here).

alphafloat, default=0.05

Significance level for confidence intervals.

use_tbool | None, default=None

If bool, passed to statsmodels .fit(..., use_t=use_t) directly. If None, automatic policy is used: for robust HC* covariances, use_t=True when n < use_t_auto_n_threshold, else False. For non-robust covariance, use_t=True.

use_t_auto_n_thresholdint, default=5000

Sample-size threshold for automatic use_t selection when use_t=None and covariance is HC* robust.

relative_ci_method{“delta”, “bootstrap”}, default=”delta”

Method for relative CI of 100 * tau / denominator. - “delta”: joint delta method that accounts for covariance between the adjusted ATE and the selected denominator. - “bootstrap”: percentile bootstrap CI on the relative effect.

relative_denominator{“adjusted_control”, “raw_control”}, default=”adjusted_control”

Denominator used for relative effects. - “adjusted_control”: model-implied control mean at the full-sample covariate mean. - “raw_control”: observed control-group outcome mean.

relative_ci_bootstrap_drawsint, default=1000

Number of bootstrap resamples used when relative_ci_method="bootstrap".

relative_ci_bootstrap_seedint | None, default=None

RNG seed used for bootstrap relative CI.

refutation_configCUPEDRefutationConfig | None, default=None

Grouped configuration for regression checks, refutation thresholds, and check actions.

covariate_variance_minfloat, default=1e-12

Minimum variance threshold for retaining a CUPED covariate. Covariates with variance less than or equal to this threshold are dropped before fitting.

Notes

  • Validity requires covariates be pre-treatment. Post-treatment covariates can bias estimates.

  • Covariates are globally centered over the full sample only. This centering convention is required so the treatment coefficient in the Lin specification remains the ATE/ITT.

  • The Lin (2013) specification is recommended as a robust regression-adjustment default in RCTs.

Canonical target

causalis.scenarios.cuped.model.CUPEDModel

Sections

NotesParametersExamples
Link to this symbol
method
causalis.scenarios.cuped.model.CUPEDModel.fit

fit

Fit CUPED-style regression adjustment (Lin-interacted OLS) on a CausalData object.

Parameters

dataCausalData

Validated dataset with columns: outcome (post), treatment, and confounders (pre covariates).

covariatesSequence[str], required

Explicit subset of data_contracts.confounders_names to use as CUPED covariates. Pass [] for an unadjusted (naive) fit.

run_checksbool | None, optional

Override whether regression checks are computed in this fit call. If None, uses self.refutation_config.run_regression_checks.

Returns

CUPEDModel

Fitted estimator.

Raises

ValueError

If covariates is omitted, not a sequence of strings, contains columns missing from the DataFrame, contains columns outside data_contracts.confounders_names, or the design matrix is rank deficient.

Canonical target

causalis.scenarios.cuped.model.CUPEDModel.fit

Sections

ParametersReturnsRaises
Link to this symbol
method
causalis.scenarios.cuped.model.CUPEDModel.estimate

estimate

Return the adjusted ATE/ITT estimate and inference.

Parameters

alphafloat, optional

Override the instance significance level for confidence intervals.

diagnostic_databool, default True

Whether to include diagnostic data_contracts in the result.

Returns

CausalEstimate

A results object containing effect estimates and inference.

Canonical target

causalis.scenarios.cuped.model.CUPEDModel.estimate

Sections

ParametersReturns
Link to this symbol
method
causalis.scenarios.cuped.model.CUPEDModel.summary_dict

summary_dict

Convenience JSON/logging output.

Parameters

alphafloat, optional

Override the instance significance level for confidence intervals.

Returns

dict

Dictionary with estimates, inference, and refutation checks.

Canonical target

causalis.scenarios.cuped.model.CUPEDModel.summary_dict

Sections

ParametersReturns
Link to this symbol
method
causalis.scenarios.cuped.model.CUPEDModel.assumptions_table

assumptions_table

Return fitted regression assumptions table (GREEN/YELLOW/RED) when available.

Canonical target

causalis.scenarios.cuped.model.CUPEDModel.assumptions_table

Link to this symbol
method
causalis.scenarios.cuped.model.CUPEDModel.__repr__

__repr__

Canonical target

causalis.scenarios.cuped.model.CUPEDModel.__repr__

Link to this symbol