Submodule

causalis.scenarios.cuped.model

model

Submodule causalis.scenarios.cuped.model with no child pages and 6 documented members.

Symbol index API members CUPEDModel fit estimate summary_dict

Classes

Jump directly into the documented classes for this page.

1 items

CUPEDModelclass

class

causalis.scenarios.cuped.model.CUPEDModel

CUPEDModel

CUPED-style regression adjustment estimator for ATE/ITT in randomized experiments.

The CUPED estimator uses pre-experiment data (covariates) to reduce the variance of the treatment effect estimate without introducing bias. While the canonical CUPED estimator uses a single variance-reduction parameter $\theta$, this implementation follows Lin (2013) and uses a fully interacted OLS specification.

Notes

The canonical CUPED adjusted outcome is defined as:

Y_{cuped} = Y - \theta (X - E[X])

where $\theta = \frac{Cov(Y, X)}{Var(X)}$ minimizes $Var(Y_{cuped})$.

This model implements the Lin (2013) specification, which is equivalent to saturated OLS and robust to heterogeneous treatment effects:

Y = \alpha + \tau D + \beta (X - \bar{X}) + \gamma D(X - \bar{X}) + \epsilon

where:

$D$ is the binary treatment indicator ($D=1$ for treatment, $D=0$ for control).
$X$ are the pre-treatment covariates (centered globally).
$\tau$ is the Average Treatment Effect (ATE).

Centering covariates at their global mean $\bar{X}$ ensures that the coefficient $\tau$ on the treatment indicator $D$ directly estimates the ATE.

Examples

Generate synthetic data with pre-treatment covariate

Parameters

cov_typestr, default=”HC2”: Covariance estimator passed to statsmodels (e.g., “nonrobust”, “HC0”, “HC1”, “HC2”, “HC3”). Note: for cluster-randomized designs, use cluster-robust SEs (not implemented here).
alphafloat, default=0.05: Significance level for confidence intervals.
use_tbool | None, default=None: If bool, passed to statsmodels .fit(..., use_t=use_t) directly. If None, automatic policy is used: for robust HC* covariances, use_t=True when n < use_t_auto_n_threshold, else False. For non-robust covariance, use_t=True.
use_t_auto_n_thresholdint, default=5000: Sample-size threshold for automatic use_t selection when use_t=None and covariance is HC* robust.
relative_ci_method{“delta”, “bootstrap”}, default=”delta”: Method for relative CI of 100 * tau / denominator. - “delta”: joint delta method that accounts for covariance between the adjusted ATE and the selected denominator. - “bootstrap”: percentile bootstrap CI on the relative effect.
relative_denominator{“adjusted_control”, “raw_control”}, default=”adjusted_control”: Denominator used for relative effects. - “adjusted_control”: model-implied control mean at the full-sample covariate mean. - “raw_control”: observed control-group outcome mean.
relative_ci_bootstrap_drawsint, default=1000: Number of bootstrap resamples used when relative_ci_method="bootstrap".
relative_ci_bootstrap_seedint | None, default=None: RNG seed used for bootstrap relative CI.
refutation_configCUPEDRefutationConfig | None, default=None: Grouped configuration for regression checks, refutation thresholds, and check actions.
covariate_variance_minfloat, default=1e-12: Minimum variance threshold for retaining a CUPED covariate. Covariates with variance less than or equal to this threshold are dropped before fitting.

Notes

Validity requires covariates be pre-treatment. Post-treatment covariates can bias estimates.
Covariates are globally centered over the full sample only. This centering convention is required so the treatment coefficient in the Lin specification remains the ATE/ITT.
The Lin (2013) specification is recommended as a robust regression-adjustment default in RCTs.

Canonical target

causalis.scenarios.cuped.model.CUPEDModel

Sections

NotesParametersExamples

Link to this symbol

method

causalis.scenarios.cuped.model.CUPEDModel.fit

fit

Fit CUPED-style regression adjustment (Lin-interacted OLS) on a CausalData object.

Parameters

dataCausalData: Validated dataset with columns: outcome (post), treatment, and confounders (pre covariates).
covariatesSequence[str], required: Explicit subset of data_contracts.confounders_names to use as CUPED covariates. Pass [] for an unadjusted (naive) fit.
run_checksbool | None, optional: Override whether regression checks are computed in this fit call. If None, uses self.refutation_config.run_regression_checks.

Returns

CUPEDModel

Fitted estimator.

Raises

ValueError

If covariates is omitted, not a sequence of strings, contains columns missing from the DataFrame, contains columns outside data_contracts.confounders_names, or the design matrix is rank deficient.

Canonical target

causalis.scenarios.cuped.model.CUPEDModel.fit

Sections

ParametersReturnsRaises

Link to this symbol

method

causalis.scenarios.cuped.model.CUPEDModel.estimate

estimate

Return the adjusted ATE/ITT estimate and inference.

Parameters

alphafloat, optional: Override the instance significance level for confidence intervals.
diagnostic_databool, default True: Whether to include diagnostic data_contracts in the result.

Returns

CausalEstimate

A results object containing effect estimates and inference.

Canonical target

causalis.scenarios.cuped.model.CUPEDModel.estimate

Sections

ParametersReturns

Link to this symbol

method

causalis.scenarios.cuped.model.CUPEDModel.summary_dict

summary_dict

Convenience JSON/logging output.

Parameters

alphafloat, optional: Override the instance significance level for confidence intervals.

Returns

dict

Dictionary with estimates, inference, and refutation checks.

Canonical target

causalis.scenarios.cuped.model.CUPEDModel.summary_dict

Sections

ParametersReturns

Link to this symbol

method

causalis.scenarios.cuped.model.CUPEDModel.assumptions_table

assumptions_table

Return fitted regression assumptions table (GREEN/YELLOW/RED) when available.

Canonical target

causalis.scenarios.cuped.model.CUPEDModel.assumptions_table

Link to this symbol

method

causalis.scenarios.cuped.model.CUPEDModel.__repr__

repr

Canonical target

causalis.scenarios.cuped.model.CUPEDModel.__repr__

Link to this symbol