Foundations

0. Introduction to Causal Inference

Get to know notation and intuition of Causal Inference with the Potential Outcome Framework. Learn what calculation of effect could be done.

Read the full article

Jump toClassic RCT CUPED Unconfoundedness Multi Unconfoundedness

Classic

1. Classic RCT

The Randomized Controlled Trial (RCT) is widely considered the gold standard for establishing causal relationships. By randomly assigning subjects to treatment and control groups, we ensure that both groups are identical in all aspects except for the treatment itself.

We call it classic because we do not have pre-treatment data of your units.

Scenario Notebook

Best for

Clean randomization, first experiments, binary or revenue outcomes.

Outputs

ATE via diff-in-means with classic inference (t-test, z-test, or bootstrap).

DGPs

classic_rct_gamma_26

Revenue outcome (Gamma) with randomized assignment.

generate_classic_rct_26

Conversion outcome (binary) with randomized assignment.

Model Math

ATE definition, diff-in-means estimator, and inference for continuous or conversion outcomes.

Open model math

Monitoring

Sample Ratio Mismatch (SRM) is the first alert for experiment health.

Open SRM monitoring

Assumptions

Unconfoundedness: random assignment breaks correlation with potential outcomes.
Overlap: each unit has a non-zero probability of assignment to every arm.
SUTVA: no interference and consistent treatment definitions.

Data contract

user_idtreatmentoutcomeconfounders columns

Variance Reduction

2. CUPED

Controlled-experiment Using Pre-Experiment Data (CUPED) is a powerful variance reduction technique. By leveraging data from before the experiment started, we can reduce the variance of our metric and detect smaller effects with the same sample size.

We use CUPED when pre-treatment outcomes are available for the same units and are predictive of post-treatment outcomes.

Scenario Notebook

Best for

Randomized experiments with strong pre-period signals and noisy outcomes.

Outputs

CUPED-adjusted ATE with tighter confidence intervals than raw diff-in-means.

DGPs

generate_cuped_tweedie_26

Tweedie-like outcome with pre-period covariates calibrated for CUPED.

generate_cuped_binary

Binary conversion outcome with pre-experiment data for CUPED adjustment.

Model Math

CUPED adjustment, optimal theta estimation, and adjusted treatment-effect estimator.

Open model math

Monitoring

Sample Ratio Mismatch (SRM) checks assignment integrity before CUPED analysis.

Open SRM monitoring

Assumptions

Unconfoundedness: assignment remains independent of potential outcomes.
Overlap: each unit has non-zero assignment probability.
SUTVA: no interference and consistent treatment definition.
Pre-treatment outcomes are measured before treatment and predictive of post-period outcomes.

Data contract

user_idtreatmentoutcomeconfounders columnsy_pre (or y_pre, y_pre_2)

Observational Client-level Study

3. Unconfoundedness

Unconfoundedness is the assumption that we have measured all variables that influence both the treatment assignment and the outcome. This allows us to estimate causal effects from observational data by adjusting for these measured confounders.

We use this setup when treatment is not randomized and we rely on rich confounders plus overlap checks for valid identification.

Scenario Notebook

Best for

Non-randomized treatment settings with strong, rich confounders and sufficient overlap.

Outputs

Debiased ATE estimates with robust confidence intervals under unconfoundedness and overlap.

DGPs

generate_obs_hte_26_rich

Continuous-outcome observational DGP with nonlinear confounding and HTE.

generate_obs_hte_binary_26

Binary-outcome observational DGP with confounding and heterogeneous effects.

Model Math

DML-IRM orthogonal score, cross-fitting, nuisance estimation, and robust ATE inference.

Open model math

Refutation

Overlap and refutation diagnostics validate support and robustness in observational settings.

Open refutation flow

Assumptions

Unconfoundedness: all confounders affecting treatment and outcome are observed.
Overlap: each unit has non-zero treatment probability conditional on covariates.
SUTVA: no interference and consistent treatment definitions.
Nuisance models are sufficiently accurate for orthogonalized estimation.

Data contract

user_idtreatmentoutcomeconfounders columns

Multi-arm Observational Study

4. Multi Unconfoundedness

Multi Unconfoundedness extends observational identification to multiple treatment arms. We estimate causal contrasts across arms by adjusting for observed confounders and modeling generalized propensity scores.

We use this setup when assignment is non-random and treatment has three or more levels.

Scenario Notebook

Best for

Multi-arm non-randomized interventions with rich covariates and sufficient overlap across arms.

Outputs

Pairwise and baseline-referenced treatment effects with orthogonalized robust inference.

DGPs

generate_multitreatment_gamma_26

Multi-treatment observational DGP with Gamma outcome and correlated confounding.

generate_multitreatment_binary_26

Multi-treatment observational DGP with binary outcome and heterogeneous effects.

Model Math

Multi-treatment IRM score construction, cross-fitting, and robust effect inference across arms.

Open model math

Refutation

Overlap and balance diagnostics ensure identification support across all treatment arms.

Open refutation flow

Assumptions

Multi-arm unconfoundedness: all confounders affecting arm assignment and outcomes are observed.
Overlap: each unit has positive probability for every treatment arm.
SUTVA: no interference and consistent treatment definitions across arms.
Nuisance models for outcome and generalized propensity are sufficiently accurate.

Data contract

user_idtreatment (multi-arm)outcomeconfounders columns