Explore Causal Inference Scenarios
Start from the setup that looks closest to your problem. Each card jumps to a full workflow below with data assumptions, diagnostics, and model links.
How to choose a scenarioIntroduction to Causal Inference
Potential outcomes, notation, estimands, and the core intuition behind causal effects.
Classic RCT
Clean treatment-control experiments without pre-treatment outcome data.
CUPED
Randomized experiments that use pre-period signal to tighten treatment-effect estimates.
Unconfoundedness
Client-level observational studies with rich confounders, overlap checks, and robust ATE estimation.
Instrumental Variables
Estimate local treatment effects when treatment is endogenous but a valid instrument is available.
GATE
Subgroup treatment effects built on top of an observational IRM workflow.
Multi Unconfoundedness
Observational settings with three or more treatment arms and pairwise effect contrasts.
Synthetic Control
Single treated-unit panel setups matched against a weighted synthetic donor pool.
Difference in Difference
Estimate causal effects by comparing the changes in outcomes over time between treated and control groups.
Uplift
Predict individual-level treatment effects to optimize targeting and maximize incremental impact.
How to choose a causal scenario
Start with the strongest design your data can support. Randomized assignment comes first; when randomization is not available, move to observed-confounder adjustment or a design-based panel strategy.
Start here when treatment can be randomized between users or units.
This is the default and most trusted scenario because assignment breaks the link between treatment and potential outcomes.
Use this RCT variant when you also have strong pre-treatment signal.
A predictive pre-treatment outcome or other pre-treatment covariates can reduce variance and tighten the estimate.
Use this when treatment cannot be randomized, but the important confounders are observed.
The model can control for those covariates, then support average effects and heterogeneity workflows.
Use this when non-randomized treatment has multiple discrete categories or arms.
It extends the same observed-confounder logic to pairwise or baseline-referenced contrasts across arms.
Use a design-based scenario instead of relying only on covariate adjustment.
Useful when treatment is endogenous but you have a credible instrument that shifts treatment without directly shifting the outcome.
Useful when treated and control units are observed before and after treatment. It relies on a parallel trends assumption.
Useful when treatment is assigned at the aggregate unit level, often with one or a small number of treated units and many potential controls. It relies on strong pre-treatment fit.
When unconfoundedness is credible, it also unlocks heterogeneous treatment effects such as GATE for sub-groups and predictive Uplift.
Parallel trends is not simply weaker than unconfoundedness. It is a different identification assumption. DiD can handle some unobserved time-invariant confounding, while unconfoundedness requires all relevant confounders to be observed.
0. Introduction to Causal Inference
Get to know notation and intuition of Causal Inference with the Potential Outcome Framework. Learn what calculation of effect could be done.
Read the full article1. Classic RCT
The Randomized Controlled Trial (RCT) is widely considered the gold standard for establishing causal relationships. By randomly assigning subjects to treatment and control groups, we ensure that both groups are identical in all aspects except for the treatment itself.
We call it classic because we do not have pre-treatment data of your units.
ATE definition, diff-in-means estimator, and inference for continuous or conversion outcomes.
- Unconfoundedness: random assignment breaks correlation with potential outcomes.
- Overlap: each unit has a non-zero probability of assignment to every arm.
- SUTVA: no interference and consistent treatment definitions.
2. CUPED
Controlled-experiment Using Pre-Experiment Data (CUPED) is a powerful variance reduction technique. By leveraging data from before the experiment started, we can reduce the variance of our metric and detect smaller effects with the same sample size.
We use CUPED when pre-treatment outcomes are available for the same units and are predictive of post-treatment outcomes.
CUPED adjustment, optimal theta estimation, and adjusted treatment-effect estimator.
- Unconfoundedness: assignment remains independent of potential outcomes.
- Overlap: each unit has non-zero assignment probability.
- SUTVA: no interference and consistent treatment definition.
- Pre-treatment outcomes are measured before treatment and predictive of post-period outcomes.
Sample Ratio Mismatch (SRM) checks assignment integrity before CUPED analysis.
3. Unconfoundedness
Unconfoundedness is the assumption that we have measured all variables that influence both the treatment assignment and the outcome. This allows us to estimate causal effects from observational data by adjusting for these measured confounders.
We use this setup when treatment is not randomized and we rely on rich confounders plus overlap checks for valid identification.
DML-IRM orthogonal score, cross-fitting, nuisance estimation, and robust ATE inference.
Overlap and refutation diagnostics validate support and robustness in observational settings.
- Unconfoundedness: all confounders affecting treatment and outcome are observed.
- Overlap: each unit has non-zero treatment probability conditional on covariates.
- SUTVA: no interference and consistent treatment definitions.
- Nuisance models are sufficiently accurate for orthogonalized estimation.
4. Instrumental Variables
Instrumental Variables (IV) estimate causal effects when treatment is not randomized and unobserved confounders may affect both treatment and outcome. A valid instrument shifts treatment assignment while affecting the outcome only through that treatment.
Use this setup for offer eligibility, encouragement designs, compliance gaps, and other cases where the estimand is a Local Average Treatment Effect (LATE) for compliers.
DoubleML-style Interactive Instrumental Variable Model for binary treatment and binary instrument LATE estimation.
- Relevance: the instrument changes treatment take-up.
- Exclusion: the instrument affects outcome only through treatment.
- Monotonicity: the instrument does not move anyone in the opposite direction.
Offer eligibility example with `offer_eligible` as the instrument, `accepted_offer` as treatment, and `net_revenue_90d` as outcome.
5. GATE
GATE extends the observational IRM workflow from one average effect to subgroup-level treatment effects. After fitting IRM, we project the orthogonal signal onto pre-defined client groups to measure how impact differs across segments.
We use this setup when subgroup definitions are chosen before treatment and we want interpretable, group-level heterogeneity instead of a single pooled estimate.
Orthogonal IRM signal averaged within groups, with subgroup regression and robust inference.
- Groups must be pre-defined and aligned to the fitted observations through `user_id`.
- The subgroup basis should be mutually exclusive and exhaustive.
- Every estimable group needs at least one treated and one control observation.
- SUTVA / consistency: observed outcomes correspond to the realized treatment, with no interference across units.
- Unconfoundedness: conditional on observed covariates, treatment assignment is independent of potential outcomes.
- Overlap / positivity holds for relevant covariates and within each estimable group.
- Group membership must be pre-treatment, not defined by treatment, outcomes, or post-treatment covariates.
- Cross-fitted nuisance models are accurate enough for the orthogonal score to behave like a stable pseudo-outcome.
6. Multi Unconfoundedness
Multi Unconfoundedness extends observational identification to multiple treatment arms. We estimate causal contrasts across arms by adjusting for observed confounders and modeling generalized propensity scores.
We use this setup when assignment is non-random and treatment has three or more levels.
Multi-treatment IRM score construction, cross-fitting, and robust effect inference across arms.
Overlap and balance diagnostics ensure identification support across all treatment arms.
- Multi-arm unconfoundedness: all confounders affecting arm assignment and outcomes are observed.
- Overlap: each unit has positive probability for every treatment arm.
- SUTVA: no interference and consistent treatment definitions across arms.
- Nuisance models for outcome and generalized propensity are sufficiently accurate.
7. Synthetic Control
Synthetic Control estimates treatment effects for one treated unit by constructing a weighted combination of untreated units that matches pre-treatment dynamics.
We use this setup when treatment happens at a unit-time level and pre-treatment trajectories are rich enough to build a credible synthetic counterpart.
Augmented Synthetic Control formulation with bias correction and post-treatment effect aggregation.
Placebo and pre-period fit checks validate synthetic control quality and support interpretation.
- One treated unit and untreated donor units remain unaffected by treatment spillovers.
- Pre-treatment outcomes are informative enough to approximate the treated counterfactual.
- No structural break unrelated to treatment invalidates post-treatment comparisons.
8. Difference in Difference
Difference in Difference (DiD) estimates causal effects by comparing the changes in outcomes over time between a treatment group and a control group.
It accounts for time-invariant differences between groups by subtracting out the baseline difference, relying on the parallel trends assumption to identify the treatment effect.
- Parallel Trends: In the absence of treatment, the difference between groups would have remained constant.
- No Anticipation: Units do not change behavior before treatment starts.
- SUTVA: No spillovers between treated and control units.
Requires panel data format with clear unit identifiers and indicators for treatment assignment and timing.
9. Uplift
Uplift modeling estimates the Conditional Average Treatment Effect (CATE) for individual units, enabling optimal targeting and personalized decision making.
While ATE gives the average effect, Uplift helps identify "Persuadables"—those who respond positively to treatment—while avoiding "Sure Things" or "Sleeping Dogs."
- Conditional Independence: All variables affecting both treatment and outcome are observed.
- Positivity: Every individual has a non-zero probability of receiving either treatment or control.
- Model Stability: The relationship between features and effects holds for the scoring population.
Requires training data with treatment and outcome, and scoring data with pre-treatment features.