Skip to content
Decision guide

How to choose a causal scenario

Start with the strongest design your data can support. Randomized assignment comes first; when randomization is not available, move to observed-confounder adjustment or a design-based panel strategy.

1
Classic RCT

Start here when treatment can be randomized between users or units.

This is the default and most trusted scenario because assignment breaks the link between treatment and potential outcomes.

2
CUPED

Use this RCT variant when you also have strong pre-treatment signal.

A predictive pre-treatment outcome or other pre-treatment covariates can reduce variance and tighten the estimate.

3
Unconfoundedness

Use this when treatment cannot be randomized, but the important confounders are observed.

The model can control for those covariates, then support average effects and heterogeneity workflows.

4
Multi Unconfoundedness

Use this when non-randomized treatment has multiple discrete categories or arms.

It extends the same observed-confounder logic to pairwise or baseline-referenced contrasts across arms.

Heterogeneity and scoring

When unconfoundedness is credible, it also unlocks heterogeneous treatment effects such as GATE for sub-groups and predictive Uplift.

Assumption check

Parallel trends is not simply weaker than unconfoundedness. It is a different identification assumption. DiD can handle some unobserved time-invariant confounding, while unconfoundedness requires all relevant confounders to be observed.

Foundations

0. Introduction to Causal Inference

Get to know notation and intuition of Causal Inference with the Potential Outcome Framework. Learn what calculation of effect could be done.

Read the full article
Classic

1. Classic RCT

The Randomized Controlled Trial (RCT) is widely considered the gold standard for establishing causal relationships. By randomly assigning subjects to treatment and control groups, we ensure that both groups are identical in all aspects except for the treatment itself.

We call it classic because we do not have pre-treatment data of your units.

Model Math

ATE definition, diff-in-means estimator, and inference for continuous or conversion outcomes.

Assumptions
  • Unconfoundedness: random assignment breaks correlation with potential outcomes.
  • Overlap: each unit has a non-zero probability of assignment to every arm.
  • SUTVA: no interference and consistent treatment definitions.
Monitoring

Sample Ratio Mismatch (SRM) is the first alert for experiment health.

Data contract
user_idtreatmentoutcomeconfounders columns
Variance Reduction

2. CUPED

Controlled-experiment Using Pre-Experiment Data (CUPED) is a powerful variance reduction technique. By leveraging data from before the experiment started, we can reduce the variance of our metric and detect smaller effects with the same sample size.

We use CUPED when pre-treatment outcomes are available for the same units and are predictive of post-treatment outcomes.

Model Math

CUPED adjustment, optimal theta estimation, and adjusted treatment-effect estimator.

Assumptions
  • Unconfoundedness: assignment remains independent of potential outcomes.
  • Overlap: each unit has non-zero assignment probability.
  • SUTVA: no interference and consistent treatment definition.
  • Pre-treatment outcomes are measured before treatment and predictive of post-period outcomes.
Monitoring

Sample Ratio Mismatch (SRM) checks assignment integrity before CUPED analysis.

Data contract
user_idtreatmentoutcomeconfounders columnsy_pre (or y_pre, y_pre_2)
Observational Client-level Study

3. Unconfoundedness

Unconfoundedness is the assumption that we have measured all variables that influence both the treatment assignment and the outcome. This allows us to estimate causal effects from observational data by adjusting for these measured confounders.

We use this setup when treatment is not randomized and we rely on rich confounders plus overlap checks for valid identification.

Model Math

DML-IRM orthogonal score, cross-fitting, nuisance estimation, and robust ATE inference.

Refutation

Overlap and refutation diagnostics validate support and robustness in observational settings.

Assumptions
  • Unconfoundedness: all confounders affecting treatment and outcome are observed.
  • Overlap: each unit has non-zero treatment probability conditional on covariates.
  • SUTVA: no interference and consistent treatment definitions.
  • Nuisance models are sufficiently accurate for orthogonalized estimation.
Data contract
user_idtreatmentoutcomeconfounders columns
Unobserved Confounding

4. Instrumental Variables

Instrumental Variables (IV) estimate causal effects when treatment is not randomized and unobserved confounders may affect both treatment and outcome. A valid instrument shifts treatment assignment while affecting the outcome only through that treatment.

Use this setup for offer eligibility, encouragement designs, compliance gaps, and other cases where the estimand is a Local Average Treatment Effect (LATE) for compliers.

Model

DoubleML-style Interactive Instrumental Variable Model for binary treatment and binary instrument LATE estimation.

Assumptions
  • Relevance: the instrument changes treatment take-up.
  • Exclusion: the instrument affects outcome only through treatment.
  • Monotonicity: the instrument does not move anyone in the opposite direction.
DGP

Offer eligibility example with `offer_eligible` as the instrument, `accepted_offer` as treatment, and `net_revenue_90d` as outcome.

Data contract
user_idinstrumenttreatmentoutcomeconfounders columns
Heterogeneous Effects

5. GATE

GATE extends the observational IRM workflow from one average effect to subgroup-level treatment effects. After fitting IRM, we project the orthogonal signal onto pre-defined client groups to measure how impact differs across segments.

We use this setup when subgroup definitions are chosen before treatment and we want interpretable, group-level heterogeneity instead of a single pooled estimate.

Model Math

Orthogonal IRM signal averaged within groups, with subgroup regression and robust inference.

Group Design
  • Groups must be pre-defined and aligned to the fitted observations through `user_id`.
  • The subgroup basis should be mutually exclusive and exhaustive.
  • Every estimable group needs at least one treated and one control observation.
Assumptions
  • SUTVA / consistency: observed outcomes correspond to the realized treatment, with no interference across units.
  • Unconfoundedness: conditional on observed covariates, treatment assignment is independent of potential outcomes.
  • Overlap / positivity holds for relevant covariates and within each estimable group.
  • Group membership must be pre-treatment, not defined by treatment, outcomes, or post-treatment covariates.
  • Cross-fitted nuisance models are accurate enough for the orthogonal score to behave like a stable pseudo-outcome.
Data contract
user_idtreatmentoutcomeconfounders columns
Multi-arm Observational Study

6. Multi Unconfoundedness

Multi Unconfoundedness extends observational identification to multiple treatment arms. We estimate causal contrasts across arms by adjusting for observed confounders and modeling generalized propensity scores.

We use this setup when assignment is non-random and treatment has three or more levels.

Model Math

Multi-treatment IRM score construction, cross-fitting, and robust effect inference across arms.

Refutation

Overlap and balance diagnostics ensure identification support across all treatment arms.

Assumptions
  • Multi-arm unconfoundedness: all confounders affecting arm assignment and outcomes are observed.
  • Overlap: each unit has positive probability for every treatment arm.
  • SUTVA: no interference and consistent treatment definitions across arms.
  • Nuisance models for outcome and generalized propensity are sufficiently accurate.
Data contract
user_idtreatment (multi-arm)outcomeconfounders columns
Panel Time Series

7. Synthetic Control

Synthetic Control estimates treatment effects for one treated unit by constructing a weighted combination of untreated units that matches pre-treatment dynamics.

We use this setup when treatment happens at a unit-time level and pre-treatment trajectories are rich enough to build a credible synthetic counterpart.

Model Math

Augmented Synthetic Control formulation with bias correction and post-treatment effect aggregation.

Diagnostics

Placebo and pre-period fit checks validate synthetic control quality and support interpretation.

Assumptions
  • One treated unit and untreated donor units remain unaffected by treatment spillovers.
  • Pre-treatment outcomes are informative enough to approximate the treated counterfactual.
  • No structural break unrelated to treatment invalidates post-treatment comparisons.
Data contract
unit_idtimeis_treated_unitpost_treatmentoutcome
Quasi-Experimental

8. Difference in Difference

Difference in Difference (DiD) estimates causal effects by comparing the changes in outcomes over time between a treatment group and a control group.

It accounts for time-invariant differences between groups by subtracting out the baseline difference, relying on the parallel trends assumption to identify the treatment effect.

Assumptions
  • Parallel Trends: In the absence of treatment, the difference between groups would have remained constant.
  • No Anticipation: Units do not change behavior before treatment starts.
  • SUTVA: No spillovers between treated and control units.
Data contract
unit_idtimeis_treated_unitpost_treatmentoutcome

Requires panel data format with clear unit identifiers and indicators for treatment assignment and timing.

Targeting

9. Uplift

Uplift modeling estimates the Conditional Average Treatment Effect (CATE) for individual units, enabling optimal targeting and personalized decision making.

While ATE gives the average effect, Uplift helps identify "Persuadables"—those who respond positively to treatment—while avoiding "Sure Things" or "Sleeping Dogs."

Assumptions
  • Conditional Independence: All variables affecting both treatment and outcome are observed.
  • Positivity: Every individual has a non-zero probability of receiving either treatment or control.
  • Model Stability: The relationship between features and effects holds for the scoring population.
Data contract
user_idtreatmentoutcomefeatures...

Requires training data with treatment and outcome, and scoring data with pre-treatment features.