AB tests
Design, monitor, and analyze randomized controlled trials
USE CASES
Design, monitor, and analyze randomized controlled trials
Estimate effects when there is no random assignment, using causal inference techniques
Check robustness and accuracy to make your decision trustworthy
Each scenario maps to a modern estimator with defensible inference, implemented through a consistent fit() → estimate() API.
Backed by the literature
Guardrails included: SRM checks (Fabijan et al., 2019) and sensitivity analysis (Chernozhukov et al., 2024).
Causalis is built like a real analytics system: typed data contracts, test coverage, deterministic estimators, and performance defaults that scale.
Validate schema, types, nullability, and column roles before any estimator runs.
CausalData enforces outcome/treatment/confounders
Consistent API docs with assumptions, inputs, outputs, and failure modes.
Docstring-first public API
Unit + integration tests over synthetic DGPs to prevent silent regressions.
Deterministic seeds + known ground truth
Approximation options when bootstrap is too expensive at scale.
Analytic / IF-based SEs where appropriate
Strong out-of-the-box nuisance models for modern causal ML workflows.
CatBoost default for DML/IRM
Guardrails that catch common experiment and data issues early.
SRM, balance checks, sensitivity hooks
Typed contracts. Tested estimators. Fast uncertainty. Strong defaults.
Every scenario ships with synthetic generators that encode the causal effect—so you can validate estimators against ground truth, test uncertainty, and benchmark robustness.
Minimal DGP to verify pipelines and baseline estimators end-to-end.
No confounding • constant effect
Prognostic signal + optional y_pre to benchmark CUPED / Lin adjustment.
add_pre • pre_corr • prognostic_scale
Confounding through X only: DML/IRM should recover the effect with valid inference.
beta_d ≠ 0 • beta_y ≠ 0 • u_strength_* = 0
Copula-correlated X, nonlinearities, heterogeneity, multiple outcome families.
use_copula • g_y/g_d • tau(X) • binary/poisson
Assumptions violated on purpose to show where identification breaks.
u_strength_d ≠ 0 AND u_strength_y ≠ 0 → guarded
Oracle Columns
m, g0, g1, cate
Outcome Families
continuous / binary / poisson
Positivity Control
propensity_sharpness
Ground truth ATE is 0.9509353818962034
| y | d | tenure_months | avg_sessions_week | spend_last_month | premium_user | urban_resident | |
|---|---|---|---|---|---|---|---|
| 0 | -1.983895 | 1.0 | 28.814654 | 1.0 | 84.100761 | 1.0 | 0.0 |
| 1 | 7.527126 | 0.0 | 7.444181 | 0.0 | 30.890847 | 0.0 | 1.0 |
| 2 | 6.696842 | 1.0 | 23.759279 | 2.0 | 93.693180 | 0.0 | 0.0 |
| 3 | 10.337161 | 0.0 | 24.969929 | 9.0 | 127.974978 | 0.0 | 1.0 |
| 4 | 6.071955 | 0.0 | 29.943261 | 2.0 | 96.998539 | 0.0 | 1.0 |
| value | |
|---|---|
| field | |
| estimand | ATE |
| model | IRM |
| value | 0.9788 (ci_abs: 0.8074, 1.1503) |
| value_relative | 40.3375 (ci_rel: 33.2610, 47.4139) |
| alpha | 0.0500 |
| p_value | 0.0000 |
| is_significant | True |
| n_treated | 3454 |
| n_control | 6546 |
| treatment_mean | 3.6747 |
| control_mean | 2.3230 |
| time | 2026-02-08 |