Uncofoundedness
We call 'Uncofoundedness' a scenario where a treatment is not randomly assigned to participants, so confounders effect on treatment assignment and outcome.
Treatment - purchase in one category.
We will test hypothesis:
- There is no difference in LTV between treatment and control groups.
- There is a difference in LTV between treatment and control groups.
| y | d | tenure_months | avg_sessions_week | spend_last_month | age_years | income_monthly | prior_purchases_12m | support_tickets_90d | premium_user | mobile_user | urban_resident | referred_user | m | m_obs | tau_link | g0 | g1 | cate | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0.000000 | 0.0 | 28.814654 | 1.0 | 77.936767 | 50.234101 | 1926.698301 | 1.0 | 2.0 | 1.0 | 1.0 | 1.0 | 0.0 | 0.047970 | 0.047970 | 1.330764 | 8.137981 | 35.177086 | 27.039105 |
| 1 | 559.364158 | 1.0 | 25.913345 | 3.0 | 53.777740 | 28.115859 | 5104.271509 | 3.0 | 0.0 | 1.0 | 1.0 | 0.0 | 1.0 | 0.049695 | 0.049695 | 2.190209 | 60.459257 | 584.580685 | 524.121427 |
| 2 | 26.143003 | 1.0 | 24.969929 | 10.0 | 134.764322 | 22.907062 | 5267.938255 | 8.0 | 3.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.077087 | 0.077087 | 1.570177 | 7.712855 | 38.297992 | 30.585137 |
| 3 | 19.283585 | 1.0 | 40.655089 | 5.0 | 59.517074 | 31.970490 | 6597.327018 | 3.0 | 2.0 | 1.0 | 1.0 | 1.0 | 0.0 | 0.069481 | 0.069481 | 1.933844 | 25.386510 | 189.737828 | 164.351318 |
| 4 | 0.000000 | 1.0 | 18.560899 | 3.0 | 74.370930 | 39.237248 | 4930.009628 | 5.0 | 1.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.047097 | 0.047097 | 1.818265 | 15.359250 | 102.433597 | 87.074347 |
Ground truth ATE is 617.0712367740982 Ground truth ATTE is 837.4043605736649
CausalData(df=(100000, 13), treatment='d', outcome='y', confounders=['tenure_months', 'avg_sessions_week', 'spend_last_month', 'age_years', 'income_monthly', 'prior_purchases_12m', 'support_tickets_90d', 'premium_user', 'mobile_user', 'urban_resident', 'referred_user'])
| treatment | count | mean | std | min | p10 | p25 | median | p75 | p90 | max | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0.0 | 95015 | 73.966492 | 238.503707 | 0.0 | 0.0 | 0.0 | 7.448482 | 62.190137 | 184.701873 | 21396.007575 |
| 1 | 1.0 | 4985 | 907.471726 | 2545.077996 | 0.0 | 0.0 | 0.0 | 143.706288 | 730.651759 | 2269.458998 | 48466.747037 |


(<Figure size 1540x880 with 1 Axes>, <Figure size 1540x880 with 1 Axes>)
| confounders | mean_d_0 | mean_d_1 | abs_diff | smd | ks_pvalue | |
|---|---|---|---|---|---|---|
| 0 | spend_last_month | 86.467107 | 117.559365 | 31.092259 | 0.329735 | 0.00000 |
| 1 | avg_sessions_week | 4.945293 | 6.188967 | 1.243674 | 0.287878 | 0.00000 |
| 2 | premium_user | 0.738199 | 0.852357 | 0.114158 | 0.285782 | 0.00000 |
| 3 | prior_purchases_12m | 3.863548 | 4.291675 | 0.428127 | 0.200809 | 0.00000 |
| 4 | income_monthly | 4496.964003 | 4921.775317 | 424.811314 | 0.169596 | 0.00000 |
| 5 | age_years | 36.446101 | 34.628006 | 1.818095 | -0.162591 | 0.00000 |
| 6 | referred_user | 0.270599 | 0.323771 | 0.053172 | 0.116536 | 0.00000 |
| 7 | mobile_user | 0.872546 | 0.908726 | 0.036180 | 0.116115 | 0.00001 |
| 8 | urban_resident | 0.597169 | 0.638114 | 0.040945 | 0.084327 | 0.00000 |
| 9 | support_tickets_90d | 0.994138 | 1.074423 | 0.080286 | 0.077965 | 0.00002 |
| 10 | tenure_months | 28.523084 | 29.718502 | 1.195418 | 0.064588 | 0.00000 |
Inference
Math Explanation of the IRM Model and ATTE Estimand
The Interactive Regression Model (IRM) is a flexible framework used in Double Machine Learning (DML) to estimate treatment effects. Unlike linear models, it allows the treatment effect to vary with confounders (interaction) and makes no parametric assumptions about the functional forms of the outcomes.
We write for an observation, where is treatment and is the observed outcome.
1. Nuisance Functions
The IRM framework relies on three "nuisance" components estimated from the data:
- Outcome Regression (Control):
- Outcome Regression (Treated):
- Propensity Score:
Let denote the overall treatment rate (estimated by the sample mean of ).
In the provided implementation (irm.py), these are estimated using cross-fitting (splitting data into folds) to avoid overfitting bias.
2. ATTE (Average Treatment Effect on the Treated)
The Average Treatment Effect on the Treated (ATTE) measures the impact of the treatment specifically on those individuals who received it:
Under unconfoundedness, , and overlap , this is identified from observed data.
3. The Orthogonal Score
DML uses a Neyman-orthogonal score to ensure the estimator is robust to small errors in the nuisance function estimates. The score for ATTE is defined as:
To match the implementation in irm.py, define:
- Residuals: ,
- IPW terms: ,
- Weights (ATTE): and (the normalized form with )
Then: \begin{aligned} \psi_a(W;\eta) &= -w = -\frac{D}{p} \ \psi_b(W;\eta) &= w,(g_1(X)-g_0(X)) + \bar{w},(u_1 h_1 - u_0 h_0) \end{aligned}
(If normalize_ipw=True, the code rescales and to have mean 1.)
4. Final Estimation (Step-by-step simplification)
For brevity, write , , and . Plug in :
\begin{aligned} \psi_b &= \frac{D}{p}(g_1-g_0)
- \frac{m}{p}\left[\frac{D}{m}(Y-g_1) - \frac{1-D}{1-m}(Y-g_0)\right] \ &= \frac{D}{p}(g_1-g_0) + \frac{D}{p}(Y-g_1) - \frac{m}{p}\frac{1-D}{1-m}(Y-g_0) \ &= \frac{D}{p}(Y-g_0) - \frac{m}{p}\frac{1-D}{1-m}(Y-g_0). \end{aligned}
So the terms cancel, and the ATTE score depends only on and .
The estimator solves : \begin{aligned} \hat{\theta}_{ATTE} &= \frac{\mathbb{E}[\psi_b]}{\mathbb{E}[-\psi_a]} = \frac{\mathbb{E}[\psi_b]}{\mathbb{E}[D/p]} = \mathbb{E}[\psi_b]. \end{aligned}
Equivalently,
| estimand | coefficient | p_val | lower_ci | upper_ci | relative_diff_% | is_significant | |
|---|---|---|---|---|---|---|---|
| 0 | ATTE | 816.966741 | 0.0 | 749.154822 | 884.77866 | 893.17824 | True |
CausalEstimate(estimand='ATTE', model='IRM', model_options={'n_folds': 5, 'n_rep': 1, 'normalize_ipw': False, 'trimming_rule': 'truncate', 'trimming_threshold': 0.01, 'random_state': None, 'std_error': 34.59855341330873, 't_stat': 23.61274273910593}, value=816.9667408936743, ci_upper_absolute=884.7786595009447, ci_lower_absolute=749.1548222864039, value_relative=893.1782401420442, ci_upper_relative=967.3160563964027, ci_lower_relative=819.0404238876857, alpha=0.05, p_value=0.0, is_significant=True, n_treated=4985, n_control=95015, outcome='y', treatment='d', confounders=['tenure_months', 'avg_sessions_week', 'spend_last_month', 'age_years', 'income_monthly', 'prior_purchases_12m', 'support_tickets_90d', 'premium_user', 'mobile_user', 'urban_resident', 'referred_user'], time=datetime.datetime(2026, 1, 27, 8, 26, 46, 441822), diagnostic_data=UnconfoundednessDiagnosticData(m_hat=array([0.05620082, 0.06597108, 0.12947216, ..., 0.03993944, 0.06856774, 0.0686461 ], shape=(100000,)), d=array([0, 1, 1, ..., 0, 0, 0], shape=(100000,)), y=array([ 0. , 559.3641575 , 26.14300299, ..., 86.88646582, 169.67753671, 0. ], shape=(100000,)), x=array([[ 28.81465403, 1. , 77.9367668 , ..., 1. ,
- , 0. ], [ 25.91334462, 3. , 53.7777399 , ..., 1. ,
- , 1. ], [ 24.9699287 , 10. , 134.76432201, ..., 1. ,
- , 0. ], ..., [ 18.95058854, 2. , 49.18443354, ..., 1. ,
- , 0. ], [ 22.87615781, 6. , 46.8461344 , ..., 1. ,
- , 1. ], [ 38.81380133, 4. , 149.87138917, ..., 1. ,
- , 1. ]], shape=(100000, 11)), g0_hat=array([ -1.88039973, 70.12097778, -2.32580324, ..., 85.09028141, 219.30257652, 284.15408662], shape=(100000,)), g1_hat=array([ 3.52125899, 626.31094544, 255.36485828, ..., 556.25218828, 1510.46557774, 2935.52765031], shape=(100000,)), psi_b=array([-2.24619835e+00, 9.81430651e+03, 5.71089393e+02, ..., -1.49895669e+00, 7.32831695e+01, 4.20136010e+02], shape=(100000,)), folds=array([0, 1, 4, ..., 3, 2, 3], shape=(100000,)), trimming_threshold=0.01, sigma2=230611.39428973786, nu2=20.51450972522466, psi_sigma2=array([-230607.8583866 , -226129.52187451, -178068.73534978, ..., -230608.1680113 , -228148.74971339, -149867.84934463], shape=(100000,)), psi_nu2=array([ 25.98365281, -366.08026856, -303.22555781, ..., 12.27041473, 36.55198755, 36.61933817], shape=(100000,)), riesz_rep=array([-1.19453237, 20.06018054, 20.06018054, ..., -0.83452271, -1.47673775, -1.47854995], shape=(100000,)), m_alpha=array([23.96253506, 28.42254226, 59.84989764, ..., 16.74067631, 29.62362582, 29.65997892], shape=(100000,)), psi=array([-2.24619835e+00, -6.57419380e+03, -1.58174109e+04, ..., -1.49895669e+00, 7.32831695e+01, 4.20136010e+02], shape=(100000,)), score='ATTE'), sensitivity_analysis={})
| metric | value | flag | |
|---|---|---|---|
| 0 | edge_0.01_below | 0.000000 | GREEN |
| 1 | edge_0.01_above | 0.000000 | GREEN |
| 2 | edge_0.02_below | 0.183620 | RED |
| 3 | edge_0.02_above | 0.000000 | RED |
| 4 | KS | 0.193386 | GREEN |
| 5 | AUC | 0.623803 | GREEN |
| 6 | ESS_treated_ratio | 0.639887 | GREEN |
| 7 | ESS_control_ratio | 0.998503 | GREEN |
| 8 | tails_w1_q99/med | 5.610889 | GREEN |
| 9 | tails_w0_q99/med | 1.307781 | GREEN |
| 10 | ATT_identity_relerr | 0.012851 | GREEN |
| 11 | clip_m_total | 0.049540 | YELLOW |
| 12 | calib_ECE | 0.006803 | GREEN |
| 13 | calib_slope | 0.626749 | YELLOW |
| 14 | calib_intercept | -1.036712 | RED |

| metric | value | flag | |
|---|---|---|---|
| 0 | se_plugin | 34.598553 | NA |
| 1 | psi_p99_over_med | 828.340166 | RED |
| 2 | psi_kurtosis | 1769.053579 | RED |
| 3 | max_|t|_g1 | 0.000000 | GREEN |
| 4 | max_|t|_g0 | 0.872006 | GREEN |
| 5 | max_|t|_m | 0.417469 | GREEN |
1.) Are your clients independent (i)? 2.) Do you measure confounders, treatment, and outcome in the same intervals? 3.) Do you measure confounders before treatment and outcome after? 4.) Do you have a consistent label of treatment, such as if a person does not receive a treatment, he has a label 0?
| metric | value | flag | |
|---|---|---|---|
| 0 | balance_max_smd | 0.018554 | GREEN |
| 1 | balance_frac_violations | 0.000000 | GREEN |
{'theta': 816.9667408936743, 'se': 34.59855341330873, 'alpha': 0.05, 'z': 1.959963984540054, 'H0': 0.0, 'sampling_ci': (749.1548222864039, 884.7786595009447), 'theta_bounds_cofounding': (794.9964525215009, 838.9370292658477), 'bias_aware_ci': (728.8783636080927, 908.4863956227268), 'max_bias': 21.97028837217341, 'sigma2': 230611.39428973786, 'nu2': 20.51450972522466, 'rv': 0.2730480733887724, 'rva': 0.2561902047512034, 'params': {'r2_y': 0.01, 'r2_d': 0.01, 'rho': 1.0, 'use_signed_rr': False}}
| r2_y | r2_d | rho | theta_long | theta_short | delta | |
|---|---|---|---|---|---|---|
| d | 0.000261 | 0.000496 | -1.0 | 816.966741 | 816.176488 | 0.790253 |