Benchmark MultiTreatmentIRM vs DoubleML Multitreatment (APOS)
This notebook presents the doubleml multitreatment benchmark research workflow and key analysis steps.
Compare MultiTreatmentIRM from Causalis with the DoubleML multitreatment realization (DoubleMLAPOS) on the same 3-arm DGP.
DGP
Use generate_multitreatment_irm_26() with oracle effects so we can benchmark each active arm against control.
| y | d_0 | d_1 | d_2 | tenure_months | avg_sessions_week | spend_last_month | premium_user | urban_resident | support_tickets_q | ... | m_obs_d_1 | tau_link_d_1 | m_d_2 | m_obs_d_2 | tau_link_d_2 | g_d_0 | g_d_1 | g_d_2 | cate_d_1 | cate_d_2 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1.541272 | 1.0 | 0.0 | 0.0 | 27.656605 | 3.198667 | 89.609464 | 0.0 | 1.0 | 0.0 | ... | 0.246062 | -0.352005 | 0.220606 | 0.220606 | 0.494166 | 3.279384 | 2.306314 | 5.375338 | -0.973070 | 2.095954 |
| 1 | 6.802333 | 1.0 | 0.0 | 0.0 | 23.798386 | 3.362415 | 102.337236 | 0.0 | 0.0 | 3.0 | ... | 0.178897 | -0.307360 | 0.236716 | 0.236716 | 0.420278 | 2.807850 | 2.064853 | 4.274630 | -0.742997 | 1.466780 |
| 2 | 8.079449 | 1.0 | 0.0 | 0.0 | 28.425009 | 3.391819 | 102.660712 | 0.0 | 1.0 | 1.0 | ... | 0.210001 | -0.320189 | 0.218040 | 0.218040 | 0.502415 | 3.069919 | 2.228798 | 5.073677 | -0.841121 | 2.003758 |
| 3 | 2.136820 | 1.0 | 0.0 | 0.0 | 18.860066 | 4.071175 | 83.593417 | 0.0 | 0.0 | 2.0 | ... | 0.176239 | -0.316241 | 0.237394 | 0.237394 | 0.441677 | 2.716805 | 1.980234 | 4.225485 | -0.736571 | 1.508680 |
| 4 | 1.555391 | 0.0 | 1.0 | 0.0 | 17.853087 | 3.140075 | 79.209870 | 0.0 | 1.0 | 1.0 | ... | 0.231904 | -0.350130 | 0.246832 | 0.246832 | 0.493624 | 3.224354 | 2.271869 | 5.282273 | -0.952485 | 2.057919 |
5 rows × 26 columns
| mean | |
|---|---|
| d_0 | 0.501040 |
| d_1 | 0.245720 |
| d_2 | 0.253240 |
| y | 4.308808 |
{'d_1 vs d_0': -1.199206862416331, 'd_2 vs d_0': 2.5379024492441777}
MultiCausalData(df=(25000, 12), treatment_names=['d_0', 'd_1', 'd_2'], control_treatment='d_0')outcome='y', confounders=['tenure_months', 'avg_sessions_week', 'spend_last_month', 'premium_user', 'urban_resident', 'support_tickets_q', 'discount_eligible', 'credit_utilization'], user_id=None,
Causalis: MultiTreatmentIRM
| d_1 vs d_0 | d_2 vs d_0 | |
|---|---|---|
| field | ||
| estimand | ATE | ATE |
| model | MultiTreatmentIRM | MultiTreatmentIRM |
| value | -1.2818 (ci_abs: -1.3781, -1.1855) | 2.3674 (ci_abs: 2.1994, 2.5354) |
| value_relative | -32.0434 (ci_rel: -34.0579, -30.0289) | 59.1812 (ci_rel: 54.4420, 63.9205) |
| alpha | 0.0500 | 0.0500 |
| p_value | 0.0000 | 0.0000 |
| is_significant | True | True |
| n_treated | 6143 | 6331 |
| n_control | 12526 | 12526 |
| treatment_mean | 2.9329 | 6.6318 |
| control_mean | 3.8095 | 3.8095 |
| time | 2026-02-22 | 2026-02-22 |
DoubleML: multitreatment realization (APOS)
<doubleml.data.base_data.DoubleMLData at 0x1692cde80>
coef std err t P>|t| 2.5 % 97.5 % 1 vs 0 -1.319629 0.044631 -29.567224 0.0 -1.407105 -1.232153 2 vs 0 2.471794 0.079695 31.015554 0.0 2.315594 2.627994
| contrast | oracle_ate | causalis_multitreatment_irm | doubleml_apos_contrast | abs_diff_causalis_vs_dml | |
|---|---|---|---|---|---|
| 0 | d_1 vs d_0 | -1.199207 | -1.281841 | -1.319629 | 0.037788 |
| 1 | d_2 vs d_0 | 2.537902 | 2.367443 | 2.471794 | 0.104351 |