DML IRM vs CausalML NearestNeighborMatch

This notebook presents the matching research workflow and key analysis steps.

Causal Inference consists of two main parts: Identification Assumptions and Model Specification. SUTVA, Unconfoundedness, Overlap are strong assumptions that must be true to call our inference causal. Studies and quasi experiments often have problems with Identification Assumptions so in practice you spend time to prove them, not model specification

However, In this notebook I will focus on the model specification. Propensity Score matching is a classical ML non-parametric approcah, estimating ATTE. It must perform worse than DML approach because:

uses both the propensity model and the outcome model, not just propensity scores
is more robust to small model misspecification through orthogonalization
uses cross-fitting, which reduces overfitting bias from ML nuisance models
does not throw away as much data as matching often does
provides more principled statistical inference (standard errors, confidence intervals)
can estimate ATE or ATTE cleanly, not just the treated-group effect by default

We will compare absolute estimates on DGPs from Causalis between IRM DML model implemented in Causalis and NearestNeighborMatch implemented in CausalML

generate_obs_hte_26_rich()

Result

Running n=10,000 ... Running n=100,000 ... Running n=1,000,000 ...

	n	ground_truth_atte	irm_atte	matching_atte	irm_abs_error	matching_abs_error	irm_runtime_sec	matching_runtime_sec	matched_n
0	10000	11.454404	6.256087	35.978189	5.198317	24.523785	3.325442	1.350460	618
1	100000	10.914991	12.106856	28.721069	1.191864	17.806077	16.588017	3.740718	9238
2	1000000	11.028129	10.340542	16.589285	0.687587	5.561156	220.501116	35.604447	99218

Result

n=10,000: ground truth ATTE=11.454404, IRM ATTE=6.256087, matching ATTE=35.978189 n=100,000: ground truth ATTE=10.914991, IRM ATTE=12.106856, matching ATTE=28.721069 n=1,000,000: ground truth ATTE=11.028129, IRM ATTE=10.340542, matching ATTE=16.589285

DML IRM outperforms NearestNeighborMatch

generate_obs_hte_binary_26()

Result

Running n=10,000 ... Running n=100,000 ... Running n=1,000,000 ...

	n	ground_truth_atte	irm_atte	matching_atte	irm_abs_error	matching_abs_error	irm_runtime_sec	matching_runtime_sec	matched_n
0	10000	0.103885	0.102344	0.115652	0.001541	0.011767	4.476367	1.970728	2594
1	100000	0.101238	0.103547	0.074051	0.002309	0.027187	27.549454	4.830420	29304
2	1000000	0.101411	0.103282	0.094218	0.001871	0.007192	221.720278	35.615880	298710

Result

n=10,000: ground truth ATTE=0.103885, IRM ATTE=0.102344, matching ATTE=0.115652 n=100,000: ground truth ATTE=0.101238, IRM ATTE=0.103547, matching ATTE=0.074051 n=1,000,000: ground truth ATTE=0.101411, IRM ATTE=0.103282, matching ATTE=0.094218

DML IRM outperforms NearestNeighborMatch

Conclusion

I recommend to use DML IRM for Unconfoundedness scenario as default model specification