Skip to content
Research7 min read

`generate_scm_gamma_26()`

This notebook presents the generate scm gamma 26 research workflow and key analysis steps.

generate_scm_gamma_26()

This notebook presents the generate scm gamma 26 research workflow and key analysis steps.

This scenario produces synthetic panel data with one treated unit and multiple donors. It uses a low-level Gamma DGP to simulate realistic outcomes with time-varying exposure and latent rates.

DGP math

The Data Generating Process (DGP) for the Gamma SCM scenario follows a hierarchical log-linear model for the mean μ\mu, with observations yy sampled from a Gamma distribution.

1.1 Donor units

For each donor unit jj at time tt:

  1. Exposure EtjE_{tj}: log(Etj)=αj+γj(ttˉ)+ϵtcommon_exp+ϵtjdonor_exp\log(E_{tj}) = \alpha_j + \gamma_j (t - \bar{t}) + \epsilon_t^{\text{common\_exp}} + \epsilon_{tj}^{\text{donor\_exp}} where ϵtcommon_exp\epsilon_t^{\text{common\_exp}} and ϵtjdonor_exp\epsilon_{tj}^{\text{donor\_exp}} are AR(1) processes.
  2. Mean μtj\mu_{tj}: μtj=Etjexp(ηtj)\mu_{tj} = E_{tj} \cdot \exp(\eta_{tj}) ηtj=βj+δj(ttˉ)+λjSt+Lt+kϕjkFtk+ϵtjdonor_noise\eta_{tj} = \beta_j + \delta_j (t - \bar{t}) + \lambda_j S_t + L_t + \sum_{k} \phi_{jk} F_{tk} + \epsilon_{tj}^{\text{donor\_noise}} where:
    • StS_t: monthly seasonality signal
    • LtL_t: common factor (macro log index, AR(1))
    • FtkF_{tk}: latent factors (AR(1))
    • ϵtjdonor_noise\epsilon_{tj}^{\text{donor\_noise}}: donor-specific AR(1) noise
  3. Outcome ytjy_{tj}: ytjGamma(k,μtjk)y_{tj} \sim \text{Gamma}\left(k, \frac{\mu_{tj}}{k}\right) where kk is the gamma_shape parameter.

1.2 Treated unit

The counterfactual mean μt,cf\mu_{t, cf} is a weighted combination of donors, potentially with a pre-fit mismatch: μt,cf=(jwjμtj)exp(ϵtmismatch)\mu_{t, cf} = \left( \sum_j w_j \mu_{tj} \right) \cdot \exp(\epsilon_t^{\text{mismatch}}) where wDirichlet(α)w \sim \text{Dirichlet}(\alpha).

Treatment is active only in post periods. For post step k=0,1,,npost1k = 0,1,\dots,n_{post}-1: rk=1e(k+1)/2.5,τkrate=(β0+β1k)rkr_k = 1 - e^{-(k+1)/2.5}, \qquad \tau_k^{\text{rate}} = (\beta_0 + \beta_1 k)\,r_k where β0\beta_0 is treatment_effect_rate and β1\beta_1 is treatment_effect_slope.

Equivalently, in calendar time: τtrate=0 in pre/anchor periods, and τtrate=τkrate in post period k.\tau_t^{\text{rate}} = 0 \text{ in pre/anchor periods, and } \tau_t^{\text{rate}} = \tau_k^{\text{rate}} \text{ in post period } k. μt,treated=μt,cf(1+τtrate)\mu_{t, treated} = \mu_{t, cf} \cdot (1 + \tau_t^{\text{rate}})

The outcomes are: yt,cfGamma(k,μt,cfk)y_{t, cf} \sim \text{Gamma}\left(k, \frac{\mu_{t, cf}}{k}\right) yt,treated=yt,cf(1+τtrate)y_{t, treated} = y_{t, cf} \cdot (1 + \tau_t^{\text{rate}}) So for each tt, yt,treatedy_{t, treated} remains Gamma-distributed with mean μt,treated\mu_{t, treated}, and yt,treatedyt,cfy_{t, treated} - y_{t, cf} is exactly the realized treatment effect.

2. Oracle Treatment Effects (ATT)

The Average Treatment Effect on the Treated (ATT) is the average impact of the intervention across all post-treatment periods. In this synthetic scenario, we can calculate it in two ways:

  1. Realized ATT: Based on observed vs. counterfactual outcomes. ATTrealized=1TposttPost(YtYt(0))\text{ATT}_{realized} = \frac{1}{T_{post}} \sum_{t \in \text{Post}} (Y_t - Y_t^{(0)}) In the data, this is the mean of tau_realized_true for the treated unit in post-periods.

  2. Mean ATT: Based on the underlying population means (the "signal"). ATTmean=1TposttPost(μt(1)μt(0))\text{ATT}_{mean} = \frac{1}{T_{post}} \sum_{t \in \text{Post}} (\mu_t^{(1)} - \mu_t^{(0)}) In the data, this is the mean of tau_mean_true for the treated unit in post-periods.

Result

Ground-truth ATTE is 1.919269

Result
unit_idcalendar_timetreated_timeyy_cftau_realized_truemu_cfmu_treatedtau_mean_true
0donor_12000-0109.9716119.9716110.010.05908110.0590810.0
1donor_12000-02010.34437910.3443790.010.09746710.0974670.0
2donor_12000-03010.99849810.9984980.011.35409811.3540980.0
3donor_12000-04011.50871711.5087170.011.27771611.2777160.0
4donor_12000-05011.12528111.1252810.010.09879110.0987910.0

EDA

Result

PanelDataSCM(df=(1763, 4), y='y', unit_col='unit_id', time_col='calendar_time', treated_time='treated_time', time_freq='M', treated_unit='treated', treatment_start=Period('2003-02', 'M'), last_post_period=Period('2003-07', 'M'), n_pre_periods=37, n_post_periods=6, donor_units=['donor_1', 'donor_10', 'donor_11', 'donor_12', 'donor_13', 'donor_14', 'donor_15', 'donor_16', 'donor_17', 'donor_18', 'donor_19', 'donor_2', 'donor_20', 'donor_21', 'donor_22', 'donor_23', 'donor_24', 'donor_25', 'donor_26', 'donor_27', 'donor_28', 'donor_29', 'donor_3', 'donor_30', 'donor_31', 'donor_32', 'donor_33', 'donor_34', 'donor_35', 'donor_36', 'donor_37', 'donor_38', 'donor_39', 'donor_4', 'donor_40', 'donor_5', 'donor_6', 'donor_7', 'donor_8', 'donor_9'])

Result
donorpre_meanpre_stdpre_slopecorr_with_treated_prermse_to_treated_prermse_to_treated_pre_standardizedmean_diff_preslope_diff_premax_abs_gap_preis_never_treatedn_missing_precorr_rankstd_rmse_rankslope_rankcomposite_similarity_scorerank_by_similaritynotes
0donor_2020.7568774.2352560.0656720.6992983.0505741.1772420.0558000.0010457.515230True06210.9500001high_std_rmse
1donor_2318.8599823.4591600.1093250.7928792.8028491.081642-1.8410960.0446986.193337True011190.8500002high_std_rmse
2donor_1727.3743995.0042510.0621590.6949457.6332942.9457516.673322-0.00246715.022205True072330.7500003high_std_rmse
3donor_3512.8901962.4784840.0548960.7743417.9950723.085364-7.810882-0.00973011.244939True022570.7416674high_std_rmse
4donor_2712.1380522.5617730.0657320.7101078.7849343.390178-8.5630250.00110612.226100True042920.7333335high_std_rmse
5donor_3915.9332353.4934790.1016010.7560995.2879262.040654-4.7678430.0369749.398086True0316160.7333336high_std_rmse
6donor_222.0239103.7529400.0255930.5997733.2990441.2731281.322833-0.0390338.913091True0194170.6916677high_std_rmse
7donor_3423.2252664.2589880.1167990.6853024.0123471.5483982.5241890.0521738.782578True0811210.6916678high_std_rmse
8donor_2615.5315482.3672400.0585690.5843825.6456362.178697-5.169529-0.00605710.132082True0211850.6583339high_std_rmse
9donor_920.1983463.8946960.1851730.6098193.1348431.209762-0.5027310.12054610.776096True0153310.61666710high_std_rmse
10donor_3115.9745053.8091540.0465190.5732215.6788052.191497-4.726572-0.01810712.461670True0221990.60833311high_std_rmse
11donor_322.9844704.6788130.1623310.6395624.2792101.6513822.2833920.09770412.714594True01213260.60000012high_std_rmse
12donor_719.8880003.5650320.0488460.4348703.4714051.339644-0.813077-0.0157808.749577True037680.60000013high_std_rmse
13donor_3322.4674013.8903760.0616340.4372674.0191361.5510181.766323-0.0029939.071825True0361240.59166714high_std_rmse
14donor_610.8170571.6194710.0302230.70590410.0551803.880377-9.884020-0.03440415.388984True0533140.59166715high_std_rmse
15donor_2121.0015404.037106-0.0182810.5420383.4296401.3235260.300463-0.08290810.637405True0275250.55000016high_std_rmse
16donor_1328.4006645.2926880.0372270.6052908.7981813.3952907.699587-0.02739917.004455True01830120.52500017high_std_rmse
17donor_1521.6477894.2108290.1821640.5542893.6394931.4045100.9467120.11753713.105415True0247290.52500018high_std_rmse
18donor_129.9649961.8943910.0900930.60725310.9364094.220450-10.7360820.02546614.965497True01734100.51666719high_std_rmse
19donor_88.7722322.0987240.1042140.64746512.0985294.668922-11.9288450.03958817.063234True01035180.50000020high_std_rmse
20donor_3023.6849354.3569770.1161760.4976464.8336741.8653552.9838580.05154910.057305True03015200.48333321high_std_rmse
21donor_421.8015734.2072430.1795670.5245233.7669451.4536951.1004960.1149418.288093True0289280.48333322high_std_rmse
22donor_1022.1404684.5088050.2246460.5587794.0073461.5464681.4393910.1600199.740145True02310330.47500023high_std_rmse
23donor_525.4123614.3742730.0572710.3238306.3799702.4620834.711284-0.00735514.186181True0402160.46666724high_std_rmse
24donor_1815.3755112.8942780.1425640.5513535.9315332.289027-5.3255660.07793811.432539True02520230.45833325high_std_rmse
25donor_2525.6381454.7196780.1000070.4910596.4306632.4816464.9370680.03538113.366836True03322150.44166726high_std_rmse
26donor_3617.5568153.5104830.1320700.4902824.4726711.726041-3.1442620.06744410.183903True03414220.44166727high_std_rmse
27donor_111.4423752.5254620.1460530.6085119.5316243.678332-9.2587020.08142714.575727True01631240.43333328high_std_rmse
28donor_4011.3637761.9580710.0904760.5067759.6224393.713378-9.3373010.02585014.358223True02932110.42500029high_std_rmse
29donor_3218.8750093.180241-0.0495270.4047013.6732851.417551-1.826069-0.1141539.321772True0388270.41666730high_std_rmse
30donor_2827.8215625.5995060.3365260.6450328.3696953.2299347.1204840.27189916.331011True01126370.40833331high_std_rmse
31donor_1626.2305995.7609590.0959070.3709337.7074162.9743555.5295220.03128118.330394True03924130.39166732high_std_rmse
32donor_1131.8983845.7483970.2780220.62721112.1017754.67017411.1973070.21339524.414887True01336350.32500033high_std_rmse
33donor_3823.3412295.6078840.3270230.4731255.6020502.1618772.6401520.26239613.541965True03517360.29166734high_std_rmse
34donor_1964.64207413.2858800.4161000.66600345.47727917.55005643.9409970.35147487.274045True0940400.28333335high_std_rmse
35donor_2932.0691726.6803890.3926420.62357612.6089944.86591411.3680950.32801520.594296True01437380.28333336high_std_rmse
36donor_2412.6046433.1692440.1990550.4934368.6143213.324337-8.0964340.13442814.907594True03128320.26666737high_std_rmse
37donor_1427.3422676.2862050.3931000.5474358.5147983.2859316.6411900.32847315.190467True02627390.25833338high_std_rmse
38donor_3734.4057317.6634630.2318760.59056415.1593495.85011713.7046540.16724933.571787True02039340.25000039high_std_rmse
39donor_2234.3533515.2355900.1835910.49225814.3928435.55431613.6522740.11896427.762663True03238300.19166740high_std_rmse
Result

png