Scenario3 min read

Scenario: Classic RCT

Automated conversion of classic_rct.ipynb

Scenario: Classic RCT

We call 'Classic Randomized Controlled Trial' (RCT) a scenario where a treatment is randomly assigned to participants, and we do not have pre-experiment data of participants like pre-treatment outcome.

Treatment - new onboarding for new users.

We will test hypothesis:

HoH_o - There is no difference in conversion rate between treatment and control groups.

HaH_a - There is a difference in conversion rate between treatment and control groups.

Data

We will use DGP from causalis. More you can read at https://causalis.causalcraft.com/articles/generate_classic_rct_26

Result
user_idconversiondplatform_ioscountry_usasource_paidmm_obstau_linkg0g1cate
001fc40.00.01.00.01.00.50.50.1064830.3106200.3338680.023249
10204c0.01.00.00.01.00.50.50.1064830.1982570.2157270.017471
2002cf0.00.01.01.00.00.50.50.1064830.2319690.2514790.019509
30202d0.01.01.01.00.00.50.50.1064830.2319690.2514790.019509
4011cb0.01.00.01.00.00.50.50.1064830.1421890.1556780.013489
Result

Ground truth ATE is 0.01719144406311028

Result

CausalData(df=(10000, 5), treatment='d', outcome='conversion', confounders=['platform_ios', 'country_usa', 'source_paid'])

Result
treatmentcountmeanstdminp10p25medianp75p90max
00.049550.1989910.3992810.00.00.00.00.01.01.0
11.050450.2329040.4227230.00.00.00.00.01.01.0
Result

png

Monitoring

Our system is randomly splitting users. Half of them must have new onboarding, other half has not. We should monitor the split with SRM test. Read more at https://causalis.causalcraft.com/articles/srm

Check the confounders balance

Are groups equal in terms of confounders? We need to choose with domain and business sense confounders and check balance of them. The standard benchmark:

  • SMD>0.1SMD > 0.1
  • ks_pvalue < 0.05

As we see system splitted users randomly

Estimation with Diff-in-Means

Inference methods

In Causalis.DiffInMeans model implemented ttest, conversion_ztest and bootstrap:

  • use conversion_ztest when users < 100k and outcome is binary
  • use bootstrap when users < 10k or outcome is ratio metric or your metric is highly skewed
  • in other cases use ttest

We will use conversion_ztest for our scenario

conversion_ztest

The conversion_ztest performs a statistical comparison of conversion rates between two groups (Treatment and Control). It provides a p-value for the hypothesis test, and robust confidence intervals for both absolute and relative differences.

1. Observed Metrics

For each group (Control 00, Treatment 11):

  • n0,n1n_0, n_1: Total number of observations.
  • x0,x1x_0, x_1: Number of successes (conversions).
  • p0=x0n0,    p1=x1n1p_0 = \frac{x_0}{n_0}, \;\; p_1 = \frac{x_1}{n_1}: Observed conversion rates.

2. Hypothesis Test (P-value)

The test evaluates H0:p1=p0H_0: p_1 = p_0 (no difference).

  • Pooled Proportion: p^=x0+x1n0+n1\hat{p} = \frac{x_0 + x_1}{n_0 + n_1}
  • Pooled Standard Error: SEpooled=p^(1p^)(1n0+1n1)SE_{pooled} = \sqrt{\hat{p}(1 - \hat{p}) \left(\frac{1}{n_0} + \frac{1}{n_1}\right)}
  • Z-Statistic: Z=p1p0SEpooledZ = \frac{p_1 - p_0}{SE_{pooled}}
  • P-value: 2×(1Φ(Z))2 \times (1 - \Phi(|Z|)), where Φ\Phi is the standard normal CDF.

3. Absolute Difference (Newcombe CI)

To calculate the confidence interval for the difference Δ=p1p0\Delta = p_1 - p_0, we use the Newcombe method, which is more robust than standard Wald intervals for conversion rates.

  1. Wilson Score Interval for each group: CIWilson,i=(li,ui)=pi+z22ni±zpi(1pi)ni+z24ni21+z2niCI_{Wilson, i} = (l_i, u_i) = \frac{p_i + \frac{z^2}{2n_i} \pm z \sqrt{\frac{p_i(1 - p_i)}{n_i} + \frac{z^2}{4n_i^2}}}{1 + \frac{z^2}{n_i}}
  2. Combined Interval: CIΔ=(l1u0,    u1l0)CI_{\Delta} = (l_1 - u_0, \;\; u_1 - l_0) (where zz is the critical value for the chosen α\alpha)

4. Relative Difference (Lift)

Lift measures the percentage change: Lift=(p1p01)×100%\text{Lift} = (\frac{p_1}{p_0} - 1) \times 100\%. The confidence interval uses a delta-method approximation on the lift scale:

  • Var(p1)=p1(1p1)n1,  Var(p0)=p0(1p0)n0\text{Var}(p_1) = \frac{p_1(1 - p_1)}{n_1}, \; \text{Var}(p_0) = \frac{p_0(1 - p_0)}{n_0}
  • SElift=100×(1p0)2Var(p1)+(p1p02)2Var(p0)SE_{\text{lift}} = 100 \times \sqrt{(\frac{1}{p_0})^2 \text{Var}(p_1) + (\frac{p_1}{p_0^2})^2 \text{Var}(p_0)}
  • Relative CI: Lift±z×SElift\text{Lift} \pm z \times SE_{\text{lift}}

(If p0p_0 is extremely close to 0, the lift is undefined; the implementation returns inf/0 and NaN for the CI.)