Skip to content
Submodule
causalis.scenarios.multi_unconfoundedness.dgp

dgp

Submodule causalis.scenarios.multi_unconfoundedness.dgp with no child pages and 5 documented members.

Functions

Jump directly into the documented functions for this page.

4 items

Data

Jump directly into the documented data for this page.

1 items
function
causalis.scenarios.multi_unconfoundedness.dgp.generate_multitreatment_gamma_26

generate_multitreatment_gamma_26

Pre-configured multi-treatment dataset with Gamma-distributed outcome.

  • 3 treatment classes: d_0 (control), d_1, d_2

  • 8 confounders with realistic marginals sampled through a Gaussian copula

  • Gamma outcome with log-link confounding and heterogeneous arm effects

Examples

Notes

Let X=(tenure,sessions,spend,premium,urban,tickets,discount,credit)X = (\text{tenure}, \text{sessions}, \text{spend}, \text{premium}, \text{urban}, \text{tickets}, \text{discount}, \text{credit}) denote the 8 observed confounders. The treatment assignment mechanism is a multinomial logit with calibrated marginal arm rates near (0.50,0.25,0.25)(0.50, 0.25, 0.25):

sk(X)=αd,k+βd,kX,Pr(D=kX)=exp(sk(X))j=02exp(sj(X)).s_k(X) = \alpha_{d,k} + \beta_{d,k}^{\top} X, \qquad \Pr(D = k \mid X) = \frac{\exp(s_k(X))}{\sum_{j=0}^{2} \exp(s_j(X))}.

The confounders are jointly sampled through a Toeplitz copula with Corr(Xi,Xj)=0.3ij\mathrm{Corr}(X_i, X_j) = 0.3^{|i-j|}.

The outcome uses a log link. For arm kk,

logμk(X)=αy+βyX+θk+τk(X),Y(k)XΓ(shape=2,scale=μk(X)/2).\log \mu_k(X) = \alpha_y + \beta_y^{\top} X + \theta_k + \tau_k(X), \qquad Y(k) \mid X \sim \Gamma(\text{shape}=2, \text{scale}=\mu_k(X)/2).

This scenario fixes θ=(0,0.05,0.10)\theta = (0, -0.05, 0.10) and uses the heterogeneous shifts

τ1(X)=min{0.220.0010tenure0.006sessions0.05premium0.04discount0.10(credit0.45),0.02},\tau_1(X) = \min \left\{ -0.22 - 0.0010 \, \text{tenure} - 0.006 \, \text{sessions} - 0.05 \, \text{premium} - 0.04 \, \text{discount} - 0.10 \, (\text{credit} - 0.45), -0.02 \right\},
τ2(X)=max{0.16+0.014sessions+0.030log(1+spend)+0.06urban0.006tickets+0.12(credit0.45),0.02}.\tau_2(X) = \max \left\{ 0.16 + 0.014 \, \text{sessions} + 0.030 \, \log(1 + \text{spend}) + 0.06 \, \text{urban} - 0.006 \, \text{tickets} + 0.12 \, (\text{credit} - 0.45), 0.02 \right\}.

So d_1 is always weakly worse than control on the log-mean scale, while d_2 is always weakly better than control.

Canonical target

causalis.scenarios.multi_unconfoundedness.dgp.generate_multitreatment_gamma_26

Sections

NotesExamples
Link to this symbol
function
causalis.scenarios.multi_unconfoundedness.dgp.generate_multitreatment_binary_26

generate_multitreatment_binary_26

Pre-configured multi-treatment dataset with Binary outcome.

  • 3 treatment classes: d_0 (control), d_1, d_2

  • 8 confounders with realistic marginals sampled through a Gaussian copula

  • Binary outcome with a logistic baseline and heterogeneous arm effects

Examples

Notes

Let X=(tenure,active days,income,premium,family,complaints,discount,engagement)X = (\text{tenure}, \text{active days}, \text{income}, \text{premium}, \text{family}, \text{complaints}, \text{discount}, \text{engagement}) denote the 8 confounders. Treatment assignment again follows a calibrated multinomial logit with target arm rates near (0.50,0.25,0.25)(0.50, 0.25, 0.25):

sk(X)=αd,k+βd,kX,Pr(D=kX)=exp(sk(X))j=02exp(sj(X)).s_k(X) = \alpha_{d,k} + \beta_{d,k}^{\top} X, \qquad \Pr(D = k \mid X) = \frac{\exp(s_k(X))}{\sum_{j=0}^{2} \exp(s_j(X))}.

The outcome uses a logistic link with alpha_y = -1.1:

logitPr(Y(k)=1X)=1.1+βyX+θk+τk(X).\operatorname{logit}\Pr(Y(k)=1 \mid X) = -1.1 + \beta_y^{\top} X + \theta_k + \tau_k(X).

This scenario fixes θ=(0,0.18,0.26)\theta = (0, -0.18, 0.26) and uses

τ1(X)=min{0.160.0008tenure0.020active days0.08premium0.03complaints0.10(engagement0.60),0.02},\tau_1(X) = \min \left\{ -0.16 - 0.0008 \, \text{tenure} - 0.020 \, \text{active days} - 0.08 \, \text{premium} - 0.03 \, \text{complaints} - 0.10 \, (\text{engagement} - 0.60), -0.02 \right\},
τ2(X)=max{0.14+0.020active days+0.028log(1+income)+0.05family0.010complaints+0.12(engagement0.60),0.02}.\tau_2(X) = \max \left\{ 0.14 + 0.020 \, \text{active days} + 0.028 \, \log(1 + \text{income}) + 0.05 \, \text{family} - 0.010 \, \text{complaints} + 0.12 \, (\text{engagement} - 0.60), 0.02 \right\}.

The clipping keeps d_1 uniformly below control and d_2 uniformly above control on the log-odds scale, while the Gaussian copula with Corr(Xi,Xj)=0.3ij\mathrm{Corr}(X_i, X_j) = 0.3^{|i-j|} induces cross-feature dependence.

Canonical target

causalis.scenarios.multi_unconfoundedness.dgp.generate_multitreatment_binary_26

Sections

NotesExamples
Link to this symbol
function
causalis.scenarios.multi_unconfoundedness.dgp.generate_multitreatment_irm_26

generate_multitreatment_irm_26

Canonical target

causalis.scenarios.multi_unconfoundedness.dgp.generate_multitreatment_irm_26

Link to this symbol
function
causalis.scenarios.multi_unconfoundedness.dgp.generate_multi_dml_cx_26

generate_multi_dml_cx_26

The notebook simulates overlapping contact and repeat actions. This packaged DGP resolves them into a mutually exclusive one-hot treatment:

  • control

  • neg_contact_flg

  • error_flg

  • neg_contact_flg_error_flg

Treatment assignment matches the notebook’s independent Bernoulli contact and repeat mechanisms exactly after overlap-resolution, but is exposed through the shared multi-treatment generator so it integrates with MultiCausalData and the scenario tooling.

Examples

Notes

Write a(X)a(X) for the contact logit and b(X)b(X) for the repeat logit. The notebook first draws two conditionally independent Bernoulli actions,

CXBernoulli(σ(a(X))),RXBernoulli(σ(b(X))),C \mid X \sim \operatorname{Bernoulli}(\sigma(a(X))), \qquad R \mid X \sim \operatorname{Bernoulli}(\sigma(b(X))),

where σ(z)=1/(1+ez)\sigma(z) = 1 / (1 + e^{-z}). In this packaged benchmark the pair (C,R)(C, R) is re-encoded as a one-hot treatment:

D={control(C,R)=(0,0),neg_contact_flg(C,R)=(1,0),error_flg(C,R)=(0,1),neg_contact_flg_error_flg(C,R)=(1,1).D = \begin{cases} \text{control} & (C, R) = (0, 0), \\ \text{neg\_contact\_flg} & (C, R) = (1, 0), \\ \text{error\_flg} & (C, R) = (0, 1), \\ \text{neg\_contact\_flg\_error\_flg} & (C, R) = (1, 1). \end{cases}

Let pc=σ(a(X))p_c = \sigma(a(X)) and pr=σ(b(X))p_r = \sigma(b(X)). Then the arm probabilities are

Pr(D=controlX)=(1pc)(1pr),\Pr(D=\text{control}\mid X) = (1-p_c)(1-p_r),
Pr(D=neg_contact_flgX)=pc(1pr),\Pr(D=\text{neg\_contact\_flg}\mid X) = p_c (1-p_r),
Pr(D=error_flgX)=(1pc)pr,\Pr(D=\text{error\_flg}\mid X) = (1-p_c) p_r,
Pr(D=neg_contact_flg_error_flgX)=pcpr.\Pr(D=\text{neg\_contact\_flg\_error\_flg}\mid X) = p_c p_r.

Equivalently, this is exactly the softmax model with class scores (0,a(X),b(X),a(X)+b(X))(0, a(X), b(X), a(X)+b(X)), which is why the implementation passes g_d=[None, _cx_contact_logit, _cx_repeat_logit, lambda x: _cx_contact_logit(x) + _cx_repeat_logit(x)].

The observed outcome uses a binary logit baseline gy(X)g_y(X) plus a class effect

logitPr(Y=1X,D)=gy(X)+θ(D),\operatorname{logit}\Pr(Y=1 \mid X, D) = g_y(X) + \theta(D),

with θ(control)=θ(neg_contact_flg)=0\theta(\text{control}) = \theta(\text{neg\_contact\_flg}) = 0 and θ(error_flg)=θ(neg_contact_flg_error_flg)=0.65\theta(\text{error\_flg}) = \theta(\text{neg\_contact\_flg\_error\_flg}) = -0.65.

Worked overlap example: if a(X)=0.8a(X)=0.8 and b(X)=0.2b(X)=-0.2, then pc0.690p_c \approx 0.690 and pr0.450p_r \approx 0.450, giving arm probabilities approximately (0.170, 0.379, 0.140, 0.311) for (control, neg_contact_flg, error_flg, neg_contact_flg_error_flg).

Canonical target

causalis.scenarios.multi_unconfoundedness.dgp.generate_multi_dml_cx_26

Sections

NotesExamples
Link to this symbol
data
causalis.scenarios.multi_unconfoundedness.dgp.multi_dml_cx_26

multi_dml_cx_26

Value: None

None

Canonical target

causalis.scenarios.multi_unconfoundedness.dgp.multi_dml_cx_26

Link to this symbol