Skip to content
Submodule
causalis.scenarios.cuped.dgp

dgp

Submodule causalis.scenarios.cuped.dgp with no child pages and 2 documented members.

Functions

Jump directly into the documented functions for this page.

2 items
function
causalis.scenarios.cuped.dgp.generate_cuped_tweedie_26

generate_cuped_tweedie_26

Gold standard Tweedie-like DGP with mixed marginals and structured HTE.

This DGP generates a dataset representing a typical e-commerce scenario with many zeros and a heavy right tail (e.g., revenue). It features pre-treatment covariates (‘y_pre’, ‘y_pre_2’) that are calibrated to have a specific correlation with the post-treatment outcome.

Notes

The outcome $Y$ follows a Tweedie-like distribution where the mean is a function of treatment $D$ and latent factors $A$ and $X$:

E[YD,A,X]=exp(α+θD+βX+γDX+A)E[Y | D, A, X] = \exp(\alpha + \theta D + \beta X + \gamma D X + A)

The pre-period covariate $X_{pre}$ (e.g., y_pre) is generated to satisfy:

Corr(Xpre,YD=0)ρCorr(X_{pre}, Y | D=0) \approx \rho

where $\rho$ is the pre_target_corr. This allows for realistic testing of variance reduction techniques.

Examples

Generate data with two pre-period covariates

Parameters

nint, default=10000

Number of samples to generate.

seedint, default=42

Random seed.

add_prebool, default=True

Whether to add pre-period covariates.

pre_namestr, default=”y_pre”

Name of the first pre-period covariate column.

pre_name_2str, optional

Name of the second pre-period covariate column. Defaults to f"{pre_name}_2".

pre_target_corrfloat, default=0.82

Target correlation between the first pre covariate and post-outcome y in control group.

pre_target_corr_2float, optional

Target correlation for the second pre covariate. Defaults to a moderate value based on pre_target_corr to reduce collinearity.

pre_specPreCorrSpec, optional

Detailed specification for pre-period calibration (transform, method, etc.).

include_oraclebool, default=False

Whether to include oracle ground-truth columns like ‘cate’, ‘propensity’, etc.

return_causal_databool, default=True

Whether to return a CausalData object.

theta_logfloat, default=0.38

The log-uplift theta parameter for the treatment effect.

Returns

pd.DataFrame or CausalData

Canonical target

causalis.scenarios.cuped.dgp.generate_cuped_tweedie_26

Sections

NotesParametersReturnsExamples
Link to this symbol
function
causalis.scenarios.cuped.dgp.make_cuped_binary_26

make_cuped_binary_26

Binary CUPED benchmark with richer confounders and structured HTE.

This DGP generates a binary outcome (e.g., conversion) with a calibrated pre-period covariate.

Notes

The outcome $Y$ is binary and follows:

P(Y=1D,X)=logit1(α+θD+βX)P(Y=1 | D, X) = \text{logit}^{-1}(\alpha + \theta D + \beta X)

The pre-period covariate is calibrated such that its correlation with $Y$ in the control group matches pre_target_corr.

Examples

Parameters

nint, default=10000

Number of samples to generate.

seedint, default=42

Random seed.

add_prebool, default=True

Whether to add a pre-period covariate ‘y_pre’.

pre_namestr, default=”y_pre”

Name of the pre-period covariate column.

pre_target_corrfloat, default=0.65

Target correlation between y_pre and post-outcome y in the control group.

pre_specPreCorrSpec, optional

Detailed specification for pre-period calibration (transform, method, etc.).

include_oraclebool, default=True

Whether to include oracle columns like ‘cate’, ‘g0’, and ‘g1’.

return_causal_databool, default=True

Whether to return a CausalData object.

theta_logitfloat, default=0.38

Baseline log-odds uplift scale for heterogeneous treatment effects.

Returns

pd.DataFrame or CausalData

Canonical target

causalis.scenarios.cuped.dgp.make_cuped_binary_26

Sections

NotesParametersReturnsExamples
Link to this symbol