causalis.dgp.multicausaldata.base.MultiCausalDatasetGeneratorMultiCausalDatasetGenerator
Generate synthetic causal datasets with multi-class (one-hot) treatments.
Treatment assignment is modeled via a multinomial logistic (softmax) model: P(D=k | X, U) = softmax_k(alpha_d[k] + f_k(X) + u_strength_d[k] * U)
Outcome depends on confounders and the assigned treatment class: outcome_type = “continuous”: Y = alpha_y + f_y(X) + u_strength_y * U + sum_k D_k * (theta_k + tau_k(X)) + eps outcome_type = “binary”: logit P(Y=1|X,D,U) = alpha_y + f_y(X) + u_strength_y * U + sum_k D_k * (theta_k + tau_k(X)) outcome_type = “poisson”: log E[Y|X,D,U] = alpha_y + f_y(X) + u_strength_y * U + sum_k D_k * (theta_k + tau_k(X)) outcome_type = “gamma”: log E[Y|X,D,U] = alpha_y + f_y(X) + u_strength_y * U + sum_k D_k * (theta_k + tau_k(X))
Parameters
- n_treatmentsint, default=3
Number of treatment classes (including control). Column 0 is treated as control. Generated treatment columns are a full one-hot encoding that sums to 1.
- d_nameslist of str, optional
Names of treatment columns. If None, uses [“d_0”, “d_1”, …].
- thetafloat or array-like, optional
Constant treatment effects on the link scale for each class. If scalar, applied to all non-control classes (control effect = 0). If length K-1, prepends 0 for control. If length K, uses as provided.
- taucallable or list of callables, optional
Heterogeneous effects for each class. If callable, applied to non-control classes. Effects are additive with theta on the link scale: tau_link_k(X) = theta_k + tau_k(X).
- beta_yarray-like, optional
Linear coefficients for baseline outcome f_y(X).
- g_ycallable, optional
Nonlinear baseline outcome function g_y(X).
- alpha_yfloat, default=0.0
Outcome intercept on link scale.
- sigma_yfloat, default=1.0
Std dev for continuous outcomes.
- outcome_type{“continuous”, “binary”, “poisson”, “gamma”}, default=”continuous”
Outcome family.
- gamma_shapefloat, default=2.0
Shape parameter for gamma outcomes.
- u_strength_yfloat, default=0.0
Strength of unobserved confounder in outcome.
- confounder_specslist of dict, optional
Schema for generating confounders (same format as CausalDatasetGenerator).
- kint, default=5
Number of confounders if confounder_specs is None.
- x_samplercallable, optional
Custom sampler (n, k, seed) -> X ndarray.
- use_copulabool, default=False
If True and confounder_specs provided, use Gaussian copula for X.
- copula_corrarray-like, optional
Correlation matrix for copula.
- beta_darray-like or list, optional
Linear coefficients for treatment assignment. If array of shape (k,), applies to all non-control classes. If shape (K,k), uses per class.
- g_dcallable or list of callables, optional
Nonlinear treatment score per class. If callable, applies to non-control classes.
- alpha_dfloat or array-like, optional
Intercepts for treatment scores. If scalar, applies to non-control classes.
- u_strength_dfloat or array-like, default=0.0
Unobserved confounder strength in treatment assignment. If scalar, interpreted as [0, c, c, …] so latent U perturbs non-control classes relative to control (and does not cancel in softmax).
- propensity_sharpnessfloat, default=1.0
Scales treatment scores to adjust overlap.
- target_d_ratearray-like, optional
Target marginal class probabilities (length K). Calibrates alpha_d using iterative scaling (approximate when u_strength_d != 0).
- include_oraclebool, default=True
Whether to include oracle columns for propensities and potential outcomes.
- seedint, optional
Random seed.
Canonical target
causalis.dgp.multicausaldata.base.MultiCausalDatasetGenerator
Sections