generate_classic_rct
Generate a classic RCT dataset with three binary confounders: platform_ios, country_usa, and source_paid.
Parameters
- n (
int) – Number of samples to generate. - split (
float) – Proportion of samples assigned to the treatment group. - random_state (
int) – Random seed for reproducibility. - outcome_params (
dict) – Parameters defining baseline rates/means and treatment effects. e.g., {"p": {"A": 0.1, "B": 0.15}} for binary. - add_pre (
bool) – Whether to generate a pre-period covariate (y_pre). - beta_y (
array - like) – Linear coefficients for confounders in the outcome model. - outcome_depends_on_x (
bool) – Whether to add default effects for confounders if beta_y is None. - prognostic_scale (
float) – Scale of nonlinear prognostic signal (passed to generate_rct). - pre_corr (
float) – Target correlation for y_pre (passed to generate_rct). - return_causal_data (
bool) – Whether to return aCausalDataobject instead of apandas.DataFrame. - add_ancillary (
bool) – Whether to add standard ancillary columns (age, platform, etc.). - deterministic_ids (
bool) – Whether to generate deterministic user IDs. - include_oracle (
bool) – Whether to include oracle ground-truth columns like 'cate', 'propensity', etc. - **kwargs – Additional arguments passed to
generate_rct.
Returns
DataFrame or CausalData– Synthetic classic RCT dataset.