MultiCausalData
Bases: BaseModel
Data contract for cross-sectional causal data with multi-class one-hot treatments.
Parameters
- df (
DataFrame) – The DataFrame containing the causal data. - outcome (
str) – The name of the outcome column. - treatment_names (
List[str]) – The names of the treatment columns. - confounders (
List[str]) – The names of the confounder columns, by default []. - user_id (
Optional[str]) – The name of the user ID column, by default None. - control_treatment (
str) – Name of the control/baseline treatment column.
Notes
This class enforces several constraints on the data, including:
- Maximum number of treatment_names (default 15).
- No duplicate column names in the input DataFrame.
- Disjoint roles for columns (outcome, treatment_names, confounders, user_id).
- Non-empty normalized names for outcome and user_id (if provided).
- Existence of all specified columns in the DataFrame.
- Numeric or boolean types for outcome and confounders.
- Finite values for outcome, confounders, and treatment_names.
- Non-constant values for outcome, treatment_names, and confounders.
- No NaN values in used columns.
- Binary (0/1) encoding for treatment columns.
- One-hot treatment assignment (exactly one active treatment per row).
- A stable control treatment in position 0.
- No identical values between different columns.
- Unique values for user_id (if specified).
Functions
- from_df – Create a MultiCausalData instance from a pandas DataFrame.
- get_df – Get a subset of the underlying DataFrame.
FLOAT_TOL
MAX_TREATMENTS
X
Return the confounder columns as a pandas DataFrame.
Returns
DataFrame– The confounder columns.
confounders
control_treatment
df
from_df
Create a MultiCausalData instance from a pandas DataFrame.
Parameters
- df (
DataFrame) – The input DataFrame. - outcome (
str) – The name of the outcome column. - treatment_names (
Union[str, List[str]]) – The name(s) of the treatment column(s). - confounders (
Union[str, List[str]]) – The name(s) of the confounder column(s), by default None. - user_id (
str) – The name of the user ID column, by default None. - control_treatment (
str) – Name of the control treatment column. - **kwargs (
Any) – Additional keyword arguments passed to the constructor.
Returns
MultiCausalData– An instance of MultiCausalData.
get_df
Get a subset of the underlying DataFrame.
Parameters
- columns (
List[str]) – Specific columns to include, by default None. - include_outcome (
bool) – Whether to include the outcome column, by default True. - include_confounders (
bool) – Whether to include confounder columns, by default True. - include_treatments (
bool) – Whether to include treatment columns, by default True. - include_user_id (
bool) – Whether to include the user ID column, by default False.
Returns
DataFrame– A copy of the requested DataFrame subset.
Raises
ValueError– If any of the requested columns do not exist.
model_config
outcome
treatment
Return the single treatment column as a pandas Series.
Returns
Series– The treatment column.
Raises
AttributeError– If there is more than one treatment column.
treatment_names
treatments
Return the treatment columns as a pandas DataFrame.
Returns
DataFrame– The treatment columns.