API ReferenceEntry

causaldata

causaldata

Reference details for causaldata in causalis.dgp.

causaldata

Causalis Dataclass for storing Cross-sectional DataFrame and column metadata for causal inference.

Classes
  • CausalData – Container for causal inference datasets.
CausalData

Bases: BaseModel

Container for causal inference datasets.

Wraps a pandas DataFrame and stores the names of treatment, outcome, and optional confounder columns. The stored DataFrame is restricted to only those columns. Uses Pydantic for validation and as a data_contracts contract.

Attributes
  • df (DataFrame) – The DataFrame containing the data_contracts restricted to outcome, treatment, and confounder columns. NaN values are not allowed in the used columns.
  • treatment_name (str) – Column name representing the treatment variable.
  • outcome_name (str) – Column name representing the outcome variable.
  • confounders_names (List[str]) – Names of the confounder columns (may be empty).
  • user_id_name ((str, optional)) – Column name representing the unique identifier for each observation/user.
Functions
  • from_df – Friendly constructor for CausalData.
  • get_df – Get a DataFrame with specified columns.
X

Design matrix of confounders.

Returns
  • DataFrame – The DataFrame containing only confounder columns.
confounders

List of confounder column names.

Returns
  • List[str] – Names of the confounder columns.
confounders_names
df
from_df

Friendly constructor for CausalData.

Parameters
  • df (DataFrame) – The DataFrame containing the data_contracts.
  • treatment (str) – Column name representing the treatment variable.
  • outcome (str) – Column name representing the outcome variable.
  • confounders (Union[str, List[str]]) – Column name(s) representing the confounders/covariates.
  • user_id (str) – Column name representing the unique identifier for each observation/user.
  • **kwargs (Any) – Additional arguments passed to the Pydantic model constructor.
Returns
get_df

Get a DataFrame with specified columns.

Parameters
  • columns (List[str]) – Specific column names to include.
  • include_treatment (bool) – Whether to include the treatment column.
  • include_outcome (bool) – Whether to include the outcome column.
  • include_confounders (bool) – Whether to include confounder columns.
  • include_user_id (bool) – Whether to include the user_id column.
Returns
  • DataFrame – A copy of the internal DataFrame with selected columns.
Raises
  • ValueError – If any specified columns do not exist.
model_config
outcome

Outcome column as a Series.

Returns
  • Series – The outcome column.
outcome_name
treatment

Treatment column as a Series.

Returns
  • Series – The treatment column.
treatment_name
user_id

user_id column as a Series.

Returns
  • Series – The user_id column.
user_id_name