API ReferenceEntry

cate

cate

Reference details for cate in causalis.scenarios.

cate

Modules
  • blp
  • cate – Conditional Average Treatment Effect (CATE) inference methods for causalis.
  • gate – Group Average Treatment Effect (GATE) inference methods for causalis.
Classes
  • BLP – Best linear predictor (BLP) with orthogonal signals.
BLP

Best linear predictor (BLP) with orthogonal signals. Mainly used for CATE and GATE estimation for IRM models.

The Best Linear Predictor (BLP) targets the coefficient vector :math:\beta_0 that minimizes the mean squared error between the true treatment effect function :math:\tau(X) and a linear combination of basis functions :math:b(X):

.. math:: \beta_0 = \arg\min_{\beta \in \mathbb{R}^K} \mathbb{E}\Big[\big(\tau(X) - b(X)^\top \beta \big)^2\Big].

This is characterized by the moment condition:

.. math:: \mathbb{E}[b(X)\psi] = \mathbb{E}[b(X)b(X)^\top]\beta_0,

where :math:\psi is the orthogonal signal such that :math:\mathbb{E}[\psi \mid X] = \tau(X).

The estimator is obtained via OLS of the orthogonal signal on the basis:

.. math:: \hat{\beta} = (B^\top B)^{-1}B^\top\psi.

GATE (Group Average Treatment Effect)

When is_gate=True, the basis consists of group indicators (dummy variables). In this case, the BLP coefficients correspond to the group means of the orthogonal signal, which approximate the GATEs:

.. math:: \hat{\beta}k = \frac{1}{n_k}\sum{i:G_i=k}\psi_i \approx \text{GATE}_k.

Confidence Intervals

Confidence intervals for any linear combination :math:\hat{g} = A\hat{\beta} are computed using the estimated covariance matrix :math:\widehat{\Omega}:

.. math:: \widehat{\operatorname{Var}}(\hat{g}) \approx A\widehat{\Omega}A^\top.

Pointwise and joint confidence intervals (via Gaussian multiplier bootstrap) are supported.

Parameters
  • orth_signal (:class:numpy.array) – The orthogonal signal to be predicted. Has to be of shape (n_obs,), where n_obs is the number of observations.
  • basis (:class:pandas.DataFrame) – The basis for estimating the best linear predictor. Has to have the shape (n_obs, d), where n_obs is the number of observations and d is the number of predictors.
  • is_gate (bool) – Indicates whether the basis is constructed for GATEs (dummy-basis). Default is False.
Functions
  • confint – Confidence intervals for the BLP model.
  • fit – Estimate BLP models.
basis

Basis.

blp_model

Best-Linear-Predictor model.

blp_omega

Covariance matrix.

confint

Confidence intervals for the BLP model.

Parameters
  • basis (:class:pandas.DataFrame) – The basis for constructing the confidence interval. Has to have the same form as the basis from the construction. If None is passed, if the basis is constructed for GATEs, the GATEs are returned. Else, the confidence intervals for the basis coefficients are returned (with pointwise cofidence intervals). Default is None.
  • joint (bool) – Indicates whether joint confidence intervals are computed. Default is False.
  • alpha (float) – The significance level. Default is 0.05.
  • n_rep_boot (int) – The number of bootstrap repetitions (only relevant for joint confidence intervals). Default is 500.
Returns
  • df_ci (DataFrame) – A data_contracts frame with the confidence interval(s).
fit

Estimate BLP models.

Parameters
  • cov_type (str) – The covariance type to be used in the estimation. Default is 'HC0'. See :meth:statsmodels.regression.linear_model.OLS.fit for more information.
  • diagnostic_data (bool) – Whether to include diagnostic data_contracts. (Currently not used for BLP).
  • **kwargs – Additional keyword arguments to be passed to :meth:statsmodels.regression.linear_model.OLS.fit.
Returns
orth_signal

Orthogonal signal.

summary

A summary for the best linear predictor effect after calling :meth:fit.

blp
Classes
  • BLP – Best linear predictor (BLP) with orthogonal signals.
BLP

Best linear predictor (BLP) with orthogonal signals. Mainly used for CATE and GATE estimation for IRM models.

The Best Linear Predictor (BLP) targets the coefficient vector :math:\beta_0 that minimizes the mean squared error between the true treatment effect function :math:\tau(X) and a linear combination of basis functions :math:b(X):

.. math:: \beta_0 = \arg\min_{\beta \in \mathbb{R}^K} \mathbb{E}\Big[\big(\tau(X) - b(X)^\top \beta \big)^2\Big].

This is characterized by the moment condition:

.. math:: \mathbb{E}[b(X)\psi] = \mathbb{E}[b(X)b(X)^\top]\beta_0,

where :math:\psi is the orthogonal signal such that :math:\mathbb{E}[\psi \mid X] = \tau(X).

The estimator is obtained via OLS of the orthogonal signal on the basis:

.. math:: \hat{\beta} = (B^\top B)^{-1}B^\top\psi.

GATE (Group Average Treatment Effect)

When is_gate=True, the basis consists of group indicators (dummy variables). In this case, the BLP coefficients correspond to the group means of the orthogonal signal, which approximate the GATEs:

.. math:: \hat{\beta}k = \frac{1}{n_k}\sum{i:G_i=k}\psi_i \approx \text{GATE}_k.

Confidence Intervals

Confidence intervals for any linear combination :math:\hat{g} = A\hat{\beta} are computed using the estimated covariance matrix :math:\widehat{\Omega}:

.. math:: \widehat{\operatorname{Var}}(\hat{g}) \approx A\widehat{\Omega}A^\top.

Pointwise and joint confidence intervals (via Gaussian multiplier bootstrap) are supported.

Parameters
  • orth_signal (:class:numpy.array) – The orthogonal signal to be predicted. Has to be of shape (n_obs,), where n_obs is the number of observations.
  • basis (:class:pandas.DataFrame) – The basis for estimating the best linear predictor. Has to have the shape (n_obs, d), where n_obs is the number of observations and d is the number of predictors.
  • is_gate (bool) – Indicates whether the basis is constructed for GATEs (dummy-basis). Default is False.
Functions
  • confint – Confidence intervals for the BLP model.
  • fit – Estimate BLP models.
basis

Basis.

blp_model

Best-Linear-Predictor model.

blp_omega

Covariance matrix.

confint

Confidence intervals for the BLP model.

Parameters
  • basis (:class:pandas.DataFrame) – The basis for constructing the confidence interval. Has to have the same form as the basis from the construction. If None is passed, if the basis is constructed for GATEs, the GATEs are returned. Else, the confidence intervals for the basis coefficients are returned (with pointwise cofidence intervals). Default is None.
  • joint (bool) – Indicates whether joint confidence intervals are computed. Default is False.
  • alpha (float) – The significance level. Default is 0.05.
  • n_rep_boot (int) – The number of bootstrap repetitions (only relevant for joint confidence intervals). Default is 500.
Returns
  • df_ci (DataFrame) – A data_contracts frame with the confidence interval(s).
fit

Estimate BLP models.

Parameters
  • cov_type (str) – The covariance type to be used in the estimation. Default is 'HC0'. See :meth:statsmodels.regression.linear_model.OLS.fit for more information.
  • diagnostic_data (bool) – Whether to include diagnostic data_contracts. (Currently not used for BLP).
  • **kwargs – Additional keyword arguments to be passed to :meth:statsmodels.regression.linear_model.OLS.fit.
Returns
orth_signal

Orthogonal signal.

summary

A summary for the best linear predictor effect after calling :meth:fit.

cate

Conditional Average Treatment Effect (CATE) inference methods for causalis.

This submodule provides methods for estimating conditional average treatment effects.

Modules
  • cate_esimand – IRM-based implementation for estimating CATE (per-observation orthogonal signals).
cate_esimand

IRM-based implementation for estimating CATE (per-observation orthogonal signals).

This module provides a function that, given a CausalData object, fits the internal IRM model and augments the data with a new column 'cate' that contains the orthogonal signals (an estimate of the conditional average treatment effect for each unit).

Functions
  • cate_esimand – Estimate per-observation CATEs using IRM and return a DataFrame with a new 'cate' column.
cate_esimand

Estimate per-observation CATEs using IRM and return a DataFrame with a new 'cate' column.

Parameters
  • data (CausalData) – A CausalData object with defined outcome (outcome), treatment (binary 0/1), and confounders.
  • ml_g (estimator) – ML learner for outcome regression g(D, X) = E[Y | D, X] supporting fit/predict. Defaults to CatBoostRegressor if None.
  • ml_m (classifier) – ML learner for propensity m(X) = P[D=1 | X] supporting fit/predict_proba. Defaults to CatBoostClassifier if None.
  • n_folds (int) – Number of folds for cross-fitting.
  • n_rep (int) – Number of repetitions for sample splitting.
  • use_blp (bool) – If True, and X_new is provided, fits a BLP on the orthogonal signal and predicts CATE for X_new. If False (default), uses the in-sample orthogonal signal and appends to data.
  • X_new (DataFrame) – New covariate matrix for out-of-sample CATE prediction via best linear predictor. Must contain the same feature columns as the confounders in data_contracts.
Returns
  • DataFrame – If use_blp is False: returns a copy of data with a new column 'cate'. If use_blp is True and X_new is provided: returns a DataFrame with 'cate' column for X_new rows.
Raises
  • ValueError – If treatment is not binary 0/1 or required metadata is missing.
gate

Group Average Treatment Effect (GATE) inference methods for causalis.

This submodule provides methods for estimating group average treatment effects.

Modules
  • gate_esimand – Group Average Treatment Effect (GATE) estimation using local DML IRM and BLP.
gate_esimand

Group Average Treatment Effect (GATE) estimation using local DML IRM and BLP.

Functions
  • gate_esimand – Estimate Group Average Treatment Effects (GATEs).
gate_esimand

Estimate Group Average Treatment Effects (GATEs).

If groups is None, observations are grouped by quantiles of the plugin CATE proxy (g1_hat - g0_hat).