Causalis

`cate`

Modules

blp –
cate – Conditional Average Treatment Effect (CATE) inference methods for causalis.
gate – Group Average Treatment Effect (GATE) inference methods for causalis.

Classes

BLP – Best linear predictor (BLP) with orthogonal signals.

`BLP`

Best linear predictor (BLP) with orthogonal signals. Mainly used for CATE and GATE estimation for IRM models.

The Best Linear Predictor (BLP) targets the coefficient vector :math:\beta_0 that minimizes the mean squared error between the true treatment effect function :math:\tau(X) and a linear combination of basis functions :math:b(X):

.. math:: \beta_0 = \arg\min_{\beta \in \mathbb{R}^K} \mathbb{E}\Big[\big(\tau(X) - b(X)^\top \beta \big)^2\Big].

This is characterized by the moment condition:

.. math:: \mathbb{E}[b(X)\psi] = \mathbb{E}[b(X)b(X)^\top]\beta_0,

where :math:\psi is the orthogonal signal such that :math:\mathbb{E}[\psi \mid X] = \tau(X).

The estimator is obtained via OLS of the orthogonal signal on the basis:

.. math:: \hat{\beta} = (B^\top B)^{-1}B^\top\psi.

GATE (Group Average Treatment Effect)

When is_gate=True, the basis consists of group indicators (dummy variables). In this case, the BLP coefficients correspond to the group means of the orthogonal signal, which approximate the GATEs:

.. math:: \hat{\beta}k = \frac{1}{n_k}\sum{i:G_i=k}\psi_i \approx \text{GATE}_k.

Confidence Intervals

Confidence intervals for any linear combination :math:\hat{g} = A\hat{\beta} are computed using the estimated covariance matrix :math:\widehat{\Omega}:

.. math:: \widehat{\operatorname{Var}}(\hat{g}) \approx A\widehat{\Omega}A^\top.

Pointwise and joint confidence intervals (via Gaussian multiplier bootstrap) are supported.

Parameters

orth_signal (:class:numpy.array) – The orthogonal signal to be predicted. Has to be of shape (n_obs,), where n_obs is the number of observations.
basis (:class:pandas.DataFrame) – The basis for estimating the best linear predictor. Has to have the shape (n_obs, d), where n_obs is the number of observations and d is the number of predictors.
is_gate (bool) – Indicates whether the basis is constructed for GATEs (dummy-basis). Default is False.

Functions

confint – Confidence intervals for the BLP model.
fit – Estimate BLP models.

`basis`

Basis.

`blp_model`

Best-Linear-Predictor model.

`blp_omega`

Covariance matrix.

`confint`

Confidence intervals for the BLP model.

Parameters

basis (:class:pandas.DataFrame) – The basis for constructing the confidence interval. Has to have the same form as the basis from the construction. If None is passed, if the basis is constructed for GATEs, the GATEs are returned. Else, the confidence intervals for the basis coefficients are returned (with pointwise cofidence intervals). Default is None.
joint (bool) – Indicates whether joint confidence intervals are computed. Default is False.
alpha (float) – The significance level. Default is 0.05.
n_rep_boot (int) – The number of bootstrap repetitions (only relevant for joint confidence intervals). Default is 500.

Returns

df_ci (DataFrame) – A data_contracts frame with the confidence interval(s).

`fit`

Estimate BLP models.

Parameters

cov_type (str) – The covariance type to be used in the estimation. Default is 'HC0'. See :meth:statsmodels.regression.linear_model.OLS.fit for more information.
diagnostic_data (bool) – Whether to include diagnostic data_contracts. (Currently not used for BLP).
**kwargs – Additional keyword arguments to be passed to :meth:statsmodels.regression.linear_model.OLS.fit.

Returns

self (object) –

`orth_signal`

Orthogonal signal.

`summary`

A summary for the best linear predictor effect after calling :meth:fit.

`blp`

Classes

BLP – Best linear predictor (BLP) with orthogonal signals.

`BLP`

Best linear predictor (BLP) with orthogonal signals. Mainly used for CATE and GATE estimation for IRM models.

.. math:: \beta_0 = \arg\min_{\beta \in \mathbb{R}^K} \mathbb{E}\Big[\big(\tau(X) - b(X)^\top \beta \big)^2\Big].

This is characterized by the moment condition:

.. math:: \mathbb{E}[b(X)\psi] = \mathbb{E}[b(X)b(X)^\top]\beta_0,

where :math:\psi is the orthogonal signal such that :math:\mathbb{E}[\psi \mid X] = \tau(X).

The estimator is obtained via OLS of the orthogonal signal on the basis:

.. math:: \hat{\beta} = (B^\top B)^{-1}B^\top\psi.

GATE (Group Average Treatment Effect)

When is_gate=True, the basis consists of group indicators (dummy variables). In this case, the BLP coefficients correspond to the group means of the orthogonal signal, which approximate the GATEs:

.. math:: \hat{\beta}k = \frac{1}{n_k}\sum{i:G_i=k}\psi_i \approx \text{GATE}_k.

Confidence Intervals

Confidence intervals for any linear combination :math:\hat{g} = A\hat{\beta} are computed using the estimated covariance matrix :math:\widehat{\Omega}:

.. math:: \widehat{\operatorname{Var}}(\hat{g}) \approx A\widehat{\Omega}A^\top.

Pointwise and joint confidence intervals (via Gaussian multiplier bootstrap) are supported.

Parameters

orth_signal (:class:numpy.array) – The orthogonal signal to be predicted. Has to be of shape (n_obs,), where n_obs is the number of observations.
basis (:class:pandas.DataFrame) – The basis for estimating the best linear predictor. Has to have the shape (n_obs, d), where n_obs is the number of observations and d is the number of predictors.
is_gate (bool) – Indicates whether the basis is constructed for GATEs (dummy-basis). Default is False.

Functions

confint – Confidence intervals for the BLP model.
fit – Estimate BLP models.

`basis`

Basis.

`blp_model`

Best-Linear-Predictor model.

`blp_omega`

Covariance matrix.

`confint`

Confidence intervals for the BLP model.

Parameters

basis (:class:pandas.DataFrame) – The basis for constructing the confidence interval. Has to have the same form as the basis from the construction. If None is passed, if the basis is constructed for GATEs, the GATEs are returned. Else, the confidence intervals for the basis coefficients are returned (with pointwise cofidence intervals). Default is None.
joint (bool) – Indicates whether joint confidence intervals are computed. Default is False.
alpha (float) – The significance level. Default is 0.05.
n_rep_boot (int) – The number of bootstrap repetitions (only relevant for joint confidence intervals). Default is 500.

Returns

df_ci (DataFrame) – A data_contracts frame with the confidence interval(s).

`fit`

Estimate BLP models.

Parameters

cov_type (str) – The covariance type to be used in the estimation. Default is 'HC0'. See :meth:statsmodels.regression.linear_model.OLS.fit for more information.
diagnostic_data (bool) – Whether to include diagnostic data_contracts. (Currently not used for BLP).
**kwargs – Additional keyword arguments to be passed to :meth:statsmodels.regression.linear_model.OLS.fit.

Returns

self (object) –

`orth_signal`

Orthogonal signal.

`summary`

A summary for the best linear predictor effect after calling :meth:fit.

`cate`

Conditional Average Treatment Effect (CATE) inference methods for causalis.

This submodule provides methods for estimating conditional average treatment effects.

Modules

cate_esimand – IRM-based implementation for estimating CATE (per-observation orthogonal signals).

`cate_esimand`

IRM-based implementation for estimating CATE (per-observation orthogonal signals).

This module provides a function that, given a CausalData object, fits the internal IRM model and augments the data with a new column 'cate' that contains the orthogonal signals (an estimate of the conditional average treatment effect for each unit).

Functions

cate_esimand – Estimate per-observation CATEs using IRM and return a DataFrame with a new 'cate' column.

`cate_esimand`

Estimate per-observation CATEs using IRM and return a DataFrame with a new 'cate' column.

Parameters

data (CausalData) – A CausalData object with defined outcome (outcome), treatment (binary 0/1), and confounders.
ml_g (estimator) – ML learner for outcome regression g(D, X) = E[Y | D, X] supporting fit/predict. Defaults to CatBoostRegressor if None.
ml_m (classifier) – ML learner for propensity m(X) = P[D=1 | X] supporting fit/predict_proba. Defaults to CatBoostClassifier if None.
n_folds (int) – Number of folds for cross-fitting.
n_rep (int) – Number of repetitions for sample splitting.
use_blp (bool) – If True, and X_new is provided, fits a BLP on the orthogonal signal and predicts CATE for X_new. If False (default), uses the in-sample orthogonal signal and appends to data.
X_new (DataFrame) – New covariate matrix for out-of-sample CATE prediction via best linear predictor. Must contain the same feature columns as the confounders in data_contracts.

Returns

DataFrame – If use_blp is False: returns a copy of data with a new column 'cate'. If use_blp is True and X_new is provided: returns a DataFrame with 'cate' column for X_new rows.

Raises

ValueError – If treatment is not binary 0/1 or required metadata is missing.

`gate`

Group Average Treatment Effect (GATE) inference methods for causalis.

This submodule provides methods for estimating group average treatment effects.

Modules

gate_esimand – Group Average Treatment Effect (GATE) estimation using local DML IRM and BLP.

`gate_esimand`

Group Average Treatment Effect (GATE) estimation using local DML IRM and BLP.

Functions

gate_esimand – Estimate Group Average Treatment Effects (GATEs).

`gate_esimand`

Estimate Group Average Treatment Effects (GATEs).

If groups is None, observations are grouped by quantiles of the plugin CATE proxy (g1_hat - g0_hat).