cate
Modules
- blp –
- cate – Conditional Average Treatment Effect (CATE) inference methods for causalis.
- gate – Group Average Treatment Effect (GATE) inference methods for causalis.
Classes
- BLP – Best linear predictor (BLP) with orthogonal signals.
BLP
Best linear predictor (BLP) with orthogonal signals. Mainly used for CATE and GATE estimation for IRM models.
The Best Linear Predictor (BLP) targets the coefficient vector :math:\beta_0 that minimizes the mean squared error
between the true treatment effect function :math:\tau(X) and a linear combination of basis functions :math:b(X):
.. math:: \beta_0 = \arg\min_{\beta \in \mathbb{R}^K} \mathbb{E}\Big[\big(\tau(X) - b(X)^\top \beta \big)^2\Big].
This is characterized by the moment condition:
.. math:: \mathbb{E}[b(X)\psi] = \mathbb{E}[b(X)b(X)^\top]\beta_0,
where :math:\psi is the orthogonal signal such that :math:\mathbb{E}[\psi \mid X] = \tau(X).
The estimator is obtained via OLS of the orthogonal signal on the basis:
.. math:: \hat{\beta} = (B^\top B)^{-1}B^\top\psi.
GATE (Group Average Treatment Effect)
When is_gate=True, the basis consists of group indicators (dummy variables).
In this case, the BLP coefficients correspond to the group means of the orthogonal signal,
which approximate the GATEs:
.. math:: \hat{\beta}k = \frac{1}{n_k}\sum{i:G_i=k}\psi_i \approx \text{GATE}_k.
Confidence Intervals
Confidence intervals for any linear combination :math:\hat{g} = A\hat{\beta} are computed using the estimated covariance matrix :math:\widehat{\Omega}:
.. math:: \widehat{\operatorname{Var}}(\hat{g}) \approx A\widehat{\Omega}A^\top.
Pointwise and joint confidence intervals (via Gaussian multiplier bootstrap) are supported.
Parameters
- orth_signal (
:class:) – The orthogonal signal to be predicted. Has to be of shapenumpy.array(n_obs,), wheren_obsis the number of observations. - basis (
:class:) – The basis for estimating the best linear predictor. Has to have the shapepandas.DataFrame(n_obs, d), wheren_obsis the number of observations anddis the number of predictors. - is_gate (
bool) – Indicates whether the basis is constructed for GATEs (dummy-basis). Default isFalse.
Functions
basis
Basis.
blp_model
Best-Linear-Predictor model.
blp_omega
Covariance matrix.
confint
Confidence intervals for the BLP model.
Parameters
- basis (
:class:) – The basis for constructing the confidence interval. Has to have the same form as the basis from the construction. Ifpandas.DataFrameNoneis passed, if the basis is constructed for GATEs, the GATEs are returned. Else, the confidence intervals for the basis coefficients are returned (with pointwise cofidence intervals). Default isNone. - joint (
bool) – Indicates whether joint confidence intervals are computed. Default isFalse. - alpha (
float) – The significance level. Default is0.05. - n_rep_boot (
int) – The number of bootstrap repetitions (only relevant for joint confidence intervals). Default is500.
Returns
- df_ci (
DataFrame) – A data_contracts frame with the confidence interval(s).
fit
Estimate BLP models.
Parameters
- cov_type (
str) – The covariance type to be used in the estimation. Default is'HC0'. See :meth:statsmodels.regression.linear_model.OLS.fitfor more information. - diagnostic_data (
bool) – Whether to include diagnostic data_contracts. (Currently not used for BLP). - **kwargs – Additional keyword arguments to be passed to :meth:
statsmodels.regression.linear_model.OLS.fit.
Returns
- self (
object) –
orth_signal
Orthogonal signal.
summary
A summary for the best linear predictor effect after calling :meth:fit.
blp
Classes
- BLP – Best linear predictor (BLP) with orthogonal signals.
BLP
Best linear predictor (BLP) with orthogonal signals. Mainly used for CATE and GATE estimation for IRM models.
The Best Linear Predictor (BLP) targets the coefficient vector :math:\beta_0 that minimizes the mean squared error
between the true treatment effect function :math:\tau(X) and a linear combination of basis functions :math:b(X):
.. math:: \beta_0 = \arg\min_{\beta \in \mathbb{R}^K} \mathbb{E}\Big[\big(\tau(X) - b(X)^\top \beta \big)^2\Big].
This is characterized by the moment condition:
.. math:: \mathbb{E}[b(X)\psi] = \mathbb{E}[b(X)b(X)^\top]\beta_0,
where :math:\psi is the orthogonal signal such that :math:\mathbb{E}[\psi \mid X] = \tau(X).
The estimator is obtained via OLS of the orthogonal signal on the basis:
.. math:: \hat{\beta} = (B^\top B)^{-1}B^\top\psi.
GATE (Group Average Treatment Effect)
When is_gate=True, the basis consists of group indicators (dummy variables).
In this case, the BLP coefficients correspond to the group means of the orthogonal signal,
which approximate the GATEs:
.. math:: \hat{\beta}k = \frac{1}{n_k}\sum{i:G_i=k}\psi_i \approx \text{GATE}_k.
Confidence Intervals
Confidence intervals for any linear combination :math:\hat{g} = A\hat{\beta} are computed using the estimated covariance matrix :math:\widehat{\Omega}:
.. math:: \widehat{\operatorname{Var}}(\hat{g}) \approx A\widehat{\Omega}A^\top.
Pointwise and joint confidence intervals (via Gaussian multiplier bootstrap) are supported.
Parameters
- orth_signal (
:class:) – The orthogonal signal to be predicted. Has to be of shapenumpy.array(n_obs,), wheren_obsis the number of observations. - basis (
:class:) – The basis for estimating the best linear predictor. Has to have the shapepandas.DataFrame(n_obs, d), wheren_obsis the number of observations anddis the number of predictors. - is_gate (
bool) – Indicates whether the basis is constructed for GATEs (dummy-basis). Default isFalse.
Functions
basis
Basis.
blp_model
Best-Linear-Predictor model.
blp_omega
Covariance matrix.
confint
Confidence intervals for the BLP model.
Parameters
- basis (
:class:) – The basis for constructing the confidence interval. Has to have the same form as the basis from the construction. Ifpandas.DataFrameNoneis passed, if the basis is constructed for GATEs, the GATEs are returned. Else, the confidence intervals for the basis coefficients are returned (with pointwise cofidence intervals). Default isNone. - joint (
bool) – Indicates whether joint confidence intervals are computed. Default isFalse. - alpha (
float) – The significance level. Default is0.05. - n_rep_boot (
int) – The number of bootstrap repetitions (only relevant for joint confidence intervals). Default is500.
Returns
- df_ci (
DataFrame) – A data_contracts frame with the confidence interval(s).
fit
Estimate BLP models.
Parameters
- cov_type (
str) – The covariance type to be used in the estimation. Default is'HC0'. See :meth:statsmodels.regression.linear_model.OLS.fitfor more information. - diagnostic_data (
bool) – Whether to include diagnostic data_contracts. (Currently not used for BLP). - **kwargs – Additional keyword arguments to be passed to :meth:
statsmodels.regression.linear_model.OLS.fit.
Returns
- self (
object) –
orth_signal
Orthogonal signal.
summary
A summary for the best linear predictor effect after calling :meth:fit.
cate
Conditional Average Treatment Effect (CATE) inference methods for causalis.
This submodule provides methods for estimating conditional average treatment effects.
Modules
- cate_esimand – IRM-based implementation for estimating CATE (per-observation orthogonal signals).
cate_esimand
IRM-based implementation for estimating CATE (per-observation orthogonal signals).
This module provides a function that, given a CausalData object, fits the internal IRM model and augments the data with a new column 'cate' that contains the orthogonal signals (an estimate of the conditional average treatment effect for each unit).
Functions
- cate_esimand – Estimate per-observation CATEs using IRM and return a DataFrame with a new 'cate' column.
cate_esimand
Estimate per-observation CATEs using IRM and return a DataFrame with a new 'cate' column.
Parameters
- data (
CausalData) – A CausalData object with defined outcome (outcome), treatment (binary 0/1), and confounders. - ml_g (
estimator) – ML learner for outcome regression g(D, X) = E[Y | D, X] supporting fit/predict. Defaults to CatBoostRegressor if None. - ml_m (
classifier) – ML learner for propensity m(X) = P[D=1 | X] supporting fit/predict_proba. Defaults to CatBoostClassifier if None. - n_folds (
int) – Number of folds for cross-fitting. - n_rep (
int) – Number of repetitions for sample splitting. - use_blp (
bool) – If True, and X_new is provided, fits a BLP on the orthogonal signal and predicts CATE for X_new. If False (default), uses the in-sample orthogonal signal and appends to data. - X_new (
DataFrame) – New covariate matrix for out-of-sample CATE prediction via best linear predictor. Must contain the same feature columns as the confounders indata_contracts.
Returns
DataFrame– If use_blp is False: returns a copy of data with a new column 'cate'. If use_blp is True and X_new is provided: returns a DataFrame with 'cate' column for X_new rows.
Raises
ValueError– If treatment is not binary 0/1 or required metadata is missing.
gate
Group Average Treatment Effect (GATE) inference methods for causalis.
This submodule provides methods for estimating group average treatment effects.
Modules
- gate_esimand – Group Average Treatment Effect (GATE) estimation using local DML IRM and BLP.
gate_esimand
Group Average Treatment Effect (GATE) estimation using local DML IRM and BLP.
Functions
- gate_esimand – Estimate Group Average Treatment Effects (GATEs).
gate_esimand
Estimate Group Average Treatment Effects (GATEs).
If groups is None, observations are grouped by quantiles of the
plugin CATE proxy (g1_hat - g0_hat).