Skip to content
Submodule
causalis.scenarios.unconfoundedness.model

model

Submodule causalis.scenarios.unconfoundedness.model with no child pages and 13 documented members.

Classes

Jump directly into the documented classes for this page.

1 items
class
causalis.scenarios.unconfoundedness.model.IRM

IRM

Bases: sklearn.base.BaseEstimator

Interactive Regression Model (IRM) with cross-fitting using CausalData.

Parameters

dataCausalData

Data container with outcome, binary treatment (0/1), and confounders.

ml_gestimator

Learner for E[Y|X,D]. If classifier and Y is binary, predict_proba is used; otherwise predict().

ml_mclassifier

Learner for E[D|X] (propensity). Must support predict_proba() or predict() in (0,1).

n_foldsint, default 5

Number of cross-fitting folds.

n_repint, default 1

Number of repetitions of sample splitting. Currently only 1 is supported.

normalize_ipwbool, default False

Whether to normalize IPW terms within the score. Applied to ATE only. For ATTE, normalization is ignored to preserve the canonical ATTE EIF.

trimming_rule{“truncate”}, default “truncate”

Trimming approach for propensity scores.

trimming_thresholdfloat, default 1e-2

Threshold for trimming if rule is “truncate”.

weightsOptional[np.ndarray or Dict], default None

Optional weights. - If array of shape (n,), used as ATE weights (w). Assumed E[w|X] = w. - If dict, can contain ‘weights’ (w) and ‘weights_bar’ (E[w|X]). - For ATTE, computed internally (w=D/P(D=1), w_bar=m(X)/P(D=1)). Note: If weights depend on treatment or outcome, E[w|X] must be provided for correct sensitivity analysis.

relative_baseline_minfloat, default 1e-8

Minimum absolute baseline value used for relative effects. If |mu_c| is below this threshold, relative estimates are set to NaN with a warning.

random_stateOptional[int], default None

Random seed for fold creation.

n_jobsint, default 1

Number of parallel jobs for fold-level cross-fitting. Use -1 to use all available CPUs. Practical guidance: - Start with n_jobs=1 for stable, low-contention defaults. - Increase to n_jobs=2/4/-1 when cross-fitting is the bottleneck. - If nuisance learners are already multithreaded (e.g. CatBoost with thread_count=-1), keep n_jobs=1 or set learner threads to 1 to avoid CPU oversubscription. - On shared machines, prefer a bounded value (for example 2 or 4) instead of -1.

store_diagnosticsbool, default True

Whether to retain raw fit-time arrays and diagnostic-only artifacts on the fitted model. Set to False for a lighter-weight estimator that still supports effect estimation, while only retaining immutable outcome and treatment snapshots. In lightweight mode the estimator no longer keeps the confounder matrix, raw propensities, or fold assignments in memory after fit().

Examples

Notes

The IRM model targets binary-treatment causal effects under unconfoundedness. Let W=(Y,D,X)W = (Y, D, X) with D{0,1}D \in \{0, 1\} and define

g0(d,x)=E[YD=d,X=x],m0(x)=P(D=1X=x).g_0(d, x) = \mathbb{E}[Y \mid D=d, X=x], \qquad m_0(x) = \mathbb{P}(D=1 \mid X=x).

Under conditional ignorability and overlap,

(Y(0),Y(1))DX,0<m0(X)<1 a.s.,(Y(0), Y(1)) \perp D \mid X, \qquad 0 < m_0(X) < 1 \ \text{a.s.},

the target functionals are identified as

θ0ATE=E[g0(1,X)g0(0,X)]\theta_0^{ATE} = \mathbb{E}[g_0(1, X) - g_0(0, X)]

and

θ0ATTE=E[g0(1,X)g0(0,X)D=1].\theta_0^{ATTE} = \mathbb{E}[g_0(1, X) - g_0(0, X) \mid D=1].

This implementation cross-fits three nuisance objects: g^1(x)E[YD=1,X=x]\hat g_1(x) \approx \mathbb{E}[Y \mid D=1, X=x], g^0(x)E[YD=0,X=x]\hat g_0(x) \approx \mathbb{E}[Y \mid D=0, X=x], and m^(x)P(D=1X=x)\hat m(x) \approx \mathbb{P}(D=1 \mid X=x). Propensities are trimmed via

m~(x)=min{1ε,max(m^(x),ε)},\tilde m(x) = \min\{1-\varepsilon, \max(\hat m(x), \varepsilon)\},

where ε=\varepsilon = trimming_threshold.

Estimation solves the sample moment equation

En[ψa(Wi;η^)θ+ψb(Wi;η^)]=0,\mathbb{E}_n[\psi_a(W_i; \hat\eta)\theta + \psi_b(W_i; \hat\eta)] = 0,

giving the closed-form estimator

θ^=En[ψb(Wi;η^)]En[ψa(Wi;η^)].\hat\theta = -\frac{\mathbb{E}_n[\psi_b(W_i; \hat\eta)]} {\mathbb{E}_n[\psi_a(W_i; \hat\eta)]}.

For both ATE and ATTE, the orthogonal score component used here is

ψb=w(g^1(X)g^0(X))+wˉ[(Yg^1(X))Dm~(X)(Yg^0(X))1D1m~(X)].\psi_b = w \, (\hat g_1(X) - \hat g_0(X)) + \bar w \left[ (Y - \hat g_1(X)) \frac{D}{\tilde m(X)} - (Y - \hat g_0(X)) \frac{1-D}{1-\tilde m(X)} \right].

The score derivative differs by estimand:

ψa=1for ATE,ψa=wfor ATTE.\psi_a = -1 \quad \text{for ATE}, \qquad \psi_a = -w \quad \text{for ATTE}.

The corresponding weights are

w=wˉ=1for unweighted ATE,w = \bar w = 1 \quad \text{for unweighted ATE},

while for ATTE` this implementation uses normalized treated weights

wi=DiEn[D],wˉi=m~(Xi)En[D].w_i = \frac{D_i}{\mathbb{E}_n[D]}, \qquad \bar w_i = \frac{\tilde m(X_i)}{\mathbb{E}_n[D]}.

If normalize_ipw=True, the inverse-probability factors D/m~(X)D / \tilde m(X) and (1D)/(1m~(X))(1-D) / (1-\tilde m(X)) are additionally stabilized by their sample means (a Hajek-style normalization). This option is applied to ATE only; for ATTE it is intentionally ignored to preserve the canonical ATTE efficient influence function used by the estimator.

Initialization

Initialize the estimator and validate configuration options.

Canonical target

causalis.scenarios.unconfoundedness.model.IRM

Sections

ParametersNotesInitializationExamples
Link to this symbol
method
causalis.scenarios.unconfoundedness.model.IRM.fit

fit

Fit nuisance models via cross-fitting.

Parameters

dataOptional[CausalData], default None

CausalData container. If None, uses self.data.

store_diagnosticsOptional[bool], default None

Optional override for whether the fitted model should retain diagnostics-oriented arrays and expose diagnostic payloads from subsequent estimate() calls. Outcome and treatment snapshots are always retained to keep post-fit estimation deterministic.

Returns

self : IRM

Fitted estimator.

Canonical target

causalis.scenarios.unconfoundedness.model.IRM.fit

Sections

ParametersReturns
Link to this symbol
method
causalis.scenarios.unconfoundedness.model.IRM.estimate

estimate

Compute treatment effects using stored nuisance predictions.

Parameters

score{“ATE”, “ATTE”, “GATE”, “GATET”}, default “ATE”

Target estimand.

alphafloat, default 0.05

Significance level for intervals. Diagnostic payloads are included only when the model was fitted with store_diagnostics=True.

groupsOptional[pd.DataFrame | pd.Series], default None

Group labels/indicators for score="GATE" or score="GATET". If None, fallback to self.data.gate_groups when present. GATE/GATET requires CausalData.user_id and aligns groups to those fit-time observation ids. Row-indexed groups are also accepted only when the fit-time row-to-user_id mapping is still unchanged.

cov_type{“HC0”, “HC1”, “HC2”, “HC3”}, default “HC3”

Robust covariance type for score="GATE" / score="GATET" inference.

cov_kwdsOptional[Dict[str, Any]], default None

Additional covariance keyword arguments requested for subgroup inference. These are currently ignored because GATE/GATET use closed-form HCx covariance formulas rather than delegating to statsmodels.

Returns

CausalEstimate or GateEstimate

Result container for the estimated effect. For subgroup scores, the returned GateEstimate supports summary() for subgroup-vs-zero inference, contrast(...) for formal group-vs-group tests, and pairwise_summary(...) for a broader comparison table.

Canonical target

causalis.scenarios.unconfoundedness.model.IRM.estimate

Sections

ParametersReturns
Link to this symbol
property
causalis.scenarios.unconfoundedness.model.IRM.diagnostics_

diagnostics_

Return diagnostic data.

Returns

dict

Dictionary containing ‘m_hat’, ‘g0_hat’, ‘g1_hat’, and ‘folds’.

Canonical target

causalis.scenarios.unconfoundedness.model.IRM.diagnostics_

Sections

Returns
Link to this symbol
property
causalis.scenarios.unconfoundedness.model.IRM.coef

coef

Return the estimated coefficient.

Returns

np.ndarray

The estimated coefficient.

Canonical target

causalis.scenarios.unconfoundedness.model.IRM.coef

Sections

Returns
Link to this symbol
property
causalis.scenarios.unconfoundedness.model.IRM.se

se

Return the standard error of the estimate.

Returns

np.ndarray

The standard error.

Canonical target

causalis.scenarios.unconfoundedness.model.IRM.se

Sections

Returns
Link to this symbol
property
causalis.scenarios.unconfoundedness.model.IRM.pvalues

pvalues

Return the p-values for the estimate.

Returns

np.ndarray

The p-values.

Canonical target

causalis.scenarios.unconfoundedness.model.IRM.pvalues

Sections

Returns
Link to this symbol
property
causalis.scenarios.unconfoundedness.model.IRM.summary

summary

Return a summary DataFrame of the results.

Returns

pd.DataFrame

The results summary.

Canonical target

causalis.scenarios.unconfoundedness.model.IRM.summary

Sections

Returns
Link to this symbol
property
causalis.scenarios.unconfoundedness.model.IRM.orth_signal

orth_signal

Return the cross-fitted orthogonal signal (psi_b).

Returns

np.ndarray

The orthogonal signal.

Canonical target

causalis.scenarios.unconfoundedness.model.IRM.orth_signal

Sections

Returns
Link to this symbol
method
causalis.scenarios.unconfoundedness.model.IRM.gate

gate

Convenience wrapper for estimate(score="GATE", ...).

Parameters

groupspd.DataFrame or pd.Series

Subgroup labels or a strict dummy basis. GATE requires CausalData.user_id and aligns groups to those fit-time observation ids.

alphafloat, default 0.05

Significance level for confidence intervals.

cov_type{“HC0”, “HC1”, “HC2”, “HC3”}, default “HC3”

Robust covariance type for subgroup inference.

cov_kwdsOptional[Dict[str, Any]], default None

Additional covariance keyword arguments requested by the caller. These are currently ignored by the closed-form GATE implementation.

Returns

GateEstimate

Estimated subgroup effects and diagnostics. The returned result also supports contrast(...) and pairwise_summary(...) for formal post-estimation group comparisons.

Canonical target

causalis.scenarios.unconfoundedness.model.IRM.gate

Sections

ParametersReturns
Link to this symbol
method
causalis.scenarios.unconfoundedness.model.IRM.gatet

gatet

Convenience wrapper for estimate(score="GATET", ...).

Canonical target

causalis.scenarios.unconfoundedness.model.IRM.gatet

Link to this symbol
method
causalis.scenarios.unconfoundedness.model.IRM.sensitivity_analysis

sensitivity_analysis

Compute a sensitivity analysis following Chernozhukov et al. (2022).

Parameters

r2_yfloat

Sensitivity parameter for outcome equation (R^2 form, R_Y^2; converted to odds form internally).

r2_dfloat

Sensitivity parameter for treatment equation (R^2 form, R_D^2).

rhofloat, default 1.0

Correlation between unobserved components.

H0float, default 0.0

Null hypothesis for robustness values.

alphafloat, default 0.05

Significance level for CI bounds.

Canonical target

causalis.scenarios.unconfoundedness.model.IRM.sensitivity_analysis

Sections

Parameters
Link to this symbol
method
causalis.scenarios.unconfoundedness.model.IRM.confint

confint

Compute confidence intervals for the estimated coefficient.

Parameters

alphafloat, default 0.05

Significance level.

Returns

pd.DataFrame

DataFrame with confidence intervals.

Canonical target

causalis.scenarios.unconfoundedness.model.IRM.confint

Sections

ParametersReturns
Link to this symbol