Causalis

`unconfoundedness`

Modules

cate – Conditional Average Treatment Effect (CATE) inference methods for causalis.
dgp –
gate – Group Average Treatment Effect (GATE) inference methods for causalis.
model – IRM estimator consuming CausalData.
refutation – Refutation and robustness utilities for Causalis.

Classes

IRM – Interactive Regression Model (IRM) with cross-fitting using CausalData.

`IRM`

Bases: BaseEstimator

Interactive Regression Model (IRM) with cross-fitting using CausalData.

Parameters

data (CausalData) – Data container with outcome, binary treatment (0/1), and confounders.
ml_g (estimator) – Learner for E[Y|X,D]. If classifier and Y is binary, predict_proba is used; otherwise predict().
ml_m (classifier) – Learner for E[D|X] (propensity). Must support predict_proba() or predict() in (0,1).
n_folds (int) – Number of cross-fitting folds.
n_rep (int) – Number of repetitions of sample splitting. Currently only 1 is supported.
normalize_ipw (bool) – Whether to normalize IPW terms within the score. Applied to ATE only. For ATTE, normalization is ignored to preserve the canonical ATTE EIF.
trimming_rule ('truncate') – Trimming approach for propensity scores.
trimming_threshold (float) – Threshold for trimming if rule is "truncate".
weights (Optional[ndarray or Dict]) – Optional weights.
If array of shape (n,), used as ATE weights (w). Assumed E[w|X] = w.
If dict, can contain 'weights' (w) and 'weights_bar' (E[w|X]).
For ATTE, computed internally (w=D/P(D=1), w_bar=m(X)/P(D=1)). Note: If weights depend on treatment or outcome, E[w|X] must be provided for correct sensitivity analysis.
relative_baseline_min (float) – Minimum absolute baseline value used for relative effects. If |mu_c| is below this threshold, relative estimates are set to NaN with a warning.
random_state (Optional[int]) – Random seed for fold creation.

Functions

confint – Compute confidence intervals for the estimated coefficient.
estimate – Compute treatment effects using stored nuisance predictions.
fit – Fit nuisance models via cross-fitting.
gate – Estimate Group Average Treatment Effects via BLP on orthogonal signal.
sensitivity_analysis – Compute a sensitivity analysis following Chernozhukov et al. (2022).

`coef`

Return the estimated coefficient.

Returns

ndarray – The estimated coefficient.

`confint`

Compute confidence intervals for the estimated coefficient.

Parameters

alpha (float) – Significance level.

Returns

DataFrame – DataFrame with confidence intervals.

`data`

`diagnostics_`

Return diagnostic data.

Returns

dict – Dictionary containing 'm_hat', 'g0_hat', 'g1_hat', and 'folds'.

`estimate`

Compute treatment effects using stored nuisance predictions.

Parameters

score (('ATE', 'ATTE')) – Target estimand.
alpha (float) – Significance level for intervals.
diagnostic_data (bool) – Whether to include diagnostic data_contracts in the result.

Returns

CausalEstimate – Result container for the estimated effect.

`fit`

Fit nuisance models via cross-fitting.

Parameters

data (Optional[CausalData]) – CausalData container. If None, uses self.data.

Returns

self (IRM) – Fitted estimator.

`gate`

Estimate Group Average Treatment Effects via BLP on orthogonal signal.

Parameters

groups (DataFrame or Series) – Group indicators or labels.
If a single column (Series or 1-col DataFrame) with non-boolean values, it is treated as categorical labels and one-hot encoded.
If multiple columns or boolean/int indicators, it is used as the basis directly.
alpha (float) – Significance level for intervals (passed to BLP).

Returns

BLP – Fitted Best Linear Predictor model.

`ml_g`

`ml_m`

`n_folds`

`n_rep`

`normalize_ipw`

`normalize_ipw_effective_`

`orth_signal`

Return the cross-fitted orthogonal signal (psi_b).

Returns

ndarray – The orthogonal signal.

`pvalues`

Return the p-values for the estimate.

Returns

ndarray – The p-values.

`random_state`

`relative_baseline_min`

`score`

`se`

Return the standard error of the estimate.

Returns

ndarray – The standard error.

`sensitivity_analysis`

Compute a sensitivity analysis following Chernozhukov et al. (2022).

Parameters

r2_y (float) – Sensitivity parameter for outcome equation (R^2 form, R_Y^2; converted to odds form internally).
r2_d (float) – Sensitivity parameter for treatment equation (R^2 form, R_D^2).
rho (float) – Correlation between unobserved components.
H0 (float) – Null hypothesis for robustness values.
alpha (float) – Significance level for CI bounds.

`summary`

Return a summary DataFrame of the results.

Returns

DataFrame – The results summary.

`trimming_rule`

`trimming_threshold`

`weights`

`cate`

Conditional Average Treatment Effect (CATE) inference methods for causalis.

This submodule provides methods for estimating conditional average treatment effects.

Modules

cate_esimand – IRM-based implementation for estimating CATE (per-observation orthogonal signals).

`cate_esimand`

IRM-based implementation for estimating CATE (per-observation orthogonal signals).

This module provides a function that, given a CausalData object, fits the internal IRM model and augments the data with a new column 'cate' that contains the orthogonal signals (an estimate of the conditional average treatment effect for each unit).

Functions

cate_esimand – Estimate per-observation CATEs using IRM and return a DataFrame with a new 'cate' column.

`cate_esimand`

Estimate per-observation CATEs using IRM and return a DataFrame with a new 'cate' column.

Parameters

data (CausalData) – A CausalData object with defined outcome (outcome), treatment (binary 0/1), and confounders.
ml_g (estimator) – ML learner for outcome regression g(D, X) = E[Y | D, X] supporting fit/predict. Defaults to CatBoostRegressor if None.
ml_m (classifier) – ML learner for propensity m(X) = P[D=1 | X] supporting fit/predict_proba. Defaults to CatBoostClassifier if None.
n_folds (int) – Number of folds for cross-fitting.
n_rep (int) – Number of repetitions for sample splitting.
use_blp (bool) – If True, and X_new is provided, fits a BLP on the orthogonal signal and predicts CATE for X_new. If False (default), uses the in-sample orthogonal signal and appends to data.
X_new (DataFrame) – New covariate matrix for out-of-sample CATE prediction via best linear predictor. Must contain the same feature columns as the confounders in data_contracts.

Returns

DataFrame – If use_blp is False: returns a copy of data with a new column 'cate'. If use_blp is True and X_new is provided: returns a DataFrame with 'cate' column for X_new rows.

Raises

ValueError – If treatment is not binary 0/1 or required metadata is missing.

`dgp`

Functions

generate_obs_hte_26 – Observational dataset with nonlinear outcome model, nonlinear treatment assignment,
generate_obs_hte_26_rich – Observational dataset with richer confounding, nonlinear outcome model,
generate_obs_hte_binary_26 – Observational binary-outcome dataset with nonlinear confounding and
obs_linear_26_dataset – A pre-configured observational linear dataset with 5 standard confounders.

`generate_obs_hte_26`

Observational dataset with nonlinear outcome model, nonlinear treatment assignment, and a heterogeneous (nonlinear) treatment effect tau(X). Based on the scenario in notebooks/cases/dml_atte.ipynb.

Parameters

n (int) – Number of samples.
seed (int) – Random seed.
include_oracle (bool) – Whether to include oracle ground-truth columns like 'cate', 'propensity', etc.
return_causal_data (bool) – If True, returns a CausalData object. If False, returns a pandas DataFrame.

`generate_obs_hte_26_rich`

Observational dataset with richer confounding, nonlinear outcome model, nonlinear treatment assignment, and heterogeneous treatment effects. Adds additional realistic covariates and dependencies to mimic real data.

Parameters

n (int) – Number of samples.
seed (int) – Random seed.
include_oracle (bool) – Whether to include oracle ground-truth columns like 'cate', 'propensity', etc.
return_causal_data (bool) – If True, returns a CausalData object. If False, returns a pandas DataFrame.

`generate_obs_hte_binary_26`

Observational binary-outcome dataset with nonlinear confounding and heterogeneous treatment effects.

This scenario follows the structure of generate_obs_hte_26_rich, but uses a binary outcome model and a modified confounder set.

Parameters

n (int) – Number of samples.
seed (int) – Random seed.
include_oracle (bool) – Whether to include oracle columns like 'cate', 'propensity', etc.
return_causal_data (bool) – If True, returns a CausalData object. If False, returns a pandas DataFrame.

`obs_linear_26_dataset`

A pre-configured observational linear dataset with 5 standard confounders. Based on the scenario in docs/cases/dml_ate.ipynb.

Parameters

n (int) – Number of samples.
seed (int) – Random seed.
include_oracle (bool) – Whether to include oracle ground-truth columns like 'cate', 'propensity', etc.
return_causal_data (bool) – If True, returns a CausalData object. If False, returns a pandas DataFrame.

`gate`

Group Average Treatment Effect (GATE) inference methods for causalis.

This submodule provides methods for estimating group average treatment effects.

Modules

gate_esimand – Group Average Treatment Effect (GATE) estimation using local DML IRM and BLP.

`gate_esimand`

Group Average Treatment Effect (GATE) estimation using local DML IRM and BLP.

Functions

gate_esimand – Estimate Group Average Treatment Effects (GATEs).

`gate_esimand`

Estimate Group Average Treatment Effects (GATEs).

If groups is None, observations are grouped by quantiles of the plugin CATE proxy (g1_hat - g0_hat).

`model`

IRM estimator consuming CausalData.

Implements cross-fitted nuisance estimation for g0, g1 and m, and supports ATE/ATTE scores.

Classes

IRM – Interactive Regression Model (IRM) with cross-fitting using CausalData.

`HAS_CATBOOST`

`IRM`

Bases: BaseEstimator

Interactive Regression Model (IRM) with cross-fitting using CausalData.

Parameters

data (CausalData) – Data container with outcome, binary treatment (0/1), and confounders.
ml_g (estimator) – Learner for E[Y|X,D]. If classifier and Y is binary, predict_proba is used; otherwise predict().
ml_m (classifier) – Learner for E[D|X] (propensity). Must support predict_proba() or predict() in (0,1).
n_folds (int) – Number of cross-fitting folds.
n_rep (int) – Number of repetitions of sample splitting. Currently only 1 is supported.
normalize_ipw (bool) – Whether to normalize IPW terms within the score. Applied to ATE only. For ATTE, normalization is ignored to preserve the canonical ATTE EIF.
trimming_rule ('truncate') – Trimming approach for propensity scores.
trimming_threshold (float) – Threshold for trimming if rule is "truncate".
weights (Optional[ndarray or Dict]) – Optional weights.
If array of shape (n,), used as ATE weights (w). Assumed E[w|X] = w.
If dict, can contain 'weights' (w) and 'weights_bar' (E[w|X]).
For ATTE, computed internally (w=D/P(D=1), w_bar=m(X)/P(D=1)). Note: If weights depend on treatment or outcome, E[w|X] must be provided for correct sensitivity analysis.
relative_baseline_min (float) – Minimum absolute baseline value used for relative effects. If |mu_c| is below this threshold, relative estimates are set to NaN with a warning.
random_state (Optional[int]) – Random seed for fold creation.

Functions

confint – Compute confidence intervals for the estimated coefficient.
estimate – Compute treatment effects using stored nuisance predictions.
fit – Fit nuisance models via cross-fitting.
gate – Estimate Group Average Treatment Effects via BLP on orthogonal signal.
sensitivity_analysis – Compute a sensitivity analysis following Chernozhukov et al. (2022).

`coef`

Return the estimated coefficient.

Returns

ndarray – The estimated coefficient.

`confint`

Compute confidence intervals for the estimated coefficient.

Parameters

alpha (float) – Significance level.

Returns

DataFrame – DataFrame with confidence intervals.

`data`

`diagnostics_`

Return diagnostic data.

Returns

dict – Dictionary containing 'm_hat', 'g0_hat', 'g1_hat', and 'folds'.

`estimate`

Compute treatment effects using stored nuisance predictions.

Parameters

score (('ATE', 'ATTE')) – Target estimand.
alpha (float) – Significance level for intervals.
diagnostic_data (bool) – Whether to include diagnostic data_contracts in the result.

Returns

CausalEstimate – Result container for the estimated effect.

`fit`

Fit nuisance models via cross-fitting.

Parameters

data (Optional[CausalData]) – CausalData container. If None, uses self.data.

Returns

self (IRM) – Fitted estimator.

`gate`

Estimate Group Average Treatment Effects via BLP on orthogonal signal.

Parameters

groups (DataFrame or Series) – Group indicators or labels.
If a single column (Series or 1-col DataFrame) with non-boolean values, it is treated as categorical labels and one-hot encoded.
If multiple columns or boolean/int indicators, it is used as the basis directly.
alpha (float) – Significance level for intervals (passed to BLP).

Returns

BLP – Fitted Best Linear Predictor model.

`ml_g`

`ml_m`

`n_folds`

`n_rep`

`normalize_ipw`

`normalize_ipw_effective_`

`orth_signal`

Return the cross-fitted orthogonal signal (psi_b).

Returns

ndarray – The orthogonal signal.

`pvalues`

Return the p-values for the estimate.

Returns

ndarray – The p-values.

`random_state`

`relative_baseline_min`

`score`

`se`

Return the standard error of the estimate.

Returns

ndarray – The standard error.

`sensitivity_analysis`

Compute a sensitivity analysis following Chernozhukov et al. (2022).

Parameters

r2_y (float) – Sensitivity parameter for outcome equation (R^2 form, R_Y^2; converted to odds form internally).
r2_d (float) – Sensitivity parameter for treatment equation (R^2 form, R_D^2).
rho (float) – Correlation between unobserved components.
H0 (float) – Null hypothesis for robustness values.
alpha (float) – Significance level for CI bounds.

`summary`

Return a summary DataFrame of the results.

Returns

DataFrame – The results summary.

`trimming_rule`

`trimming_threshold`

`weights`

`refutation`

Refutation and robustness utilities for Causalis.

Importing this package exposes the public functions from all refutation submodules (overlap, score, unconfoundedness) so you can access commonly used helpers directly via causalis.refutation.

Modules

Functions

get_sensitivity_summary – Render a single, unified bias-aware summary string.
interpret_sensitivity_analysis – Run sensitivity analysis and return a structured interpretation.
plot_influence_instability – Plot instability diagnostics for per-unit score (EIF moment).
plot_m_overlap – Overlap plot for m(x)=P(D=1|X) with high-res rendering.
plot_propensity_reliability – Plot a propensity calibration reliability diagram.
plot_residual_diagnostics – Plot residual diagnostics for nuisance models.
run_overlap_diagnostics – Run overlap diagnostics from CausalData and CausalEstimate.
run_score_diagnostics – Run score diagnostics from CausalData and CausalEstimate.
run_unconfoundedness_diagnostics – Run unconfoundedness diagnostics from CausalData and CausalEstimate.
sensitivity_analysis – Compute bias-aware components and cache them.
sensitivity_benchmark – Computes a benchmark for a given set of features by refitting a short IRM model

`get_sensitivity_summary`

Render a single, unified bias-aware summary string.

If bias-aware components are missing, shows a sampling-only variant with max_bias=0 and then formats via format_bias_aware_summary for consistency.

Parameters

effect_estimation (Dict[str, Any] or Any) – The effect estimation object.
label (str) – The label for the estimand.

Returns

Optional[str] – Formatted summary string or None if extraction fails.

`interpret_sensitivity_analysis`

Run sensitivity analysis and return a structured interpretation.

Parameters

effect_estimation (Dict[str, Any] or Any) – The effect estimation object.
r2_y (float) – Sensitivity parameter for outcome residual confounding strength.
r2_d (float) – Sensitivity parameter for treatment residual confounding strength.
rho (float) – Correlation parameter for unobserved confounding.
H0 (float) – Null hypothesis used for significance checks.
alpha (float) – Significance level.
use_signed_rr (bool) – Whether to use signed rr in the quadratic sensitivity combination.

Returns

Dict[str, Any] – Dictionary with:
- raw: the output of sensitivity_analysis(...)
- interpretation: machine-readable interpretation fields
- summary: compact human-readable interpretation

`overlap`

Modules

overlap_plot –
overlap_validation – Overlap diagnostics focused on positivity and propensity calibration.
reliability_plot – Reliability diagram for propensity calibration diagnostics.

Functions

plot_m_overlap – Overlap plot for m(x)=P(D=1|X) with high-res rendering.
plot_propensity_reliability – Plot a propensity calibration reliability diagram.
run_overlap_diagnostics – Run overlap diagnostics from CausalData and CausalEstimate.

`overlap_plot`

Functions

plot_m_overlap – Overlap plot for m(x)=P(D=1|X) with high-res rendering.

`plot_m_overlap`

Overlap plot for m(x)=P(D=1|X) with high-res rendering.

x in [0,1]
Stable NumPy KDE w/ boundary reflection (no SciPy warnings)
Uses Matplotlib default colors unless color_t/color_c are provided

Parameters

diag (UnconfoundednessDiagnosticData or CausalEstimate) – Diagnostic data directly, or an estimate containing diagnostic_data with m_hat and d.
clip (tuple) – Quantiles to clip for KDE range.
bins (str or int) – Histogram bins.
kde (bool) – Whether to show KDE.
shade_overlap (bool) – Whether to shade the overlap area.
ax (Axes) – Existing axes to plot on.
figsize (tuple) – Figure size.
dpi (int) – Dots per inch.
font_scale (float) – Font scaling factor.
save (str) – Path to save the figure.
save_dpi (int) – DPI for saving.
transparent (bool) – Whether to save with transparency.
color_t (color) – Color for treated group.
color_c (color) – Color for control group.

Returns

Figure – The generated figure.

`overlap_validation`

Overlap diagnostics focused on positivity and propensity calibration.

Functions

run_overlap_diagnostics – Run overlap diagnostics from CausalData and CausalEstimate.

`run_overlap_diagnostics`

Run overlap diagnostics from CausalData and CausalEstimate.

`plot_m_overlap`

Overlap plot for m(x)=P(D=1|X) with high-res rendering.

x in [0,1]
Stable NumPy KDE w/ boundary reflection (no SciPy warnings)
Uses Matplotlib default colors unless color_t/color_c are provided

Parameters

diag (UnconfoundednessDiagnosticData or CausalEstimate) – Diagnostic data directly, or an estimate containing diagnostic_data with m_hat and d.
clip (tuple) – Quantiles to clip for KDE range.
bins (str or int) – Histogram bins.
kde (bool) – Whether to show KDE.
shade_overlap (bool) – Whether to shade the overlap area.
ax (Axes) – Existing axes to plot on.
figsize (tuple) – Figure size.
dpi (int) – Dots per inch.
font_scale (float) – Font scaling factor.
save (str) – Path to save the figure.
save_dpi (int) – DPI for saving.
transparent (bool) – Whether to save with transparency.
color_t (color) – Color for treated group.
color_c (color) – Color for control group.

Returns

Figure – The generated figure.

`plot_propensity_reliability`

Plot a propensity calibration reliability diagram.

Parameters

estimate (CausalEstimate) – Estimate with diagnostic data (m_hat; optionally m_hat_raw, d).
data (CausalData) – Optional fallback source for treatment d when not stored in diagnostic data.
n_bins (int) – Number of calibration bins used to build the reliability table.
show_recalibration (bool) – Overlay logistic recalibration curve sigmoid(alpha + beta * logit(p)) when parameters are available.
annotate_metrics (bool) – Annotate ECE and logistic recalibration parameters on the figure.
ax (Axes) – Existing axes to plot on.
figsize (tuple) – Figure size.
dpi (int) – Dots per inch.
font_scale (float) – Font scaling factor.
point_color (color) – Marker color for binned reliability points.
diagonal_color (color) – Color for the perfect calibration diagonal.
recalibration_color (color) – Color for the logistic recalibration curve.
min_marker_size (float) – Base marker area for non-empty bins.
marker_size_scale (float) – Additional marker area scaled by bin count share.
save (str) – Path to save the figure.
save_dpi (int) – DPI for saving.
transparent (bool) – Whether to save with transparency.

Returns

Figure – The generated figure.

`reliability_plot`

Reliability diagram for propensity calibration diagnostics.

Functions

plot_propensity_reliability – Plot a propensity calibration reliability diagram.

`plot_propensity_reliability`

Plot a propensity calibration reliability diagram.

Parameters

estimate (CausalEstimate) – Estimate with diagnostic data (m_hat; optionally m_hat_raw, d).
data (CausalData) – Optional fallback source for treatment d when not stored in diagnostic data.
n_bins (int) – Number of calibration bins used to build the reliability table.
show_recalibration (bool) – Overlay logistic recalibration curve sigmoid(alpha + beta * logit(p)) when parameters are available.
annotate_metrics (bool) – Annotate ECE and logistic recalibration parameters on the figure.
ax (Axes) – Existing axes to plot on.
figsize (tuple) – Figure size.
dpi (int) – Dots per inch.
font_scale (float) – Font scaling factor.
point_color (color) – Marker color for binned reliability points.
diagonal_color (color) – Color for the perfect calibration diagonal.
recalibration_color (color) – Color for the logistic recalibration curve.
min_marker_size (float) – Base marker area for non-empty bins.
marker_size_scale (float) – Additional marker area scaled by bin count share.
save (str) – Path to save the figure.
save_dpi (int) – DPI for saving.
transparent (bool) – Whether to save with transparency.

Returns

Figure – The generated figure.

`run_overlap_diagnostics`

Run overlap diagnostics from CausalData and CausalEstimate.

`plot_influence_instability`

Plot instability diagnostics for per-unit score (EIF moment).

Panels

Histogram of |psi_i| (optional log-x scale).
Scatter of |psi_i| versus clipped propensity m_i.
(optional) Histogram of IPW terms.
(optional) ESS ratio bars for treated/control weights.

Parameters

estimate (CausalEstimate) – Estimate with diagnostic data (m_hat, g0_hat; optionally y, d, g1_hat, psi).
data (CausalData) – Optional fallback source for y and d if not stored in diagnostic data.
trimming_threshold (float) – Propensity clipping threshold. If omitted, uses diagnostic/model defaults.
use_estimator_psi (bool) – Use estimator-provided diagnostic_data.psi when available; otherwise reconstruct score.
include_ipw (bool) – Add IPW-term histogram and ESS ratio bar panels.
bins (Any) – Histogram bins for non-log histograms.
log_hist (bool) – Use log-scaled x-axis bins for |psi_i| histogram when possible.
scatter_log_y (bool) – Plot |psi_i| on log scale in the scatter panel.
top_k (int) – Highlight top-k largest |psi_i| in the scatter panel.
figsize (tuple) – Figure size. Defaults to (12, 8) with IPW panels, (11, 4.6) otherwise.
dpi (int) – Dots per inch.
font_scale (float) – Font scaling factor.
save (str) – Path to save the figure.
save_dpi (int) – DPI for saving.
transparent (bool) – Whether to save with transparency.

Returns

Figure – The generated figure.

`plot_m_overlap`

Overlap plot for m(x)=P(D=1|X) with high-res rendering.

x in [0,1]
Stable NumPy KDE w/ boundary reflection (no SciPy warnings)
Uses Matplotlib default colors unless color_t/color_c are provided

Parameters

diag (UnconfoundednessDiagnosticData or CausalEstimate) – Diagnostic data directly, or an estimate containing diagnostic_data with m_hat and d.
clip (tuple) – Quantiles to clip for KDE range.
bins (str or int) – Histogram bins.
kde (bool) – Whether to show KDE.
shade_overlap (bool) – Whether to shade the overlap area.
ax (Axes) – Existing axes to plot on.
figsize (tuple) – Figure size.
dpi (int) – Dots per inch.
font_scale (float) – Font scaling factor.
save (str) – Path to save the figure.
save_dpi (int) – DPI for saving.
transparent (bool) – Whether to save with transparency.
color_t (color) – Color for treated group.
color_c (color) – Color for control group.

Returns

Figure – The generated figure.

`plot_propensity_reliability`

Plot a propensity calibration reliability diagram.

Parameters

estimate (CausalEstimate) – Estimate with diagnostic data (m_hat; optionally m_hat_raw, d).
data (CausalData) – Optional fallback source for treatment d when not stored in diagnostic data.
n_bins (int) – Number of calibration bins used to build the reliability table.
show_recalibration (bool) – Overlay logistic recalibration curve sigmoid(alpha + beta * logit(p)) when parameters are available.
annotate_metrics (bool) – Annotate ECE and logistic recalibration parameters on the figure.
ax (Axes) – Existing axes to plot on.
figsize (tuple) – Figure size.
dpi (int) – Dots per inch.
font_scale (float) – Font scaling factor.
point_color (color) – Marker color for binned reliability points.
diagonal_color (color) – Color for the perfect calibration diagonal.
recalibration_color (color) – Color for the logistic recalibration curve.
min_marker_size (float) – Base marker area for non-empty bins.
marker_size_scale (float) – Additional marker area scaled by bin count share.
save (str) – Path to save the figure.
save_dpi (int) – DPI for saving.
transparent (bool) – Whether to save with transparency.

Returns

Figure – The generated figure.

`plot_residual_diagnostics`

Plot residual diagnostics for nuisance models.

Panels

Treated-only: u1 = y - g1 vs g1.
Control-only: u0 = y - g0 vs g0.
Binned calibration error: E[d - m | m in bin] vs binned m.

Parameters

estimate (CausalEstimate) – Estimate with diagnostic data (m_hat, g0_hat; optionally g1_hat, y, d).
data (CausalData) – Optional fallback source for y and d when missing in diagnostic data.
clip_propensity (float) – Clipping epsilon for propensity values in the treatment-residual panel.
n_bins (int) – Number of quantile bins for the binned-mean trend overlays.
marker_size (float) – Scatter marker size.
alpha (float) – Scatter opacity.
figsize (tuple) – Figure size.
dpi (int) – Dots per inch.
font_scale (float) – Font scaling factor.
save (str) – Path to save the figure.
save_dpi (int) – DPI for saving.
transparent (bool) – Whether to save with transparency.

Returns

Figure – The generated figure.

`run_overlap_diagnostics`

Run overlap diagnostics from CausalData and CausalEstimate.

`run_score_diagnostics`

Run score diagnostics from CausalData and CausalEstimate.

`run_unconfoundedness_diagnostics`

Run unconfoundedness diagnostics from CausalData and CausalEstimate.

`score`

Modules

influence_plot – Influence/instability plots for score-based diagnostics.
residual_plots – Residual diagnostic plots for nuisance models g0/g1 and m.
score_validation – Score diagnostics focused on orthogonality and EIF stability.

Functions

plot_influence_instability – Plot instability diagnostics for per-unit score (EIF moment).
plot_residual_diagnostics – Plot residual diagnostics for nuisance models.
run_score_diagnostics – Run score diagnostics from CausalData and CausalEstimate.

`influence_plot`

Influence/instability plots for score-based diagnostics.

Functions

plot_influence_instability – Plot instability diagnostics for per-unit score (EIF moment).

`plot_influence_instability`

Plot instability diagnostics for per-unit score (EIF moment).

Panels

Histogram of |psi_i| (optional log-x scale).
Scatter of |psi_i| versus clipped propensity m_i.
(optional) Histogram of IPW terms.
(optional) ESS ratio bars for treated/control weights.

Parameters

estimate (CausalEstimate) – Estimate with diagnostic data (m_hat, g0_hat; optionally y, d, g1_hat, psi).
data (CausalData) – Optional fallback source for y and d if not stored in diagnostic data.
trimming_threshold (float) – Propensity clipping threshold. If omitted, uses diagnostic/model defaults.
use_estimator_psi (bool) – Use estimator-provided diagnostic_data.psi when available; otherwise reconstruct score.
include_ipw (bool) – Add IPW-term histogram and ESS ratio bar panels.
bins (Any) – Histogram bins for non-log histograms.
log_hist (bool) – Use log-scaled x-axis bins for |psi_i| histogram when possible.
scatter_log_y (bool) – Plot |psi_i| on log scale in the scatter panel.
top_k (int) – Highlight top-k largest |psi_i| in the scatter panel.
figsize (tuple) – Figure size. Defaults to (12, 8) with IPW panels, (11, 4.6) otherwise.
dpi (int) – Dots per inch.
font_scale (float) – Font scaling factor.
save (str) – Path to save the figure.
save_dpi (int) – DPI for saving.
transparent (bool) – Whether to save with transparency.

Returns

Figure – The generated figure.

`plot_influence_instability`

Plot instability diagnostics for per-unit score (EIF moment).

Panels

Histogram of |psi_i| (optional log-x scale).
Scatter of |psi_i| versus clipped propensity m_i.
(optional) Histogram of IPW terms.
(optional) ESS ratio bars for treated/control weights.

Parameters

estimate (CausalEstimate) – Estimate with diagnostic data (m_hat, g0_hat; optionally y, d, g1_hat, psi).
data (CausalData) – Optional fallback source for y and d if not stored in diagnostic data.
trimming_threshold (float) – Propensity clipping threshold. If omitted, uses diagnostic/model defaults.
use_estimator_psi (bool) – Use estimator-provided diagnostic_data.psi when available; otherwise reconstruct score.
include_ipw (bool) – Add IPW-term histogram and ESS ratio bar panels.
bins (Any) – Histogram bins for non-log histograms.
log_hist (bool) – Use log-scaled x-axis bins for |psi_i| histogram when possible.
scatter_log_y (bool) – Plot |psi_i| on log scale in the scatter panel.
top_k (int) – Highlight top-k largest |psi_i| in the scatter panel.
figsize (tuple) – Figure size. Defaults to (12, 8) with IPW panels, (11, 4.6) otherwise.
dpi (int) – Dots per inch.
font_scale (float) – Font scaling factor.
save (str) – Path to save the figure.
save_dpi (int) – DPI for saving.
transparent (bool) – Whether to save with transparency.

Returns

Figure – The generated figure.

`plot_residual_diagnostics`

Plot residual diagnostics for nuisance models.

Panels

Treated-only: u1 = y - g1 vs g1.
Control-only: u0 = y - g0 vs g0.
Binned calibration error: E[d - m | m in bin] vs binned m.

Parameters

estimate (CausalEstimate) – Estimate with diagnostic data (m_hat, g0_hat; optionally g1_hat, y, d).
data (CausalData) – Optional fallback source for y and d when missing in diagnostic data.
clip_propensity (float) – Clipping epsilon for propensity values in the treatment-residual panel.
n_bins (int) – Number of quantile bins for the binned-mean trend overlays.
marker_size (float) – Scatter marker size.
alpha (float) – Scatter opacity.
figsize (tuple) – Figure size.
dpi (int) – Dots per inch.
font_scale (float) – Font scaling factor.
save (str) – Path to save the figure.
save_dpi (int) – DPI for saving.
transparent (bool) – Whether to save with transparency.

Returns

Figure – The generated figure.

`residual_plots`

Residual diagnostic plots for nuisance models g0/g1 and m.

Functions

plot_residual_diagnostics – Plot residual diagnostics for nuisance models.

`plot_residual_diagnostics`

Plot residual diagnostics for nuisance models.

Panels

Treated-only: u1 = y - g1 vs g1.
Control-only: u0 = y - g0 vs g0.
Binned calibration error: E[d - m | m in bin] vs binned m.

Parameters

estimate (CausalEstimate) – Estimate with diagnostic data (m_hat, g0_hat; optionally g1_hat, y, d).
data (CausalData) – Optional fallback source for y and d when missing in diagnostic data.
clip_propensity (float) – Clipping epsilon for propensity values in the treatment-residual panel.
n_bins (int) – Number of quantile bins for the binned-mean trend overlays.
marker_size (float) – Scatter marker size.
alpha (float) – Scatter opacity.
figsize (tuple) – Figure size.
dpi (int) – Dots per inch.
font_scale (float) – Font scaling factor.
save (str) – Path to save the figure.
save_dpi (int) – DPI for saving.
transparent (bool) – Whether to save with transparency.

Returns

Figure – The generated figure.

`run_score_diagnostics`

Run score diagnostics from CausalData and CausalEstimate.

`score_validation`

Score diagnostics focused on orthogonality and EIF stability.

Functions

run_score_diagnostics – Run score diagnostics from CausalData and CausalEstimate.

`run_score_diagnostics`

Run score diagnostics from CausalData and CausalEstimate.

`sensitivity_analysis`

Compute bias-aware components and cache them.

Parameters

effect_estimation (Dict[str, Any] or Any) – The effect estimation object.
r2_y (float) – Sensitivity parameter for the outcome (R^2 form, R_Y^2; converted to odds form internally).
r2_d (float) – Sensitivity parameter for the treatment (R^2 form, R_D^2).
rho (float) – Correlation parameter.
H0 (float) – Null hypothesis for robustness values.
alpha (float) – Significance level.
use_signed_rr (bool) – Whether to use signed rr in the quadratic combination of sensitivity components. If True and m_alpha/rr are available, the bias bound is computed via the per-unit quadratic form and RV/RVa are not reported.

Returns

dict – Dictionary with bias-aware results:
- theta, se, alpha, z
- sampling_ci
- theta_bounds_cofounding = (theta - bound_width, theta + bound_width)
- bias_aware_ci = faithful CI for the bounds
- max_bias and components (sigma2, nu2)
- params (r2_y, r2_d, rho, use_signed_rr)

`sensitivity_benchmark`

Computes a benchmark for a given set of features by refitting a short IRM model (excluding the provided features) and contrasting it with the original (long) model. Returns a DataFrame containing r2_y, r2_d, rho and the change in estimates.

Parameters

effect_estimation (dict) – A dictionary containing the fitted IRM model under the key 'model'.
benchmarking_set (list[str]) – List of confounder names to be used for benchmarking (to be removed in the short model).
fit_args (dict) – Legacy name for additional keyword arguments passed to IRM.estimate(...) on the short model. If score is omitted, the long-model score is reused.

Returns

DataFrame – A one-row DataFrame indexed by the treatment name with columns:
r2_y, r2_d, rho: residual-based benchmarking strengths
theta_long, theta_short, delta: effect estimates and their change (long - short)

`unconfoundedness`

Modules

sensitivity – Sensitivity functions refactored into a dedicated module.
unconfoundedness_validation – Unconfoundedness diagnostics focused on covariate balance (SMD).

Functions

compute_bias_aware_ci – Compute bias-aware confidence intervals.
get_sensitivity_summary – Render a single, unified bias-aware summary string.
run_unconfoundedness_diagnostics – Run unconfoundedness diagnostics from CausalData and CausalEstimate.
sensitivity_analysis – Compute bias-aware components and cache them.
sensitivity_benchmark – Computes a benchmark for a given set of features by refitting a short IRM model

`compute_bias_aware_ci`

Compute bias-aware confidence intervals.

Returns a dict with:

theta, se, alpha, z
sampling_ci
theta_bounds_cofounding = [theta_lower, theta_upper] = theta ± bound_width
bias_aware_ci = [theta - (bound_width + zse), theta + (bound_width + zse)]
max_bias_base, max_bias, bound_width and components (sigma2, nu2)

Parameters

effect_estimation (Dict[str, Any] or Any) – The effect estimation object.
r2_y (float) – Sensitivity parameter for the outcome (R^2 form, R_Y^2).
r2_d (float) – Sensitivity parameter for the treatment (R^2 form, R_D^2).
rho (float) – Correlation parameter.
H0 (float) – Null hypothesis for robustness values.
alpha (float) – Significance level.
use_signed_rr (bool) – Whether to use signed rr in the quadratic combination of sensitivity components. If True and m_alpha/rr are available, the bias bound is computed via the per-unit quadratic form and RV/RVa are not reported.

Returns

dict – Dictionary with bias-aware results.

`get_sensitivity_summary`

Render a single, unified bias-aware summary string.

If bias-aware components are missing, shows a sampling-only variant with max_bias=0 and then formats via format_bias_aware_summary for consistency.

Parameters

effect_estimation (Dict[str, Any] or Any) – The effect estimation object.
label (str) – The label for the estimand.

Returns

Optional[str] – Formatted summary string or None if extraction fails.

`run_unconfoundedness_diagnostics`

Run unconfoundedness diagnostics from CausalData and CausalEstimate.

`sensitivity`

Sensitivity functions refactored into a dedicated module.

This module centralizes bias-aware sensitivity helpers and the public entry points used by refutation utilities for unconfoundedness.

Functions

get_sensitivity_summary – Render a single, unified bias-aware summary string.
interpret_sensitivity_analysis – Run sensitivity analysis and return a structured interpretation.
sensitivity_analysis – Compute bias-aware components and cache them.
sensitivity_benchmark – Computes a benchmark for a given set of features by refitting a short IRM model

`get_sensitivity_summary`

Render a single, unified bias-aware summary string.

If bias-aware components are missing, shows a sampling-only variant with max_bias=0 and then formats via format_bias_aware_summary for consistency.

Parameters

effect_estimation (Dict[str, Any] or Any) – The effect estimation object.
label (str) – The label for the estimand.

Returns

Optional[str] – Formatted summary string or None if extraction fails.

`interpret_sensitivity_analysis`

Run sensitivity analysis and return a structured interpretation.

Parameters

effect_estimation (Dict[str, Any] or Any) – The effect estimation object.
r2_y (float) – Sensitivity parameter for outcome residual confounding strength.
r2_d (float) – Sensitivity parameter for treatment residual confounding strength.
rho (float) – Correlation parameter for unobserved confounding.
H0 (float) – Null hypothesis used for significance checks.
alpha (float) – Significance level.
use_signed_rr (bool) – Whether to use signed rr in the quadratic sensitivity combination.

Returns

Dict[str, Any] – Dictionary with:
- raw: the output of sensitivity_analysis(...)
- interpretation: machine-readable interpretation fields
- summary: compact human-readable interpretation

`sensitivity_analysis`

Compute bias-aware components and cache them.

Parameters

effect_estimation (Dict[str, Any] or Any) – The effect estimation object.
r2_y (float) – Sensitivity parameter for the outcome (R^2 form, R_Y^2; converted to odds form internally).
r2_d (float) – Sensitivity parameter for the treatment (R^2 form, R_D^2).
rho (float) – Correlation parameter.
H0 (float) – Null hypothesis for robustness values.
alpha (float) – Significance level.
use_signed_rr (bool) – Whether to use signed rr in the quadratic combination of sensitivity components. If True and m_alpha/rr are available, the bias bound is computed via the per-unit quadratic form and RV/RVa are not reported.

Returns

dict – Dictionary with bias-aware results:
- theta, se, alpha, z
- sampling_ci
- theta_bounds_cofounding = (theta - bound_width, theta + bound_width)
- bias_aware_ci = faithful CI for the bounds
- max_bias and components (sigma2, nu2)
- params (r2_y, r2_d, rho, use_signed_rr)

`sensitivity_benchmark`

Parameters

effect_estimation (dict) – A dictionary containing the fitted IRM model under the key 'model'.
benchmarking_set (list[str]) – List of confounder names to be used for benchmarking (to be removed in the short model).
fit_args (dict) – Legacy name for additional keyword arguments passed to IRM.estimate(...) on the short model. If score is omitted, the long-model score is reused.

Returns

DataFrame – A one-row DataFrame indexed by the treatment name with columns:
r2_y, r2_d, rho: residual-based benchmarking strengths
theta_long, theta_short, delta: effect estimates and their change (long - short)

`sensitivity_analysis`

Compute bias-aware components and cache them.

Parameters

effect_estimation (Dict[str, Any] or Any) – The effect estimation object.
r2_y (float) – Sensitivity parameter for the outcome (R^2 form, R_Y^2; converted to odds form internally).
r2_d (float) – Sensitivity parameter for the treatment (R^2 form, R_D^2).
rho (float) – Correlation parameter.
H0 (float) – Null hypothesis for robustness values.
alpha (float) – Significance level.
use_signed_rr (bool) – Whether to use signed rr in the quadratic combination of sensitivity components. If True and m_alpha/rr are available, the bias bound is computed via the per-unit quadratic form and RV/RVa are not reported.

Returns

dict – Dictionary with bias-aware results:
- theta, se, alpha, z
- sampling_ci
- theta_bounds_cofounding = (theta - bound_width, theta + bound_width)
- bias_aware_ci = faithful CI for the bounds
- max_bias and components (sigma2, nu2)
- params (r2_y, r2_d, rho, use_signed_rr)

`sensitivity_benchmark`

Parameters

effect_estimation (dict) – A dictionary containing the fitted IRM model under the key 'model'.
benchmarking_set (list[str]) – List of confounder names to be used for benchmarking (to be removed in the short model).
fit_args (dict) – Legacy name for additional keyword arguments passed to IRM.estimate(...) on the short model. If score is omitted, the long-model score is reused.

Returns

DataFrame – A one-row DataFrame indexed by the treatment name with columns:
r2_y, r2_d, rho: residual-based benchmarking strengths
theta_long, theta_short, delta: effect estimates and their change (long - short)

`unconfoundedness_validation`

Unconfoundedness diagnostics focused on covariate balance (SMD).

Functions

run_unconfoundedness_diagnostics – Run unconfoundedness diagnostics from CausalData and CausalEstimate.

`run_unconfoundedness_diagnostics`

Run unconfoundedness diagnostics from CausalData and CausalEstimate.