Skip to content
Submodule
causalis.scenarios.did.refutation.diagnostics

diagnostics

Submodule causalis.scenarios.did.refutation.diagnostics with no child pages and 7 documented members.

Functions

Jump directly into the documented functions for this page.

5 items

Data

Jump directly into the documented data for this page.

2 items
data
causalis.scenarios.did.refutation.diagnostics.BasePeriod

BasePeriod

Value: None

None

Canonical target

causalis.scenarios.did.refutation.diagnostics.BasePeriod

Link to this symbol
function
causalis.scenarios.did.refutation.diagnostics.did_support_table

did_support_table

Return Callaway & Sant’Anna ATT(g,t) support under the requested model policy.

This function identifies the available cohort-time cells for estimation and verifies if there are enough units to form complete treated/control pairs. A unit is “complete” if it is observed in both the base period and the target period.

Parameters

dataPanelDataDID

The validated panel data object.

control_group{“not_yet_or_never”, “never_treated”}, default “not_yet_or_never”

The definition of the comparison group.

anticipationint, default 0

Number of periods before treatment to exclude from the control group due to potential anticipation effects.

base_period{“universal”, “varying”}, default “universal”

Whether to use a fixed base period (universal) or a period-specific one (varying) for each target period.

include_pre_periodsbool, default False

Whether to include pre-treatment periods (useful for placebo tests).

Returns

pd.DataFrame

A table of support metrics for each cohort-time cell: - cohort: The treatment group. - time: The calendar period. - base_time: The period used as a baseline for the difference. - is_supported: Whether the cell has sufficient data for estimation. - n_treated_complete: Number of treated units observed in both periods. - n_control_complete: Number of control units observed in both periods. - treated_completion_rate: Share of cohort units that are complete. - control_completion_rate: Share of control units that are complete.

Notes

The Callaway & Sant’Anna (2021) estimator requires that for each target parameter ATT(g,t)ATT(g,t), there exists a set of units in the comparison group that are also observed in the base period g1g-1 (or t1t-1 for varying base periods).

Examples

Canonical target

causalis.scenarios.did.refutation.diagnostics.did_support_table

Sections

ParametersReturnsNotesExamples
Link to this symbol
function
causalis.scenarios.did.refutation.diagnostics.raw_did_event_study_table

raw_did_event_study_table

Return unadjusted DID event-study cells from the validated panel.

This function calculates simple mean differences between treated and control groups across different event-time periods. These are “raw” estimates without covariate adjustment or IPW/DR weighting.

Parameters

dataPanelDataDID

The validated panel data object.

control_group{“not_yet_or_never”, “never_treated”}, default “not_yet_or_never”

The definition of the comparison group.

anticipationint, default 0

Number of periods before treatment to exclude.

base_period{“varying”, “universal”}, default “varying”

The base period policy for the event study.

include_pre_periodsbool, default True

Whether to include pre-treatment (placebo) periods.

Returns

pd.DataFrame

A table of raw DID estimates: - event_time: Periods relative to treatment (tgt - g). - raw_did: The unadjusted Difference-in-Differences estimate. - se: Naive standard error of the mean difference. - t_stat: t-statistic for the null of zero difference. - n_treated, n_control: Sample sizes in the cell.

Notes

The raw DID for a cohort-time cell (g,t)(g, t) is calculated as:

.. math:: \Delta_{raw}(g,t) = [E[Y_t | G=g] - E[Y_{base} | G=g]] - [E[Y_t | C] - E[Y_{base} | C]]

where CC is the comparison group. These are useful for visual inspection of parallel trends before applying more complex estimators.

Examples

Canonical target

causalis.scenarios.did.refutation.diagnostics.raw_did_event_study_table

Sections

ParametersReturnsNotesExamples
Link to this symbol
function
causalis.scenarios.did.refutation.diagnostics.did_covariate_balance_table

did_covariate_balance_table

Return unweighted base-period covariate balance for Callaway & Sant’Anna cells.

Calculates the standardized mean difference (SMD) for each covariate between the treated and control groups in the base period.

Parameters

dataPanelDataDID

The validated panel data object.

control_group{“not_yet_or_never”, “never_treated”}, default “not_yet_or_never”

The definition of the comparison group.

anticipationint, default 0

Anticipation periods to exclude.

base_period{“universal”, “varying”}, default “universal”

Base period policy.

include_pre_periodsbool, default False

Whether to include pre-treatment cells.

post_onlybool, default True

If True, only checks balance for cells used in post-treatment estimation.

Returns

pd.DataFrame

A balance table with columns: - covariate: Name of the covariate. - treated_mean: Average value in the treated group. - control_mean: Average value in the control group. - smd: Standardized Mean Difference. - abs_smd: Absolute value of the SMD.

Notes

The Standardized Mean Difference is defined as:

.. math:: SMD = \frac{\bar{X}{treated} - \bar{X}{control}}{\sqrt{(s^2_{treated} + s^2_{control}) / 2}}

Values of SMD>0.1|SMD| > 0.1 or 0.250.25 are often used as thresholds to indicate potential imbalance that requires adjustment.

Examples

Canonical target

causalis.scenarios.did.refutation.diagnostics.did_covariate_balance_table

Sections

ParametersReturnsNotesExamples
Link to this symbol
function
causalis.scenarios.did.refutation.diagnostics.did_base_design_table

did_base_design_table

Return base-period control-design rank diagnostics for Callaway & Sant’Anna cells.

Checks the numerical stability of the propensity score and outcome regression designs in the comparison group. High condition numbers or rank deficiency indicate potential multicollinearity or insufficient variation in covariates.

Parameters

dataPanelDataDID

The validated panel data object.

control_group{“not_yet_or_never”, “never_treated”}, default “not_yet_or_never”

The definition of the comparison group.

anticipationint, default 0

Anticipation periods to exclude.

base_period{“universal”, “varying”}, default “universal”

Base period policy.

include_pre_periodsbool, default False

Whether to include pre-treatment cells.

post_onlybool, default True

If True, only checks cells used in post-treatment estimation.

Returns

pd.DataFrame

A table of design diagnostics: - n_control: Number of units in the control pool for the cell. - n_parameters: Number of covariates including the intercept. - control_design_rank: Matrix rank of the covariate design. - condition_number: The L2 condition number of the design matrix. - is_rank_deficient: Whether the matrix is not full rank.

Examples

Canonical target

causalis.scenarios.did.refutation.diagnostics.did_base_design_table

Sections

ParametersReturnsExamples
Link to this symbol
function
causalis.scenarios.did.refutation.diagnostics.run_did_diagnostics

run_did_diagnostics

Run compact pre-fit diagnostics for Callaway & Sant’Anna estimation readiness.

This function performs a battery of “smoke tests” on the data before fitting the model. It checks for sufficient sample size, parallel trends in pre-treatment periods, covariate balance, and numerical stability of the design.

Parameters

dataPanelDataDID

The validated panel data object.

control_group{“not_yet_or_never”, “never_treated”}, default “not_yet_or_never”

The definition of the comparison group.

anticipationint, default 0

Anticipation periods to exclude.

base_period{“universal”, “varying”}, default “universal”

Base period policy.

include_pre_periodsbool, default False

Whether to include pre-treatment cells in the diagnostics.

min_treated_per_cellint, default 30

Minimum number of treated units required in each ATT(g,t) cell.

min_control_per_cellint, default 30

Minimum number of control units required in each ATT(g,t) cell.

min_control_to_treated_ratiofloat, default 1.0

Minimum ratio of control units to treated units.

min_pair_completion_ratefloat, default 0.80

Minimum share of units that must be observed in both base and target periods.

min_control_pool_retentionfloat, default 0.25

Minimum share of the original control pool that must be available for estimation.

max_unsupported_cell_sharefloat, default 0.25

Maximum allowable share of cohort-time cells that cannot be estimated.

min_pre_periodsint, default 2

Minimum number of pre-treatment periods required for placebo tests.

max_abs_pretrend_t_statfloat, default 2.0

Maximum absolute t-statistic allowed for raw pre-treatment differences.

max_abs_covariate_smdfloat, default 0.25

Maximum allowable absolute SMD for covariates in the base period.

max_condition_numberfloat, default 1e6

Maximum allowable condition number for the control design matrix.

min_clustersint, default 2

Minimum number of clusters required if clustering is used.

Returns

pd.DataFrame

A diagnostic report with columns: - test: Name of the diagnostic check. - flag: Status (GREEN, YELLOW, RED). - value: Observed value of the metric. - threshold: The threshold used for the check. - message: Descriptive result message.

Examples

Canonical target

causalis.scenarios.did.refutation.diagnostics.run_did_diagnostics

Sections

ParametersReturnsExamples
Link to this symbol
data
causalis.scenarios.did.refutation.diagnostics.__all__

__all__

Value: ['did_support_table', 'raw_did_event_study_table', 'did_covariate_balance_table', 'did_base_design_t...

[‘did_support_table’, ‘raw_did_event_study_table’, ‘did_covariate_balance_table’, ‘did_base_design_t…

Canonical target

causalis.scenarios.did.refutation.diagnostics.__all__

Link to this symbol