Systematic Leading Indicator Discovery...

Abstract

The American Distress Index (ADI) tracks household financial distress through 96 economic indicators organized into five components. This paper documents a systematic scanner that tests all pairwise combinations of these indicators for statistically validated leading relationships — instances where one indicator’s movement reliably precedes another’s by a measurable lag.

The pipeline applies five sequential filters to eliminate spurious correlations:

Cross-correlation on raw levels (FDR-corrected at α = 0.05)
Cross-correlation on first-differenced series (eliminates trend-driven false positives)
Multi-crisis validation across GFC (2007–2009), COVID (2020), and the 2001 recession
Granger causality testing (p < 0.05)
Out-of-sample replication (calibrate 2000–2012, validate 2013–2025, minimum r > 0.3)

Of 57,541 raw pairs tested, 3,416 unique indicator combinations survived the dual correlation filter. 1,264 validated during at least one crisis. The top 30 were tested for Granger causality; 11 passed. Of those, 6 replicated out-of-sample — a survival rate of 0.01%.

Data

The scanner draws from 96 indicator time series stored in data/indicators/. Each indicator has quarterly or monthly observations resampled to a common quarterly frequency using last-observation-carried-forward. The baseline period spans 2000–2025, with most series beginning between 2000 and 2005.

Sources include the Federal Reserve Economic Data (FRED) system, the Bureau of Labor Statistics (BLS), the National Bureau of Economic Research, the NY Federal Reserve, and administrative data from mortgage servicer reports. All source data is publicly available and attributable.

Key preprocessing steps:

Quarterly resampling: Monthly series are resampled to quarterly using end-of-quarter values
Minimum overlap: Pairs require at least 12 quarters of overlapping observations
Winsorization: Values are capped at the 95th percentile to prevent COVID-era outliers from dominating the correlation structure
Maximum lag: Cross-correlations are computed at lags 0 through 16 quarters (4 years)

Methods

Filter 1: Raw Cross-Correlation with FDR Correction

For each pair of indicators (A, B), we compute the Pearson cross-correlation at every lag from 0 to 16 quarters. The null hypothesis is that A at time t and B at time t + lag are uncorrelated.

Testing 57,541 pairs generates a massive multiple comparisons problem. Raw p-values would produce hundreds of false positives. We apply the Benjamini-Hochberg procedure for False Discovery Rate (FDR) control at α = 0.05, which bounds the expected proportion of false discoveries rather than the probability of any single false discovery. This is more appropriate than Bonferroni correction for exploratory analysis, where we expect a nonzero proportion of true relationships.

3,416 unique indicator pairs survive this filter.

Filter 2: First-Differenced Cross-Correlation

Two economic time series that both trend upward over a decade will show spurious correlation even if they’re mechanistically unrelated. This is the “nonsense correlation” problem documented by Granger and Newbold (1974).

We recompute cross-correlations on first-differenced series — the change from one quarter to the next — which strips out shared trends and tests whether changes in A predict changes in B. Only pairs that are significant in both raw levels and first differences survive.

All 3,416 pairs survived this filter in the current dataset, which reflects the fact that most indicator pairs in a coherent distress framework do share genuine short-term dynamics, not just shared trends.

Filter 3: Multi-Crisis Validation

A leading relationship that holds during tranquil periods but breaks during crises has limited practical value — crises are precisely when leading indicators matter most.

We define three crisis windows:

Great Financial Crisis (GFC): 2007-Q3 through 2009-Q2
COVID recession: 2020-Q1 through 2020-Q3
2001 recession: 2001-Q1 through 2001-Q4

For each pair, we check whether the leading relationship (correct sign, plausible lag) holds during each crisis window. Pairs must validate during at least one crisis to pass. Of 3,416 pairs, 1,264 validated during one or more crises.

The strongest relationships validate across multiple crises. Initial Claims → Continuing Claims validated in all three — the GFC, COVID, and 2001 — reflecting the structural mechanics of unemployment insurance rather than a crisis-specific artifact.

Filter 4: Granger Causality

Cross-correlation establishes co-movement with a time offset. Granger causality goes further: it tests whether lagged values of A contain information about B that is not already contained in B’s own lagged values. In other words, does knowing A’s past improve our forecast of B beyond what B’s own history provides?

We apply grangercausalitytests from the statsmodels library at the optimal lag identified in Filter 1. The null hypothesis is that A does not Granger-cause B. We reject at p < 0.05.

Of the top 30 candidates tested, 11 passed Granger causality. The 19 that failed were pairs where the cross-correlation was genuine but the leader’s information was already embedded in the follower’s own autoregressive structure — redundant, not predictive.

Filter 5: Out-of-Sample Replication

The four preceding filters use the full sample period. This creates a risk of overfitting: the pipeline might identify relationships that fit historical data perfectly but fail on new data.

We split the sample at Q1 2013:

Calibration window: 2000–2012 (includes the GFC and 2001 recession)
Validation window: 2013–2025 (includes the 2022 inflation shock and current distress cycle)

We re-estimate the cross-correlation using only the calibration window, then compute the correlation at the same lag on the validation window. Pairs must achieve an out-of-sample correlation |r| > 0.3 to pass.

Of 11 Granger-validated pairs, 6 replicated out-of-sample.

Results

The Six Validated Leading Relationships

Leader	Follower	Lag	r	Crises	Granger F	OOS r
Initial Claims	Continuing Claims	1q	0.95	3 (GFC, COVID, 2001)	78.4	0.94
Initial Claims	Unemployment Rate	1q	0.79	2 (GFC, COVID)	25.6	0.79
CPI Inflation	Auto Insurance CPI	3q	0.77	2 (GFC, COVID)	9.9	0.87
CC Delinquency	All-Loan Charge-Offs	3q	0.76	2 (GFC, 2001)	10.3	0.83
Energy CPI	Wage-Inflation Gap	1q	−0.73	2 (GFC, COVID)	6.0	−0.81
Initial Claims	U-6 Underemployment	1q	0.72	2 (GFC, COVID)	16.1	0.70

Three of the six pairs share the same leader: Initial Claims. This is not coincidence — initial filings are a real-time proxy for labor market deterioration, and they feed mechanistically into continuing claims, the unemployment rate, and broader underemployment. The 1-quarter lag reflects the administrative pipeline: a worker files an initial claim, then appears in the continuing claims count the following quarter.

The CPI → Auto Insurance relationship has a longer 3-quarter lag and an interesting economic interpretation. Broad CPI increases in raw materials, labor, and repair costs take roughly 9 months to work through the insurance industry’s rate-filing and approval cycle. The out-of-sample r of 0.87 — higher than the in-sample r of 0.77 — suggests this relationship may actually be strengthening as insurance rate adjustments have become more formulaic.

The Credit Card Delinquency → All-Loan Charge-Off pair captures the distress cascade within the debt system. Households that miss credit card payments — the most unsecured, highest-interest debt — typically progress to broader charge-offs across auto, personal, and eventually mortgage loans within 3 quarters.

The Energy CPI → Wage-Inflation Gap relationship is negative: energy price spikes compress the gap between wage growth and inflation (r = −0.73), with wages taking roughly one quarter to begin adjusting. This pair tracks the cost pressure mechanism — when energy costs surge, wages haven’t caught up yet, squeezing household budgets.

The Pipeline Funnel

Stage	Pairs	Survival Rate
Raw pairs tested	57,541	—
Unique pairs (dual filter)	3,416	5.9% of raw
Crisis-validated (1+ crisis)	1,264	37.0% of dual
Granger causality tested	30	Top candidates
Granger passed	11	36.7% of tested
Out-of-sample validated	6	54.5% of Granger
Overall survival	6	0.01% of raw

Cascade Sequences

Beyond pairwise relationships, we detect transitive chains where A leads B and B leads C, implying A leads C with a cumulative lag. The cascade detector identified 40 such chains, 29 of which span multiple ADI components.

Top cascade sequences:

Chain	Cumulative Lag	Score
NFCI Leverage → Initial Claims → Continuing Claims	11 quarters	82.1
Savings Rate → Initial Claims → Continuing Claims	17 quarters	81.6
CC Delinquency → Initial Claims → Continuing Claims	5 quarters	80.9

The Savings Rate → Initial Claims → Continuing Claims chain is particularly significant for ADI interpretation. It spans two components — Buffer Depletion and Labor Market — suggesting that savings erosion eventually manifests as labor market deterioration. The 17-quarter cumulative lag is consistent with, though not identical to, the manually discovered 9-quarter Buffer Depletion → Debt Stress relationship documented in What the Savings Rate Told Us Nine Quarters Before the Last Crisis.

Discussion

What This Analysis Establishes

The scanner provides systematic evidence that household financial distress unfolds in a predictable sequence, not simultaneously. Certain indicators — particularly initial unemployment claims, credit card delinquency, and broad CPI inflation — contain structural information about what will happen 1 to 3 quarters later in downstream indicators.

This is not forecasting. We are documenting historical lag structures that have held across multiple economic regimes (expansion, recession, recovery) and multiple crisis types (financial, pandemic, dot-com). The appropriate interpretation: if the leader indicator is currently elevated, historical patterns suggest the follower will follow — though the timing and magnitude are not guaranteed.

Limitations

Correlation is not causation. Initial claims and continuing claims share a mechanical link through the unemployment insurance system, so the 1-quarter lag has a clear causal interpretation. But other relationships (e.g., CPI → Auto Insurance) may reflect common upstream causes rather than direct transmission.

Regime dependence. The relationships validated during the GFC and COVID recessions may not hold during a fundamentally different type of crisis — a currency crisis, a trade war, or a technology-driven displacement. The 2001 recession provides some evidence of robustness to crisis type, but the sample of recessions is inherently small.

Small crisis sample. Three crises is the maximum available for the post-2000 data period. Statistical power for crisis-specific validation is limited. A relationship that fails in one crisis may still be structurally valid.

COVID distortion. The COVID recession produced unprecedented speed and magnitude in many indicators (initial claims reached 6.1 million in a single week). While winsorization at the 95th percentile mitigates this, some validated relationships may be influenced by the sheer scale of COVID-era movements.

The 9-quarter gap. The original Buffer Depletion → Debt Stress relationship (r = 0.69, 9-quarter lag) predates this scanner and was validated through manual cross-correlation analysis. It operates at the component level (weighted averages of multiple indicators) rather than indicator level. The scanner’s individual indicator results are complementary, not a replacement for the component-level finding.

Relationship to the ADI

The ADI’s component weights were originally derived from PCA factor analysis and calibrated against the GFC backtest. The leading indicator research provides independent support for the weighting structure: Buffer Depletion’s 30% weight is justified by its role as the earliest warning signal, and the cascade analysis confirms that buffer-related indicators (savings rate) structurally precede labor market indicators (claims) by substantial lags.

The six validated pairs also inform the frontpage scoring algorithm: indicators that are part of a validated leading relationship receive additional weight in the newsworthiness score when their current values are elevated.

Appendix: Data Access

The full scanner output — including all 3,416 tested pairs, filter-stage results, and the 6 fully validated relationships — is available via the API:

JSON: /api/research/leading-indicators.json
Cascade sequences: Available in data/research/cascade_sequences.json
Research charts: Static PNGs at /research/heatmap.png, /research/ccf_plots.png, /research/cascade_diagram.png, /research/funnel.png

To reproduce the analysis:

PYTHONPATH=. python3 scripts/research/leading_indicator_scanner.py
PYTHONPATH=. python3 scripts/research/detect_cascades.py
PYTHONPATH=. python3 scripts/research/generate_research_charts.py

All source code is available in the project repository.

Systematic Leading Indicator Discovery Across 96 Distress Indicators

Abstract

Data

Methods

Filter 1: Raw Cross-Correlation with FDR Correction

Filter 2: First-Differenced Cross-Correlation

Filter 3: Multi-Crisis Validation

Filter 4: Granger Causality

Filter 5: Out-of-Sample Replication

Results

The Six Validated Leading Relationships

The Pipeline Funnel

Cascade Sequences

Discussion

What This Analysis Establishes

Limitations

Relationship to the ADI

Appendix: Data Access

Frequently Asked Questions

How many leading indicator relationships did the scanner find?

What is the strongest validated leading relationship?

Does credit card delinquency predict broader loan defaults?

What is the difference between a structural projection and a forecast?

How does the out-of-sample validation work?

Discussion

Get the numbers when they move.