Chapter 16: Causal Inference with Time Series

Opening Question

When your data is primarily temporal—aggregate outcomes observed repeatedly over time—how can you identify causal effects rather than just correlations?


Chapter Overview

Most causal inference methods in this book exploit cross-sectional variation: comparing treated and control units, using instrumental variation across observations, or leveraging threshold discontinuities. But many important questions are inherently time series: What are the effects of monetary policy? How do oil price shocks affect the economy? Did a structural reform change an economy's trajectory?

Time series data present distinctive challenges for causal inference. With only one country, one economy, or one market observed over time, there are no control units. Serial correlation violates independence assumptions. Non-stationarity creates spurious correlations. And the fundamental problem of causal inference—we cannot observe counterfactuals—is acute when we have a single time series.

This chapter develops methods for causal inference when time is the primary dimension of variation. These methods have different identification strategies than cross-sectional approaches, but the core goal is the same: distinguishing correlation from causation by exploiting exogenous variation.

What you will learn:

  • Why time series creates special challenges for causal inference

  • What Granger causality does and doesn't tell us

  • How structural VAR identifies causal effects through timing assumptions, long-run restrictions, or sign restrictions

  • Local projections as a flexible alternative to VAR

  • External instruments and proxy SVAR methods

  • Interrupted time series as the time series analogue of RD

Prerequisites: Chapter 7 (Time Series Foundations), Chapter 9 (Causal Framework), Chapter 12 (Instrumental Variables)


16.1 Special Challenges of Time Series

The n=1 Problem

Cross-sectional causal inference compares units: what happened to the treated versus the control? With time series, we often have only one unit observed over time. There is no "control" economy that didn't experience the oil shock, no parallel universe where the Fed didn't raise rates.

Implications:

  • Treatment effects must be identified from temporal variation

  • Control must be constructed implicitly (counterfactual modeling) or via design (interrupted time series)

  • External validity is particularly challenging—different time periods may have different structural relationships

Serial Correlation

Time series observations are not independent. Today's GDP is correlated with yesterday's GDP, which is correlated with the day before's. This creates:

  • Spurious regression: Two trending series appear correlated even with no causal relationship

  • Inference problems: Standard errors assuming independence are too small

  • Confounding: Any omitted variable that trends over time confounds the relationship

Non-Stationarity

Many economic series are non-stationary: they trend upward, exhibit structural breaks, or have time-varying volatility.

Definition 16.1 (Stationarity): A time series is (weakly) stationary if its mean and autocovariance do not depend on time: E[Yt]=μE[Y_t] = \mu for all tt, and Cov(Yt,Ytk)=γkCov(Y_t, Y_{t-k}) = \gamma_k depends only on the lag kk, not on tt.

Non-stationary series require careful handling:

  • Unit roots: Series that are integrated of order 1 (I(1)) must be differenced before standard regression

  • Cointegration: Some I(1) series share common trends, allowing level regressions under specific conditions

  • Structural breaks: Parameters may change at unknown points

Warning: Spurious Regression Is Not a Minor Nuisance—It Is a Fatal Threat

If you regress one trending series on another, you will find a statistically significant relationship—even if the series are completely independent. This is not a small-sample problem; it gets worse as TT grows.

Simulation: Generate two independent random walks (cumulative sums of random noise). Regress one on the other. You will find:

  • R20.5R^2 \approx 0.5 or higher

  • tt-statistics of 5, 10, or more

  • "Significant" relationship with p<0.001p < 0.001

This is entirely spurious—the series have no causal relationship by construction.

Why it happens: Trending series share a common property (they wander). Regression attributes this shared wandering to a relationship between the series. Standard asymptotics fail because the series don't satisfy the conditions for consistency.

The solution: Before running any time series regression:

Step
Action
Tool

1

Plot the series

Visual inspection for trends

2

Test for unit roots

ADF, PP, or KPSS tests

3a

If I(0): Proceed with levels

Standard regression valid

3b

If I(1): Difference the data

Regress ΔY\Delta Y on ΔX\Delta X

3c

If cointegrated: Use ECM

Test with Engle-Granger or Johansen

Implementation:

Bottom line: A "significant" relationship between two trending series is meaningless until you've addressed non-stationarity. This is not optional.

Dynamic Effects

Causal effects in time series typically unfold over time. A monetary policy shock today affects output not just immediately, but over the following months and years. We want to trace out the impulse response function—the dynamic path of the effect.

Definition 16.2 (Impulse Response Function): The impulse response of variable YY to a shock ε\varepsilon at horizon hh is Yt+h/εt\partial Y_{t+h} / \partial \varepsilon_t—the effect of a shock today on outcomes hh periods in the future.

This distinguishes time series causal inference from cross-sectional methods where we typically estimate a single average treatment effect.


16.2 Granger "Causality": Predictive Precedence

The Concept

Clive Granger's (1969) framework asks: does knowing XX's past help predict YY's future, beyond what YY's own past tells us?

Definition 16.3 (Granger Predictability): XX Granger-causes YY if past values of XX contain information useful for forecasting YY beyond what's contained in past values of YY alone: MSE[E(YtYt1,Yt2,...)]>MSE[E(YtYt1,Yt2,...,Xt1,Xt2,...)]MSE[E(Y_t | Y_{t-1}, Y_{t-2}, ...)] > MSE[E(Y_t | Y_{t-1}, Y_{t-2}, ..., X_{t-1}, X_{t-2}, ...)]

Terminological Warning: The term "Granger causality" is deeply entrenched but profoundly misleading. Granger himself acknowledged this, and many econometricians now prefer "Granger predictability" or "incremental predictability." What the test actually establishes is whether XX helps forecast YY—a statement about prediction, not causation. We use the standard terminology throughout (because you'll encounter it everywhere) but urge readers to mentally substitute "Granger-predicts" for "Granger-causes."

Testing for Granger Causality

Bivariate test: Estimate: Yt=α+j=1pβjYtj+j=1pγjXtj+εtY_t = \alpha + \sum_{j=1}^{p} \beta_j Y_{t-j} + \sum_{j=1}^{p} \gamma_j X_{t-j} + \varepsilon_t

Test H0:γ1=γ2=...=γp=0H_0: \gamma_1 = \gamma_2 = ... = \gamma_p = 0 using an F-test.

Multivariate test: In a VAR system, test whether lags of XX enter the equation for YY.

What Granger Causality Tells Us

Predictive content: If XX Granger-causes YY, XX's past values help forecast YY. This is useful for forecasting and establishing temporal precedence.

Temporal ordering: Granger causality implies XX changes precede YY changes. This rules out YY causing XX (in the predictive sense).

What Granger Causality Does NOT Tell Us

Not true causation: Granger causality is predictive, not causal. A third variable ZZ that causes both XX and YY, with ZXZ \to X happening before ZYZ \to Y, will make XX appear to Granger-cause YY even though XX has no causal effect on YY.

Example: Suppose Christmas shopping (Z) causes increased credit card debt (X) and then, with a lag, causes increased January spending (Y). Credit card debt Granger-causes January spending—but paying off your credit card doesn't affect your January spending. The common cause creates the predictive relationship.

Not contemporaneous effects: Granger causality uses lagged values. If XX affects YY contemporaneously (within the same period), this won't show up as Granger causality.

Not policy-relevant without structure: Even if money supply Granger-causes output, this doesn't tell us what happens if the Fed changes policy. The historical relationship may reflect endogenous policy responses, not the causal effect of exogenous policy changes.

When Granger Causality Is Useful

  • Establishing temporal precedence: As preliminary evidence that XX precedes YY

  • Forecasting: If you care about prediction rather than causation

  • Model specification: Identifying which variables belong in a forecasting model

  • Ruling out certain causal structures: If XX doesn't Granger-cause YY, some causal structures are implausible

When Granger Causality Is Misleading

  • Confounded relationships: Common causes create spurious Granger causality

  • Simultaneous systems: Contemporaneous causation is invisible

  • Policy evaluation: The Lucas critique applies—historical correlations may not hold under policy intervention

Box: The LSE Tradition and General-to-Specific Modeling

Macroeconometrics has its own tradition of thinking about model specification and structure, centered at the London School of Economics and particularly associated with David Hendry.

The LSE approach emphasizes:

General-to-specific (Gets) modeling: Start with a general unrestricted model embedding all theoretically relevant variables and dynamics. Then simplify systematically:

  1. Estimate the general model

  2. Apply diagnostic tests (serial correlation, heteroskedasticity, parameter stability)

  3. Remove insignificant terms, testing restrictions

  4. Verify the simplified model passes diagnostics

  5. Iterate until a parsimonious, well-specified model is reached

This contrasts with "specific-to-general" approaches that start simple and add variables.

Congruence: A model should be "congruent" with the data-generating process—matching its statistical properties. Diagnostic tests assess congruence.

Encompassing: A good model should explain ("encompass") the results of rival models.

The Autometrics algorithm (Doornik & Hendry) automates Gets modeling, systematically exploring the model space and applying batteries of tests. It represents algorithmic model selection disciplined by econometric theory.

Connection to causal inference: The LSE tradition is about model selection and dynamic specification rather than causal identification in the modern credibility revolution sense. It asks: Given a set of candidate variables, which belong in the model? It does not directly address: Is the variation exogenous? Yet the traditions are complementary. Gets modeling can be seen as searching for a well-specified reduced form. Identification—whether coefficients have causal interpretation—requires additional arguments about exogeneity that SVAR methods (Section 16.3-16.6) address.

See Hendry (1995), Dynamic Econometrics, for the comprehensive treatment.


16.3 Structural Vector Autoregression (SVAR)

The VAR Foundation

A vector autoregression (VAR) models a set of variables as depending on their own lags:

Yt=A1Yt1+A2Yt2+...+ApYtp+utY_t = A_1 Y_{t-1} + A_2 Y_{t-2} + ... + A_p Y_{t-p} + u_t

where YtY_t is a k×1k \times 1 vector of variables and utu_t is a k×1k \times 1 vector of reduced-form errors with covariance matrix Σ=E[utut]\Sigma = E[u_t u_t'].

VAR is descriptive: It summarizes the data's dynamic correlations but doesn't distinguish cause from effect. The reduced-form shocks utu_t are correlated across equations—we can't interpret them as causal "shocks."

From VAR to SVAR

Structural VAR imposes additional assumptions to identify causal shocks. The structural model is:

B0Yt=B1Yt1+...+BpYtp+εtB_0 Y_t = B_1 Y_{t-1} + ... + B_p Y_{t-p} + \varepsilon_t

where εt\varepsilon_t are mutually uncorrelated structural shocks with diagonal covariance matrix.

The relationship between reduced-form and structural:

  • ut=B01εtu_t = B_0^{-1} \varepsilon_t

  • Σ=B01B01\Sigma = B_0^{-1} B_0^{-1'}

The identification problem: Σ\Sigma has k(k+1)/2k(k+1)/2 unique elements. B0B_0 has k2k^2 elements. We need k(k1)/2k(k-1)/2 restrictions to identify B0B_0.

Figure 16.1: SVAR Identification Schemes

Figure 16.1: Three approaches to SVAR identification. Short-run (Cholesky) restrictions impose a recursive ordering. Long-run restrictions (Blanchard-Quah) distinguish permanent from transitory shocks. Sign restrictions impose theoretically motivated constraints on the direction of effects.

Definition 16.4 (Structural Shock): A structural shock is an exogenous, economically interpretable impulse to a variable—such as a monetary policy shock, oil supply shock, or technology shock—uncorrelated with other structural shocks.

Identification via Short-Run Restrictions

Recursive (Cholesky) identification: Assume B0B_0 is lower triangular. This means:

  • First variable responds contemporaneously only to its own shock

  • Second variable responds to first and its own shock

  • And so on...

The ordering matters: variables ordered first are contemporaneously exogenous to variables ordered later.

Example: Monetary policy VAR (Christiano, Eichenbaum, and Evans 1999):

Order: Output → Prices → Fed funds rate → Money → Other

This assumes:

  • Output and prices respond to monetary policy with a lag (reasonable if policy data releases lag real activity)

  • The Fed responds contemporaneously to output and prices (realistic)

  • Money responds to the interest rate immediately (portfolio rebalancing)

Assessing recursive identification: Is the ordering defensible? The key assumption is that "slow-moving" variables like output don't respond within the period (typically a quarter) to monetary shocks.

Identification via Long-Run Restrictions

Blanchard and Quah (1989) identify shocks by their long-run effects rather than contemporaneous structure.

Example: Identify supply vs. demand shocks

  • Supply shocks have permanent effects on output

  • Demand shocks have only temporary effects on output

Mathematically, impose that demand shocks' cumulative effect on output is zero: h=0Yt+hεtdemand=0\sum_{h=0}^{\infty} \frac{\partial Y_{t+h}}{\partial \varepsilon_t^{demand}} = 0

Applications: Distinguishing permanent from transitory shocks, identifying technology shocks, decomposing trend and cycle.

Limitations: Identification depends on the model correctly capturing the shock transmission. If the economy has rich dynamics, long-run restrictions may not uniquely identify shocks.

Identification via Sign Restrictions

Uhlig (2005) and others propose identifying shocks by the sign of their effects, without imposing zero restrictions.

Example: Monetary policy shock

  • Contractionary monetary shock raises interest rates

  • Should reduce output (or at least not increase it)

  • Should reduce prices (or at least not increase them)

Procedure:

  1. Draw rotations of the reduced-form orthogonalization

  2. Keep only rotations where impulse responses satisfy sign restrictions

  3. Report the set of identified impulse responses

Advantages:

  • Less restrictive than zero restrictions

  • Reflects genuine uncertainty about contemporaneous effects

Disadvantages:

  • Produces set identification, not point identification

  • Results depend on choice of sign restrictions

  • May include very different structural models in the identified set


16.4 Impulse Response Analysis

Computing Impulse Responses

Given a structural identification, impulse responses trace out effects of a one-unit shock over time.

For a VAR(1) with structural shock εt\varepsilon_t: Yt+h=AhB01εt+(effects of future shocks)Y_{t+h} = A^h B_0^{-1} \varepsilon_t + \text{(effects of future shocks)}

The impulse response at horizon hh is: Θh=AhB01\Theta_h = A^h B_0^{-1}

where the (i,j)(i,j) element gives the effect of shock jj on variable ii after hh periods.

Cumulative Responses

For some questions, cumulative effects matter: Θhcum=s=0hΘs\Theta^{cum}_h = \sum_{s=0}^{h} \Theta_s

This gives the total effect through horizon hh.

Visualization

Impulse response figures show:

  • The dynamic path of each variable in response to each shock

  • Confidence intervals (typically 68% and 90%)

  • Often normalized to a one-standard-deviation shock

Figure 16.2: Impulse Response Functions

Figure 16.2: Impulse response functions to a monetary policy shock. A contractionary monetary shock raises interest rates, reduces output (with a hump-shaped response), gradually lowers prices, and appreciates the exchange rate. Shaded bands show 95% confidence intervals, which widen at longer horizons.

Inference

Asymptotic inference: Delta method standard errors for impulse responses.

Bootstrap inference: More reliable in small samples:

  1. Estimate VAR, obtain residuals

  2. Resample residuals, construct pseudo-samples

  3. Re-estimate VAR and impulse responses

  4. Construct confidence intervals from bootstrap distribution

Bayesian inference: Draws from posterior distribution of VAR parameters translate to posterior distribution of impulse responses.


16.5 Local Projections

Motivation

VAR-based impulse responses require:

  • Correct lag length specification

  • Correct model (linear, time-invariant)

  • Full system estimation

Jordà (2005) proposes local projections (LP) as a more flexible alternative.

The Local Projection Estimator

Instead of iterating a VAR forward, estimate the effect at each horizon directly:

Yt+h=αh+βhXt+γhZt+εt+hY_{t+h} = \alpha^h + \beta^h X_t + \gamma^h Z_t + \varepsilon_{t+h}

where:

  • Yt+hY_{t+h} is the outcome hh periods ahead

  • XtX_t is the shock or treatment of interest

  • ZtZ_t includes controls (lags of YY, other variables)

  • βh\beta^h is the impulse response at horizon hh

Run separate regressions for h=0,1,2,...,Hh = 0, 1, 2, ..., H to trace out the impulse response.

Definition 16.5 (Local Projection): A local projection estimates the effect of a variable on outcomes at horizon hh by regressing Yt+hY_{t+h} directly on the variable and controls, rather than iterating a one-step-ahead model.

Comparison with VAR

Aspect
VAR
Local Projection

Efficiency

More efficient if VAR correctly specified

Less efficient, more robust

Specification

Requires full system

Equation-by-equation

Misspecification

Errors compound at long horizons

Each horizon estimated separately

Nonlinearities

Hard to incorporate

Easy to incorporate

Confidence intervals

Often too narrow

More reliable

Figure 16.3: VAR vs Local Projections

Figure 16.3: Comparing VAR and Local Projection approaches. VAR produces tighter confidence bands when correctly specified, but LP is more robust to misspecification. The key trade-off: VAR offers efficiency; LP offers robustness.

Practical Implementation

Controls: Include lags of dependent variable, lags of shock, and other relevant variables. Common choices:

  • pp lags of YY

  • pp lags of XX

  • Trend or time effects if needed

Standard errors: Use Newey-West HAC standard errors to account for serial correlation in εt+h\varepsilon_{t+h}.

Horizon: Choose HH based on economic question (monetary policy effects: 16-20 quarters; long-run effects: longer).

Local Projection IV (LP-IV)

When XtX_t is endogenous, combine local projections with instrumental variables:

Yt+h=αh+βhXt+γhZt+εt+hY_{t+h} = \alpha^h + \beta^h X_t + \gamma^h Z_t + \varepsilon_{t+h}

with XtX_t instrumented by WtW_t.

Application: Stock and Watson (2018) use LP-IV with external instruments for monetary policy shocks (high-frequency surprises) to estimate effects on output and prices.


16.6 External Instruments and Proxy SVAR

The Problem with Internal Identification

Traditional SVAR identification (timing, long-run restrictions) relies on assumptions within the model. These assumptions are often controversial and difficult to test.

External identification: Use information from outside the VAR—external instruments—to identify structural shocks.

External Instruments Approach

An external instrument ZtZ_t is correlated with the structural shock of interest εt1\varepsilon_t^1 but uncorrelated with other structural shocks εt2,...,εtk\varepsilon_t^2, ..., \varepsilon_t^k.

Identification conditions:

  1. Relevance: E[Ztεt1]0E[Z_t \varepsilon_t^1] \neq 0

  2. Exogeneity: E[Ztεtj]=0E[Z_t \varepsilon_t^j] = 0 for j1j \neq 1

Figure 16.4: External Instruments (Proxy SVAR)

Figure 16.4: External instruments (proxy SVAR) identification. The external instrument (e.g., a narrative monetary shock measure) must be correlated with the shock of interest but uncorrelated with other structural shocks. This allows identification without relying on timing or long-run restrictions within the model.

Estimation (Stock and Watson 2012, Mertens and Ravn 2013):

  1. Estimate reduced-form VAR, obtain residuals utu_t

  2. Use IV regression of ut1u_t^1 on ut1u_t^{-1} (other equations' residuals) with instrument ZtZ_t

  3. Recover structural parameters from IV estimates

Sources of External Instruments

High-frequency identification (Gertler and Karadi 2015):

  • Monetary policy surprises measured by asset price changes in narrow windows around FOMC announcements

  • Identifying assumption: only monetary news moves markets in 30-minute windows

Narrative identification (Romer and Romer 2004):

  • Read FOMC minutes and identify policy changes not driven by forecasts of future economic conditions

  • Construct series of "exogenous" policy changes

Natural experiments:

  • Oil supply disruptions due to geopolitical events

  • Weather shocks affecting agricultural production

Example: Monetary Policy Effects

Gertler and Karadi (2015) proxy SVAR approach:

Instrument: Federal funds futures price changes in 30-minute windows around FOMC announcements

Identification logic:

  • In a 30-minute window, the only systematic information is the policy announcement

  • Futures price changes reflect monetary policy surprises

Results:

  • Contractionary monetary shock raises interest rates, reduces output and prices

  • Credit spreads rise (financial accelerator)

  • Effects last several years

This external identification produces results qualitatively similar to, but more precisely estimated than, recursive SVAR.


16.7 Interrupted Time Series

Design Overview

Interrupted time series (ITS) is the time series analogue of regression discontinuity: something changes at a known point in time, and we compare outcomes before and after.

Setup:

  • Outcome observed over time: YtY_t for t=1,...,Tt = 1, ..., T

  • Intervention at time T0T_0

  • Pre-intervention: t<T0t < T_0

  • Post-intervention: tT0t \geq T_0

Estimation

Segmented regression:

Yt=β0+β1t+β2Dt+β3(tT0)Dt+εtY_t = \beta_0 + \beta_1 t + \beta_2 D_t + \beta_3 (t - T_0) D_t + \varepsilon_t

where Dt=1D_t = 1 if tT0t \geq T_0.

Interpretation:

  • β0\beta_0: pre-intervention intercept

  • β1\beta_1: pre-intervention trend

  • β2\beta_2: immediate level change at intervention (jump)

  • β3\beta_3: change in trend after intervention (slope change)

Identification Assumption

Continuity in the counterfactual: Absent intervention, the pre-intervention trend would have continued.

This is analogous to RD's continuity assumption but applied temporally:

limtT0E[Yt(0)]=limtT0+E[Yt(0)]\lim_{t \to T_0^-} E[Y_t(0)] = \lim_{t \to T_0^+} E[Y_t(0)]

Threats to Validity

Concurrent events: Other changes occurring at T0T_0 confound the effect.

Anticipation: If actors anticipate the intervention, pre-intervention trends are affected.

Maturation: Natural changes over time (learning, aging) could cause trends independent of intervention.

Instrumentation: Measurement changes coinciding with the intervention.

Strengthening ITS

Multiple pre-intervention periods: More data before T0T_0 allows better characterization of the counterfactual trend.

Control series: A similar series unaffected by the intervention provides a "synthetic" counterfactual (comparative ITS).

Multiple interventions: If the intervention is implemented at different times in different places, variation in timing aids identification.

Example: Policy Implementation

Suppose a country implements a new tax policy on January 1, 2015. ITS analysis would:

  1. Collect monthly tax revenue data for 2010-2019

  2. Estimate the pre-2015 trend

  3. Test for a level shift and trend change in January 2015

  4. Control for other economic variables (GDP, inflation) that might coincidentally change

If the timing of implementation was determined by factors unrelated to revenue trends (e.g., political calendar), the ITS estimate is credible.


16.8 Running Example: Monetary Policy Shocks

The Question

What happens to output and prices when the Federal Reserve unexpectedly tightens monetary policy?

This question is central to macroeconomic policy but challenging because:

  • Monetary policy responds to economic conditions (endogeneity)

  • Effects are dynamic, unfolding over years

  • The U.S. economy is observed only once (no control group)

Evolution of Methods

1. Atheoretic VARs (Sims 1980)

  • Estimate reduced-form VAR of output, prices, interest rate, money

  • Problem: Cannot separate monetary shocks from systematic policy responses

2. Recursive SVAR (Christiano, Eichenbaum, and Evans 1999)

  • Order variables so Fed funds rate responds contemporaneously to output and prices

  • Output and prices respond only with a lag

  • Finding: Contractionary shock raises rates, eventually lowers output and prices

3. Narrative identification (Romer and Romer 2004)

  • Read FOMC transcripts to identify policy changes not responding to forecasts

  • Construct "Romer dates" of exogenous policy shifts

  • Finding: Larger effects than VAR, more precise

4. High-frequency identification (Gertler and Karadi 2015)

  • Use Fed funds futures changes around FOMC as instrument

  • Finding: Consistent with VAR but sharper identification, reveals financial channel

5. Local projections with external instruments (Stock and Watson 2018)

  • Combine LP flexibility with high-frequency instruments

  • Finding: Robust to VAR misspecification, reveals state-dependence

Current Consensus

Across methods, contractionary monetary policy:

  • Raises short-term interest rates (by construction)

  • Lowers output with peak effect after 12-18 months

  • Lowers prices (with some debate about the "price puzzle" in VARs)

  • Tightens financial conditions, widens credit spreads

The agreement across methods strengthens confidence in causal interpretation.

Lessons for Practitioners

  1. Multiple identification strategies: When possible, use several (timing restrictions, external instruments, narrative). Agreement builds credibility.

  2. Dynamic effects: Report full impulse responses, not just contemporaneous effects.

  3. Robustness: Check sensitivity to lag length, sample period, variable set.

  4. State dependence: Effects may differ in recessions vs. expansions (can test with state-dependent LP).


Practical Guidance

When to Use Each Method

Method
Best For
Caution

Granger causality

Forecasting, preliminary analysis

Not causal inference

Recursive SVAR

Well-understood timing structure

Ordering assumptions may be wrong

Long-run SVAR

Distinguishing permanent vs. transitory

Strong theory needed

Sign-restricted SVAR

When zeros are too strong

Set identification, wide bounds

Local projections

Flexible estimation, state dependence

Less efficient than VAR

External instruments

When external shocks available

Instrument validity required

Interrupted time series

Clear policy discontinuity

Concurrent confounders

Common Pitfalls

Pitfall 1: Confusing Granger causality with true causation Granger causality is predictive precedence, not causal effect. A common cause that affects XX before YY will create Granger causality without true causation.

How to avoid: Use Granger causality for forecasting; use SVAR/LP for causal inference.

Pitfall 2: Ignoring non-stationarity Running VARs on non-stationary data without cointegration can produce spurious results.

How to avoid: Test for unit roots. Use differences or error-correction models for I(1) data.

Pitfall 3: Over-confidence in timing assumptions Recursive SVAR requires that "slow-moving" variables don't respond within the period. This may fail with monthly data or high-frequency interactions.

How to avoid: Consider whether identification assumptions match the data frequency. Use external instruments when available.

Pitfall 4: Undersized confidence intervals VAR-based inference often produces too-narrow confidence intervals, especially at long horizons.

How to avoid: Use bootstrap inference. Compare with LP confidence intervals.

Pitfall 5: Structural breaks Time series relationships may not be stable over long samples. Oil price effects in the 1970s may differ from today.

How to avoid: Test for breaks. Report subsample results. Use rolling-window estimation.

Implementation Checklist


Qualitative Bridge

The Limits of Time Series Causal Inference

Time series methods identify dynamic effects, but they cannot tell us:

  • What mechanisms transmit shocks to outcomes

  • Why policy responded as it did

  • Whether the identified relationships are structural or regime-specific

When to Combine

Understanding policy: Narrative methods like Romer and Romer (2004) blend quantitative coding with qualitative reading of FOMC transcripts. The identification comes from understanding why policy changed.

Mechanism exploration: SVAR tells us that monetary tightening reduces output. Case studies of specific episodes can reveal channels: Was it through credit constraints? Exchange rates? Expectations?

Historical context: Time series estimates may be unstable across regimes. Economic historians can explain why—different monetary frameworks, financial structures, or global conditions.

Example: The Volcker Disinflation

The 1979-1982 Volcker disinflation is a natural experiment in monetary policy. Time series analysis estimates the output cost of reducing inflation. But understanding the episode requires qualitative analysis:

  • Why did Volcker act? Political constraints, credibility concerns, inflation psychology

  • Why did it "work"? Credibility restoration, expectation coordination, changed labor relations

  • Was it necessary? Counterfactual reasoning about alternative approaches

This integration of quantitative estimation with historical understanding is essential for learning from major policy episodes.


Integration Note

Connections to Other Methods

Method
Relationship
See Chapter

Difference-in-Differences

ITS is the time series analogue

Ch. 13

Instrumental Variables

External instruments use IV logic

Ch. 12

Synthetic Control

Constructs counterfactual for single unit

Ch. 15

Regression Discontinuity

ITS shares discontinuity logic

Ch. 14

Triangulation Strategies

Time series causal estimates gain credibility when:

  1. Multiple identification strategies agree: Recursive SVAR, sign restrictions, and external instruments yield similar impulse responses

  2. Different data frequencies: Monthly and quarterly estimates align

  3. Cross-country evidence: Effects are consistent across countries with similar institutions

  4. Narrative consistency: Quantitative estimates match historical accounts of specific episodes

  5. Structural models: Estimates are consistent with theoretical predictions


Running Example: China's Post-1978 Growth (Time Series Perspective)

Time Series Challenges for China

Can time series methods help identify what drove China's growth? The challenges are severe:

  • n=1: Only one China

  • Regime change: 1978 was a structural break; pre-1978 data may be irrelevant

  • Multiple simultaneous reforms: Liberalization, SEZs, trade opening, agricultural reform occurred together

  • Data quality: Chinese statistics are controversial, especially for pre-reform periods

What Time Series Methods Can Offer

Growth accounting: Decompose growth into contributions from capital accumulation, labor, and TFP. This is descriptive (Ch. 6) but frames causal questions.

Structural breaks: Test for trend breaks in 1978, 1992, 2008 to identify reform effects on the growth trajectory.

Interrupted time series: Treat specific reforms as interventions:

  • Agricultural reform (1978-1984)

  • SOE reform (1990s)

  • WTO accession (2001)

Compare actual outcomes to extrapolated pre-reform trends.

Comparative time series: Construct synthetic counterfactuals using other countries (Ch. 15), then analyze the gap using time series methods.

Limitations

Time series alone cannot answer "what caused China's growth" because:

  • Reforms occurred simultaneously and interactively

  • No pre-reform control period establishes counterfactual trajectories

  • External factors (global trade expansion, technology transfer) confound internal reforms

The China question ultimately requires triangulation across methods: descriptive growth accounting, quasi-experimental evidence on specific policies (SEZs, SOEs), time series structural break analysis, and comparative case study with other developing economies.


Summary

Key takeaways:

  1. Time series causal inference faces distinctive challenges: Serial correlation, non-stationarity, and the n=1 problem require different tools than cross-sectional analysis.

  2. Granger causality is not causation: It's predictive precedence, useful for forecasting and temporal ordering, but not for identifying causal effects.

  3. SVAR identifies structural shocks through timing assumptions, long-run restrictions, or sign restrictions. Each requires different identifying assumptions; none is universally superior.

  4. Local projections provide a flexible alternative to VAR, estimating impulse responses horizon-by-horizon. They're robust to VAR misspecification but less efficient.

  5. External instruments bring cross-sectional identification logic to time series, using outside information (high-frequency data, narrative events) to isolate exogenous variation.

  6. Interrupted time series applies discontinuity logic temporally, comparing trends before and after a discrete intervention.

Returning to the opening question: When data is primarily temporal, causal inference requires exploiting the timing of shocks and interventions. Structural VARs impose economic restrictions on contemporaneous or long-run effects; local projections flexibly estimate dynamic responses; external instruments bring exogenous variation from outside the model. The best practice combines multiple approaches, building credibility through agreement across identification strategies.


Further Reading

Essential

  • Stock and Watson (2001), "Vector Autoregressions" - Accessible introduction to VAR methodology

  • Ramey (2016), "Macroeconomic Shocks and Their Propagation" - Comprehensive survey of SVAR identification

For Deeper Understanding

  • Kilian and Lütkepohl (2017), Structural Vector Autoregressive Analysis - Definitive textbook

  • Jordà (2005), "Estimation and Inference of Impulse Responses by Local Projections" - Original LP paper

  • Stock and Watson (2018), "Identification and Estimation of Dynamic Causal Effects in Macroeconomics" - LP-IV methods

Advanced/Specialized

  • Mertens and Ravn (2013), "The Dynamic Effects of Personal and Corporate Income Tax Changes" - Proxy SVAR with external instruments

  • Gertler and Karadi (2015), "Monetary Policy Surprises, Credit Costs, and Economic Activity" - High-frequency identification

  • Uhlig (2005), "What Are the Effects of Monetary Policy on Output?" - Sign restrictions

Applications

  • Christiano, Eichenbaum, and Evans (1999), "Monetary Policy Shocks" - Canonical recursive SVAR

  • Romer and Romer (2004), "A New Measure of Monetary Shocks" - Narrative identification

  • Blanchard and Quah (1989), "The Dynamic Effects of Aggregate Demand and Supply Disturbances" - Long-run identification


Exercises

Conceptual

  1. Explain the difference between Granger causality and true causation. Construct an example where XX Granger-causes YY but XX has no causal effect on YY.

  2. In a bivariate VAR of output and interest rates, why does the ordering matter for recursive identification? What economic assumptions justify ordering output before interest rates?

  3. Compare VAR-based impulse responses with local projection impulse responses. What are the relative advantages and disadvantages? When would you prefer one over the other?

Applied

  1. Download monthly data on the Federal funds rate, industrial production, and CPI from FRED. Estimate a recursive SVAR with the ordering: industrial production, CPI, Fed funds. Plot impulse responses to a monetary policy shock. How sensitive are results to the ordering?

  2. Using the Romer and Romer (2004) monetary shock series (available online), estimate local projections of industrial production on monetary shocks. Compare your results to SVAR-based estimates.

Discussion

  1. The Lucas critique argues that historical correlations between policy and outcomes may not hold under alternative policy rules. How do modern time series causal methods address (or fail to address) this critique?


Appendix 16A: VAR Estimation Details

Reduced-Form VAR

For a VAR(p) with kk variables: Yt=c+A1Yt1+...+ApYtp+utY_t = c + A_1 Y_{t-1} + ... + A_p Y_{t-p} + u_t

Estimation: Equation-by-equation OLS is consistent and efficient.

Lag selection: Use AIC, BIC, or likelihood ratio tests.

Structural Identification

Recursive (Cholesky): B0=PB_0 = P where Σ=PP\Sigma = PP' (Cholesky decomposition).

Long-run restrictions: Impose hΘh\sum_h \Theta_h constraints.

Sign restrictions: Search over rotations QQ where B01=PQB_0^{-1} = PQ and QQ=IQQ' = I.


Appendix 16B: Newey-West Standard Errors

For local projections with horizon hh, residuals εt+h\varepsilon_{t+h} are serially correlated. Use Newey-West HAC standard errors with bandwidth MhM \geq h:

V^=Ω^0+j=1Mwj(Ω^j+Ω^j)\hat{V} = \hat{\Omega}_0 + \sum_{j=1}^{M} w_j (\hat{\Omega}_j + \hat{\Omega}_j')

where Ω^j=1Ttu^tu^tjXtXtj\hat{\Omega}_j = \frac{1}{T} \sum_t \hat{u}_t \hat{u}_{t-j}' X_t X_{t-j}' and wj=1j/(M+1)w_j = 1 - j/(M+1) (Bartlett kernel).

Last updated