Chapter 17: When Point Identification Fails

Opening Question

What can we learn about causal effects when our identifying assumptions are too weak to pin down a single number?


Chapter Overview

The previous chapters have emphasized identification: conditions under which data reveal a causal parameter. Instrumental variables identify effects when exclusion restrictions hold; DiD identifies when parallel trends holds; RD identifies when there's no manipulation. But what if these assumptions are implausible? What if we're unwilling to assume what identification requires?

The traditional response is to abandon the question or make stronger assumptions. This chapter develops a third option: partial identification. Rather than accepting a dubious point estimate or giving up entirely, we can characterize the set of values consistent with weaker, more defensible assumptions.

Partial identification yields bounds rather than point estimates. The treatment effect might be anywhere from 0.05 to 0.25—a range that still provides useful information. Bounds are wider than point estimates but more honest about what the data can and cannot tell us.

What you will learn:

  • The logic of partial identification and when it's preferable to point identification

  • Manski bounds for missing data and selection

  • Lee bounds for sample selection in experiments

  • How sensitivity analysis connects to bounded identification

  • When bounds are informative and when they're too wide to be useful

  • The intellectual virtue of honest uncertainty

Prerequisites: Chapter 9 (Causal Framework), Chapter 11 (Selection on Observables), Chapter 12 (Instrumental Variables)


17.1 The Logic of Partial Identification

From Point to Partial

Consider estimating the effect of college on earnings. We observe:

  • YiobsY_i^{obs}: observed earnings

  • DiD_i: college attendance indicator

  • XiX_i: covariates

Under selection on observables (Chapter 11), we assume: Yi(0),Yi(1)DiXiY_i(0), Y_i(1) \perp D_i | X_i

This is strong: it requires that, conditional on XX, college attendance is as-if random. If unmeasured ability affects both college choice and earnings, this assumption fails.

Traditional response: Either (1) assume conditional independence anyway and report the biased estimate, or (2) find an instrument and impose exclusion restrictions, or (3) abandon causal interpretation entirely.

Partial identification response: What can we learn without assuming conditional independence? What bounds on the treatment effect are logically implied by the data and minimal assumptions?

The Identification Region

Definition 17.1 (Identification Region): The identification region Θ\Theta^* is the set of parameter values consistent with the data and maintained assumptions: Θ={θ:data and assumptions are compatible with θ}\Theta^* = \{\theta : \text{data and assumptions are compatible with } \theta\}

Point identification means Θ\Theta^* contains a single value. Partial identification means Θ\Theta^* is a set (interval, union of intervals, or more complex).

Why Partial Identification?

Intellectual honesty: Point estimates convey false precision when identifying assumptions are questionable. Bounds honestly represent uncertainty.

Assumption transparency: Partial identification makes assumptions visible. The width of bounds reveals how much identifying assumptions "buy."

Robustness: A policy conclusion that holds across the entire bounds interval is robust to identification concerns.

Decision-making: Bounds can still guide decisions. If the effect is positive across the entire interval, the policy implication is clear even without point identification.


17.2 Manski Bounds

The Missing Data Problem

The fundamental problem of causal inference is a missing data problem: we never observe both Yi(0)Y_i(0) and Yi(1)Y_i(1) for the same unit. Charles Manski's work formalizes what this missing data implies for identification.

Bounds on the Average Treatment Effect

Consider the simplest setting: we want to know E[Y(1)Y(0)]E[Y(1) - Y(0)] but observe Yi(Di)Y_i(D_i) where DiD_i is treatment status.

We can write: E[Y(1)]=E[Y(1)D=1]P(D=1)+E[Y(1)D=0]P(D=0)E[Y(1)] = E[Y(1)|D=1]P(D=1) + E[Y(1)|D=0]P(D=0)

We observe E[YD=1]E[Y|D=1] and E[YD=0]E[Y|D=0]. Under no assumptions:

  • E[Y(1)D=1]E[Y(1)|D=1] is identified: it equals E[YD=1]E[Y|D=1]

  • E[Y(1)D=0]E[Y(1)|D=0] is not identified: the counterfactual outcome for the untreated

No-Assumptions Bounds

If YY is bounded in [YL,YU][Y_L, Y_U], then: E[Y(1)D=0][YL,YU]E[Y(1)|D=0] \in [Y_L, Y_U]

This yields bounds on E[Y(1)]E[Y(1)]: E[YD=1]P(D=1)+YLP(D=0)E[Y(1)]E[YD=1]P(D=1)+YUP(D=0)E[Y|D=1]P(D=1) + Y_L \cdot P(D=0) \leq E[Y(1)] \leq E[Y|D=1]P(D=1) + Y_U \cdot P(D=0)

Similarly for E[Y(0)]E[Y(0)]. The bounds on the ATE are:

Theorem 17.1 (Manski No-Assumptions Bounds): If Y[YL,YU]Y \in [Y_L, Y_U], the ATE is bounded by: ΔLE[Y(1)Y(0)]ΔU\Delta_L \leq E[Y(1) - Y(0)] \leq \Delta_U where: ΔL=E[YD=1]E[YD=0](YUYL)(1P(D=1))\Delta_L = E[Y|D=1] - E[Y|D=0] - (Y_U - Y_L)(1 - P(D=1)) ΔU=E[YD=1]E[YD=0]+(YUYL)P(D=1)\Delta_U = E[Y|D=1] - E[Y|D=0] + (Y_U - Y_L)P(D=1)

Intuition: The worst case for identifying a positive effect is if untreated units would have had the highest possible outcomes under treatment, and treated units would have had the lowest possible outcomes under control.

Width of No-Assumptions Bounds

The bounds width is (YUYL)(Y_U - Y_L): the entire range of the outcome. For many applications, this is too wide to be informative.

Example: Estimating returns to college on earnings

  • YL=0Y_L = 0 (can't have negative earnings)

  • YU=$1,000,000Y_U = \$1,000,000 (practical maximum)

  • P(D=1)=0.3P(D=1) = 0.3 (30% attend college)

The no-assumptions bounds span nearly the entire earnings range—uninformative without additional restrictions.

Tightening Bounds with Assumptions

Monotone treatment response (MTR): Assume treatment doesn't hurt anyone: Yi(1)Yi(0) for all iY_i(1) \geq Y_i(0) \text{ for all } i

This cuts the bounds in half: negative treatment effects are ruled out.

Monotone treatment selection (MTS): Assume people who select treatment have weakly better untreated outcomes: E[Y(0)D=1]E[Y(0)D=0]E[Y(0)|D=1] \geq E[Y(0)|D=0]

This rules out selection driven by low baseline outcomes.

Monotone instrumental variables (MIV): For an instrument ZZ, assume: E[Y(d)Z=z1]E[Y(d)Z=z2] when z1>z2E[Y(d)|Z=z_1] \geq E[Y(d)|Z=z_2] \text{ when } z_1 > z_2

This imposes monotonicity in the instrument without requiring exclusion.

Box: Understanding MTR, MTS, and MIV—Returns to Education

These assumptions are subtle. A concrete example helps distinguish them.

Setting: We want to bound the returns to a college degree (DD) on earnings (YY).

Assumption
What It Says
When Credible

MTR

Yi(college)Yi(no college)Y_i(college) \geq Y_i(no\ college) for all ii

College never hurts anyone's earnings. Plausible if education only adds skills, not if signaling crowds out experience.

MTS

E[Y(0)D=1]E[Y(0)D=0]E[Y(0) \| D=1] \geq E[Y(0) \| D=0]

College-goers would earn more than non-goers even without college. Captures positive selection on ability—violated if low-ability people attend due to affirmative action.

MIV

E[Y(d)Z=z1]E[Y(d)Z=z2]E[Y(d) \| Z=z_1] \geq E[Y(d) \| Z=z_2] when z1>z2z_1 > z_2

Using parental education as ZZ: people with more-educated parents have higher potential earnings at any education level. Does not assume parents' education only affects child earnings through child's own education.

Key distinction:

  • MTR restricts individual treatment effects (no one is harmed)

  • MTS restricts selection patterns (who chooses treatment)

  • MIV restricts how potential outcomes vary with an observable (without requiring exclusion)

Combining assumptions: Manski and Pepper (2000) show that combining MTR + MTS + MIV with parental education tightens returns-to-schooling bounds dramatically—from nearly [−100%, +100%] to approximately [6%, 15%].

The tradeoff: Tighter bounds require stronger assumptions. A researcher uncomfortable with MTR ("maybe some people learn best on the job") gets wider bounds. The partial identification framework makes this tradeoff explicit.

Each assumption tightens bounds. The researcher chooses which assumptions are credible and reports the resulting bounds.

Identification Regions Figure 17.1: How assumptions narrow the identification region. With no assumptions, bounds span a wide range. Adding MTR, MTS, or both progressively tightens the identified set. Combining multiple assumptions with an instrumental variable can approach point identification. The width of each bar shows the remaining uncertainty under that assumption set.


17.3 Lee Bounds for Sample Selection

The Problem

Experiments often suffer from differential attrition: treated and control groups have different dropout rates. If dropouts differ systematically from completers, comparing observed outcomes is biased.

Example: A job training RCT

  • Treatment: job training program

  • Control: no training

  • Outcome: employment at 12 months

  • Problem: 80% of treatment group completes follow-up, but only 60% of control group

The 80% vs. 60% difference could reflect the program helping people stay in the study (good) or the program selecting different types into the sample (bad for identification).

Lee (2009) Bounds

David Lee's bounds address sample selection by trimming the sample to make selection rates equal.

Key assumption: Treatment affects selection only by changing who is observed, not by adding "new types."

Assumption 17.1 (Monotonicity in Selection): For all units, Si(1)Si(0)S_i(1) \geq S_i(0) (treatment never causes exit) or Si(1)Si(0)S_i(1) \leq S_i(0) (treatment never causes retention).

Trimming procedure: If treatment increases retention (more treated observed than control), trim the treatment group to match the control selection rate:

  1. Calculate the proportion excess: p=1P(S=1D=0)/P(S=1D=1)p = 1 - P(S=1|D=0)/P(S=1|D=1)

  2. Trim pp percent of treatment group—either the top or bottom of the outcome distribution

  3. Lower bound: trim from above. Upper bound: trim from below.

Theorem 17.2 (Lee Bounds): Under monotonicity in selection, the ATE for always-observed units is bounded by: [YˉD=1trim,lowerYˉD=0,YˉD=1trim,upperYˉD=0][\bar{Y}_{D=1}^{trim,lower} - \bar{Y}_{D=0}, \bar{Y}_{D=1}^{trim,upper} - \bar{Y}_{D=0}]

Intuition: We don't know which treated units are "marginal" (would have dropped out absent treatment). We bound the effect by assuming they have extreme outcomes (highest or lowest).

Example: Job Training Evaluation

Continuing the example:

  • Treatment completion: 80%

  • Control completion: 60%

  • Proportion excess: p=160/80=0.25p = 1 - 60/80 = 0.25

We trim 25% of the treatment group. For the lower bound, drop the top 25% of earners among treated. For the upper bound, drop the bottom 25%.

If observed mean earnings are:

  • Control: $25,000

  • Treatment: $30,000

  • Treatment (trim high): $27,000

  • Treatment (trim low): $33,000

Lee bounds: [$27,000 - $25,000, $33,000 - $25,000] = [$2,000, $8,000]

The point estimate without bounding is $5,000, comfortably inside the bounds.

When Are Lee Bounds Informative?

Lee bounds are informative when:

  • Selection rate differences are small

  • Outcome variation is moderate

  • Sample sizes are large enough for precise trimmed means

They are uninformative when:

  • Selection rates differ dramatically

  • Outcome distributions have fat tails

  • The trimmed portions contain most of the signal


17.4 Bounds in Instrumental Variables

Relaxing Exclusion

Standard IV requires the exclusion restriction: the instrument affects outcomes only through treatment. When this is questionable, we can partially identify effects by bounding the direct effect of the instrument.

Relaxed exclusion: Instead of assuming zero direct effect, assume the direct effect is bounded: γδ|\gamma| \leq \delta

where γ\gamma is the instrument's direct effect on the outcome.

Resulting bounds: β[Cov(Y,Z)Cov(D,Z)δCov(D,Z)/Var(Z),Cov(Y,Z)Cov(D,Z)+δCov(D,Z)/Var(Z)]\beta \in \left[\frac{Cov(Y,Z)}{Cov(D,Z)} - \frac{\delta}{|Cov(D,Z)/Var(Z)|}, \frac{Cov(Y,Z)}{Cov(D,Z)} + \frac{\delta}{|Cov(D,Z)/Var(Z)|}\right]

As δ0\delta \to 0, bounds collapse to the point estimate. As δ\delta increases, bounds widen.

Bounds with Weak Instruments

Weak instruments create another identification problem. Rather than compute unreliable point estimates, we can report Anderson-Rubin confidence sets that remain valid regardless of instrument strength.

These confidence sets are bounds: they include all parameter values not rejected by the data.


17.5 Sensitivity Analysis as Partial Identification

The Connection

Sensitivity analysis (Chapter 11) asks: how much unmeasured confounding would be needed to explain away an estimated effect? This implicitly produces bounds.

Oster (2019) bounds: Compare how coefficients change when adding observed controls. Bound the effect of unobservables by extrapolating from observables.

Rosenbaum bounds: In matched studies, bound the treatment effect under varying levels of unmeasured confounding.

E-values: Report the minimum confounding strength needed to explain away the effect.

From Sensitivity to Bounds

A sensitivity analysis answers: "If confounding had strength Γ\Gamma, what effects would be consistent with the data?"

This maps to partial identification:

  • For each Γ\Gamma, compute the set of compatible effects

  • Union over plausible Γ\Gamma values gives the identification region

Example: If the observed effect is β^=0.15\hat{\beta} = 0.15 and sensitivity analysis shows:

  • Γ=1\Gamma = 1 (no confounding): β[0.10,0.20]\beta \in [0.10, 0.20]

  • Γ=1.5\Gamma = 1.5: β[0.02,0.28]\beta \in [0.02, 0.28]

  • Γ=2\Gamma = 2: β[0.05,0.35]\beta \in [-0.05, 0.35]

If we're willing to assume Γ1.5\Gamma \leq 1.5, the bounds are [0.02,0.28][0.02, 0.28]—positive throughout.


17.6 Proximal Causal Inference

Using Proxies for Unobserved Confounders

When confounders are unobserved, proxies may help. Proximal causal inference (Tchetgen Tchetgen et al., 2020) formalizes conditions under which proxies for unmeasured confounding enable identification or tighten bounds.

Setup:

  • UU: unmeasured confounder

  • WW: proxy for UU (affected by UU, not by DD or YY directly)

  • ZZ: proxy for UU (different from WW)

Identification result: Under conditions relating proxies to the confounder, treatment effects can be identified or bounded even without observing UU.

Intuition

If we have two proxies for the same confounder, each provides partial information about UU. Together, they may provide enough information to control for UU's confounding influence.

Example: Estimating effect of air pollution (DD) on health (YY)

  • UU: socioeconomic status (unmeasured)

  • WW: neighborhood housing values (proxy for SES)

  • ZZ: car ownership (another proxy for SES)

Neither proxy perfectly measures SES, but together they may triangulate its confounding influence.

Limitations

Proximal causal inference requires:

  • Multiple proxies for the same confounder

  • Specific independence conditions

  • Sufficient proxy quality

When conditions are only approximately satisfied, the method produces bounds rather than point identification.


17.7 When to Report Bounds

The Trade-off

Point estimates are precise but may be wrong if assumptions fail. Bounds are honest about uncertainty but may be too wide for policy guidance.

Factors favoring bounds:

  • Identifying assumptions are questionable

  • Bounds are informative (narrow enough to guide decisions)

  • Audience values honesty over precision

  • Stakes are high enough to warrant extra caution

Factors favoring point estimates:

  • Assumptions are widely accepted

  • Bounds are so wide they're uninformative

  • The estimate is understood as one of many inputs

  • Consumers of research prefer precise (even if wrong) numbers

Reporting Strategy

Best practice: Report both.

  1. Main point estimate under standard assumptions

  2. Sensitivity analysis or bounds under weaker assumptions

  3. Discussion of what assumptions are required for each

This lets readers with different credences in assumptions draw different conclusions.

Example: Returns to Education

Point estimate: IV using compulsory schooling laws finds ~8% return per year of schooling.

Concerns: Exclusion restriction (compulsory schooling may affect outcomes through channels other than years of schooling), LATE interpretation (effect for compliers may differ from population average).

Bounds approach:

  • Under monotonicity alone (education doesn't hurt earnings): [0,+)[0, +\infty)

  • Under monotonicity + bounded effect heterogeneity: [0.03,0.15][0.03, 0.15]

  • Under relaxed exclusion (γ0.02|\gamma| \leq 0.02): [0.05,0.11][0.05, 0.11]

The bounds narrow as assumptions strengthen. Readers can choose which assumptions to believe.


17.8 Inference for Partially Identified Parameters

Confidence Intervals for Bounds

Bounds are estimated, not known. We need confidence intervals for the identification region itself.

Imbens and Manski (2004): Confidence intervals for interval-identified parameters: CI=[θ^Lcse^L,θ^U+cse^U]CI = [\hat{\theta}_L - c \cdot \hat{se}_L, \hat{\theta}_U + c \cdot \hat{se}_U]

where cc is chosen to achieve correct coverage for the identified set, not just the endpoints.

Issues with Inference

Empty intersection: With multiple assumptions, their intersection may be empty in finite samples even if the true parameter is identified.

Conservative inference: Standard methods for bounds are conservative—actual coverage often exceeds nominal.

Bootstrap: Resampling methods can construct confidence sets for bounds, but require care with set-valued parameters.


17.9 Running Example: Returns to Education

The Identification Challenge

The returns to education question illustrates partial identification themes:

What we want: The causal effect of an additional year of schooling on earnings.

Why point identification is hard:

  • Selection: More able people get more schooling

  • Omitted variables: Family background, innate ability, motivation

  • Measurement: Years of schooling may not capture quality

IV approach (Chapter 12): Use compulsory schooling laws as instruments. But:

  • Exclusion restriction is questionable (schooling laws may affect outcomes through peer effects, credential effects)

  • LATE is for compliers, not the population

Bounds Analysis

No-assumptions bounds: If we only assume earnings are non-negative and bounded by $1 million, bounds span nearly the entire range—uninformative.

Monotonicity bounds: Assuming more education doesn't reduce earnings:

  • Lower bound: 0 (education has no positive effect)

  • Upper bound: Observed college premium (upper bound on the causal effect)

This tells us: the effect is non-negative but could be anywhere from zero to the observed correlation.

IV bounds with relaxed exclusion: Following Conley et al. (2012), allow the instrument to have a small direct effect:

  • If γ0.01|\gamma| \leq 0.01, bounds are approximately [0.06,0.10][0.06, 0.10]

  • If γ0.03|\gamma| \leq 0.03, bounds are approximately [0.03,0.13][0.03, 0.13]

Sensitivity analysis (Altonji et al. 2005, Oster 2019):

  • How much would selection on unobservables need to exceed selection on observables to explain away the return?

  • Estimates suggest selection would need to be 2-3x stronger—plausibly too strong

What We Learn

Combining approaches:

  • Returns to education are almost certainly positive

  • Plausible range: 3-15% per year

  • Point estimate of ~8% is within the plausible range

  • Uncertainty is real but doesn't reverse conclusions

This is more honest than reporting "8% ± 2%" when the true uncertainty is much larger.


Practical Guidance

When to Use Partial Identification

Situation
Recommended Approach

Point-identifying assumptions are credible

Report point estimate + sensitivity analysis

Some assumptions questionable

Report bounds under alternative assumptions

Major assumption clearly false

Use bounds or weakest defensible assumptions

Policy decision required

Report bounds; check if decision is robust

Academic credibility paramount

Report both; let readers choose assumptions

Common Pitfalls

Pitfall 1: Dismissing bounds as uninformative Wide bounds still provide information—they tell you what you don't know. "The effect is somewhere between -0.1 and 0.5" is more honest than "the effect is 0.2 ± 0.05 (assuming my model is correct)."

How to avoid: Report bounds even when wide. Discuss what additional assumptions or data would narrow them.

Pitfall 2: Cherry-picking bounds Choosing assumptions to get narrow bounds defeats the purpose.

How to avoid: Report bounds under multiple assumption sets. Be transparent about which assumptions you find most credible and why.

Pitfall 3: Ignoring sampling uncertainty in bounds Bounds are estimated quantities with their own standard errors.

How to avoid: Report confidence intervals for bounds endpoints. Use methods designed for partially identified parameters.

Pitfall 4: Confusing identification failure with estimation failure Wide bounds may reflect weak identification (a data limitation) or weak assumptions (a modeling choice). These have different implications.

How to avoid: Distinguish between "the data are insufficient" and "I'm unwilling to assume enough."

Implementation Checklist


Qualitative Bridge

The Value of Honest Uncertainty

Partial identification embodies a commitment to honest uncertainty—acknowledging the limits of what data can tell us. This connects to qualitative research traditions that:

  • Emphasize the complexity of social phenomena

  • Resist oversimplified causal claims

  • Value thick description over point estimates

When to Combine

Understanding assumptions: Qualitative knowledge helps assess which identifying assumptions are credible. Fieldwork, interviews, and institutional analysis reveal whether exclusion restrictions hold, whether selection is monotone, whether treatment effects are bounded.

Interpreting bounds: Wide bounds indicate genuine uncertainty, but don't tell us why identification is hard. Qualitative analysis can explain the sources of confounding and suggest what additional data or design would narrow bounds.

Communicating uncertainty: Policy audiences may find bounds confusing. Case studies and narrative can convey what uncertain estimates mean for real decisions.

Example: Evaluating Education Interventions

Bounds on educational intervention effects may span positive and negative. Qualitative evidence helps interpret this:

  • Process observation: Does the intervention seem to work in classrooms?

  • Teacher interviews: What are implementation challenges?

  • Student focus groups: How do students experience the intervention?

This evidence doesn't narrow statistical bounds but helps judge where in the bounds the true effect likely lies.


Integration Note

Connections to Other Methods

Method
Relationship
See Chapter

Selection on Observables

Sensitivity analysis produces bounds

Ch. 11

Instrumental Variables

Relaxed exclusion yields bounds

Ch. 12

RDD

Extrapolation from cutoff produces bounds

Ch. 14

DiD

Sensitivity to parallel trends gives bounds

Ch. 13

Triangulation Strategies

Bounds from different methods may overlap, reinforcing conclusions:

  1. Different bounding assumptions: Do bounds from MTR, MTS, and MIV intersect?

  2. Different research designs: Do IV bounds overlap with selection-on-observables bounds?

  3. Different datasets: Are bounds consistent across data sources?

Overlap of multiple bounds provides stronger evidence than any single bound.


Summary

Key takeaways:

  1. Partial identification produces bounds instead of point estimates when identifying assumptions are too weak for point identification.

  2. Manski bounds show what can be learned from data alone (often very little) and how monotonicity or other assumptions tighten bounds.

  3. Lee bounds address sample selection in experiments by trimming to equalize selection rates, under monotonicity assumptions.

  4. Sensitivity analysis is implicit partial identification—it maps assumptions about confounding strength to sets of compatible effects.

  5. Bounds provide honest uncertainty about causal effects. They're wider than point estimates but don't rely on questionable assumptions.

  6. Report both when possible: point estimates under standard assumptions, bounds under weaker assumptions. Let readers choose their credence.

Returning to the opening question: When identifying assumptions are too weak to pin down a single number, we can still learn from data. Partial identification characterizes the set of parameter values consistent with the data and minimal assumptions. These bounds may be wide, but they honestly represent what we know—and don't know—about causal effects.


Further Reading

Essential

  • Manski (2003), Partial Identification of Probability Distributions - The foundational treatment

  • Tamer (2010), "Partial Identification in Econometrics" - Accessible survey

For Deeper Understanding

  • Manski (1990), "Nonparametric Bounds on Treatment Effects" - Original treatment effect bounds

  • Lee (2009), "Training, Wages, and Sample Selection" - Lee bounds derivation and application

  • Imbens and Manski (2004), "Confidence Intervals for Partially Identified Parameters" - Inference methods

Advanced/Specialized

  • Conley, Hansen, and Rossi (2012), "Plausibly Exogenous" - Bounds with imperfect instruments

  • Tchetgen Tchetgen et al. (2020), "Introduction to Proximal Causal Inference" - Proxies for confounders

  • Molinari (2020), "Microeconometrics with Partial Identification" - Comprehensive treatment

Applications

  • Manski and Pepper (2000), "Monotone Instrumental Variables" - Returns to schooling bounds

  • Blundell et al. (2007), "Changes in the Distribution of Male and Female Wages" - Bounds on wage distribution changes

  • Kline and Santos (2012), "A Score Based Approach to Wild Bootstrap Inference" - Inference for bounds


Exercises

Conceptual

  1. Explain the difference between point identification and partial identification. When is partial identification preferable to (a) imposing additional assumptions for point identification, or (b) abandoning causal inference entirely?

  2. In Lee bounds, what is the monotonicity assumption? Construct an example where this assumption fails and explain why Lee bounds would be invalid.

  3. How does sensitivity analysis relate to partial identification? Show how Oster (2019) bounds can be reframed as partial identification under assumptions about selection.

Applied

  1. Using data from a job training program evaluation:

    • Calculate naive treatment effect estimates

    • If there is differential attrition, compute Lee bounds

    • Discuss what the bounds tell you about the program's effectiveness

  2. For the returns to education question:

    • State Manski's no-assumptions bounds (given plausible outcome bounds)

    • Add monotonicity (education doesn't reduce earnings) and compute tighter bounds

    • Discuss whether these bounds are narrow enough to be policy-relevant

Discussion

  1. A policymaker says: "I need a number, not a range. Bounds are useless for decision-making." How would you respond? When are bounds useful for policy decisions, and when are they genuinely uninformative?

Last updated