Chapter 25: Research Practice

Opening Question

Beyond knowing the right methods, what makes the difference between research that contributes to knowledge and research that misleads or is forgotten?


Chapter Overview

The preceding chapters have covered methods: how to identify causal effects, describe patterns, combine evidence, and handle uncertainty. But methods are tools. Using them well requires attention to research practice---the organization, execution, communication, and ethics of empirical work.

This chapter addresses what separates good research from technically correct but ultimately unhelpful work. Good research practice is reproducible, transparent, and honest about uncertainty. It communicates clearly, both to specialists and broader audiences. And it considers the ethical dimensions of working with data about human beings.

These issues have taken on new urgency as replication crises in multiple fields have revealed how easily smart researchers using valid methods can produce misleading results. The problem is not usually fraud. It's the accumulation of small choices---which specifications to report, how to frame results, what to emphasize---that systematically biases what we learn.

What you will learn:

  • How to organize research projects for reproducibility and collaboration

  • When and how to pre-register analyses

  • How to write and present empirical research effectively

  • How to present uncertainty honestly, avoiding the pitfalls of null hypothesis significance testing

Prerequisites: General familiarity with empirical methods from earlier chapters


Historical Context: From Heroic Research to Reproducible Science

For much of the 20th century, empirical social science followed what we might call the "heroic" model. A researcher would develop a question, collect or access data, analyze it using their judgment about appropriate methods, and publish findings. Replication was rare. Data sharing was optional. The researcher's expertise and reputation served as the primary guarantee of quality.

This model came under sustained challenge beginning in the 1990s. Dewald, Thursby, and Anderson (1986) found that only 7% of economics papers could be replicated with the authors' data. The "credibility revolution" in economics (Angrist and Pischke 2010) emphasized research design over researcher authority. And the broader "replication crisis"---sparked by spectacular failures to replicate in psychology (Open Science Collaboration 2015), medicine (Ioannidis 2005), and eventually economics---forced a reckoning with how research is actually conducted.

The response has included new infrastructure (data repositories, pre-registration platforms), new incentives (journals requiring data availability, badges for open practices), and new norms (distinguishing exploratory from confirmatory analysis). Economics has been slower to adopt some of these changes than psychology, but movement is visible. The AEA's data and code availability policy (2019), the growth of pre-analysis plans for experiments, and increased attention to robustness and specification curves all reflect evolving standards.

This chapter distills emerging best practices while acknowledging ongoing debates about what's appropriate for different types of research.

Box: What Do We Know About Peer Review?

Peer review is the gatekeeping mechanism for scientific knowledge, yet we have surprisingly little evidence about how well it works. Li (2017) provides one of the most rigorous studies, examining NIH grant review using a clever identification strategy.

Her key findings: Reviewers are both more informed and more biased when evaluating research related to their own work. They can better assess quality in their area of expertise, but they also favor proposals similar to their own research. The net effect? Expertise dominates bias---reviewers' specialized knowledge improves selection more than their conflicts of interest harm it.

The identification exploits variation in whether experts are permanent committee members (who evaluate many related proposals) versus temporary members (who serve occasionally). This variation in reviewer composition is plausibly unrelated to proposal quality, allowing causal identification.

For research practice, this suggests: Seek informed reviewers despite potential bias. And for your own work: Understand that review is imperfect but not arbitrary. Rejection doesn't necessarily mean your work is bad, and acceptance doesn't guarantee it's good.


25.1 Project Workflow

Organizing for Reproducibility

A reproducible research project is one where another researcher (or your future self) can understand what was done and verify the results. This requires organization from the start, not cleanup at the end.

Principle 25.1: Reproducibility by Design Build reproducibility into project workflow from day one. It is far easier to maintain clean organization than to impose it on a messy project.

Project Structure

A well-organized empirical project follows a consistent structure that separates raw data, code, and outputs. The key principles:

  1. Separation of raw and processed data: Raw data is never modified. All cleaning is done in code that can be re-run.

  2. Numbered scripts: Code runs in order, making dependencies clear.

  3. Clear input/output: Each script has defined inputs and outputs.

  4. Self-documenting: README files explain what's what.

See Chapter 26 (Programming Companion: Project Management) for detailed folder structure templates, dependency management with Make/targets, and containerization for reproducibility.

Version Control

Version control (typically Git) tracks changes over time:

Benefits:

  • Complete history of what changed when

  • Easy to revert mistakes

  • Collaboration without overwriting

  • Branching for experiments

Basic workflow:

Even for solo projects, version control provides invaluable safety and history.

Documentation

Documentation serves multiple audiences:

For yourself:

  • Analysis notes explaining decisions

  • Code comments for complex logic

  • README files for project navigation

For collaborators:

  • Data dictionaries/codebooks

  • Dependency documentation

  • Installation instructions

For replicators:

  • Complete instructions to reproduce results

  • Software versions and environment specification

  • Known issues and limitations

Data Management

Principle 25.2: Raw Data Immutability Never modify raw data files. All transformations should be done in code, creating new files for processed data.

Data management checklist:

Collaboration

Research is increasingly collaborative. Effective collaboration requires:

Communication:

  • Regular check-ins

  • Shared documentation

  • Clear division of labor

Technical infrastructure:

  • Shared repositories (GitHub, GitLab)

  • Cloud storage for large data

  • Common computing environment

Attribution and credit:

  • Clear authorship agreements early

  • Documented contributions

  • Acknowledgment of all contributors


25.2 Transparency and Pre-Registration

The Case for Transparency

Transparency means others can see what you did and verify your claims. It's fundamental to science but historically neglected in social science.

Dimensions of transparency:

  1. Data availability: Can others access the data?

  2. Code availability: Is the analysis code public?

  3. Material availability: Are instruments, protocols, supplementary materials accessible?

  4. Analysis transparency: Is it clear what choices were made?

Data and Code Sharing

Most major economics journals now require data and code availability:

AEA policy (2019):

  • Data and code must be deposited

  • Must be sufficient to reproduce results

  • Exceptions for proprietary/confidential data (but code still required)

Practical considerations:

  • Use persistent identifiers (DOI) for data deposits

  • Include dependencies and environment specification

  • Test that replication package actually works

  • For confidential data, provide as much as possible (summary statistics, simulated data)

Pre-Registration

Pre-registration commits researchers to an analysis plan before seeing results:

Definition 25.1: Pre-Registration A time-stamped, publicly available research plan documenting hypotheses, data, and analysis methods before results are known.

What to pre-register:

  • Primary research questions and hypotheses

  • Data source and sample definition

  • Variable construction and measurement

  • Primary specifications (estimation method, controls)

  • Treatment of outliers and missing data

  • Multiple testing adjustments

What need not be pre-registered:

  • Exploratory analyses (but label them as such)

  • Robustness checks

  • Secondary specifications

When Pre-Registration Makes Sense

Pre-registration is most valuable when:

Situation
Value of Pre-Registration

Prospective RCT

High - can plan before data collection

Survey with primary outcome

High - commit before seeing responses

Secondary analysis of existing data

Moderate - can pre-register before accessing

Observational study with new data collection

Moderate - plan before analysis

Reanalysis of publicly available data

Lower - others can check your work directly

Exploratory analysis

Lower - exploration is the point

Registered Reports

Registered reports take pre-registration further:

  1. Stage 1 review: Reviewers evaluate research design before data collection

  2. In-principle acceptance: If design is sound, paper will be published regardless of results

  3. Stage 2 review: After data collection, verify pre-registered plan was followed

This eliminates publication bias at source. Results are irrelevant to publication decision.

Criticisms and Limitations

Pre-registration is debated in economics:

Arguments against:

  • Much economics is observational---can't pre-register before data exist

  • Over-emphasis on confirmatory analysis discourages valuable exploration

  • Reviewers can still reject at Stage 2

  • Administrative burden may not be worthwhile

Responses:

  • Pre-registration is for confirmatory claims, not all analysis

  • Exploratory work remains valuable but should be labeled

  • Burden decreases with practice

  • Specific to research context---not one-size-fits-all

Balanced Approach

A pragmatic approach:

  1. For experiments: Pre-register primary hypotheses and analysis

  2. For observational work: Pre-register when possible (before accessing data)

  3. Always: Distinguish confirmatory from exploratory analysis

  4. Always: Report what you did (specification curves help)

  5. Accept: Some research is inherently exploratory, and that's fine

Specification Curves and Multiverse Analysis

Pre-registration constrains ex ante choices. Specification curves reveal ex post how conclusions depend on analytical choices.

Definition 25.2: Specification Curve A visualization showing how estimated effects vary across all defensible combinations of analytical choices (variable definitions, sample restrictions, controls, estimation methods).

Construction:

  1. Identify all reasonable analytical choices (e.g., which controls, which sample, which functional form)

  2. Run all possible combinations

  3. Plot estimates sorted by effect size

  4. Show which choices produce which estimates

Worked Example: Specification Curve for Returns to Education

Analytical choices:

  • Sample: Men only, women only, both

  • Measure: Years of schooling, highest degree

  • Controls: None, demographics, family background, ability proxy

  • Method: OLS, IV (compulsory schooling), IV (college proximity)

This yields 3 × 2 × 4 × 3 = 72 specifications.

The specification curve plots all 72 estimates, showing:

  • Range of estimates (e.g., 5% to 14% returns)

  • Which choices drive variation (IV estimates tend higher; controlling for ability lowers OLS)

  • Robustness of main conclusions (returns are positive across all specifications)

Interpretation:

  • If estimates cluster tightly, findings are robust to analytical choices

  • If estimates vary wildly, honest reporting requires acknowledging this fragility

  • The pattern of variation can be informative: which choices matter most?

Multiverse analysis extends this logic to the full "multiverse" of possible analyses, including data processing choices:

  • Missing data handling (listwise deletion, imputation, bounds)

  • Outlier treatment (winsorize, trim, include)

  • Variable construction (alternative measures, different aggregations)

  • Sample restrictions (age range, time period, geography)

The full multiverse can contain thousands of specifications. The goal is not to run all of them but to understand how conclusions depend on choices.

When to use specification curves:

  • When analytical choices are genuinely debatable

  • When you want to demonstrate robustness (or honestly reveal fragility)

  • When different specifications have been used in the literature

  • When reviewers might question your main specification

Limitations:

  • Computationally intensive for many choices

  • Not all specifications are equally credible---some choices may be clearly wrong

  • May give false confidence if all specifications share the same flaw

  • Cannot substitute for getting the identification right


25.3 Writing and Communication

Structure of Empirical Papers

The standard economics paper follows a predictable structure:

  1. Introduction (~2-4 pages)

    • Research question

    • Why it matters

    • What you do

    • What you find

    • Contribution

  2. Background/Literature (~2-3 pages)

    • Relevant prior work

    • Where you fit

  3. Data (~2-4 pages)

    • Sources

    • Construction

    • Summary statistics

  4. Empirical Strategy (~3-5 pages)

    • Identification strategy

    • Estimating equations

    • Key assumptions

  5. Results (~5-10 pages)

    • Main findings

    • Robustness

    • Heterogeneity

  6. Discussion/Conclusion (~2-3 pages)

    • Interpretation

    • Limitations

    • Implications

Writing for Clarity

Principle 25.3: The Reader's Time is Valuable Write for a busy reader who will skim before deciding whether to read carefully. Make your contribution clear quickly.

Practical guidance:

Front-load key information:

  • First sentence should hint at the question

  • First paragraph should convey the main finding

  • Readers should understand your contribution without reading the whole paper

Use structure:

  • Clear section headings

  • Topic sentences for paragraphs

  • Transitions between sections

Be precise:

  • Define terms

  • Specify what you mean (which population, which parameter)

  • Avoid vague qualifiers ("significant relationship")

Be concise:

  • Cut unnecessary words

  • Every paragraph should serve a purpose

  • One point per paragraph

Tables and Figures

Tables and figures often convey results more effectively than prose.

Table principles:

  • Informative titles that describe content

  • Clear column headers

  • Standard errors in parentheses (or brackets for confidence intervals)

  • Note significance levels and sample sizes

  • Don't include too many columns---split complex tables

  • Round appropriately (3-4 significant digits usually sufficient)

Example: Well-Formatted Regression Table

Table 3: Returns to Education

(1)
(2)
(3)

OLS

IV

IV

Years of schooling

0.103

0.089

0.112

(0.008)

(0.024)

(0.019)

Controls

No

No

Yes

First-stage F

-

12.4

18.7

N

24,531

24,531

24,531

Notes: Standard errors in parentheses, clustered by state. * p<0.10, ** p<0.05, *** p<0.01. Controls include age, age squared, race, and region fixed effects.

Figure principles:

  • Clear, informative titles

  • Labeled axes with units

  • Legends when needed

  • Source notes

  • Not too cluttered

  • Consider colorblind-friendly palettes

Figure 25.1a: Poor Design
Figure 25.1b: Good Design

Figure 25.1: Good vs. Poor Figure Design. Both panels show identical data on GDP per capita across regions. The poor design (top) uses garish colors, a cluttered bar chart, and removes helpful gridlines. The good design (bottom) uses a line chart appropriate for time series, accessible colors, clean labels, and minimal visual clutter. Good visualization makes patterns immediately apparent.

Writing for Different Audiences

Research often needs to reach multiple audiences:

Academic specialists:

  • Full technical detail

  • Extensive robustness

  • Positioning in literature

Policy audiences:

  • Lead with implications

  • Minimize jargon

  • Focus on magnitudes and uncertainty

  • Explicit about what results do/don't show

General audiences:

  • Plain language

  • Concrete examples

  • Clear visualizations

  • Acknowledge limitations without burying the finding


25.4 Presenting Uncertainty Honestly

The Problem with P-Values

The null hypothesis significance testing (NHST) framework has dominated empirical research but produces systematic problems:

Issues with p-values:

  1. Dichotomization: Treats p = 0.049 differently from p = 0.051

  2. Misinterpretation: P-values don't measure probability the null is true

  3. Publication bias: Incentivizes p-hacking to cross thresholds

  4. Effect size neglect: Statistical significance ≠ practical importance

Definition 25.2: What a P-Value Actually Is The probability of observing data as extreme or more extreme than what was observed, if the null hypothesis were true and the study were repeated many times. It is not the probability that the null is true.

Beyond Significance Stars

Moving beyond binary significance requires:

1. Report effect sizes with uncertainty

  • Report estimates with confidence intervals, not just significance

  • Interpret magnitude, not just sign

  • Consider practical significance, not just statistical

2. Use confidence intervals

  • 95% CI conveys more than stars

  • Readers can assess whether effects of various sizes are compatible with data

  • Multiple significance levels implicit in CI

3. Consider Bayesian approaches

  • Posterior probabilities directly address "probability effect is real"

  • Prior specification makes assumptions explicit

  • Credible intervals have intuitive interpretation

Practical Guidance on Presenting Uncertainty

Principle 25.4: Honest Uncertainty Report what you learned, including what you didn't learn. Overstating precision harms credibility and misleads users of research.

Do:

  • Report confidence intervals for key estimates

  • Discuss sensitivity of results to specification choices

  • Note sample size and power considerations

  • Distinguish statistically significant from economically meaningful

  • Acknowledge what assumptions are required

Don't:

  • Make binary claims based on p-value thresholds

  • Hide imprecision behind stars

  • Dismiss insignificant results as "no effect"

  • Over-interpret point estimates when intervals are wide

  • Claim certainty you don't have

Communicating to Non-Specialists

Policy audiences and the public need accessible communication of uncertainty:

Strategies:

  • Use natural frequencies ("1 in 20") rather than percentages or decimals

  • Visualize uncertainty (error bars, ranges, distributions)

  • Use plain language ("we can't rule out effects anywhere from -5% to +10%")

  • Provide context for magnitudes ("similar to the effect of X")

  • Be explicit about confidence level ("we're fairly confident that...")

Worked Example: Communicating Minimum Wage Results

Technical: "We estimate an employment elasticity of -0.073 (SE = 0.022, p < 0.01)."

Policy audience: "Our results suggest that a 10% minimum wage increase would reduce employment by about 0.7%, plus or minus about 0.4 percentage points."

General audience: "We find small negative effects on employment. A typical minimum wage increase might reduce jobs by less than 1%---meaningful but modest. Some studies find even smaller effects or no effect at all, so there's genuine uncertainty about the exact impact."

The Role of Prior Knowledge

Pure frequentist inference ignores prior information, but researchers and readers always have priors. Bayesian approaches make this explicit:

P(θdata)P(dataθ)×P(θ)P(\theta | data) \propto P(data | \theta) \times P(\theta)

Practical implications:

  • Extraordinary claims require extraordinary evidence

  • Accumulated prior evidence matters for interpretation

  • A single study rarely should change beliefs dramatically

  • Meta-analytic thinking applies informally even without formal meta-analysis


25.5 Ethics in Empirical Research

Dimensions of Research Ethics

Research ethics extends beyond IRB compliance:

1. Human subjects protection

  • Informed consent

  • Privacy and confidentiality

  • Minimizing harm

  • Special protections for vulnerable populations

2. Data ethics

  • Responsible use of administrative data

  • Privacy in the age of big data

  • Algorithmic fairness when research informs policy

3. Professional integrity

  • Honest reporting

  • Appropriate attribution

  • Conflicts of interest

4. Social responsibility

  • Considering who benefits from research

  • Engaging affected communities

  • Thinking about misuse of findings

Common Ethical Challenges

Re-identification risk: Even "anonymized" data can sometimes be re-identified. Consider:

  • What harm could come from re-identification?

  • What safeguards are appropriate?

  • When is data too sensitive to share?

Research on vulnerable populations: Development economics often studies the global poor. Consider:

  • Power dynamics between researchers and subjects

  • Benefit sharing with studied communities

  • Avoiding "extractive" research

Dual use: Research findings can be used for purposes researchers didn't intend. Consider:

  • Who might use your findings?

  • Could findings be misused?

  • Do you have responsibility for downstream use?

Box: Concrete Ethics Cases in Empirical Research

Case 1: Algorithmic Fairness in Criminal Justice

Predictive algorithms used in bail and sentencing decisions (like COMPAS) have been shown to exhibit racial disparities. Researchers face tensions:

  • Algorithms may reduce overall detention rates (benefit)

  • But may systematically disadvantage Black defendants (harm)

  • "Fairness" has multiple incompatible definitions (equal false positive rates? equal accuracy? calibration?)

Researcher responsibility: If your work informs algorithmic tools, assess disparate impact across protected groups. Report fairness metrics alongside accuracy.

Case 2: Targeting in Development Programs

ML-based targeting can identify who benefits most from interventions. But:

  • Optimization for efficiency may conflict with equity

  • Targeting on predicted outcomes can exclude those most in need if they have lower predicted gains

  • Communities may perceive targeting as unfair even if statistically justified

Researcher responsibility: Be explicit about the welfare function being optimized. Consider who is left out and why.

Case 3: Re-identification of "Anonymous" Data

Researchers have demonstrated that individuals can be re-identified from "anonymous" datasets:

  • Sweeney showed 87% of Americans are uniquely identified by zip + birthdate + gender

  • Genetic data linked to public genealogy databases identified thousands

  • Location data from phones can identify individuals from patterns

Researcher responsibility: Assume determined adversaries. Use formal privacy protections (differential privacy) for sensitive data. Don't release data that could enable harm even if IRB approved.

Case 4: Research in Authoritarian Contexts

Field experiments and surveys in non-democracies raise special issues:

  • Enumerators may face retaliation for certain questions

  • Governments may demand data access

  • Findings could be used to target dissidents

Researcher responsibility: Consider whether the research can be done safely and ethically in context. Have data destruction protocols. Limit what data is collected.

The common thread: Ethical research requires imagination about downstream consequences, not just compliance with formal rules.

Ethical Research Practice

Principle 25.5: Ethics Throughout the Research Process Ethical considerations should inform all stages of research, not just IRB review at the beginning.

Planning stage:

  • Is the question worth asking?

  • Who benefits from the research?

  • What are the risks to participants?

Data collection:

  • Informed consent

  • Privacy protections

  • Fair compensation

Analysis:

  • Honest reporting

  • No selective presentation

  • Appropriate caveats

Communication:

  • Accurate representation

  • Accessible to affected communities

  • Consideration of how findings might be used


25.6 AI-Assisted Research Workflow

Large language models (LLMs) and AI coding assistants have become integral to empirical research workflows. Using them effectively requires understanding their capabilities, limitations, and ethical implications.

AI Tools in the Research Process

Code generation and debugging:

  • LLMs can write boilerplate code, data cleaning scripts, and visualization functions

  • Particularly useful for syntax you don't use daily (e.g., complex regex, SQL queries, LaTeX tables)

  • Effective for debugging—paste error messages and code for suggested fixes

Literature review and synthesis:

  • AI can summarize papers, identify themes across literature, and suggest relevant citations

  • Useful for rapid orientation in unfamiliar fields

  • Cannot replace careful reading of key papers

Writing assistance:

  • Drafting, editing, and restructuring prose

  • Generating first drafts of methods sections from analysis code

  • Translation between technical and non-technical registers

Critical Limitations

Warning: AI Does Not Understand Your Research

LLMs are pattern-matching systems trained on text. They do not:

  • Understand causal identification strategies

  • Know whether your instrument is valid

  • Verify that code produces correct results

  • Check whether claims are supported by evidence

AI assistance complements but never replaces domain expertise and careful verification.

Common failure modes:

Failure
Example
Mitigation

Plausible but wrong code

Correct syntax, wrong algorithm

Always test on known cases

Hallucinated citations

Cites papers that don't exist

Verify every reference

Confident nonsense

Authoritative-sounding but factually wrong

Cross-check key claims

Subtle statistical errors

Misapplies methods in edge cases

Review statistical logic carefully

Training data cutoff

Doesn't know recent methods/papers

Supplement with current sources

Ethical Considerations

Transparency and disclosure:

  • Journals increasingly require disclosure of AI use

  • Document which parts of your workflow used AI assistance

  • Maintain human accountability for all claims and code

Authorship and credit:

  • AI is a tool, not an author—humans bear responsibility

  • Acknowledge AI assistance in methods or acknowledgments

  • Do not misrepresent AI-generated text as entirely your own work

Data privacy:

  • Do not paste sensitive or confidential data into cloud-based AI tools

  • IRB protocols may restrict AI use with human subjects data

  • Consider local/on-premise AI tools for sensitive projects

Best Practices

Verification is mandatory:

Maintain intellectual ownership:

  • Use AI to accelerate, not replace, your thinking

  • Understand every line of code you commit

  • Be able to explain and defend every methodological choice

Version control for AI interactions:

  • Save prompts and responses for reproducibility

  • Document which AI model and version was used

  • Note when AI suggestions were modified

Integration with Traditional Workflow

Task
Traditional Approach
AI-Augmented Approach

Literature search

Database queries, citation chains

AI synthesis + targeted deep reading

Code writing

Write from scratch, adapt examples

AI draft + verification + refinement

Debugging

Stack Overflow, documentation

AI diagnosis + manual verification

Writing

Outline → draft → revise

AI draft → heavy revision → human voice

Peer review

Read and comment

AI summary + focused human critique

The key principle: AI handles the mechanical while humans provide judgment, creativity, and accountability. The division of labor should leave all substantive decisions—identification strategy, interpretation, claims—in human hands.


Practical Guidance

When to Do What

Stage
Key Practices

Project start

Set up reproducible structure, version control

Before data access

Pre-register (where appropriate)

During analysis

Document choices, maintain code quality

Writing

Clear structure, honest uncertainty

Submission

Complete replication package

Publication

Archive data and code

Common Pitfalls

Pitfall 1: Cleanup Later Planning to clean up messy code/organization after the project is done. It never happens.

How to avoid: Build reproducibility in from the start. It's easier to maintain good practices than to impose them retroactively.

Pitfall 2: Over-Selling Overstating certainty or importance of findings to increase publication chances or media attention.

How to avoid: Report results honestly, including uncertainty. Your reputation is built over a career, not a single paper.

Pitfall 3: Under-Documenting Assuming you'll remember why you made analysis decisions.

How to avoid: Document decisions in real time. Use meaningful variable names and code comments. Write README files.

Pitfall 4: Significance Chasing Running specifications until one is significant, then reporting only that.

How to avoid: Pre-register primary specifications. Report specification curves. Distinguish confirmatory from exploratory.

Implementation Checklist

Project setup:

Analysis:

Writing:

Archiving:


Summary

Key takeaways:

  1. Reproducibility requires organization from project start---clear directory structure, version control, documentation, and immutable raw data.

  2. Pre-registration and transparency help distinguish confirmatory from exploratory findings, though their value depends on research context.

  3. Good communication front-loads key findings, uses tables and figures effectively, and honestly presents uncertainty without hiding behind p-values.

Returning to the opening question: The difference between research that contributes to knowledge and research that misleads lies not primarily in technical sophistication but in research practice. Organized, transparent, reproducible work that honestly communicates uncertainty is more valuable than brilliant analysis buried in messy projects with selective reporting. These practices benefit not just science but researchers themselves---you will thank your past self for that well-organized project and that carefully documented decision.


Further Reading

Essential

  • Christensen, G., J. Freese, and E. Miguel (2019). "Transparent and Reproducible Social Science Research." University of California Press.

  • Angrist, J.D. and J.-S. Pischke (2010). "The Credibility Revolution in Empirical Economics." Journal of Economic Perspectives.

For Deeper Understanding

  • Wilson, G., J. Bryan, K. Cranston, et al. (2017). "Good Enough Practices in Scientific Computing." PLOS Computational Biology.

  • Gentzkow, M. and J. Shapiro (2014). "Code and Data for the Social Sciences: A Practitioner's Guide." [Online resource]

Advanced/Specialized

  • McCloskey, D. and S. Ziliak (1996). "The Standard Error of Regressions." Journal of Economic Literature. [Critique of significance testing]

  • Gelman, A. and J. Carlin (2014). "Beyond Power Calculations: Assessing Type S (Sign) and Type M (Magnitude) Errors." Perspectives on Psychological Science.

Applications

  • Nosek, B., et al. (2015). "Promoting an Open Research Culture." Science. [Open science guidelines]

  • AEA (2019). "Data and Code Availability Policy." American Economic Association.

  • Li, D. (2017). "Expertise versus Bias in Evaluation: Evidence from the NIH." AEJ: Applied. Rigorous analysis of peer review showing expertise dominates bias.


Exercises

Conceptual

  1. A colleague argues that pre-registration is unnecessary for observational research because the data already exist. How would you respond? Under what circumstances might pre-registration still be valuable?

  2. Explain why a 95% confidence interval conveys more information than a significance star. Give an example where knowing the interval would change interpretation.

Applied

  1. Find a published empirical economics paper. Evaluate its replicability based on available information: Is data available? Is code available? Could you reproduce the results? What's missing?

  2. Take one of your own past analyses (or a homework assignment). Reorganize it following the project structure principles in this chapter. Document what you had to add or clarify.

Discussion

  1. Some argue that requiring data and code sharing imposes unfair burdens on researchers who collected expensive original data, giving free-riders access to years of work. Others argue openness is essential to science. Where do you come down, and how would you design policies to balance these concerns?


Appendix 25A: Resources

Pre-Registration Platforms

  • AEA RCT Registry (experiments in economics)

  • OSF Registries (Open Science Framework)

  • EGAP Registry (governance and politics)

  • As Predicted (simple, rapid registration)

Data Repositories

  • ICPSR (social science data)

  • Harvard Dataverse

  • OpenICPSA

  • Zenodo (general purpose)

  • Journal-specific repositories

Software and Tools

  • Git/GitHub (version control)

  • Docker (computational reproducibility)

  • R Markdown/Jupyter (literate programming)

  • Make/Snakemake (pipeline management)

Style Guides

  • Gentzkow and Shapiro code guide

  • STATA coding guidelines (SSC)

  • Google R style guide

  • Journal-specific guidelines

Last updated