Statistics in a social context Opening remarks Maximilian Kasy May - - PowerPoint PPT Presentation

statistics in a social context opening remarks
SMART_READER_LITE
LIVE PREVIEW

Statistics in a social context Opening remarks Maximilian Kasy May - - PowerPoint PPT Presentation

Statistics in a social context Opening remarks Maximilian Kasy May 10, 2019 Introduction Current debates across the social and life sciences: Publication bias and p-hacking, replicability and replications, pre-analysis plans and


slide-1
SLIDE 1

Statistics in a social context Opening remarks

Maximilian Kasy May 10, 2019

slide-2
SLIDE 2

Introduction

  • Current debates across the social and life sciences:
  • Publication bias and p-hacking,
  • replicability and replications,
  • pre-analysis plans and other reform proposals, ...
  • Motivation of this conference:
  • These debates raise a number of foundational questions,
  • which, I believe, are not well addressed using textbook frameworks,
  • and which require input from several disciplines.

1 / 13

slide-3
SLIDE 3

Roadmap for these opening remarks

  • 1. Where I am coming from:

a Research with Isaiah Andrews on “which findings get published.” b Research with Alex Frankel on “which findings should be published.”

  • 2. Three alternative perspectives on statistics:

a Decision problems, b (optimal) communication, c research as a social process.

  • 3. Brief preview of conference.

2 / 13

slide-4
SLIDE 4

Which findings get published?

Andrews, I. and Kasy, M. (2018). Identification of and correction for publication bias

  • 1. Published research is selected

in various ways (significance at different levels, sign, ...)

  • Lab experiments in economics and psychology: Statistical significance
  • Effect of minimum wages on employment: Statistical significance, sign.
  • Deworming: Inconclusive.
  • 2. How do we know?

Form and magnitude of selection are nonparametrically identified.

  • Using systematic replication studies.
  • Using meta-studies.

3 / 13

slide-5
SLIDE 5

Evidence on selective publication

2 4 6 8 10

W

2 4 6 8 10

Wr

0.5 1

|X|

0.2 0.4 0.6 2 4 6 8 10

W

0.2 0.4

Density

  • Data from systematic replication study of Camerer et al. (2016).
  • Z-statistic W , replication W r, estimate X, standard error Σ.
  • Absent selection:
  • 1. z-statistics should be continuously distributed.
  • 2. Original and replication estimates should be symmetrically distributed.
  • 3. Estimates from studies with larger standard errors should be more dispersed,

but not shifted.

4 / 13

slide-6
SLIDE 6

Selection implies publication bias

  • 4
  • 2

2 4

  • 1.5
  • 1
  • 0.5

0.5 1 1.5

bias

bias no bias

  • 4
  • 2

2 4 0.2 0.4 0.6 0.8 1

coverage

true coverage nominal coverage

  • 4
  • 2

2 4 0.1 0.2 0.3

density

Bayesian default belief naive default belief

  • Suppose only findings with z-stats > 1.96 are published.
  • The figures plot, as a function of the true mean θ,
  • 1. The bias of Z as estimator of θ,
  • 2. the coverage of Z ± 1.96 as confidence interval for θ,
  • 3. the naive and the correct Bayesian posterior density, for a normal prior,

when no finding is published.

5 / 13

slide-7
SLIDE 7

Which findings should be published?

Frankel, A. and Kasy, M. (2018). Which findings should be published?

  • Publication bias motivates calls for reform:

Publication should not select on findings.

  • But: Is eliminating bias the right objective?

How does it relate to informing decision makers?

  • We characterize optimal publication rules from an instrumental perspective:
  • Study might inform the public about some state of the world.
  • Then the public chooses a policy action.
  • Take as given that not all findings get published (prominently).

6 / 13

slide-8
SLIDE 8

Key findings

  • 1. Optimal rules selectively publish surprising findings.

In leading examples: Similar to two-sided or one sided tests.

  • 2. But: Selective publication always distorts inference.

There is a trade-off policy relevance vs. statistical credibility.

7 / 13

slide-9
SLIDE 9

Example of optimal publication region

μ0 μ0+ c μ0- c X S 2 c /σ0

  • 2

c /σ0 t σ0 S

  • Optimal publication region (shaded). Axes:

left Estimate X, standard error S. right “t-statistic” t = (X − µ0)/S, standard error S.

  • Note:
  • Given S, publish outside symmetric interval around µ0.
  • Critical value for t-statistic is non-monotonic in S.

8 / 13

slide-10
SLIDE 10

A standard foundation of statistics: Decision theory

state of the world θ

  • bserved data

X decision a loss L(a,θ) decision function a=δ(X) statistical model X~f(x,θ)

Questions to ask in this framework:

  • Objective function?
  • Set of feasible

actions?

  • Prior information?

9 / 13

slide-11
SLIDE 11

Is this an appropriate description of empirical research?

  • Some questions:
  • Why do we not just print all the data?
  • Why do we need researchers?
  • What is the purpose of pre-committing to a research design?
  • Does commitment make sense without conflicts of interest?
  • How do we cumulatively learn from published research?
  • Can we make sense of publication bias, pre-analysis plans, etc.,

using textbook foundations of statistics?

  • Or do we need alternative foundations,

taking into account the social dimension of research?

10 / 13

slide-12
SLIDE 12

Alternative foundations

Different ways of thinking about statistics / econometrics:

  • 2. Statistics as (optimal) communication.
  • Not just “you and the data.”
  • What do we communicate to whom?
  • Subject to what costs and benefits?

Why not publish everything? Attention?

  • 3. Statistics / research as a social process.
  • Researchers, editors and referees, policymakers.
  • Incentives, information, strategic behavior.
  • Social learning, paradigm changes.

11 / 13

slide-13
SLIDE 13

Proposed agenda

  • Derive optimal methodological recommendations,
  • assuming the goal is to promote some notion of collective learning

through communication of summaries of empirical findings,

  • taking into account the constraints
  • f human psychology and the social organization of research.

To better understand these constraints, draw on

  • 1. psychology,
  • 2. sociology and history of science,
  • 3. microeconomic theory and information economics.

12 / 13

slide-14
SLIDE 14

Conference outline

  • Applied perspectives
  • Katherine Casey (development economics)

Comments on pre-specification and analysis plans

  • Simine Vazire (psychology) The Credibility Revolution in Psychological Science
  • Ben Olken (development economics) Promises and Perils of Pre-Analysis Plans
  • Daniel Mellow (meta studies)
  • Microeconomic models
  • Jann Spiess (econometrics)

Optimal Estimation when Researcher and Social Preferences are Misaligned

  • Alex Frankel (economic theory) Which findings should be published
  • Isaiah Andrews (econometrics) Statistical Reports for Remote Agents
  • Marco Ottaviani (economic theory) Strategic Sample Selection
  • Philosophical and historical perspectives
  • Theodore Porter (history of statistics) Statistics, a Tool of Science?
  • Deborah Mayo (philosophy of statistics)

3D Statistics: 7 Responses to Challenges for Statistical Testers

  • Zoe Hitzig (economic theory, philosophy)

The Problem of New Evidence: P-hacking and Pre-analysis Plans

13 / 13