1 How to Win a Forecasting Tournament? Philip E. Tetlock Wharton - - PowerPoint PPT Presentation

1 how to win a forecasting tournament
SMART_READER_LITE
LIVE PREVIEW

1 How to Win a Forecasting Tournament? Philip E. Tetlock Wharton - - PowerPoint PPT Presentation

1 How to Win a Forecasting Tournament? Philip E. Tetlock Wharton School CFA Asset Management Forum Montreal, October 8, 2015 WHAT ARE FORECASTING TOURNAMENTS? level-playing-field competitions to determine who knows what a disruptive


slide-1
SLIDE 1

1

slide-2
SLIDE 2

How to Win a Forecasting Tournament? Philip E. Tetlock Wharton School

CFA Asset Management Forum Montreal, October 8, 2015

slide-3
SLIDE 3
  • level-playing-field competitions to determine who knows what
  • a disruptive technology that destabilizes stale status hierarchies

WHAT ARE FORECASTING TOURNAMENTS?

3

slide-4
SLIDE 4

How Did GJP Win the Tournament?

  • By assigning the most accurate probability estimates to over 500
  • utcomes of “national security relevance”
  • But how did GJP do that?
slide-5
SLIDE 5

Winning requires picking battles wisely:

Where the pendulum swings Where the ball stops Where the hurricane meanders More Predictable Less Predictable

5

slide-6
SLIDE 6

Discounting Pseudo-Diagnostic News to Which Crowd Over-Reacts Spotting Subtly-Diagnostic News to Which: Crowd Under-Reacts

Winning Requires Skill at:

Time 1 Subjective Probability Time 1 Subjective Probability

E1 E2 E3 Crowd Beliefs E1

6

slide-7
SLIDE 7

False-Positives

9/11: Under- Connecting the Dots WMD: Over- Connecting the Dots Finding Osama Bin Laden

False-Negatives

And winning requires moving beyond blame- game ping-pong

7

slide-8
SLIDE 8

But How Exactly Did GJP JP Pull it Off?

Get Right People on Bus

  • Spotting/cultivating superforecasters (40%

boost) Teaming

  • Anti-groupthink groups (10% boost).

Training

  • Debiasing exercises (10% boost)

Elitist Algorithms

  • Aggregation algorithms that up-weight shrewd

forecasters AND extremize to compensate for conservatism of aggregates (25%--plus boost)

8

slide-9
SLIDE 9

Obama’s Osama Decision: Through a GJP JP Lens

  • Hollywood vs. History (the myth and reality of Zero Dark Thirty)
  • Two Thought-Experiment Variations on Reality
  • Clones vs. Silos
  • National Security vs. March Madness
slide-10
SLIDE 10

OPTOMETRY TRUMPS PROPHECY

  • GJP’s methods improve foresight using tested tools:

personnel selection, training, teaming, incentives and algorithms

  • Still a blurry world, just less so: GJP’s best methods

assign probabilities of 24-28% to things that don’t happen/ 72-76% to things that do

10

slide-11
SLIDE 11

END

slide-12
SLIDE 12

Ungar’s lo log-odds model beat all ll comers (in includin ing several predic iction markets)

  • Log-odds with shrinkage + noise
  • mj = a log(pj/(1-pj) + e
  • Amount of transformation, a, depends on sophistication

and diversity of forecaster pool

0.2 0.4 0.6 0.8 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Transformed Probability

Probability

12

slide-13
SLIDE 13

Da Day Probability of

  • f Ra

Rain in Ou Outcome of

  • f Ra

Rain in Brie Brier Sc Scores 1 90% Yes = 100% (1 (1-.9)2 + + (0 (0-.1)2 = = 0.02 0.02 2 50% Yes = 100% (1 (1-.5)2 + + (0 (0-.5)2 = = 0.50 0.50 3 50% No = 0% (0 (0-.5)2 + + (1 (1-.5)2 = = 0.50 0.50 4 80% Yes = 100% (1 (1-.8)2

2 +

+ (0 (0-.2 .2)2 = = 0.08 0.08 Mean 68% 50% 0.28

Measuring the accuracy of f probability ju judgments

13

slide-14
SLIDE 14

Measuring Accuracy: Brier Scoring

Best Possible Random Worst Possible 2.0 Reverse Clairvoyance .5 Just Guessing Perfect theory of deterministic system

14

slide-15
SLIDE 15

Breaking Brier Scores Down In Into Two Key Metrics:

  • Calibration
  • Resolution

15

slide-16
SLIDE 16

Examples of Calibration & Resolution

Subjective Probability 1 1 0.5 0.5

Best Possible Calibration with Poor Resolution

16

slide-17
SLIDE 17

Examples of Calibration & Resolution

Subjective Probability 1 1 0.5 0.5

Best Possible Calibration with Good Resolution

17

slide-18
SLIDE 18

Examples of Calibration & Resolution

Subjective Probability 1 1 0.5 0.5

Best Possible Calibration with Best Possible Resolution

18

slide-19
SLIDE 19
  • Minimalist:
  • Dart-throwing chimp
  • Simple extrapolation/time-series models
  • Moderately aggressive:
  • Unweighted mean/median of wisdom of the crowd
  • Expert consensus panels (Central Banks, EIU, Bloomberg,…)
  • Maximalist
  • Most advanced statistical/Big-Data models
  • Beating deep liquid markets

Benchmarking (w (what should count as a good brier score?)

19

slide-20
SLIDE 20

Other Take-Aways fr from the Tournaments

  • We discovered:
  • Just how vague “vague verbiage” can be—and how it makes it impossible to

keep score

  • The personality/behavioral profiles of superforecasters
  • The group-dynamics profiles of superteams
  • Designing debiasing training that boosts real-world accuracy
slide-21
SLIDE 21

Vague verbiage can be very ry vague

  • it might happen (0.09 to .64)
  • it could happen (0.02 to .56)
  • it's a possibility (0.001 to .45)
  • It’s a real possibility (0.22 to 0.89)
  • it's probable (0.55 to 0.90)
  • maybe (QER 0.31 to 0.69)
  • distinct possibility (0.21 to 0.84)
  • risky (0.11 to 0.83)
  • some chance (0.05 to 0.42)
  • slamdunk or sure thing (QER,

0.95 to 1.0)

  • Watch what happens when we translate words

into quant-equivalence ranges:

Less certain More certain

1

might could possibly real possibility probable maybe distinct possibility risky some chance slam dunk / sure thing impossible

21

slide-22
SLIDE 22

How Accurate Are Today’s Thought Leaders?

Wolf Krugman Ferguson Bremmer Friedman Kristol

22

slide-23
SLIDE 23

Profiling Superforecasters

  • Fluid intelligence helps but without…
  • Active open-mindedness helps but without …
  • And both combined count for little unless:
  • You believe probability estimation is a skill that can be cultivated—and is

worth cultivating

slide-24
SLIDE 24

Profiling Superteams

  • Somehow manage to check groupthink via precision questioning and

constructive confrontation without degrading into factionalism

slide-25
SLIDE 25

Yet Goli liath Decid ided to Lend David id Sli lingshot Money

  • In 2010, IARPA challenged five $5M-per-year research programs to
  • ut-predict a $5B-per-year bureaucracy in a 4-year tournament
  • One of these programs, GJP, won the tournament—by big margins