1 how to win a forecasting tournament
play

1 How to Win a Forecasting Tournament? Philip E. Tetlock Wharton - PowerPoint PPT Presentation

1 How to Win a Forecasting Tournament? Philip E. Tetlock Wharton School CFA Asset Management Forum Montreal, October 8, 2015 WHAT ARE FORECASTING TOURNAMENTS? level-playing-field competitions to determine who knows what a disruptive


  1. 1

  2. How to Win a Forecasting Tournament? Philip E. Tetlock Wharton School CFA Asset Management Forum Montreal, October 8, 2015

  3. WHAT ARE FORECASTING TOURNAMENTS? • level-playing-field competitions to determine who knows what • a disruptive technology that destabilizes stale status hierarchies 3

  4. How Did GJP Win the Tournament? • By assigning the most accurate probability estimates to over 500 outcomes of “national security relevance” • But how did GJP do that?

  5. Winning requires picking battles wisely: Where the Where the Where the ball pendulum hurricane stops swings meanders More Predictable Less Predictable 5

  6. Winning Requires Skill at: Discounting Pseudo-Diagnostic News to Which Spotting Subtly-Diagnostic News to Which: Crowd Over-Reacts Crowd Under-Reacts 1 1 Subjective Probability Subjective Probability Crowd Beliefs 0 0 E1 E1 E2 E3 Time Time 6

  7. And winning requires moving beyond blame- game ping-pong Finding WMD: Over- 9/11: Under- Osama Connecting the Dots Connecting the Dots Bin Laden False-Positives False-Negatives 7

  8. But How Exactly Did GJP JP Pull it Off? Get Right • Spotting/cultivating superforecasters (40% People on Bus boost) Teaming • Anti-groupthink groups (10% boost). Training • Debiasing exercises (10% boost) • Aggregation algorithms that up-weight shrewd Elitist forecasters AND extremize to compensate for Algorithms conservatism of aggregates (25%--plus boost) 8

  9. Obama’s Osama Decision: Through a GJP JP Lens • Hollywood vs. History (the myth and reality of Zero Dark Thirty) • Two Thought-Experiment Variations on Reality • Clones vs. Silos • National Security vs. March Madness

  10. OPTOMETRY TRUMPS PROPHECY  GJP’s methods improve foresight using tested tools: personnel selection, training, teaming, incentives and algorithms  Still a blurry world, just less so: GJP’s best methods assign probabilities of 24- 28% to things that don’t happen/ 72-76% to things that do 10

  11. END

  12. Ungar’s lo log-odds model beat all ll comers (in includin ing several predic iction markets) • Log-odds with shrinkage + noise m j = a log(p j /(1-p j ) + e • • Amount of transformation, a , depends on sophistication and diversity of forecaster pool 1 Transformed Probability 0.8 0.6 0.4 0.2 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 12 Probability

  13. Measuring the accuracy of f probability ju judgments Day Da Probability of of Ra Rain in Ou Outcome of of Ra Rain in Brie Brier Sc Scores (1-.9) 2 + (0-.1) 2 = 1 90% Yes = 100% (1 + (0 = 0.02 0.02 (1-.5) 2 + (0-.5) 2 = 2 50% Yes = 100% (1 + (0 = 0.50 0.50 (0-.5) 2 + (1-.5) 2 = 3 50% No = 0% (0 + (1 = 0.50 0.50 .2) 2 = (1-.8) 2 2 + 4 80% Yes = 100% (1 + (0 (0-.2 = 0.08 0.08 Mean 68% 50% 0.28 13

  14. Measuring Accuracy: Brier Scoring Best Possible Random Worst Possible 2.0 0 .5 Perfect theory of Just Guessing Reverse deterministic Clairvoyance system 14

  15. Breaking Brier Scores Down In Into Two Key Metrics: • Calibration • Resolution 15

  16. Examples of Calibration & Resolution Best Possible Calibration with Poor Resolution 1 0.5 0 0.5 1 Subjective Probability 16

  17. Examples of Calibration & Resolution Best Possible Calibration with Good Resolution 1 0.5 0 0.5 1 Subjective Probability 17

  18. Examples of Calibration & Resolution Best Possible Calibration with Best Possible Resolution 1 0.5 0 0.5 1 Subjective Probability 18

  19. Benchmarking (w (what should count as a good brier score?) • Minimalist: • Dart-throwing chimp • Simple extrapolation/time-series models • Moderately aggressive: • Unweighted mean/median of wisdom of the crowd • Expert consensus panels (Central Banks, EIU, Bloomberg,…) • Maximalist • Most advanced statistical/Big-Data models • Beating deep liquid markets 19

  20. Other Take-Aways fr from the Tournaments • We discovered: • Just how vague “vague verbiage” can be— and how it makes it impossible to keep score • The personality/behavioral profiles of superforecasters • The group-dynamics profiles of superteams • Designing debiasing training that boosts real-world accuracy

  21. Vague verbiage can be very ry vague  Watch what happens when we translate words into quant-equivalence ranges : • it might happen (0.09 to .64) • maybe (QER 0.31 to 0.69) • it could happen (0.02 to .56) • distinct possibility (0.21 to 0.84) • it's a possibility (0.001 to .45) • risky (0.11 to 0.83) • It’s a real possibility (0.22 to 0.89) • some chance (0.05 to 0.42) • it's probable (0.55 to 0.90) • slamdunk or sure thing (QER, 0.95 to 1.0) real possibly possibility could might probable 0 1 Less certain More certain impossible some risky distinct maybe slam dunk / chance possibility sure thing 21

  22. Wolf Krugman Bremmer How Accurate Are Today’s Thought Leaders? Ferguson Friedman Kristol 22

  23. Profiling Superforecasters • Fluid intelligence helps but without… • Active open- mindedness helps but without … • And both combined count for little unless: • You believe probability estimation is a skill that can be cultivated — and is worth cultivating

  24. Profiling Superteams • Somehow manage to check groupthink via precision questioning and constructive confrontation without degrading into factionalism

  25. Yet Goli liath Decid ided to Lend David id Sli lingshot Money • In 2010, IARPA challenged five $5M-per-year research programs to out-predict a $5B-per-year bureaucracy in a 4-year tournament • One of these programs, GJP, won the tournament — by big margins

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend