calibrate p values by taking the square root
play

Calibrate p values by taking the square root Rutgers Foundations of - PowerPoint PPT Presentation

Calibrate p values by taking the square root Rutgers Foundations of Probability Seminar September 12, 2016 Glenn Shafer 1. Significance levels and p values 2. Game theoretic probability 3. The dynamic nature of game theoretic testing


  1. Calibrate p ‐ values by taking the square root Rutgers Foundations of Probability Seminar September 12, 2016 Glenn Shafer 1. Significance levels and p ‐ values 2. Game ‐ theoretic probability 3. The dynamic nature of game ‐ theoretic testing 4. Calibrating p ‐ values 5. Insuring against loss of evidence 1

  2. See Working Papers at www.probabilityandfinance.com: 33. Test martingales, Bayes factors, and p ‐ values, by Glenn Shafer, Alexander Shen, Nikolai Vereshchagin, and Vladimir Vovk. Statistical Science 26 , 84–101, 2011. 34. Insuring against loss of evidence in game ‐ theoretic probability, by A. Philip Dawid, Steven de Rooij, Glenn Shafer, Alexander Shen, Nikolai Vereshchagin, and Vladimir Vovk . Statistics and Probability Letters 81 , 157–162, 2011. 2

  3. For nearly 100 years, researchers have persisted in using p ‐ values in spite of fierce criticism. Both Bayesians and Neyman ‐ Pearson purists contend that use of a p ‐ value is cheating even in the simplest case, where the hypothesis to be tested and a test statistic are specified in advance. Bayesians point out that a small p ‐ value often does not translate into a strong Bayes factor against the hypothesis. Neyman ‐ Pearson purists insist that you should state a significance level in advance and stick with it, even if the p ‐ value turns out to be much smaller than this significance level. But many applied statisticians persist in feeling that a p ‐ value much smaller than the significance level is meaningful evidence. In the game ‐ theoretic approach to probability (see my 2001 book with Vladimir Vovk, described at www.probabilityandfinance.com), you test a statistical hypothesis by using its probabilities to bet. You reject at a significance level of 0.01, say, if you succeed in multiplying the capital you risk by 100. In this picture, we can calibrate small p ‐ values so as to measure their meaningfulness while absolving them of cheating. There are various ways to implement this calibration, but one of them leads to a very simple rule of thumb: take the square root of the p ‐ value. Thus rejection at a significance level of 0.01 requires a p ‐ value of one in 10,000. 3

  4. Part 1. Significance levels and p ‐ values • Is use of p ‐ values cheating? Part 2. Game ‐ theoretic probability • Use a game to define probability • Game ‐ theoretic justification of significance testing Part 3. The dynamic nature of game ‐ theoretic testing • Evidence can go up and then back down. • Pretending you had stopped earlier = using p ‐ value Part 4. Calibrating p ‐ values • Averaging stopped versions of Skeptic’s play • The square ‐ root calibrator 4

  5. Part 1. Significance levels and p ‐ values Is use of p ‐ values is cheating? The emphasis on p ‐ values began with Karl Pearson and R. A. Fisher. R. A. Fisher, 1890 ‐ 1962 Karl Pearson, 1857 ‐ 1936 5

  6. Part 1. Significance levels and p ‐ values 6

  7. Part 1. Significance levels and p ‐ values Twentieth ‐ century questions 7

  8. Part 1. Significance levels and p ‐ values 8

  9. Part 1. Significance levels and p ‐ values 9

  10. Part 1. Significance levels and p ‐ values 10

  11. Part 2. Game ‐ theoretic probability • Use a game to define probability • Game ‐ theoretic justification of significance testing Fermat: probability = measure of cases that produce event. Pascal: probability = capital you risk to get 1 if event happens. Pierre Fermat, 1601 ‐ 1665 Blaise Pascal, 1623 ‐ 1662 11

  12. Part 2. Game ‐ theoretic probability Pascal’s question to Fermat in 1654 0 Peter 0 Peter Paul’s payoffs are shown. Paul Paul 64 Paul needs 2 points to win. Peter needs only one. If the game must be broken off, how much of the stake should Paul get? 12

  13. Part 2. Game ‐ theoretic probability 0 Peter Fermat’s answer 0 Peter Suppose they play two rounds. Paul There are 4 possible outcomes: Paul 64 1. Peter wins first, Peter wins second 2. Peter wins first, Paul wins second 3. Paul wins first, Peter wins second 4. Paul wins first, Paul wins second Paul wins only in outcome 4. So his share should be ¼, or 16 pistoles. Pascal didn’t like the argument. 13

  14. Part 2. Game ‐ theoretic probability Pascal’s answer (game theory) 0 Peter 16 0 Peter Paul 32 Paul 64 14

  15. Part 2. Game ‐ theoretic probability 0 Peter 16 Peter 0 32 Paul Paul 64 0 Peter 1/4 Peter 0 1/2 Paul Paul 1 15

  16. Part 2. Game ‐ theoretic probability Measure ‐ theoretic probability: • Classical: elementary events with probabilities adding to one. • Modern: space with filtration and probability measure. Probability of A = total probability of elementary events favoring A Game ‐ theoretic probability: • Forecaster offers prices for uncertain payoffs. • Skeptic decides what to buy. Probability of A = stake Skeptic must risk to get 1 if A happens Upper probability of A = stake Skeptic must risk to get at least 1 if A happens 16

  17. Part 2. Game ‐ theoretic probability Ville revived game ‐ theoretic probability (e.g., martingales) in 1939. Jean Ville 1910 ‐ 1989 Ville’s Picture On each round: 1. Skeptic decides which offers to accept. 2. Reality decides the outcome. Sequential nature of the game is fundamental. 17

  18. Part 2. Game ‐ theoretic probability 18

  19. Part 2. Game ‐ theoretic probability Ville’s game ‐ theoretic foundation for classical probability • An event has probability zero if and only if Skeptic can multiply his capital infinitely if the event fails. • An event has probability <1/ K if and only if Skeptic can multiply his capital by K if the event fails. Vovk and I generalize in two ways: 1. We say upper probability instead of probability when too few bets are offered to construct an exact 0/1 payoff. 2. We allow bets to be offered in the course of the game by a forecaster. Vovk’s Picture On each round: 1. Forecaster offers bets. 2. Skeptic decides which offers to accept. 3. Reality decides the outcome. 19

  20. Part 2. Game ‐ theoretic probability Game ‐ theoretic justification of significance testing 20

  21. Part 2. Game ‐ theoretic probability Game ‐ theoretic justification of significance testing The gambling picture justifies significance testing. Don’t try to understand game ‐ theoretic probability in terms of classical statistics. The logic goes in the other direction. 21

  22. Part 2. Game ‐ theoretic probability Game ‐ theoretic explanation of why p ‐ values are less convincing Strategy depends on p !! 22

  23. Part 3. Dynamic nature of game ‐ theoretic testing • Evidence can go up and then back down. • Pretending you stopped earlier = using p ‐ value 23

  24. Part 3. Dynamic nature of game ‐ theoretic testing Classical Cournot principle Game ‐ theoretic Cournot principle Meaning of probability model Meaning of forecasts = = Event of small probability 1/K Skeptic will not multiply capital risked by selected in advance will not happen. large factor. Classic principle as special case of game ‐ theoretic principle: 1. Assume forecast on each round is probability distribution for Reality’s next move. 2. Fix a strategy for Forecaster, thus defining a classical probability model for Reality’s moves. 3. Fix a strategy for Skeptic (including a stopping time). 4. Fix a factor K by which Skeptic aims to multiply capital. In this special case, the two principles are equivalent, because 24

  25. Part 3. Dynamic nature of game ‐ theoretic testing Classical Cournot principle Game ‐ theoretic Cournot principle Meaning of probability model Meaning of forecasts = = Event of small probability 1/K Skeptic will not multiply capital risked by selected in advance will not happen. large factor. The scope of the generalization: 1. Forecast on each round may fall short of a complete probability distribution for Reality’s next move. 2. Forecaster need not follow a strategy. 3.Skeptic need not follow a strategy. 4. Skeptic need not set a goal for multiplying his capital. 5. But the stopping time must be fixed. 25

  26. Part 3. Dynamic nature of game ‐ theoretic testing Skeptic need not follow a strategy. But we study strategies for Skeptic in order to see what he can accomplish. The capital process for a strategy for Skeptic is called a martingale . (This usage is due to Jean Ville.) 26

  27. Part 3. Dynamic nature of game ‐ theoretic testing • Forecaster gives 50 ‐ 50 odds on Paul each time. • Skeptic’s strategy: Start with 16 pistols and bet all his money on Paul each time. • The numbers in red constitute the martingale for this strategy. Peter 0 Peter 16 0 Paul 32 Paul 64 27

  28. Part 3. Dynamic nature of game ‐ theoretic testing Selecting a strategy for Skeptic (martingale) In probability case, the martingale is a likelihood ratio. This is how the notion of an alternative hypothesis enters the picture. 28

  29. Part 3. Dynamic nature of game ‐ theoretic testing Classical Cournot principle Game ‐ theoretic Cournot principle Meaning of probability model Meaning of forecasts = = Event of small probability 1/K Skeptic will not multiply capital risked by selected in advance will not happen. large factor. The scope of the generalization: 1. Forecast on each round may fall short of a complete probability distribution for Reality’s next move. 2. Forecaster need not follow a strategy. 3. Skeptic need not follow a strategy. 4. Skeptic need not set a goal for multiplying his capital. 5. But the stopping time must be fixed. 29

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend