Bayesian Adjustment for Multiplicity Jim Berger Duke University - PowerPoint PPT Presentation

✬ ✩ 2011 Rao Prize Conference, Penn State, June 19 Bayesian Adjustment for Multiplicity Jim Berger Duke University with James Scott University of Texas 2011 Rao Prize Conference Department of Statistics, Penn State University May 19, 2011 ✫ ✪ 1

✬ ✩ 2011 Rao Prize Conference, Penn State, June 19 ✫ ✪ 2

✬ ✩ 2011 Rao Prize Conference, Penn State, June 19 Outline • Background on multiplicity • Illustration of the Bayesian approach through simpler examples – Multiple testing under exclusivity – Multiple testing under non-exclusivity – Sequence multiple testing • The general Bayesian approach to multiplicity adjustment • Multiple models • Variable selection (including comparison with empirical Bayes) • Subgroup analysis ✫ ✪ 3

✬ ✩ 2011 Rao Prize Conference, Penn State, June 19 Some Multiplicity Problems in SAMSI Research Programs • Stochastic Computation / Data Mining and Machine Learning – Example: Microarrays, with 100,000 mean gene expression differentials µ i , and testing H 0 : µ i = 0 versus H 1 : µ i ̸ = 0. Multiplicity problem: Even if all µ i = 0, one would find that roughly 500 tests reject at, say, level α = 0 . 05, so a correction for this effect is needed. • Astrostatistics and Phystat – Example: 1.6 million tests of Cosmic Microwave Background radiation for non-Gaussianity in its spatial distribution. – Example: At the LHC, they are considering using up to 10 12 tests for each particle event to try to detect particles such as the Higgs boson. And recently (pre LHC), there was an 8 σ event that didn’t replicate. • Multiplicity and Reproducibility in Scientific Studies – In the USA, drug compounds entering Phase I development today have an 8% chance of reaching market, versus a 14% chance 15 years ago – 70% phase III failure rates, versus 20% failure rate 10 years ago. ✫ ✪ – Reports that 30% of phase III successes fail to replicate. 4

✬ ✩ 2011 Rao Prize Conference, Penn State, June 19 Simple Examples of the Bayesian Approach to Multiplicity Adjustment Key Fact: Bayesian analysis deals with multiplicity adjustment solely through the assignment of prior probabilities to models or hypotheses. Example: Multiple Testing under Exclusivity Suppose one is testing mutually exclusive hypotheses H i , i = 1 , . . . , m , so each hypothesis is a separate model. If the hypotheses are viewed as exchangeable, choose P ( H i ) = 1 /m . Example: 1000 energy channels are searched for a signal: • if the signal is known to exist and occupy only one channel, but no channel is theoretically preferred, each channel can be assigned prior probability 0.001. • if the signal is not known to exist (e.g., it is the prediction of a non-standard physics theory) prior probability 1/2 should be given to ‘no signal,’ and probability 0.0005 to each channel. ✫ ✪ This is the Bayesian solution regardless of the structure of the data. 5

✬ ✩ 2011 Rao Prize Conference, Penn State, June 19 In contrast, frequentist solutions depend on the structure of the data. Example: For each channel, test H 0 i : µ i = 0 versus H 1 i : µ i > 0. Data: X i , i = 1 , ..., m , are normally distributed with mean µ i , variance 1, and correlation ρ . If ρ = 0, one can just do individual tests at level α/m (Bonferroni) to obtain an overall error probability of α . If ρ > 0, harder work is needed: • Choose an overall decision rule, e.g., “declare channel i to have the signal if X i is the largest value and X i > K .” • Compute the corresponding error probability, which can be shown to be ( K − √ ρZ ) m ] [ X i > K | µ 1 = . . . = µ m = 0) = E Z √ 1 − ρ α = Pr(max 1 − Φ , i where Φ is the standard normal cdf and Z is standard normal. Note that this gives (essentially) the Bonferroni correction when ρ = 0, and ✫ ✪ converges to 1 − Φ[ K ] as ρ → 1 (the one-dimensional solution). 6

✬ ✩ 2011 Rao Prize Conference, Penn State, June 19 An example of non-mutually exclusive Bayesian multiple testing (Scott and Berger, 2006 JSPI; other, more sophisticated full Bayesian analyses are in G¨ onen et. al. (03), Do, M¨ uller, and Tang (02), Newton et all. (01), Newton and Kendziorski (03), M¨ uller et al. (03), Guindani, M., Zhang, S. and Mueller, P.M. (2007), . . . ; many empirical Bayes such as Storey, J.D., Dai, J.Y and Leek, J.T. (2007)) • Suppose x i ∼ N ( µ i , σ 2 ) , i = 1 , . . . , m , are observed, σ 2 known, and test H 0 i : µ i = 0 versus H 1 i : µ i ̸ = 0. • Most of the µ i are thought to be zero; let p denote the unknown common prior probability that µ i is zero. • Assume that the nonzero µ i follow a N (0 , V ) distribution, with V unknown. q • Assign p the uniform prior on (0 , 1) and V the prior density π ( V ) = σ 2 / ( σ 2 + V ) 2 . ✫ ✪ 7

✬ ✩ 2011 Rao Prize Conference, Penn State, June 19 • Then the posterior probability that µ i ̸ = 0 is p + (1 − p ) √ 1 − w e wx j 2 / (2 σ 2 ) ) ∫ 1 ∫ 1 ( 0 p ∏ dpdw j ̸ = i 0 p i = 1 − . p + (1 − p ) √ 1 − w e wx j 2 / (2 σ 2 ) ) ∫ 1 ∫ 1 ∏ m ( dpdw j =1 0 0 • ( p 1 , p 2 , . . . , p m ) can be computed numerically; for large m , it is most efficient to use importance sampling, with a common importance sample for all p i . Example: Consider the following ten ‘signal’ observations: -8.48, -5.43, -4.81, -2.64, -2.40, 3.32, 4.07, 4.81, 5.81, 6.24 • Generate n = 10 , 50 , 500 , and 5000 N (0 , 1) noise observations. • Mix them together and try to identify the signals. ✫ ✪ 8

✬ ✩ 2011 Rao Prize Conference, Penn State, June 19 The ten ‘signal’ observations #noise n -8.5 -5.4 -4.8 -2.6 -2.4 3.3 4.1 4.8 5.8 6.2 p i > . 6 10 1 1 1 .94 .89 .99 1 1 1 1 1 50 1 1 1 .71 .59 .94 1 1 1 1 0 500 1 1 1 .26 .17 .67 .96 1 1 1 2 5000 1 1.0 .98 .03 .02 .16 .67 .98 1 1 1 Table 1: The posterior probabilities of being nonzero for the ten ‘signal’ means. Note 1: The penalty for multiple comparisons is automatic. Note 2: Theorem: E [# i : p i > . 6 | all µ j = 0] = O (1) as m → ∞ , so the Bayesian procedure exerts medium-strong control over false positives. (In comparison, E [# i : Bonferroni rejects | all µ j = 0] = α .) ✫ ✪ 9

✬ ✩ 2011 Rao Prize Conference, Penn State, June 19 −5.65 −5.56 0.4 0.4 Posterior density Posterior density 0.3 0.3 0.2 0.2 0.1 0.1 0 0 0.0 0.0 −10 −5 0 5 10 −10 −5 0 5 10 mu mu −2.98 −2.62 0.4 0.4 Posterior density Posterior density 0.3 0.3 0.45 0.2 0.2 0.32 0.1 0.1 0.0 0.0 −10 −5 0 5 10 −10 −5 0 5 10 mu mu Figure 1: For four of the observations, 1 − p i = Pr( µ i = 0 | y ) (the vertical bar), ✫ ✪ and the posterior densities for µ i ̸ = 0 . 10

✬ ✩ 2011 Rao Prize Conference, Penn State, June 19 Sequence Multiple Testing ✫ ✪ 11

✬ ✩ 2011 Rao Prize Conference, Penn State, June 19 Hypotheses and Data: • Alvac had shown no effect • Aidsvax had shown no effect Question: Would Alvac as a primer and Aidsvax as a booster work? The Study: Conducted in Thailand with 16,395 individuals from the general (not high-risk) population: • 74 HIV cases reported in the 8198 individuals receiving placebos • 51 HIV cases reported in the 8197 individuals receiving the treatment ✫ ✪ 12

✬ ✩ 2011 Rao Prize Conference, Penn State, June 19 The test that was performed: • Let p 1 and p 2 denote the probability of HIV in the placebo and treatment populations, respectively. • Test H 0 : p 1 = p 2 versus H 1 : p 1 ̸ = p 2 • Normal approximation okay, so p 1 − ˆ ˆ p 2 = . 009027 − . 006222 z = = 2 . 06 √ ˆ . 001359 σ { ˆ p 1 − ˆ p 2 } is approximately N( θ, 1), where θ = ( p 1 − p 2 ) / ( . 001359). We thus test H 0 : θ = 0 versus H 1 : θ ̸ = 0, based on z . • Observed z = 2 . 06, so the p -value is 0.04. Questions: • Is the p -value useable as a direct measure of vaccine efficacy? • Should the fact that there were two previous similar trials be taken into account (the multiple testing part of the story)? ✫ ✪ 13

✬ ✩ 2011 Rao Prize Conference, Penn State, June 19 Bayesian Analysis of the Single Trial: Prior distribution: • Pr ( H i ) = prior probability that H i is true, i = 0 , 1, • On H 1 : θ > 0, let π ( θ ) be the prior density for θ . Note: H 0 must be believable (at least approximately) for this to be reasonable (i.e., no fake nulls). Subjective Bayes: choose these based on personal beliefs Objective (or default) Bayes: choose • Pr ( H 0 ) = Pr ( H 1 ) = 1 2 , • π ( θ ) = Uniform(0 , 6 . 46), which arises from assigning – uniform for p 2 on 0 < p 2 < p 1 , – plug in for p 1 . ✫ ✪ 14

✬ ✩ 2011 Rao Prize Conference, Penn State, June 19 Posterior probability of hypotheses: Pr ( H 0 | z ) = probability that H 0 true, given data z f ( z | θ = 0) Pr ( H 0 ) = ∫ ∞ Pr ( H 0 ) f ( x | θ = 0) + Pr ( H 1 ) f ( z | θ ) π ( θ ) dθ 0 For the objective prior, Pr ( H 0 | z = 2 . 06) ≈ 0 . 33 ( recall, p-value ≈ .04 ) Posterior density on H 1 : θ > 0 is 2 (2 . 06 − θ ) 2 π ( θ | z = 2 . 06 , H 1 ) ∝ π ( θ ) f (2 . 06 | θ ) = (0 . 413) e − 1 for 0 < θ < 6 . 46. ✫ ✪ 15

✬ ✩ 2011 Rao Prize Conference, Penn State, June 19 0.8 0.337 0.6 p(z) 0.4 0.2 0.0 −2 0 2 4 6 ✫ ✪ z 16

Bayesian Adjustment for Multiplicity Jim Berger Duke University - PowerPoint PPT Presentation

2011 Rao Prize Conference, Penn State, June 19 Bayesian Adjustment for Multiplicity Jim Berger Duke University with James Scott University of Texas 2011 Rao Prize Conference Department of Statistics, Penn State University May 19,

RISK ADJUSTMENT DOCUMENTATION & CODING 1 DEFINE RISK ADJUSTMENT Define Risk Adjustment and

Covariate Adjustment and Statistical Power Tara Slough EGAP Learning Days X Covariate Adjustment

MULTIPLICITY CORRECTIONS IN BIOEQUIVALENCE TRIALS Jiri Hofmann EMA Workshop on Multiplicity

Properties of Eigenvalues and Eigenvectors Algebraic Multiplicity Defn. The algebraic

Multiplicity Fluctuations Josef Uchytil FNSPE CTU in Prague 27. 9. 2017 Josef Uchytil (FNSPE

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

Trade and Labour Trade and Labour Market Adjustment Market Adjustment Joseph Francois Johannes

The SDL adjustment mechanism 28 June 2018 Outline of presentation What is the SDL adjustment

Risk Adjustment Using CDPS Todd Gilmer, PhD Associate Professor University of California, San

CS440/ECE448 Lecture 15: Bayesian Inference and Bayesian Learning Slides by Svetlana Lazebnik,

Bayesian Learning 1 Outline MLE, MAP vs. Bayesian Learning Bayesian Linear Regression

CS 331: Bayesian Networks 2 1 Bayesian Networks Youve heard about how Bayesian networks

Multiplicity Guidelines Alex Dmitrienko (Mediana Inc) NISS-Merck Meet-Up September 2017

Energy Dependence of Multiplicity Fluctuations in Heavy Ion Collisions Benjamin Lungwitz, IKF

The high-multiplicity frontier for two-loop QCD Background Numerical unitarity for 2-loop

tdlo CS 744: DATACENTER AS A COMPUTER Shivaram Venkataraman Fall 2020 ANNOUNCEMENTS -

What is Modern Web? Web Frameworks Web Tooling Mobile / Tablet First

QUARTERLY MEETING Jacquie Vargas Building Coordinator Program Director, Communications Manager

I w I want nt to do th o do the rig ight ht thi hing ng but ut SHAPE APES S

Explainable (Deep) Learning and Simulation approaches Torsten Mller Visualization and

Understanding parallel analysis methods for rank selection in PCA David Hong Yue Sheng Edgar

THE 3-R'S OF DATA- THE 3-R'S OF DATA- SCIENCE: SCIENCE: REPEATABILITY REPEATABILITY, ,

Surv rviving Restructure Welcome Surviving Restructure - Introductions Sandra Leek Catherine

Bayesian Adjustment for Multiplicity Jim Berger Duke University - PowerPoint PPT Presentation

2011 Rao Prize Conference, Penn State, June 19 Bayesian Adjustment for Multiplicity Jim Berger Duke University with James Scott University of Texas 2011 Rao Prize Conference Department of Statistics, Penn State University May 19,

RISK ADJUSTMENT DOCUMENTATION &amp; CODING 1 DEFINE RISK ADJUSTMENT Define Risk Adjustment and

Covariate Adjustment and Statistical Power Tara Slough EGAP Learning Days X Covariate Adjustment

MULTIPLICITY CORRECTIONS IN BIOEQUIVALENCE TRIALS Jiri Hofmann EMA Workshop on Multiplicity

Properties of Eigenvalues and Eigenvectors Algebraic Multiplicity Defn. The algebraic

Multiplicity Fluctuations Josef Uchytil FNSPE CTU in Prague 27. 9. 2017 Josef Uchytil (FNSPE

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

Trade and Labour Trade and Labour Market Adjustment Market Adjustment Joseph Francois Johannes

The SDL adjustment mechanism 28 June 2018 Outline of presentation What is the SDL adjustment

Risk Adjustment Using CDPS Todd Gilmer, PhD Associate Professor University of California, San

CS440/ECE448 Lecture 15: Bayesian Inference and Bayesian Learning Slides by Svetlana Lazebnik,

Bayesian Learning 1 Outline MLE, MAP vs. Bayesian Learning Bayesian Linear Regression

CS 331: Bayesian Networks 2 1 Bayesian Networks Youve heard about how Bayesian networks

Multiplicity Guidelines Alex Dmitrienko (Mediana Inc) NISS-Merck Meet-Up September 2017

Energy Dependence of Multiplicity Fluctuations in Heavy Ion Collisions Benjamin Lungwitz, IKF

The high-multiplicity frontier for two-loop QCD Background Numerical unitarity for 2-loop

tdlo CS 744: DATACENTER AS A COMPUTER Shivaram Venkataraman Fall 2020 ANNOUNCEMENTS -

What is Modern Web? Web Frameworks Web Tooling Mobile / Tablet First

QUARTERLY MEETING Jacquie Vargas Building Coordinator Program Director, Communications Manager

I w I want nt to do th o do the rig ight ht thi hing ng but ut SHAPE APES S

Explainable (Deep) Learning and Simulation approaches Torsten Mller Visualization and

Understanding parallel analysis methods for rank selection in PCA David Hong Yue Sheng Edgar

THE 3-R'S OF DATA- THE 3-R'S OF DATA- SCIENCE: SCIENCE: REPEATABILITY REPEATABILITY, ,

Surv rviving Restructure Welcome Surviving Restructure - Introductions Sandra Leek Catherine

RISK ADJUSTMENT DOCUMENTATION & CODING 1 DEFINE RISK ADJUSTMENT Define Risk Adjustment and