Which Multiple Testing Methods are Optimal? Peter H. Westfall, - PowerPoint PPT Presentation

Which Multiple Testing Methods are Optimal? Peter H. Westfall, Texas Tech University

Background • The scientific literature has recently experienced an embarrassment of contradictory results: • Ioannidis, J.P. (2005), "Contradicted and Initially Stronger Effects in Highly Cited Clinical Research," J. Amer. Med. Assoc. 294, 218--228. • Bertram, L., McQueen, M. B., Mullin, K., Blacker, D., and Tanzi, R. E. (2007), "Systematic Meta-analyses of Alzheimer Disease Genetic Association Studies: the AlzGene Database," Nature Genetics 39, 17--23. • Boffetta, P., McLaughlin, J.K., La Vecchia, C., Tarone, R.E., Lipworth, L., Blot, W. J., (2008), "False-Positive Results in Cancer Epidemiology: A Plea for Epistemological Modesty ," J. Nat. Cancer Inst. 100, 988--995.

Goals • Compare fixed critical value methods in terms of loss Q: Does m matter? Do data correlations matter? A: It depends on how you feel about type I versus type II errors (i.e., relative costs)

Background • “Lehmann (1957a,b) was the first to consider multiple comparisons from a decision-theoretic viewpoint.” – Hochberg and Tamhane (1987), Multiple Comparisons Procedures (Wiley)

Data Setup of this Talk Data: z | θ ~N m ( θ , ρ ), ρ a correlation matrix. Model: θ i ~ iid N (0, σ 2 ), σ 2 known.

Decision Theory • Lehmann (1957a,b) Annals • Hochberg and Tamhane (1987) • Three-decision problem: Decide either – GT: θ i > 0 – LT: θ i < 0, or – NI: θ i ~ 0 (or “EM”)

A Component Loss Function • L GT ( θ ) , L LT ( θ ) , L NI ( θ ); for example: 1.2 1 1 0.8 L_NI 0.6 Loss L_GT 0.4 0.2 A 0 θ -1 0 1 -0.2

Actual and Expected Loss • Actual loss using method “M”: θ (M) ( , L ) z i i = θ I GT ( | ) L ( ) z i GT i + θ I LT ( | ) L ( ) z i LT i + θ I NI ( | ) L ( ) z i NI i ( ) Ψ = θ (M) (M) • Expected Loss: E L ( , ) z θ i , i i z i ∑ Ψ = Ψ (M) (M) • Combined Loss: (additive!?) i

Decision Rules • Decide – LT if z i < − c – GT if z i > c – NI if − c ≤ z i ≤ c • If ρ = I , then c = (1 + 1/ σ 2 ) z 1- A is optimal. ⇒ For Bonferroni-like procedures to be optimal, A = A ( m ).

Does m Matter? • Theorem: If A ( m ) = o (1) and 1/ A ( m ) = o ( m {ln( m )} 1/2 ), then Ψ (Bon) ~ Ψ (Optimal) . ⇒ If the loss of a single Type I error equals β m Type II errors (0< β <1), then Bonferroni is optimal and fixed significance level procedures (like FDR) are inadmissible. Lu, Y., and Westfall, P. (2009). Is Bonferroni Admissible for Large m ? American Journal of Mathematical and Management Sciences , Vol. 29 (1&2), 51-69.

From Lu, Y., and Westfall, P. (2009). Is Bonferroni Admissible for Large m ?

Do Data Correlations Matter? “Reject H i ” if | z i |> c , i = 1,…, m . Let V = number of false discoveries. With higher correlations among z ’s: • E( V ) is unaffected • P( V >0) is lower (smaller FWER) • Var( V ) is higher (potentially high # of false discoveries)

Effect of Correlation with Additive Loss • No affect on expected value ⇒ optimal c not affected • Affects percentiles ⇒ optimal c is affected VaR = “Value at risk”=95 th pctle of Loss (finance)

A Model for Studying Effect of Correlation Suppose z | θ ~ N m ( θ , ρ ) , with ρ = λλ′ + ψ 2 , λ ( m x1) and ψ 2 diagonal. Then ρ ij = λ i λ i . λ = + Let , where U i ~ iid U ( − 1,1). 2 2 1/2 U / ( U s ) i i i { } 1/2 − ≡ ρ = − Then E ( ρ ij )=0 and 2 1 rmsc E ( ) 1 s tan (1/ ). s ij

Waller-Duncan Loss L GT ( θ ) = − ( K +1) θ , θ < 0; L GT ( θ ) = 0, o/w. L NI ( θ ) = | θ | . Loss Loss(NI) Loss(GT) θ 0

90 th Pctle-Minimizing Optimal c , K=100

Should Loss Be Additive? • Is the cost difference between 10 and 11 Extraterrestrial Intelligence claims the same as the cost difference between 0 and 1? • Is the cost difference between 10 and 11 shouts of “fire” in a crowded theater the same as the cost difference between 0 and 1?

‘Fire-In-The-Theater’ Loss Function Let n 1 = # Directional Errors Let n 2 = # “Not Interesting” claims L 1 = n 1 /( n 1 + 1) L 2 = 1/( m − n 2 +1) − 1/( m +1) “Fire in the Theater” Loss = L 1 + L 2

Fire-In-The-Theater Loss Function Components, m =100

Expected Value-Minimizing Optimal c for Fire-In-The-Theater Loss Function

Conclusions If Type I errors are serious then: 1. m matters: larger c needed with larger m . 2. Data correlation matters: smaller c allowed with higher data correlation.

Which Multiple Testing Methods are Optimal? Peter H. Westfall, - PowerPoint PPT Presentation

Which Multiple Testing Methods are Optimal? Peter H. Westfall, Texas Tech University Background The scientific literature has recently experienced an embarrassment of contradictory results: Ioannidis, J.P. (2005), "Contradicted

Levels of Testing Chapter 12 Beyond unit testing Developer Testing stages Unit testing

Testing Terminology System testing Types of errors Function testing Structure

Property-Based Testing Matt Bachmann @mattbachmann Testing is Important Testing is Important

Software Testing Overview What is software testing? General testing criteria Testing

Software testing Software Testing Introduction Testing levels Automated testing Principles and

1. Test page This page is for testing. This page is for testing. This page is for testing.

Optimal Agents Nick Hay 27th September 2005 1 / 36 Nick Hay Optimal Agents The Optimal Agent

Toward Computing Towards an Optimal . . . An (Almost) Optimal . . . Minor Problem an Optimal

Overview Objective Types of testing ECE 553: TESTING AND Verification testing

Object Oriented Testing Chapter 23 1 OO Testing Class Testing: Equivalent to unit testing

Software Testing Software testing 1 V model Software testing 2 Program testing goals To

Functional Testing Review Chapter 8 Functional Testing We saw three types of functional

An optimal sequential procedure for a multiple selling problem Georgy Sofronov Department of

Web testing Image by C Watts What is web testing? Testing web applications Applications of which

Optimal and maximin procedures for multiple testing problems Saharon Rosset Tel Aviv University

Factor Analysis for Multiple Testing : an R package for large-scale significance testing under

0 20 40 60 80 100 N(t) N(t) = k 2 k 2 1 + t B L (t) = k sample path t l t l+ 1 t k 1 (0,

Likelihood Functions The likelihood function answers the question: What does the sensor tell about

Second-Order Asymptotics of Sequential Hypothesis Testing Yonglong Li and Vincent Y. F. Tan June

Adv Advanced anced Worksho shop p on n Ea Earthquake Fa Fault Mechanics: The Theory, ,

Unsupervised Learning Maria-Florina Balcan 04/06/2015 Reading: Chapter 14.3: Hastie,

An Introduction to Integral Equations Adrianna Gillman Rice University ICERM Workshop on Fast

Dimensionalit y red u ction : feat u re e x traction P R AC TIC IN G MAC H IN E L E AR N IN G

Computer Graphics HDR Imaging Philipp Slusallek Overview HDR Acquisition Tone-Mapping