On the Optimum Number of Hypotheses to Test when the Number of - PowerPoint PPT Presentation

On the Optimum Number of Hypotheses to Test when the Number of Observations is Limited A. Futschik and M. Posch Vienna University & Medical Univ. of Vienna On the Optimum Number of Hypotheses to Test when the Number of Observations is Limited – p. 1/ ??

A Main Goal in Statistics Extract as much information as possible from a limited number of observations In the context of Multiple Hypothesis Testing: Reject (correctly!) as many null hypotheses as possible while still ensuring some global control of the type I error. Much work has been done to derive multiple test procedures that achieve this goal! We address issue from a different as usual point of view ... On the Optimum Number of Hypotheses to Test when the Number of Observations is Limited – p. 2/ ??

Our Framework Consider situation where ... multiple hypotheses are to be tested there is control at the design stage concerning how many hypotheses will be tested overall number of observations is limited by some constant m there is control at the design stage concerning the allocation of the observations among the hypotheses to be tested On the Optimum Number of Hypotheses to Test when the Number of Observations is Limited – p. 3/ ??

On the Optimum Number of Hypotheses to Test when the Number of Observations is Limited – p. 4/28

Some Applications Clinical trials with subgroups defined by age, treatment etc. Crop variety selection Microarrays Discrete event systems On the Optimum Number of Hypotheses to Test when the Number of Observations is Limited – p. 5/ ??

Our Goal Given a maximum overall number of observations, a certain multiple test procedure Maximize (in number k of considered hypotheses): expected number of correct rejections On the Optimum Number of Hypotheses to Test when the Number of Observations is Limited – p. 6/ ??

Outline Framework of optimization problem Optimization w.r.t. a reference alternative Optimum number of hypotheses when controlling the family-wise error (Bonferroni, Bonferroni–Holm, Dunnett) Optimum number of hypotheses when controlling the false discovery rate (Benjamini–Hochberg) Optimization w.r.t. a composite alternative Classification Procedures On the Optimum Number of Hypotheses to Test when the Number of Observations is Limited – p. 7/ ??

The Optimization Problem Total of m observations and K potential hypotheses pairs available. Focus on hypotheses of type H 0 ,i : θ i = 0 vs. H 1 ,i : θ i > 0 , ( 1 ≤ i ≤ K ). If k hypothesis pairs selected at random, m/k observations available for each hypothesis pair (up to round off differences). Choose k to maximize expected number of correct rejections EN k . On the Optimum Number of Hypotheses to Test when the Number of Observations is Limited – p. 8/ ??

General Observations If no correction for multiplicity applied, k as large as possible is often optimal. With correction for multiplicity, there is usually a unique optimum k . On the Optimum Number of Hypotheses to Test when the Number of Observations is Limited – p. 9/ ??

Bonferroni Tests Define √ m ∆ m := θ (1) σ Then, for normally N (0 , σ 2 ) distributed data and one-sided Bonferroni z-tests: � � E ( N k ) = q k 1 − Φ (∆ m / √ k, 1) ( z α/k ) where q is the expected proportion of incorrect null hypotheses. On the Optimum Number of Hypotheses to Test when the Number of Observations is Limited – p. 10/ ??

Example: Bonferroni z- and t-tests 30 z−test 25 t−test 20 E ( N k ) 15 10 5 0 0 10000 20000 30000 k The expected number of correctly rejected null hypotheses for given k and the parameters m = 100000 , q = 0 . 01 , α = 0 . 05 , and θ = σ under H 1 . On the Optimum Number of Hypotheses to Test when the Number of Observations is Limited – p. 11/ ??

Optimum Number of Hypotheses Theorem: Define ∆ 2 m k m := m ) . 2 log(∆ 2 Then, as m → ∞ , the optimum number of hypotheses to test is k ∗ m = k m [1 + o (1)] , with remainder term being negative. On the Optimum Number of Hypotheses to Test when the Number of Observations is Limited – p. 12/ ??

Numerical Example The optimum number of hypotheses k ∗ m and the power (in % ) to reject an individual incorrect null hypotheses: ∆ m 5 10 20 50 100 1000 0 . 01 3 (57) 8 (70) 25 (74) 124 (76) 425 (78) 28908 (82) 0.025 3 (69) 9 (71) 29 (72) 138 (75) 469 (77) 30883 (81) α 0 . 05 4 (60) 11 (66) 33 (70) 152 (74) 508 (76) 32564 (81) On the Optimum Number of Hypotheses to Test when the Number of Observations is Limited – p. 13/ ??

Bonferroni–Holm Tests 8 6 E ( N k ) 4 2 Bonferroni Holm 0 0 10 20 30 40 50 60 k Bonferroni vs. Bonferroni–Holm Tests: θ = 1 , m = 200 , α = 0 . 025 , and q = 0 . 5 . On the Optimum Number of Hypotheses to Test when the Number of Observations is Limited – p. 14/ ??

Control of False Discovery Rate Benjamini–Hochberg: V FDR = E ( max( R, 1)) Asymptotically equivalent problem (see Genovese and Wasserman (2002)): � � E ( N k ) = q k 1 − Φ (∆ m / √ k, 1) ( z u ) → max k , On the Optimum Number of Hypotheses to Test when the Number of Observations is Limited – p. 15/ ??

Benjamini–Hochberg Theorem: Asymptotically, the optimum solution is ∆ 2 m k ∗ m = β ) 2 , ( z u ∗ β − z βu ∗ where u ∗ β maximizes u ( z u − z βu ) 2 . On the Optimum Number of Hypotheses to Test when the Number of Observations is Limited – p. 16/ ??

Asymptotic vs. Simulated Objective Function 8 6 E ( N k ) 4 2 BH asymptotic BH simulation 0 0 10 20 30 40 50 60 k The parameters: θ = 1 , m = 200 , α = 0 . 025 , and q = 0 . 5 . On the Optimum Number of Hypotheses to Test when the Number of Observations is Limited – p. 17/ ??

t-Tests I Bonferroni-tests: E N ( t ) = q k [1 − F ( t ) k ( t α/k,m/k )] , √ k m/k, ∆ m / with F ( t ) ν,δ non-central t-cdf with ν − 1 df and noncentrality parameter δ, and t γ,ν 1 − γ quantile of standard t-distribution with ν − 1 degrees of freedom. Benjamini–Hochberg procedure: E N ( t ) = q k [1 − F ( t ) k ( t u,m/k )] . √ k m/k, ∆ m / On the Optimum Number of Hypotheses to Test when the Number of Observations is Limited – p. 18/ ??

t-Tests II Theorem: Let θ (1) > 0 , and define θ m = θ/ √ m. Assume that √ m = θ (1) ∆ m = θ m σ . σ Then, for m → ∞ , the optimum solution for t-tests converges to that for z-tests. On the Optimum Number of Hypotheses to Test when the Number of Observations is Limited – p. 19/ ??

Possible Rejections for z- and t-Test 40 z−test t−test 30 E ( N k ) 20 10 0 0 5000 10000 15000 20000 k Parameters: m = 100000 , q = 0 . 01 , α = 0 . 05 and θ (1) /σ = 1 . On the Optimum Number of Hypotheses to Test when the Number of Observations is Limited – p. 20/ ??

Composite Alternatives I Bonferroni z-Tests: � ∞ � � 1 − Φ( z α/k − ∆ m ( θ ) EN k = q k ) dF ( θ ) , √ k 0 where F conditional c.d.f. of θ given θ > 0 , √ m q = P ( θ > 0) , and ∆ m ( θ ) = θ σ . On the Optimum Number of Hypotheses to Test when the Number of Observations is Limited – p. 21/ ??

Composite Alternatives II Theorem: Assume that F is continuous and define m d 2 F /σ 2 k m,F := F /σ 2 ) , 2 log( m d 2 where d F maximizes d 2 [1 − F ( d )] . Assuming that d 2 (1 − F ( d )) → 0 as d → ∞ , optimum solution k ∗ m,F satisfies k ∗ m,F = k m,F (1 + o (1)) . On the Optimum Number of Hypotheses to Test when the Number of Observations is Limited – p. 22/ ??

Composite Alternatives III 30 z−test 25 t−test 20 E ( N k ) 15 10 5 0 0 10000 20000 30000 k Parameters: m = 100000 , q = 0 . 01 , α = 0 . 05 . Effect size under alternative N (0 , 1 . 2) distributed. On the Optimum Number of Hypotheses to Test when the Number of Observations is Limited – p. 23/ ??

Composite Alternatives IV Similar result can be obtained for Benjamini–Hochberg procedure ... On the Optimum Number of Hypotheses to Test when the Number of Observations is Limited – p. 24/ ??

Classification Procedures I Classification between θ = θ 0 and θ = θ 1 Minimize k ( w 1 q [1 − g k ( θ 1 )] + w 0 (1 − q ) g k ( θ 0 )) , with g k ( θ ) probability of deciding for θ (1) under θ. For fixed k , problem equivalent to maximizing U ( k ) = k ( w 1 q g k ( θ 1 ) − w 0 (1 − q ) g k ( θ 0 )) . On the Optimum Number of Hypotheses to Test when the Number of Observations is Limited – p. 25/ ??

Classification Procedures II Theorem: For Bayes-classifier, normal data and r = w 0 (1 − q ) / ( w 1 q ) : If r > 1 , then optimum k satisfies � 2 � ∆ m k = , x r where x r is the solution of 0 = x ϕ [ x − c ( r, x )] / 2 − Φ[ x − c ( r, x )]+ r Φ[ − c ( r, x )] , with c ( r, x ) = log( r ) /x + x/ 2 , and √ m/σ. ∆ m = θ 1 − θ 0 On the Optimum Number of Hypotheses to Test when the Number of Observations is Limited – p. 26/ ??

Objective Function U(k) U(k) 3.0 correct incorrect 2.0 1.0 0.0 0 20 40 60 80 100 k Parameters m = 100 , q = 0 . 5 , w 0 = 3 , w 1 = 1 , and θ (1) /σ = 1 / 2 . On the Optimum Number of Hypotheses to Test when the Number of Observations is Limited – p. 27/ ??

On the Optimum Number of Hypotheses to Test when the Number of - PowerPoint PPT Presentation

On the Optimum Number of Hypotheses to Test when the Number of Observations is Limited A. Futschik and M. Posch Vienna University & Medical Univ. of Vienna On the Optimum Number of Hypotheses to Test when the Number of Observations is

Hypotheses with two variates Two sample hypotheses R.W. Oldford Common hypotheses Recall some

Verifying Test Hypotheses - HOL/TestGen An Experiment in Test and Proof Thomas Malcher January

13. hypothesis testing 1 competing hypotheses 2 competing hypotheses 3 competing hypotheses

Hypotheses with two variates Paired data R.W. Oldford Common hypotheses Recall some common

Business Statistics CONTENTS A hypothesis test Hypotheses Rejection region and significance

Engineering Best Practices Test, test, test, and test some more; test as you go Start from a

Model-Based Testing (ISTQB Chapter 4) Arie van Deursen 1 4.1 ISTQB Test Design Test Scripts

Chino Basin Optimum Chino Basin Optimum Basin Management Basin Management Program Program

OPTIMUM OPTIMUM ADAPTIVE ALGORITHMS ADAPTIVE ALGORITHMS for for SYSTEM IDENTIFICATION SYSTEM

Business Statistics CONTENTS Hypotheses on the median The sign test The Wilcoxon signed ranks

Some simple hypotheses to be Some simple hypotheses to be tested by IBOY-DIWPA data Takakazu

Generating Hypotheses by Generating Hypotheses by Discovering Implicit Associations in

Evaluating Hypotheses IEEE Expert, October 1996 1 Evaluating Hypotheses Sample error, true

Learning Logically Defined Hypotheses Martin Grohe RWTH Aachen Outline I. A Declarative

Fictions Functions: Three Data-Driven Hypotheses Andrew Piper, McGill University How can we

Business Statistics CONTENTS The one-sample -test for Hypotheses and SPSS Old exam

Multiple Comparisons October 18, 2019 October 18, 2019 1 / 17 After the ANOVA For an ANOVA, H

Sta$s$calMethodsforExperimental Par$clePhysics TomJunk

Tests for Multivariate Means Max Turgeon STAT 7200Multivariate Statistics Objectives

Lecture 3: Dependence measures using RKHS embeddings MLSS Cadiz, 2016 Arthur Gretton Gatsby

Boolean Logic 01-1 Boolean values Are TRUE and FALSE 01-2 Boolean values Are TRUE and

A Layered Approach A computer system can be divided up into layers. Each layer is build upon the

Boolean Algebra & Logic Gates M. Sachdev, Dept. of Electrical & Computer Engineering

Boolean Algebras Mongi BLEL King Saud University August 30, 2019 Mongi BLEL Boolean Algebras

Sambuz

Useful Links

Newsletter

Mail Us

On the Optimum Number of Hypotheses to Test when the Number of - PowerPoint PPT Presentation

On the Optimum Number of Hypotheses to Test when the Number of Observations is Limited A. Futschik and M. Posch Vienna University & Medical Univ. of Vienna On the Optimum Number of Hypotheses to Test when the Number of Observations is

Hypotheses with two variates Two sample hypotheses R.W. Oldford Common hypotheses Recall some

Verifying Test Hypotheses - HOL/TestGen An Experiment in Test and Proof Thomas Malcher January

13. hypothesis testing 1 competing hypotheses 2 competing hypotheses 3 competing hypotheses

Hypotheses with two variates Paired data R.W. Oldford Common hypotheses Recall some common

Business Statistics CONTENTS A hypothesis test Hypotheses Rejection region and significance

Engineering Best Practices Test, test, test, and test some more; test as you go Start from a

Model-Based Testing (ISTQB Chapter 4) Arie van Deursen 1 4.1 ISTQB Test Design Test Scripts

Chino Basin Optimum Chino Basin Optimum Basin Management Basin Management Program Program

OPTIMUM OPTIMUM ADAPTIVE ALGORITHMS ADAPTIVE ALGORITHMS for for SYSTEM IDENTIFICATION SYSTEM

Business Statistics CONTENTS Hypotheses on the median The sign test The Wilcoxon signed ranks

Some simple hypotheses to be Some simple hypotheses to be tested by IBOY-DIWPA data Takakazu

Generating Hypotheses by Generating Hypotheses by Discovering Implicit Associations in

Evaluating Hypotheses IEEE Expert, October 1996 1 Evaluating Hypotheses Sample error, true

Learning Logically Defined Hypotheses Martin Grohe RWTH Aachen Outline I. A Declarative

Fictions Functions: Three Data-Driven Hypotheses Andrew Piper, McGill University How can we

Business Statistics CONTENTS The one-sample -test for Hypotheses and SPSS Old exam

Multiple Comparisons October 18, 2019 October 18, 2019 1 / 17 After the ANOVA For an ANOVA, H

Sta$s$calMethodsforExperimental Par$clePhysics TomJunk

Tests for Multivariate Means Max Turgeon STAT 7200Multivariate Statistics Objectives

Lecture 3: Dependence measures using RKHS embeddings MLSS Cadiz, 2016 Arthur Gretton Gatsby

Boolean Logic 01-1 Boolean values Are TRUE and FALSE 01-2 Boolean values Are TRUE and

A Layered Approach A computer system can be divided up into layers. Each layer is build upon the

Boolean Algebra &amp; Logic Gates M. Sachdev, Dept. of Electrical &amp; Computer Engineering

Boolean Algebras Mongi BLEL King Saud University August 30, 2019 Mongi BLEL Boolean Algebras

Sambuz

Useful Links

Newsletter

Mail Us

Boolean Algebra & Logic Gates M. Sachdev, Dept. of Electrical & Computer Engineering