Inference following aggregate-level hypothesis testing in large - PowerPoint PPT Presentation

Inference following aggregate-level hypothesis testing in large scale genomic data Ruth Heller www.math.tau.ac.il/ ∼ ruheller Joint work with Nilanjan Chatterjee, Abba Krieger, and Jianxin Shi Ruth Heller (TAU) Inference following aggregate-level hypothesis testing December 9, 2016 1 / 27

Outline 1 A brief review of the multiple comparisons problem. 2 Inference following selection by aggregate level testing: (i) Goal. (ii) The conditional approach. (iii) An existing alternative. (iv) An empirical comparison. (v) Conclusions. Ruth Heller (TAU) Inference following aggregate-level hypothesis testing December 9, 2016 2 / 27

The multiple comparisons problem A family of m null hypotheses are considered: H 1 , . . . , H m . P 1 , . . . , P m are the p -values for testing H 1 , . . . , H m , respectively. The hypotheses can be divided into two types: m 0 true null hypotheses : P i ∼ U (0 , 1). 1 m 1 = m − m 0 false null hypotheses: P ( P i ≤ x ) ≥ x , ∀ x ∈ [0 , 1] . 2 A discovery is made if a null hypothesis is rejected. A false discovery is made if a true null hypothesis is rejected. Ruth Heller (TAU) Inference following aggregate-level hypothesis testing December 9, 2016 3 / 27

The two most common error rates R = the number of discoveries. V = the number of false discoveries. The familywise error rate (FWER) is Pr ( V > 0) . � � V The false discovery rate (FDR 1 ) is E . max( R , 1) The two error rates coincide if m 0 = m . Procedures that control the FWER offer also FDR control: � � V ≤ E ( I [ V > 0]) = Pr ( V > 0) . E max( R , 1) 1 Benjamini and Hochberg, 1995. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Ruth Heller (TAU) Inference following aggregate-level hypothesis testing December 9, 2016 4 / 27

The Bonferroni Procedure Reject H i if p i ≤ α/ m . Properties: FWER is controlled at level α : � Pr ( V > 0) = Pr ( ∪ i ∈ I 0 P i ≤ α/ m ) ≤ Pr ( P i ≤ α/ m ) = m 0 α/ m ≤ α, i ∈ I 0 where I 0 ⊆ { 1 , . . . , m } is the subset of true null hypotheses. The FWER error control is valid for any type of dependency across the p -values P 1 , . . . , P m . Ruth Heller (TAU) Inference following aggregate-level hypothesis testing December 9, 2016 5 / 27

The BH procedure 1 Sort the p -values p (1) ≤ . . . ≤ p ( m ) , with corresponding H (1) , . . . , H ( m ) . 2 Find R = arg max j =1 ,..., m { p ( j ) ≤ α j / m } . 3 Reject H (1) , . . . , H ( R ) . Properties: FDR = m 0 m α if the p -values are independent 1 . FDR ≤ m 0 m α if the p -values are positive dependent 2 . FDR ≤ (1 + 1 / 2 + . . . + 1 / m ) m 0 m α ≈ log( m ) m 0 m α for any type of dependence across the p -values 2 . 1 Benjamini and Hochberg, 1995. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. 2 Benjamini and Yekutieli, 2001. The control of the false discovery rate in multiple testing under dependency. Ruth Heller (TAU) Inference following aggregate-level hypothesis testing December 9, 2016 6 / 27

The adjusted p -values A multiple comparison procedure adjusted p -value for a hypothesis is the smallest nominal level at which the hypothesis would be rejected , given p 1 , . . . , p m . The Bonferroni-adjusted p -value for H i is m × p i . The Bonferroni procedure at level α rejects H i if and only if m × p i ≤ α . The BH-adjusted p -value for H ( i ) is � m × p ( j ) � min . j j ≥ i The BH procedure at level α rejects H i if and only if � m × p ( j ) � min j ≥ i ≤ α . j Ruth Heller (TAU) Inference following aggregate-level hypothesis testing December 9, 2016 7 / 27

Final remarks The BH-adjusted p -values are at most as large as the Bonferroni adjusted p -values. Bonferroni provides simultaneous inference: the FWER guarantee is valid for any subset of { 1 , . . . , m } . BH provide selective inference: the FDR guarantee is for the selected set of rejected hypotheses. More generally, with simultaneous inference the guarantee is for every possible subset, whereas with selective inference the guarantee is for the specific subset selected. Methods that assure simultaneous inference also assure selective inference, but not vice versa 3 . 3 Benjamini, 2010. Simultaneous and selective inference: Current Successes and future Challenges. Ruth Heller (TAU) Inference following aggregate-level hypothesis testing December 9, 2016 8 / 27

Outline 1 A brief review of the multiple comparisons problem. 2 Inference following selection by aggregate level testing: (i) Goal. (ii) The conditional approach. (iii) An existing alternative. (iv) An empirical comparison. (v) Conclusions. Ruth Heller (TAU) Inference following aggregate-level hypothesis testing December 9, 2016 9 / 27

Multiple studies testing similar hypotheses Examine m features in each of n studies. For feature (row) i : H ij , j = 1 , . . . , n are the n null hypotheses. H iG = ∩ n j =1 H ij is the meta-analysis (global) null hypothesis. We have m × n hypotheses for inference: H 11 . . . H 1 n H 1 G . . . ... . . . . . . . . . H m 1 H mn H mG Ruth Heller (TAU) Inference following aggregate-level hypothesis testing December 9, 2016 10 / 27

Inference following aggregate level testing In meta-analysis, aggregate level hypotheses testing is performed for powerful identification of features with signal 1 . A natural follow-up question is which studies contain signal within a discovered feature. Testing H i 1 , . . . , H in following rejection of H iG without accounting for the fact that H iG was rejected using an aggregate-level test statistic, will produce biased inference 2 . 1 Bhattacharjee et al., 2012. A subset-based approach improves power and interpretation for the combined analysis of genetic association studies of heterogeneous traits. 2 Bogomolov and Benjamini, 2014. Selective inference on multiple families of hypotheses. Ruth Heller (TAU) Inference following aggregate-level hypothesis testing December 9, 2016 11 / 27

Goal for inference Our goal is to develop multiple testing procedures that guarantee control of FWER/FDR conditional on the row being selected. This type of false positive control is particularly important if a researcher conducts different follow-up studies for each selected row. A related goal: Controlling the average FWER/FDR over the selected 1 . 1 Bogomolov and Benjamini, 2014. Selective inference on multiple families of hypotheses. Ruth Heller (TAU) Inference following aggregate-level hypothesis testing December 9, 2016 12 / 27

A large scale genomic application Expression quantitative trait loci (eQTLs) are genomic regions with genetic variants that influence the expression level of genes. Gene regulation is tissue specific, but within a single tissue may lack power due to small sample size. The discovery power of eQTL SNPs predictive of gene expression across multiple tissues may be increased by aggregate testing across tissue types. For the n=17 tumor tissues in The Cancer Genome Atlas (TCGA) Project, we aggregate the 17 eQTL test statistics to select eQTL SNPs influencing gene expression in at least one tissue, out of m = 7 , 732 , 750 candidate cis-eQTL SNPs . We aim to discover the non-null tissues within selected eQTL SNPs. Ruth Heller (TAU) Inference following aggregate-level hypothesis testing December 9, 2016 13 / 27

Notation S ⊆ { 1 , . . . , m } is the set of selected rows, e.g., all hypotheses rejected by Bonferroni/BH on the global null p -values. V i = number of false discoveries for row i . R i = number of discoveries for row i . The conditional FWER for row i is E ( I [ V i > 0] | i ∈ S ) . The conditional FDR for row i is E ( V i / max { R i , 1 }| i ∈ S ) . Ruth Heller (TAU) Inference following aggregate-level hypothesis testing December 9, 2016 14 / 27

Notation For feature (row) i : P ij , j = 1 , . . . , n are the p -values. P iG is the global null p -value. Examples 1 : n p iG = Pr ( χ 2 � 2 n ≥ − 2 log p ij ) . j =1     n n   � �  χ 2 log p L log(1 − p L  . p iG = 2 Pr 2 n ≥ max  − 2 ij , − 2 ij )  j =1 j =1 Our data matrix for analysis is: p 11 . . . p 1 n p 1 G . . . ... . . . . . . . . . p m 1 p mn p mG 1 Owen, 2009. Karl Pearson’s meta-analysis revisited. Ruth Heller (TAU) Inference following aggregate-level hypothesis testing December 9, 2016 15 / 27

Our approach for inference following row-selection 1 Compute the conditional p -values, conditional on being selected. 2 Apply a valid FWER/FDR controlling procedure on the conditional p -values. Ruth Heller (TAU) Inference following aggregate-level hypothesis testing December 9, 2016 16 / 27

Our approach for inference following row-selection 1 Compute the conditional p -values, conditional on being selected. 2 Apply a valid FWER/FDR controlling procedure on the conditional p -values. Questions we address: 1 The row may contain both null and non-null p -values, so the probability of selection is not known even for the simplest rule { P iG ≤ α/ m } . How can the conditional p -values be computed? 2 Even though the original p -values in a row are independent, the conditional p -values will be dependent. What is a valid FDR controlling procedure? Ruth Heller (TAU) Inference following aggregate-level hypothesis testing December 9, 2016 16 / 27

Inference following aggregate-level hypothesis testing in large - PowerPoint PPT Presentation

Inference following aggregate-level hypothesis testing in large scale genomic data Ruth Heller www.math.tau.ac.il/ ruheller Joint work with Nilanjan Chatterjee, Abba Krieger, and Jianxin Shi Ruth Heller (TAU) Inference following

STAT 215 Hypothesis Testing I Colin Reimer Dawson Oberlin College September 7, 2017 1 / 14

STAT 113 Hypothesis Testing I Colin Reimer Dawson Oberlin College October 5, 2017 1 / 17

Aggregate Sampling Aggregate Stockpiles CIVL 3137 2 Stockpile Segregation CIVL 3137 3

Gov 2000: 6. Hypothesis Testing Matthew Blackwell October 11, 2016 1 / 55 1. Hypothesis

Chapter 6 Hypothesis Testing What is Hypothesis Testing? the use of statistical

Chapter 6 Hypothesis Testing What is Hypothesis Testing? the use of statistical

CME/STATS 195 CME/STATS 195 Lecture 7: Hypothesis Testing and Lecture 7: Hypothesis Testing and

Cluster Validity Hypothesis Random Graph Hypothesis Random Label Hypothesis Relative Criteria

Asphalt Aggregate Specifications Aggregate Specifications In order to make good asphalt

Aggregate Blending Aggregate Blending To meet the gradation specifications for a concrete or

Testing Specification testing Michel Bierlaire Introduction to choice models Differences from

Hypothesis Testing Mark Lunt Centre for Epidemiology Versus Arthritis University of Manchester

Hypothesis tests with binomial example STAT 587 (Engineering) Iowa State University October 2,

t -tests STAT 587 (Engineering) Iowa State University October 2, 2020 Statistical hypothesis

Chapter 8: Hypothesis Testing STK4011/9011: Statistical Inference Theory Johan Pensar

Testing 6.1 Specification testing Michel Bierlaire A short reminder on hypothesis testing

and the Tim ime-based Reconstruction Jenny Regina PANDA CM, Computing Session GSI, 24-28 June

WINLAB Contact: Liang Xiao lxiao@winlab.rutgers.edu With Profs. Larry Greenstein, Wade Trappe,

Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 8 Jan-Willem van de Meent (

Week 3: Finish SLR Inference Then Multiple Linear Regression I. Confidence and Prediction

Off-line Signature Verification: A Circular Outline Grid-Based Feature Extraction Approach

CSSE463: Image Recognition Day 11 Due: Written assignment 1 tomorrow, 4:00 pm Start

Farhad Fallah 11/17/2015 Previous descriptors Introduction to the HOG descriptor

using R for regression model selection with adaptive penalties procedures based on the FDR