Ranking candidate genes from Ranking candidate genes from - PowerPoint PPT Presentation

Ranking candidate genes from Ranking candidate genes from perturbation experiments Niko Beerenwinkel

Gene ranking  Goal: Identify (or prioritize) genes that affect readout, i.e., are involved in biological process of interest i l d i bi l i l f i t t  Issues  Noise (readout, siRNA specificity)  Design (siRNA library, replicates, validation screens)  Limited resources  Procedure  Normalization: quantiles z-score error models Normalization: quantiles, z score, error models  Rank by normalized readout or p-value 2

RSA: Redundant siRNA activity analysis  Rank all siRNAs (wells) by readout   Assign p value to each gene based on the rank distribution of all Assign p-value to each gene based on the rank distribution of all siRNAs targeting it (hypergeometric model) König et al, 2007 3

Comparing gene rankings  Intersection metric  Spearman’s footrule  4

Stable variables  Let Λ be the set of all (reasonable) cut-offs for a given ranking (i.e., λ ∈ Λ is a regularization parameter). λ ∈ Λ is a regularization parameter) ranking (i e  The set of selected genes S λ = ˆ S λ ( I ) ˆ i is a function of the samples I . f ti f th l I  For a given threshold π , the set of stable variables is ½ ½ ¾ ¾ ³ ³ S λ ´ ´ S stable = ˆ k ∈ ˆ ≥ π k : max λ ∈ Λ P λ ∈ Λ  P can be estimated by sub- or re-sampling  P can be estimated by sub- or re-sampling. 5

Stability selection (Meinshausen & B ühlmann, 2010)  Under certain assumptions, the expectation of the number of falsely selected variables V is bounded by f f l l l t d i bl V i b d d b q 2 1 1 q Λ Λ E( V ) ≤ ( ) 2 π − 1 p where p is the total number of genes, and h h i i | ∪ λ ∈ Λ S λ ( I ) | ˆ λ ( ) | q Λ = E | the expected number of stable genes the expected number of stable genes.  In practice we can set π and Λ to control false positives  In practice, we can set π and Λ to control false positives. 6

Data sets  Hardt lab  Salmonella screen in human cells S l ll i h ll  19,000 genes  Read-out: infection rate Read out: infection rate  ~4 different siRNAs per gene, no replicates  Merdes lab (Saj et al., Dev Cell, 2010)  Notch screen in Drosophila  12,000 genes  Read-out: Notch activity  4 replicates  4 replicates  Secondary and in vivo validation screens 7

Salmonella screen: ranking  Quantile normalization  Rankings (Kendall’s tau distance): 8

9 Λ Λ Salmonella screen: stability Λ Λ

10 Notch screen: raw data

11 Notch screen: quantile normalization

Notch screen: normalization  Raw data cor  Quantiles Quantiles rrelation  Z-scores 12

Notch screen: ranking  Quantile-normalized  Ranking distance (Kendall’s tau) 13

Notch screen: reproducibility  Leave-one-out: R Ranking based on ki b d three replicates validated with fourth validated with fourth replicate  Cut-off 300 for C t ff 300 f validation 14

Notch screen: average leave-one-out ROC curves for different normalizations different normalizations 15

Notch screen: ROC analysis of validation screen  Secondary screen of 900 genes  Focus on down regulation All 12 000 All 12,000 genes T Top 254 genes 254 16

Notch screen: stability, in vivo validation  Median ranking of top 2000 genes Λ Median 20 17

Conclusions  Both quantile and z-score normalzation improve correlation and reproducibility. d d ibilit  Selecting stable genes complements selection of high- scoring genes. i  Stable sets quantify reproducibility of being among top k in ranking  Upper bound on expected number of false positives  RSA produced fairly unstable sets 18

Acknowledgements  Computational Biology Group, www.cbg.ethz.ch  Juliane Siebourg J li Si b  Edgar Delgado-Eckert  C ll b Collaborators  Gunter Merdes (D-BSSE, ETH Zurich)  Wolf-Dietrich Hardt (D-BIOL, ETH Zurich)  InfectX consortium  Funding  InfectX , SystemsX.ch 19

Ranking candidate genes from Ranking candidate genes from - PowerPoint PPT Presentation

Ranking candidate genes from Ranking candidate genes from perturbation experiments Niko Beerenwinkel Gene ranking Goal: Identify (or prioritize) genes that affect readout, i.e., are involved in biological process of interest i l d i bi

A Ranking Method to Improve A Ranking Method to Improve Detection of Disease Using Selectively

Easy and Hard Outline Constraint Ranking in OT The Constraint Ranking problem Making fast

Tutorial: TF-Ranking for sparse features Tutorial: TF-Ranking for sparse features This tutorial

Online Submodular Set Cover, Ranking, and Repeated Active Learning Online Ranking: At each round,

TVM for Ads Ranking @ Facebook Hao Lu, Ansha Yu, Yinghai Lu, Andrew Tulloch Ads Ranking at

2019 Candidate Filing Workshop Are you ready to file? Candidate Guide & Drive Filing

The Candidate Experience What is Candidate Experience? Candidate Experience How job seekers

Are essential genes conserved? Fatemeh Ashari Ghomi University of Canterbury

NATH BIO-GENES (INDIA) LIMITED Corporate Presentation -August 2016 Nath Bio-Genes (India) Ltd

Genes can be cloned in recombinant plasmids Gene cloning Enzymes are used to cut and paste

Genes and Behavior Genes and Behavior Cog. Sci. 1 Cog. Sci. 1 Ralph Greenspan Ralph Greenspan

CSE 427 Computational Biology Genes and Gene Prediction 1 Gene Finding: Motivation

"Where are my genes?" - A journey through the nucleus of a human cell

CSEP 527 Computational Biology Genes and Gene Prediction 1 Gene Finding: Motivation We

+ Ranking Factor Latest Trends What factors matter in 2016-2017 for ranking your Google

Kernel Principal Component Ranking: Robust Ranking on Noisy Data Evgeni Tsivtsivadze Botond

Categorical Probability and Statistics Peter McCullagh Department of Statistics University of

Annotation and down-stream analysis Martin Morgan 1 June 20-23, 2011 1 mtmorgan@fhcrc.org

Foundations of Artificial Intelligence 40. Board Games: Introduction and State of the Art Malte

Assessing Human Error Against a Benchmark of Perfection Ashton Anderson University of Toronto

Adversarial Search George Konidaris gdk@cs.brown.edu Fall 2019 Games Chess is the

CSE 182-L2:Blast & variants I Dynamic Programming FA08 CSE182 Notes

append/3 A Drosophila of L.P. As functions: append([], L) = L append([ H | T ], L) = [H |

MicroRNAs, miRBase and deep sequencing Sam Griffiths-Jones Trainer: Sam Griffiths-Jones He and

Sambuz

Useful Links

Newsletter

Mail Us

Ranking candidate genes from Ranking candidate genes from - PowerPoint PPT Presentation

Ranking candidate genes from Ranking candidate genes from perturbation experiments Niko Beerenwinkel Gene ranking Goal: Identify (or prioritize) genes that affect readout, i.e., are involved in biological process of interest i l d i bi

A Ranking Method to Improve A Ranking Method to Improve Detection of Disease Using Selectively

Easy and Hard Outline Constraint Ranking in OT The Constraint Ranking problem Making fast

Tutorial: TF-Ranking for sparse features Tutorial: TF-Ranking for sparse features This tutorial

Online Submodular Set Cover, Ranking, and Repeated Active Learning Online Ranking: At each round,

TVM for Ads Ranking @ Facebook Hao Lu, Ansha Yu, Yinghai Lu, Andrew Tulloch Ads Ranking at

2019 Candidate Filing Workshop Are you ready to file? Candidate Guide &amp; Drive Filing

The Candidate Experience What is Candidate Experience? Candidate Experience How job seekers

Are essential genes conserved? Fatemeh Ashari Ghomi University of Canterbury

NATH BIO-GENES (INDIA) LIMITED Corporate Presentation -August 2016 Nath Bio-Genes (India) Ltd

Genes can be cloned in recombinant plasmids Gene cloning Enzymes are used to cut and paste

Genes and Behavior Genes and Behavior Cog. Sci. 1 Cog. Sci. 1 Ralph Greenspan Ralph Greenspan

CSE 427 Computational Biology Genes and Gene Prediction 1 Gene Finding: Motivation

&quot;Where are my genes?&quot; - A journey through the nucleus of a human cell

CSEP 527 Computational Biology Genes and Gene Prediction 1 Gene Finding: Motivation We

+ Ranking Factor Latest Trends What factors matter in 2016-2017 for ranking your Google

Kernel Principal Component Ranking: Robust Ranking on Noisy Data Evgeni Tsivtsivadze Botond

Categorical Probability and Statistics Peter McCullagh Department of Statistics University of

Annotation and down-stream analysis Martin Morgan 1 June 20-23, 2011 1 mtmorgan@fhcrc.org

Foundations of Artificial Intelligence 40. Board Games: Introduction and State of the Art Malte

Assessing Human Error Against a Benchmark of Perfection Ashton Anderson University of Toronto

Adversarial Search George Konidaris gdk@cs.brown.edu Fall 2019 Games Chess is the

CSE 182-L2:Blast &amp; variants I Dynamic Programming FA08 CSE182 Notes

append/3 A Drosophila of L.P. As functions: append([], L) = L append([ H | T ], L) = [H |

MicroRNAs, miRBase and deep sequencing Sam Griffiths-Jones Trainer: Sam Griffiths-Jones He and

Sambuz

Useful Links

Newsletter

Mail Us

2019 Candidate Filing Workshop Are you ready to file? Candidate Guide & Drive Filing

"Where are my genes?" - A journey through the nucleus of a human cell

CSE 182-L2:Blast & variants I Dynamic Programming FA08 CSE182 Notes