Harnessing Crowd-Sourcing to Assess Genes based on Effect Size - PowerPoint PPT Presentation

Harnessing Crowd-Sourcing to Assess Genes based on Effect Size Using Visual Inference Methods Di Cook, Monash University Joint work with Niladri Roy Chowdhury, Eric Hare, Mahbub Majumder, Michelle Graham, Tengfei Yin, Heike Hofmann

Outline Analysis outline, edgeR, … background Our top genes: good, maybe, ugly Why - video of dispersion First experiment, is there any structure Re-analysis of published study VicBioStat 2016, Melbourne, Australia 2 …36

Our Data RNA libraries sequenced by Illumina HiSeq2000 Alignment by bowtie Rsamtools to import bam files, rtracklayer to import gff files GenomicRanges to count reads Negative binomial model using edgeR to compute differential expression FDR yields ~2000 significantly expressed genes VicBioStat 2016, Melbourne, Australia 3 …36

TOP 25 GENES geno Emptyvector RPA Glyma13g12080 Glyma13g11960 Glyma13g12010 Glyma06g03100 Glyma10g36890 1 1 2 3 4 5 10 5 ? ? ? ? " The Good ( ✔ ), 0 Glyma16g29220 Glyma18g10330 Glyma03g06420 Glyma09g28100 Glyma16g05640 6 7 8 9 Maybe ( ? ) & 10 10 5 Ugly ( ✘ ) ? ? ? " " 0 log2(normalized counts + 1) Glyma09g03270 Glyma09g29370 Glyma09g24780 Glyma14g34080 Glyma02g39150 ordered list of 14 11 12 13 15 10 genes 5 ? ? ? ? ! 0 Glyma02g03290 Glyma08g36390 Glyma20g26600 Glyma01g38130 Glyma18g01720 16 17 18 19 20 10 5 ? ? ? ! ! 0 Glyma05g16350 Glyma18g07090 Glyma12g36140 Glyma12g03280 Glyma02g13850 21 22 23 25 24 25 10 5 ? ? ! ! ! 0 insufficient sufficient insufficient sufficient insufficient sufficient insufficient sufficient insufficient sufficient Fe

TOP 25 GENES geno Emptyvector RPA Glyma13g12080 Glyma13g11960 Glyma13g12010 Glyma06g03100 Glyma10g36890 1 2 3 4 5 10 5 ? ? ? ? " The Good ( ✔ ), 0 Glyma16g29220 Glyma18g10330 Glyma03g06420 Glyma09g28100 Glyma16g05640 6 7 8 9 Maybe ( ? ) & 10 10 5 Ugly ( ✘ ) ? ? ? " " 0 log2(normalized counts + 1) Glyma09g03270 Glyma09g29370 Glyma09g24780 Glyma14g34080 Glyma02g39150 ordered list of 14 11 12 13 15 10 genes 5 ? ? ? ? ! 0 Glyma02g03290 Glyma08g36390 Glyma20g26600 Glyma01g38130 Glyma18g01720 16 17 18 19 20 10 5 ? ? ? ! ! 0 Glyma05g16350 Glyma18g07090 Glyma12g36140 Glyma12g03280 Glyma02g13850 21 22 23 25 24 10 5 ? ? ! ! ! 0 insufficient sufficient insufficient sufficient insufficient sufficient insufficient sufficient insufficient sufficient Fe

TOP 25 GENES geno Emptyvector RPA Glyma13g12080 Glyma13g11960 Glyma13g12010 Glyma06g03100 Glyma10g36890 1 2 3 4 5 10 5 ? ? ? ? " The Good ( ✔ ), 0 Glyma16g29220 Glyma18g10330 Glyma03g06420 Glyma09g28100 Glyma16g05640 6 7 8 9 Maybe ( ? ) & 10 10 5 Ugly ( ✘ ) ? ? ? " " 0 log2(normalized counts + 1) Glyma09g03270 Glyma09g29370 Glyma09g24780 Glyma14g34080 Glyma02g39150 ordered list of 14 11 12 13 15 10 genes 5 ? ? ? ? ! 0 Glyma02g03290 Glyma08g36390 Glyma20g26600 Glyma01g38130 Glyma18g01720 16 17 18 19 20 10 5 ? ? ? ! ! 0 Glyma05g16350 Glyma18g07090 Glyma12g36140 Glyma12g03280 Glyma02g13850 21 22 23 25 24 10 Do you agree? 5 ? ? ! ! ! 0 insufficient sufficient insufficient sufficient insufficient sufficient insufficient sufficient insufficient sufficient Fe

Why? Dispersion

Why? Level N inflates dispersion

Why? Gene B inflates dispersion

Why? In reality, gene B here inflates dispersion, making gene A not signif.

tagwise dispersion log (counts pm) cranvas ggplot2

tagwise dispersion Each point = one gene log (counts pm) cranvas ggplot2

tagwise dispersion Each point = one gene Trended dispersion log (counts pm) cranvas ggplot2

tagwise dispersion Classical Each point = interaction plot one gene of one gene Trended dispersion log (counts pm) cranvas ggplot2

tagwise dispersion Classical Each point = interaction plot one gene of one gene Trended dispersion log (counts pm) Plots linked, clicking on a point in left plot shows the interaction plot for that gene cranvas ggplot2

tagwise dispersion Classical Each point = interaction plot one gene of one gene log (counts pm) Plots linked, clicking on a point in left plot shows the interaction plot for that gene cranvas ggplot2

tagwise dispersion Classical interaction plot of one gene log (counts pm) Plots linked, clicking on a point in left plot shows the interaction plot for that gene cranvas ggplot2

tagwise dispersion log (counts pm) Plots linked, clicking on a point in left plot shows the interaction plot for that gene cranvas ggplot2

tagwise dispersion log (counts pm) cranvas ggplot2

So we ran a little experiment Compare the results with random results Take the experimental design, 2x2x3, and permute the labels Re-run the analysis, record most significant gene Plot the results VicBioStat 2016, Melbourne, Australia 7 …36

In which of these plots do the two groups have the most vertical difference? 1 2 3 4 5 6 7 8 9 10 log2(normalized counts + 1) 11 12 13 14 15 16 17 18 19 20 Emptyvector RPA Emptyvector RPA Emptyvector RPA Emptyvector RPA Emptyvector RPA geno geno_1_5, 5/7

In which of these plots is the green line the steepest, and the spread of the green points relatively small? 1 2 3 4 5 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 6 7 8 9 10 ● ● log2(normalized counts + 1) ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 11 12 13 14 15 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 16 17 18 19 20 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● i s i s i s i s i s Fe interaction_2_1, 4/5

Experiment Five different sets of null plots Five different locations of true data plot inside the lineup Shown to a sample of Amazon Turk workers Overwhelmingly in both cases, the true data is picked, slightly less so for interaction VicBioStat 2016, Melbourne, Australia 10 …36

Experiment Five different sets of null plots Five different locations of true data plot inside the lineup Shown to a sample of Amazon Turk workers Overwhelmingly in both cases, the true data is picked, slightly less so for interaction Data has SOME SIGNAL! VicBioStat 2016, Melbourne, Australia 10 …36

Human vs chimp Data from “Sex-specific and lineage-specific alternative splicing in primates” Blekhman, Marioni, Zumbo, Stephens, Gilad, Genome Research, 2010 20: 180-189, http:// genome.cshlp.org/content/suppl/2009/12/16/ gr.099226.109.DC1.html Human, chimp (and rhesus) liver RNA 3x2(M/F) individuals, 2 reps for each species VicBioStat 2016, Melbourne, Australia Image from son’s T − shirt! 11 …36

Human vs chimp Pairwise comparisons of species Likelihoods compared, FDR<0.05 VicBioStat 2016, Melbourne, Australia 12 …36

Human vs chimp Re-analyzed using edgeR, exactTest (Yes, not taking dependencies into account - but a quick re-do of analysis wanted) Just Human-Chimp Yields 3630 differentially expressed genes, at FDR<0.01, mostly overlapping with published results VicBioStat 2016, Melbourne, Australia 13 …36

Visual testing Create multiple sets of permutations of the labels of human, chimp Conduct edgeR/exactTest on each of the permutations Record the top 2500 genes based on p- value Make lineups of j’th ordered gene of actual data against those of permuted data VicBioStat 2016, Melbourne, Australia 14 …36

You try Pick one plot among the 20 “Which plot has the largest vertical difference between the two groups?” Point your mobile device to this web page goo.gl/gG60uR VicBioStat 2016, Melbourne, Australia 15 …36

Human-chimp 1 1 2 3 4 5 3 3 2 2 1 1 0 0 6 7 8 9 10 10 3 3 2 2 1 1 log10(cpm) log10(cpm) 0 11 11 12 12 13 13 14 14 15 15 3 3 2 2 1 1 0 16 16 17 17 18 18 19 19 20 20 3 3 2 2 1 0 HS HS PT PT HS HS PT PT HS HS PT PT HS HS PT PT HS HS PT PT

Human-chimp 2 1 2 3 4 5 3 3 2 2 1 1 0 0 6 7 8 9 10 10 3 3 2 2 1 1 log10(cpm) log10(cpm) 0 11 11 12 12 13 13 14 14 15 15 3 3 2 2 1 1 0 16 16 17 17 18 18 19 19 20 20 3 3 2 2 1 1 0 HS HS PT PT HS HS PT PT HS HS PT PT HS HS PT PT HS HS PT PT

Human-chimp 3 1 2 3 4 5 3 3 2 2 1 1 0 0 6 7 8 9 10 10 3 3 2 2 1 1 log10(cpm) log10(cpm) 0 11 11 12 12 13 13 14 14 15 15 3 3 2 2 1 1 0 16 16 17 17 18 18 19 19 20 20 3 3 2 2 1 0 HS HS PT PT HS HS PT PT HS HS PT PT HS HS PT PT HS HS PT PT

Harnessing Crowd-Sourcing to Assess Genes based on Effect Size - PowerPoint PPT Presentation

Harnessing Crowd-Sourcing to Assess Genes based on Effect Size Using Visual Inference Methods Di Cook, Monash University Joint work with Niladri Roy Chowdhury, Eric Hare, Mahbub Majumder, Michelle Graham, Tengfei Yin, Heike Hofmann Outline

Event Sourcing at Studyflow.nl Sourcing intro Event Sourcing architecture Joost Diepenmaat

CrowdsFunding Gilad Ravid, PhD Crowd Sourcing Pooling Collective Knowledge Ushahidi

Utilizing Crowd Funding Utilizing Crowd Funding for Support SMEs funding for Support SMEs

Global Sourcing & Technology Changes: Reboot Your Sourcing Strategies May 8, 2014 1 Mayer

Global Sourcing Local Solutions www.ncsourcing.com NC Sourcing Your industrial

Ranking candidate genes from Ranking candidate genes from perturbation experiments Niko

How to Stand Out from the Crowd on How to Stand Out from the Crowd on LinkedIn LinkedIn Maureen

POV & EXPERIENCE PROTOTYPES SLOANE, TINA, MARIE & KARNA CROWDPOWER DREAM TEAM Sloane

participatory governance syros_14.07.2012 the power of the crowd some facts crowd (people)

Slides from session at online conference imoot 2013 May 26 th 2013 These were crowd sourced from

Harnessing the potential of stem cells Harnessing the potential of stem cells for the treatment

HARNESSING HARNESSING THE THE DA DATA Elizabeth Elizabeth Lukanen, Lukanen, MPH MPH Sta

Harnessing Harnessing Grid Resources with Grid Resources with Data- -Centric Task Farms

HARNESSING THE BULL MARKET HARNESSING THE BULL MARKET FOR FREE CASH FLOW FOR FREE CASH FLOW

Harnessing technology for better social outcomes Presented by: Andrew Peckham General Manager -

Program Boosting: Program Synthesis via Crowd-Sourcing Robert A. Cochran, Loris DAntoni,

NH High Speed Data Initiative Mike Bewersdorf TDS OSP Engineering February 2009 Objective

Future of microarray techniques for study of viral diseases Guy Vernet Importance of

Laure Sabatier - CEA ICRR2015 Kyoto, Japan May 25-29 WP1 Network coordination Sisko Salomaa,

TOWNSHIP March 27 th Meet & Greet The Drop Installation Process Construction Overview

Sudbury to Hudson Transmission Reliability Project Town of Sudbury Board of Selectmen

Phylogenomic inference Hauptseminar Frishman WS2013/2014 Uli Khler February 3rd 2014 Folie 2

FY17 Half Year Results Presentation Sandeep Biswas, Managing Director & CEO Gerard Bond,

Engineer In g Training Program 2015 g ENERGIZE YOUR CAREER ENERGIZE YOUR CAREER Wouldnt it

Sambuz

Useful Links

Newsletter

Mail Us

Harnessing Crowd-Sourcing to Assess Genes based on Effect Size - PowerPoint PPT Presentation

Harnessing Crowd-Sourcing to Assess Genes based on Effect Size Using Visual Inference Methods Di Cook, Monash University Joint work with Niladri Roy Chowdhury, Eric Hare, Mahbub Majumder, Michelle Graham, Tengfei Yin, Heike Hofmann Outline

Event Sourcing at Studyflow.nl Sourcing intro Event Sourcing architecture Joost Diepenmaat

CrowdsFunding Gilad Ravid, PhD Crowd Sourcing Pooling Collective Knowledge Ushahidi

Utilizing Crowd Funding Utilizing Crowd Funding for Support SMEs funding for Support SMEs

Global Sourcing &amp; Technology Changes: Reboot Your Sourcing Strategies May 8, 2014 1 Mayer

Global Sourcing Local Solutions www.ncsourcing.com NC Sourcing Your industrial

Ranking candidate genes from Ranking candidate genes from perturbation experiments Niko

How to Stand Out from the Crowd on How to Stand Out from the Crowd on LinkedIn LinkedIn Maureen

POV &amp; EXPERIENCE PROTOTYPES SLOANE, TINA, MARIE &amp; KARNA CROWDPOWER DREAM TEAM Sloane

participatory governance syros_14.07.2012 the power of the crowd some facts crowd (people)

Slides from session at online conference imoot 2013 May 26 th 2013 These were crowd sourced from

Harnessing the potential of stem cells Harnessing the potential of stem cells for the treatment

HARNESSING HARNESSING THE THE DA DATA Elizabeth Elizabeth Lukanen, Lukanen, MPH MPH Sta

Harnessing Harnessing Grid Resources with Grid Resources with Data- -Centric Task Farms

HARNESSING THE BULL MARKET HARNESSING THE BULL MARKET FOR FREE CASH FLOW FOR FREE CASH FLOW

Harnessing technology for better social outcomes Presented by: Andrew Peckham General Manager -

Program Boosting: Program Synthesis via Crowd-Sourcing Robert A. Cochran, Loris DAntoni,

NH High Speed Data Initiative Mike Bewersdorf TDS OSP Engineering February 2009 Objective

Future of microarray techniques for study of viral diseases Guy Vernet Importance of

Laure Sabatier - CEA ICRR2015 Kyoto, Japan May 25-29 WP1 Network coordination Sisko Salomaa,

TOWNSHIP March 27 th Meet &amp; Greet The Drop Installation Process Construction Overview

Sudbury to Hudson Transmission Reliability Project Town of Sudbury Board of Selectmen

Phylogenomic inference Hauptseminar Frishman WS2013/2014 Uli Khler February 3rd 2014 Folie 2

FY17 Half Year Results Presentation Sandeep Biswas, Managing Director &amp; CEO Gerard Bond,

Engineer In g Training Program 2015 g ENERGIZE YOUR CAREER ENERGIZE YOUR CAREER Wouldnt it

Sambuz

Useful Links

Newsletter

Mail Us

Global Sourcing & Technology Changes: Reboot Your Sourcing Strategies May 8, 2014 1 Mayer

POV & EXPERIENCE PROTOTYPES SLOANE, TINA, MARIE & KARNA CROWDPOWER DREAM TEAM Sloane

TOWNSHIP March 27 th Meet & Greet The Drop Installation Process Construction Overview

FY17 Half Year Results Presentation Sandeep Biswas, Managing Director & CEO Gerard Bond,