Gene set testing in limma
COMBINE RNA-seq Workshop
Gene set testing in limma COMBINE RNA-seq Workshop Why? Sometimes - - PowerPoint PPT Presentation
Gene set testing in limma COMBINE RNA-seq Workshop Why? Sometimes after differential expression testing, we have a long list of 1000s of genes Too difficult to go through one by one Or there may be very few / no genes that make
COMBINE RNA-seq Workshop
long list of 1000’s of genes
significance (small effect sizes + experimental noise)
system being studied
– goana() function
– kegga() function
– camera() function
– mroast() / fry() functions
70 significant genes 190 genes in geneset
Is an overlap of 10 significant?
Oshlack and Wakefield (2009) Transcript length bias in RNA- seq data confounds systems biology, Biology Direct, 4:14.
GOseq, Young et al, 2010
gene length using the “covariate” argument
independent
relative to the other genes in the experiment, while taking into account inter-gene correlations
10
Rank genes by differential expression
Gene 1 Gene 2 Gene 3 Gene 4 Gene 5 Gene 6 Gene 7 Gene 8 Gene 9 Gene 11 Gene 14 Gene 15 Gene 10 Gene 12 Gene 13 Gene 16
Positive signature genes Negative signature genes
Slide courtesy of Gordon Smyth
11
Rank genes by differential expression
Gene 1 Gene 2 Gene 3 Gene 4 Gene 5 Gene 6 Gene 7 Gene 8 Gene 9 Gene 11 Gene 14 Gene 15 Gene 10 Gene 12 Gene 13 Gene 16 Genome-wide barcode plot
Slide courtesy of Gordon Smyth
12
Data courtesy of Mark McKenzie
set tend to be differentially expressed?”
the gene set are differentially expressed it will be significant
preserve gene-gene dependence in the data.
constant gene-wise variance
(overlap analysis) to quite complex (CAMERA and ROAST)
hypothesis is