A Grammar of Graphics for Genomics
The ggbio Package Michael Lawrence
Genentech
August 29, 2012
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 1 / 18
A Grammar of Graphics for Genomics The ggbio Package Michael - - PowerPoint PPT Presentation
A Grammar of Graphics for Genomics The ggbio Package Michael Lawrence Genentech August 29, 2012 Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 1 / 18 Outline 1 Motivation 2 High-level Plots 3 Grammar
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 1 / 18
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 2 / 18
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 3 / 18
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 4 / 18
120.928 Mb 120.93 Mb 120.932 Mb 120.934 Mb 120.936 Mb 120.938 Mb
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 4 / 18
10 20 30 40 50 60 120928000 120930000 120932000 120934000 120936000 120938000
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 4 / 18
10 20 30 40 50 60 120.928 Mb 120.93 Mb 120.932 Mb 120.934 Mb 120.936 Mb 120.938 Mb
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 4 / 18
seqnames start end strand exon id tx id 10 120927215 120928045
14886,14887 10 120928689 120928854
14886,14887 10 120931894 120931997
14886,14887 10 120933249 120933384
14886,14887 10 120933963 120934069
14886 10 120933963 120934104
14887 10 120936533 120936665
14887 10 120936552 120936665
14886 10 120938267 120938345
14886,14887 Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 4 / 18
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 5 / 18
120.928 Mb 120.93 Mb 120.932 Mb 120.934 Mb 120.936 Mb 120.938 Mb
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 5 / 18
10 20 30 40 50 60 120.928 Mb 120.93 Mb 120.932 Mb 120.934 Mb 120.936 Mb 120.938 Mb
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 5 / 18
50000 100000 150000 200000 250000 300000 0 Mb 50 Mb 100 Mb
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 5 / 18
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 6 / 18
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 6 / 18
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 6 / 18
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 6 / 18
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 6 / 18
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 6 / 18
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 6 / 18
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 7 / 18
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 7 / 18
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 7 / 18
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 7 / 18
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 8 / 18
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 9 / 18
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 10 / 18
120.928 Mb 120.93 Mb 120.932 Mb 120.934 Mb 120.936 Mb 120.938 Mb
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 10 / 18
10 20 30 40 50 60 120928000 120930000 120932000 120934000 120936000 120938000
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 10 / 18
120928700 120928750 120928800 120928850 A C G T
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 10 / 18
CGTAGGAGAATCCGGTGTCCAGTTCGCTGGGCAGACTTCTCCATGTGTTT
120928690 120928700 120928710 120928720 120928730 120928740
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 10 / 18
10 20 30 40 50 60 120.928 Mb 120.93 Mb 120.932 Mb 120.934 Mb 120.936 Mb 120.938 Mb
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 10 / 18
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 11 / 18
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 11 / 18
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X 5.0e+07 1.0e+08 1.5e+08 2.0e+08 seqReg Exon Intron Other
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 11 / 18
50M 100M 150M 200M 0M 50M 100M 150M 200M 0M 5 M 100M 150M 0M 50M 1 M 150M M 5 M 100M 150M 0M 5 M 100M 150M 0M 5 M 100M 150M M 50M 100M 0M 50M 100M M 50M 1 M 0M 50M 100M 0M 5 M 1 M 0M 5 M 100M 0M 5 M 100M M 50M 100M 0M 5 M 0M 50M 0M 50M 0M 50M M 50M 0M M 5 M
1 2 3 4 5 6 7 8 9 1 11 1 2 13 14 1 5 16 17 1 8 19 20 21 2 2
rearrangements interchromosomal intrachromosomal tumreads
6 8 10 12
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 11 / 18
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 12 / 18
5 10 15
Counts
read A C G T
T G A A A G T A C C G T G T G A C A T C A C A G G C T G G G A G C T T G A
25235720 25235725 25235730 25235735 25235740 25235745 25235750 25235755
mismatch snp reference
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 12 / 18
Expression
200 400 600 800 1000 group GM12878 K562 uc002rau.2 uc010yjg.1 uc002rav.2 uc010yjh.1 uc002raw.2 10930000 10940000 10950000 10960000 10970000 10980000
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 12 / 18
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 13 / 18
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 14 / 18
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 14 / 18
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 14 / 18
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 14 / 18
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 14 / 18
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 14 / 18
1 2 48.245 Mb 48.250 Mb 48.255 Mb 48.260 Mb 48.265 Mb 48.270 Mb strand + − statistical transformation: geometric object: chevron geometric object: alignment Y scale: discrete from stepping geometric object: rect stepping X scale: sequence color scale: discrete from strand
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 15 / 18
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 16 / 18
NM_006793(GeneID:10935) NM_014098(GeneID:10935) 120.928 Mb 120.93 Mb 120.932 Mb 120.934 Mb 120.936 Mb 120.938 Mb
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 16 / 18
NM_006793(GeneID:10935) NM_014098(GeneID:10935) 120.928 Mb 120.93 Mb 120.932 Mb 120.934 Mb 120.936 Mb 120.938 Mb
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 16 / 18
NM_006793(GeneID:10935) NM_014098(GeneID:10935) 120.928 Mb 120.93 Mb 120.932 Mb 120.934 Mb 120.936 Mb 120.938 Mb
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 16 / 18
NM_006793(GeneID:10935) NM_014098(GeneID:10935) 120.928 Mb 120.93 Mb 120.932 Mb 120.934 Mb 120.936 Mb 120.938 Mb
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 16 / 18
NM_006793(GeneID:10935) NM_014098(GeneID:10935) 120.928 Mb 120.93 Mb 120.932 Mb 120.934 Mb 120.936 Mb 120.938 Mb
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 16 / 18
120.928 Mb 120.93 Mb 120.932 Mb 120.934 Mb 120.936 Mb 120.938 Mb
reduced Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 16 / 18
120.928 Mb 120.93 Mb 120.932 Mb 120.934 Mb 120.936 Mb 120.938 Mb
stepping Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 16 / 18
10 20 30 40 50 60 120928000 120930000 120932000 120934000 120936000 120938000
Coverage Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 16 / 18
10 20 30 40 50 60 120928000 120930000 120932000 120934000 120936000 120938000
Counts
A C G T
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 16 / 18
10 20 30 40 50 60
Counts
A C G T 120.928 Mb 120.93 Mb 120.932 Mb 120.934 Mb 120.936 Mb 120.938 Mb
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 16 / 18
truncated Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 16 / 18
500 1000 1500 2000 500 1000 1500 2000 normal tumor score 500 1000 1500 novel FALSE TRUE
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 16 / 18
10 11 0e+00 2e+05 4e+05 6e+05 0e+00 5e+07 1e+08 0e+00 5e+07 1e+08
Coverage Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 16 / 18
0e+00 2e+05 4e+05 6e+05 10 11
Coverage Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 16 / 18
50 100 150 200 1 2 3 4 5 6
Samples Features
−5000 5000 value
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 16 / 18
chr10 chr10
chr10 Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 16 / 18
chr10 chr10
chr10
10 20 30 40 50 60
Counts
A C G T 120.928 Mb 120.93 Mb 120.932 Mb 120.934 Mb 120.936 Mb 120.938 Mb
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 16 / 18
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 17 / 18
Michael Lawrence (Genentech) A Grammar of Graphics for Genomics August 29, 2012 18 / 18