SESSION 4 / LINKS AND RULES GENOME VISUALIZATION WITH CIRCOS LINKS AND RULES Martin Krzywinski martink@bcgsc.ca
LINKS AND RULES
SESSION 4
1
LINKS AND RULES GENOME VISUALIZATION WITH CIRCOS LINKS AND RULES 1 - - PowerPoint PPT Presentation
SESSION 4 / LINKS AND RULES SESSION 4 LINKS AND RULES GENOME VISUALIZATION WITH CIRCOS LINKS AND RULES 1 Martin Krzywinski martink@bcgsc.ca SESSION 4 / LINKS AND RULES LESSON PLAN ideogram, tick, grid and label layout drawing links rules
SESSION 4 / LINKS AND RULES GENOME VISUALIZATION WITH CIRCOS LINKS AND RULES Martin Krzywinski martink@bcgsc.ca
SESSION 4
1
SESSION 4 / LINKS AND RULES GENOME VISUALIZATION WITH CIRCOS LINKS AND RULES Martin Krzywinski martink@bcgsc.ca
ideogram, tick, grid and label layout drawing links rules – coloring links by position, part 1 rules – coloring links by position, part 2 bundling links density histograms scatter plots rules – coloring links by size
2
SESSION 4 / LINKS AND RULES GENOME VISUALIZATION WITH CIRCOS LINKS AND RULES Martin Krzywinski martink@bcgsc.ca
Examples of how rules and links are combined. (A) Original data set. (B) Color of certain links is modified using rules. (C) Geometry of nearby intra-chromosomal links has been adjusted to point the link outwards. (D) Rules were used to change the thickness of links.
3
SESSION 4 / LINKS AND RULES GENOME VISUALIZATION WITH CIRCOS LINKS AND RULES Martin Krzywinski martink@bcgsc.ca 4
Links are defined by two regions, which can be of any size. top By default, links are drawn as lines (with adjustable, but constant, thickness). The lines start and end in the middle of the regions that define the link. bottom When the regions that define the link are large, it is helpful to use the thickness of the link to reflect the region size. To do this, links can be drawn as ribbons whose ends take on the thickness
thickness is not necessarily constant across the link. Depending on the
SESSION 4 / LINKS AND RULES GENOME VISUALIZATION WITH CIRCOS LINKS AND RULES Martin Krzywinski martink@bcgsc.ca 5
LESSON 1
SESSION 4 / LINKS AND RULES GENOME VISUALIZATION WITH CIRCOS LINKS AND RULES Martin Krzywinski martink@bcgsc.ca 6
The figure template showing mouse chromosomes 1-19 and human chromosome 1. ! We will be rescaling the human chromosome so that it occupies " of the figure in order to reveal detail.!
!!"#$%&'()"'(*+,-./$*#011) !!"#$%&'()2"$34/$*#011) !".-+(1) 0"%()5)4678/9#+) !!"#$%&'()//://:(2$:".-+(/$*#011) !:".-+(1) 3-,;*2;9()5)//:'-2-:3-,;*2;9(/<&.-#=.*&4(=%->(%4/2?2) $<,*.*4*.(4=&#"24)))))))))))5)8@@@@@@) $<,*.*4*.(4='"49%-;='(0-&%2)5)#*) $<,*.*4*.(4)))))))))5) ..8A..BA..CA..6A..DA..EA..FA..GA..HA..8@A) ..88A..8BA..8CA..86A..8DA..8EA..8FA..8GA..8HA<48) $<,*.*4*.(4=,(I(,4()5) ..8A..BA..CA..6A..DA..EA..FA..GA..HA..8@A) ..88A..8BA..8CA..86A..8DA..8EA..8FA..8GA..8H) $<,*.*4*.(4=>,(-34))5)7<48J8B@786@) K$<,*.*4*.(4=4$-%()))5)<48J88/G) !<"+<%"+<241) !<"+<%"+<21) 4<*L)5);(4) 0"%()5)//:'-2-:<"+<%"+<2/2?2) ,@)))5)@/HH,) ,8)))5)@/HHH,) !:<"+<%"+<21) !:<"+<%"+<241)
4(44"#4:6:8:(2$:$",$*4/$*#0)
SESSION 4 / LINKS AND RULES GENOME VISUALIZATION WITH CIRCOS LINKS AND RULES Martin Krzywinski martink@bcgsc.ca
chromosomes_scale = hs1:11.8 Mouse chromosomes 1-19 occupy ! of the figure and human chromosome 1 is shown in the other !. The human chromosome has an axis break at 120-140Mb to remove the centromere from the display (there is no data for this region). Notice that the scale of mouse chromosomes runs counter-clockwise.
7
SESSION 4 / LINKS AND RULES GENOME VISUALIZATION WITH CIRCOS LINKS AND RULES Martin Krzywinski martink@bcgsc.ca 8
LESSON 2
SESSION 4 / LINKS AND RULES GENOME VISUALIZATION WITH CIRCOS LINKS AND RULES Martin Krzywinski martink@bcgsc.ca 9
The links show 2,300 top alignments between human chromosome 1 and mouse chromosomes 1-19.! When transparency is used for link lines, it is possible to discern regions where the links are denser. The color for each link line here is black_a5.! Compare this figure with Figure 6, where transparency was not used.!
!%"#341) !%"#3)$<-"#1) 0"%())))))))))5)//:'-2-:%"#34/2?2) >(M"(,=,-'"&4)5)@,) ,-'"&4))))))))5)@/GD,) 2<"$3#(44)))))5)89) $*%*,)))))))))5)>%-$3=-D) !:%"#31) !:%"#341)
4(44"*#4:6:B:(2$:$",$*4/$*#0)
SESSION 4 / LINKS AND RULES GENOME VISUALIZATION WITH CIRCOS LINKS AND RULES Martin Krzywinski martink@bcgsc.ca 10
When transparency is not used for link lines, dense links form a solid shape making it impossible to discern regions where the links are
SESSION 4 / LINKS AND RULES GENOME VISUALIZATION WITH CIRCOS LINKS AND RULES Martin Krzywinski martink@bcgsc.ca 11
LESSON 3
SESSION 4 / LINKS AND RULES GENOME VISUALIZATION WITH CIRCOS LINKS AND RULES Martin Krzywinski martink@bcgsc.ca 12
Rules are used to color all links that impinge on mouse chromosome 11.!
!,&%(1) $*#'"2"*#)))5)=NOPB=)(Q)R..88R) $*%*,)))))))5)$<,88=-C) M)))))))))))5)8@) 2<"$3#(44)))5)B9) !:,&%(1) ///) %"#3BB6E)<48)CHDBE6)6BE88B)) %"#3BB6E)..8)8GDEE88G)8GDHB@FF)) ///)
4(44"#4:6:C:(2$:$",$*4/$*#0)
4(44"#4:6:%"#34/2?2)
SESSION 4 / LINKS AND RULES GENOME VISUALIZATION WITH CIRCOS LINKS AND RULES Martin Krzywinski martink@bcgsc.ca 13
A second rule is added to uniquely color all mm11 links that start at 20-50Mb of hs1.!
!,&%(1) ".9*,2-#$())5)8@@) $*#'"2"*#)))5)=NOPB=)(Q)R..88R) $*%*,)))))))5)$<,88=-C) M)))))))))))5)8@) 2<"$3#(44)))5)B9) K)%"#34)2<-2)9-44)-,()2(42(')>;),(.-"#"#+) ,&%(4) 0%*L))))))))5)$*#2"#&() !:,&%(1) !,&%(1) ".9*,2-#$())5)H@) $*#'"2"*#)))5)=NSTSP=)(Q)R$<,88=-CR)UU) =VWXPW8=)1)B@(E)UU)=YZ[8=)!)D@(E) $*%*,)))))))5),('=-C) M)))))))))))5)B@) 2<"$3#(44)))5)C9) !:,&%(1)
4(44"*#4:6:C:(2$:$",$*4/$*#0)
SESSION 4 / LINKS AND RULES GENOME VISUALIZATION WITH CIRCOS LINKS AND RULES Martin Krzywinski martink@bcgsc.ca 14
LESSON 4
SESSION 4 / LINKS AND RULES GENOME VISUALIZATION WITH CIRCOS LINKS AND RULES Martin Krzywinski martink@bcgsc.ca 15
Links within 10-20Mb on hs1 are colored by their destination chromosome.!
!,&%(41) !,&%(1) $*#'"2"*#)))5)=VWXPW8=)1)8@(E)UU)=YZ[8=)!) B@(E) $*%*,)))))))5)) ))(I-%\)R$<,])) ))))))))/) ))))))))4&>42,\=NOPB=^,"#'(?\=NOPB=^R.R_`8_)) ))))))))/) ))))))))R=-6R)_) M)))))))))))5)8@) 2<"$3#(44)))5)B9) !:,&%(1) !:,&%(41)
4(44"*#4:6:6:(2$:$",$*4/$*#0)
SESSION 4 / LINKS AND RULES GENOME VISUALIZATION WITH CIRCOS LINKS AND RULES Martin Krzywinski martink@bcgsc.ca 16
Using a rule, all links are colored by the chromosome associated with their ends.!
!,&%(1) $*#'"2"*#)))5)8) $*%*,)))))))5)) ))(I-%\)R$<,R)) ))))))))/)) ))))))))4&>42,\=NOPB=^,"#'(?\=NOPB=^R.R_`8_)) ))))))))/)R=-6R)_) KM)))))))))))5)8@) K2<"$3#(44)))5)B9) !:,&%(1)
4(44"*#4:6:6:(2$:$",$*4/$*#0)
SESSION 4 / LINKS AND RULES GENOME VISUALIZATION WITH CIRCOS LINKS AND RULES Martin Krzywinski martink@bcgsc.ca 17
LESSON 5
SESSION 4 / LINKS AND RULES GENOME VISUALIZATION WITH CIRCOS LINKS AND RULES Martin Krzywinski martink@bcgsc.ca
The bundlelinks tools is used to logically group adjacent links together, forming larger links. Links are bundled based on their size and distance to each other. Bundles are ideally drawn as ribbons, rather than lines, because bundle ends typically span a significant section of an ideogram.
18
SESSION 4 / LINKS AND RULES GENOME VISUALIZATION WITH CIRCOS LINKS AND RULES Martin Krzywinski martink@bcgsc.ca 19
The result of bundling links shown in Figure 10. ! Using radius2, the ends of the links are drawn closer to the mouse chromosomes.!
!%"#341) !%"#3)$<-"#1) ,">>*#))))))))5);(4) 0"%())))))))))5)//:'-2-:>&#'%(4/2?2) >(M"(,=,-'"&4)5)@,) ,-'"&4))))))))5)@/GD,) 2<"$3#(44)))))5)@9) $*%*,)))))))))5)>%-$3=-8@) !,&%(41) !,&%(1) $*#'"2"*#)))5)8)) $*%*,)))))))5)) ))(I-%\)R$<,R)) ))))))))/)) ))))))))4&>42,\=NOPB=^,"#'(?\=NOPB=^R.R_`8_) ))))))))/)R=-BR_) ,-'"&4B)))))5)@/HH,) M)))))))))))5)(I-%\=VabY8=_) !:,&%(1) !:,&%(41) !:%"#31)
4(44"*#4:6:D:(2$:$",$*4/$*#0)
SESSION 4 / LINKS AND RULES GENOME VISUALIZATION WITH CIRCOS LINKS AND RULES Martin Krzywinski martink@bcgsc.ca 20
By setting the z value to be inversely proportional to link size, small links are drawn on top.!
M)5)(I-%\7=VabY8=_)
4(44"*#4:6:D:(2$:$",$*4/$*#0)
SESSION 4 / LINKS AND RULES GENOME VISUALIZATION WITH CIRCOS LINKS AND RULES Martin Krzywinski martink@bcgsc.ca Varying the minimum number of links per bundle changes the sensitivity of bundling. When a large number of links is required (e.g. n=20), only those regions that are connected by a large number of links are turned into bundles. When this number is decreased (e.g. n = 10, n = 5, ...), the number of bundles increases. If the cutoff is small (e.g. n = 2, 3), it is possible to create a large number of bundles, because fewer links are required to form a bundle. 21
SESSION 4 / LINKS AND RULES GENOME VISUALIZATION WITH CIRCOS LINKS AND RULES Martin Krzywinski martink@bcgsc.ca 22
LESSON 6
SESSION 4 / LINKS AND RULES GENOME VISUALIZATION WITH CIRCOS LINKS AND RULES Martin Krzywinski martink@bcgsc.ca 23
<48)G8@@@@@@)G8HHHHHH) 8/@@@@^@/@@@@^@/@@@@^8/@@@@^@/@@@@^@/@@@@^) 8/@@@@^@/@@@@^8/@@@@^@/@@@@^@/@@@@^@/@@@@^) @/@@@@^@/@@@@^8/@@@@^B/@@@@^@/@@@@^@/@@@@^@/@@@@)
4(44"*#4:6:'-2-:<"42*+,-./<4/42-$3('/2?2)
SESSION 4 / LINKS AND RULES GENOME VISUALIZATION WITH CIRCOS LINKS AND RULES Martin Krzywinski martink@bcgsc.ca 24
Three density histograms summarize information about the synteny between human chromosome 1 and the mouse genome. ! The links in this figure are drawn as bundles, but the density histograms are calculated based on the individual links.!
!9%*21) 4<*L)))))))5);(4) 2;9()))))))5)<"42*+,-.) ."#))))))))5)@) .-?))))))))5)D@(C) K,@))))))))5)@/H,) K,8))))))))5)@/HG,) ,@)))))))))5)8,) ,8)))))))))5)8,`6C9) 0"%()))))))5)//:'-2-:<"42*+,-./../2?2) 0"%%=&#'(,)5);(4) 0"%%=$*%*,)5)>%-$3) !:9%*21) !9%*21) 4<*L)))))))5);(4) 2;9()))))))5)<"42*+,-.) ."#))))))))5)@) .-?))))))))5)D@(C) ,@)))))))))5)@/H,) ,8)))))))))5)@/HG,) 0"%()))))))5)//:'-2-:<"42*+,-./<4/2?2) 0"%%=&#'(,)5);(4) 0"%%=$*%*,)5)>%-$3) !:9%*21) !9%*21) 4<*L)))))))5);(4) 2;9()))))))5)<"42*+,-.) ."#))))))))5)@) .-?))))))))5)BD) ,@)))))))))5)8,) ,8)))))))))5)8,`6C9) 0"%()))))))5)//:'-2-:<"42*+,-./<4/42-$3('/2?2) K)L<(#)&4"#+)2<()#*,.-%"M(')<"42*+,-.^)4(2).-?58) 0"%()))))))5)//:'-2-:<"42*+,-./<4/42-$3('/#*,./2?2) 0"%%=&#'(,)5);(4) 0"%%=$*%*,)5)$<,8^$<,B^$<,C^$<,6^$<,D^$<,E^$<,F^$<,G^$<,H^$<,8@^$<,88^) )))))))))))))$<,8B^$<,8C^$<,86^$<,8D^$<,8E^$<,8F^$<,8G^$<,8H) !:9%*21)
4(44"*#4:6:E:(2$:$",$*4/$*#0)
SESSION 4 / LINKS AND RULES GENOME VISUALIZATION WITH CIRCOS LINKS AND RULES Martin Krzywinski martink@bcgsc.ca 25
The normalized histogram is further modified by sorting the contributions to each bin by value. The first contribution (bottom-most) is the largest and recapitulates the inner density histogram, which shows the same information but in absolute terms (and based on size of links, rather than number). The second contribution (second bottom-most) represents the next most similar chromosome, and so on.
SESSION 4 / LINKS AND RULES GENOME VISUALIZATION WITH CIRCOS LINKS AND RULES Martin Krzywinski martink@bcgsc.ca 26
LESSON 7
SESSION 4 / LINKS AND RULES GENOME VISUALIZATION WITH CIRCOS LINKS AND RULES Martin Krzywinski martink@bcgsc.ca 27
A scatter plot is added to the figure to show average conservation within 1Mb bins on hs1. ! Rules are applied to the plot to color glyphs based on value.!
!9%*21) 2;9()))))))5)4$-22(,) 0"%()))))))5)//:'-2-:4$-22(,/$*#4/2?2) ."#))))))))5)@/CH) .-?))))))))5)@/DD) ,@)))))))))5)@/G@,) ,8)))))))))5)@/H@,) +%;9<))))))5)4Q&-,() +%;9<=4"M()5)C) 0"%%=$*%*,)5)+,(;) ///)
4(44"*#4:6:F:(2$:$",$*4/$*#0)
SESSION 4 / LINKS AND RULES GENOME VISUALIZATION WITH CIRCOS LINKS AND RULES Martin Krzywinski martink@bcgsc.ca 28
!,&%(1) ".9*,2-#$()5)88@) $*#'"2"*#))5)8) !:,&%(1)
4(4"*#4:6:F:(2$:$",$*4/$*#0)
SESSION 4 / LINKS AND RULES GENOME VISUALIZATION WITH CIRCOS LINKS AND RULES Martin Krzywinski martink@bcgsc.ca 29
Glyph size is made proportional to the deviation of the data point (distance to average).!
!,&%(1) K)H@c)9(,$(#2"%() ".9*,2-#$()5)8@@) $*#'"2"*#))5)=dXTeY=)15)@/6GH) 0"%%=$*%*,)5)+,((#) 0%*L)))))))5)$*#2"#&() !:,&%(1) !,&%(1) K)8@c)9(,$(#2"%() ".9*,2-#$()5)H@) $*#'"2"*#))5)=dXTeY=)!5)@/68E) 0"%%=$*%*,)5),(') 0%*L)))))))5)$*#2"#&() !:,&%(1) !,&%(1) K)L"2<"#)8)42')*0).(-#) ".9*,2-#$()5)G@) $*#'"2"*#))5)->4\=dXTeY=)7)@/6DD_)!)@/@8) 0"%%=$*%*,)5)'+,(;) 0%*L)))))))5)$*#2"#&() !:,&%(1) !,&%(1) ".9*,2-#$()5)F@) $*#'"2"*#))5)8) +%;9<=4"M()5)(I-%\)->4\=dXTeY=)7)@/6DD_:@/@@D_) 0"%%=$*%*,)5)(I-%\=0"%%=$*%*,=)/)R=-CR_) 0%*L)5)$*#2"#&() !:,&%(1)
4(44"*#4:6:F:(2$:$",$*4/$*#0)
SESSION 4 / LINKS AND RULES GENOME VISUALIZATION WITH CIRCOS LINKS AND RULES Martin Krzywinski martink@bcgsc.ca 30
By mapping value onto glyph size and then placing all the glyphs at the same radial position (by changing data values), a glyph track is created.! Stacking such glyph tracks can create very interesting (and attractive) visualizations.!
!,&%(1) ".9*,2-#$()5)E@) $*#'"2"*#))5)8) I-%&())))))5)@/6F) !:,&%(1)
4(44"*#4:6:F:(2$:$",$*4/$*#0)
SESSION 4 / LINKS AND RULES GENOME VISUALIZATION WITH CIRCOS LINKS AND RULES Martin Krzywinski martink@bcgsc.ca 31
LESSON 8
SESSION 4 / LINKS AND RULES GENOME VISUALIZATION WITH CIRCOS LINKS AND RULES Martin Krzywinski martink@bcgsc.ca 32
Bundles are shaded in proportion to their size on hs1.!
!,&%(1) ".9*,2-#$()5)8@@) $*#'"2"*#))5)8) $*%*,))))))5) ))(I-%\)R>%-$3=-R)/)"#2\.-?\8^E7=VabY8=: D(E__)_) K0%*L)))))))5)$*#2"#&() !:,&%(1) !,&%(1)
4(44"*#4:6:G:(2$:$",$*4/$*#0)
SESSION 4 / LINKS AND RULES GENOME VISUALIZATION WITH CIRCOS LINKS AND RULES Martin Krzywinski martink@bcgsc.ca 33
Rules help create three groups of links. ! Links on mm8 and mm11 are drawn on top, in order of link size, and colored by mouse chromosome color. Links on mm5 are drawn next, with a subtler red tint. All other links are drawn below and shaded in proportion to their size.!
!,&%(1) ".9*,2-#$()5)H@) $*#'"2"*#))5)=NOPB=)(Q)R..GR)ff)=NOPB=)(Q) R..88R) $*%*,))))))5)) ))(I-%\)R$<,R)) ))))))))/) ))))))))4&>42,\=NOPB=^,"#'(?\=NOPB=^R.R_`8_)) ))))))))/)) ))))))))R=-8R_) M))))))))))5)(I-%\=VabY8=_) !:,&%(1)
4(44"*#4:6:G:(2$:$",$*4/$*#0)
SESSION 4 / LINKS AND RULES GENOME VISUALIZATION WITH CIRCOS LINKS AND RULES Martin Krzywinski martink@bcgsc.ca 34
8/etc/circos.conf!
!,&%(1) ".9*,2-#$()5)G@) $*#'"2"*#))5)=NOPB=)(Q)R..DR) $*%*,))))))5)) ))(I-%\)R$<,R)) ))))))))/)) ))))))))4&>42,\=NOPB=^,"#'(?\=NOPB=^R.R_`8_)) ))))))))/) ))))))))R=-CR_) M))))))))))5)8@) !:,&%(1)
4(44"*#4:6:G:(2$:$",$*4/$*#0)
SESSION 4 / LINKS AND RULES GENOME VISUALIZATION WITH CIRCOS LINKS AND RULES Martin Krzywinski martink@bcgsc.ca 35