Self Study: Yeast Genome Comparison SESSION 4 MARTIN KRZYWINSKI - - PowerPoint PPT Presentation

self study yeast genome comparison
SMART_READER_LITE
LIVE PREVIEW

Self Study: Yeast Genome Comparison SESSION 4 MARTIN KRZYWINSKI - - PowerPoint PPT Presentation

Self Study: Yeast Genome Comparison SESSION 4 MARTIN KRZYWINSKI Genome Sciences Centre BC Cancer Agency Vancouver, Canada EMBO PRACTICAL COURSE: BIOINFORMATICS GENOME ANALYSES Izmir Biomedicine and Genome Center, Izmir, Turkey May 214,


slide-1
SLIDE 1

GENOME VISUALIZATION WITH CIRCOS v20160503

MARTIN KRZYWINSKI

SESSION 4

Genome Sciences Centre BC Cancer Agency Vancouver, Canada

Self Study: Yeast Genome Comparison

EMBO PRACTICAL COURSE: BIOINFORMATICS GENOME ANALYSES

Izmir Biomedicine and Genome Center, Izmir, Turkey May 2–14, 2016

slide-2
SLIDE 2

Use what you have learned and create an image using data from previous day. Input data is available in session/4/data Each lesson starts you off with a template configuration 4/*/etc/circos.conf Follow the detailed handout (handouts/session-4.pdf) for this session to create the full configuration file. The instructions are also included in the template. Answers are provided in 4/*.solution/. Try your best before referring to them!

SESSION SETUP

2 GENOME VISUALIZATION WITH CIRCOS · Session 4 · Yeast Genome Comparison

slide-3
SLIDE 3

.

SESSION IMAGES

3 GENOME VISUALIZATION WITH CIRCOS · Session 4 · Yeast Genome Comparison

slide-4
SLIDE 4

GENOME VISUALIZATION WITH CIRCOS · Session 4 · Yeast Genome Comparison

LESSON 1

Yeast species comparison— drawing ideograms

4

slide-5
SLIDE 5

Generate the image shown here showing all three genomes: SACE (green) CAGL (orange) and ZYRO (blue).

.

IDEOGRAM LAYOUT

5 GENOME VISUALIZATION WITH CIRCOS · Session 4 · Yeast Genome Comparison

slide-6
SLIDE 6

Generate a version that shows only CAGL genome.

.

IDEOGRAM LAYOUT

6 GENOME VISUALIZATION WITH CIRCOS · Session 4 · Yeast Genome Comparison

slide-7
SLIDE 7

Generate a version that only shows cagl-l and cagl-m chromosomes, each

  • ccupying 1/2 of the image.

.

IDEOGRAM LAYOUT

7 GENOME VISUALIZATION WITH CIRCOS · Session 4 · Yeast Genome Comparison

slide-8
SLIDE 8

GENOME VISUALIZATION WITH CIRCOS · Session 4 · Yeast Genome Comparison

LESSON 2

Yeast duplication— interior links

8

slide-9
SLIDE 9

Draw the ZYRO genome with blue ideograms.

.

GENOME DUPLICATIONS

9 GENOME VISUALIZATION WITH CIRCOS · Session 4 · Yeast Genome Comparison

slide-10
SLIDE 10

Draw links from the file

CIRCOS/DUPLICATION/link_zyro_zyro

with thickness 1, black and transparency level 5. Use the record_limit parameter in the <link> block to load only a subset of links to speed up image generation during debugging. e.g. record_limit = 500

.

GENOME DUPLICATIONS

10 GENOME VISUALIZATION WITH CIRCOS · Session 4 · Yeast Genome Comparison

slide-11
SLIDE 11

Add a rule that hides all links whose start coordinate is less than 4 kb in size. You can access the start coordinate size using var(size1).

.

GENOME DUPLICATIONS

11 GENOME VISUALIZATION WITH CIRCOS · Session 4 · Yeast Genome Comparison

slide-12
SLIDE 12

Add another rule that changes the color of links based on their size. Use the spectral-11-div palette and map size range 4-6 kb onto color index 1-11. Use remap_int() function for this.

remap_int(x,min,max,range_min,range_max)

Make the color transparent (e.g. level 5). Set the z parameter in the rule so that larger links are drawn on top.

.

GENOME DUPLICATIONS

12 GENOME VISUALIZATION WITH CIRCOS · Session 4 · Yeast Genome Comparison

slide-13
SLIDE 13

GENOME VISUALIZATION WITH CIRCOS · Session 4 · Yeast Genome Comparison

LESSON 3

Yeast duplication— exterior links

13

slide-14
SLIDE 14

Draw an image of cagl-k and cagl-m ideograms, each occupying 1/2 of the image. Make ideograms grey. Make the ideogram radius 0.5r. Make the ideogram label radius 1.9r. Reverse orientation of cagl-m

.

FOCUS ON DUPLICATIONS

14 GENOME VISUALIZATION WITH CIRCOS · Session 4 · Yeast Genome Comparison

slide-15
SLIDE 15

Draw duplications from

CIRCOS/DUPLICATION/link_cagl_cagl

as links of thickness 2 and black with transparency level 5. Set radius to 1r. Set bezier_radius_purity to 0.50 Set crest to 0.5. Experiment with the last two parameters. What do they do?

.

FOCUS ON DUPLICATIONS

15 GENOME VISUALIZATION WITH CIRCOS · Session 4 · Yeast Genome Comparison

slide-16
SLIDE 16

Create a rule that changes the bezier_radius for intrachromosomal

  • links. Check this status in the rule condition

using var(intrachr) Remap the absolute difference between start2 and start1 (min=0, max=1e6)

  • nto the range (1.25,6). Use remap().

remap(x,min,max,range_min,range_max)

To continue processing the next rule even when this rule matches, set flow = continue

.

FOCUS ON DUPLICATIONS

16 GENOME VISUALIZATION WITH CIRCOS · Session 4 · Yeast Genome Comparison

slide-17
SLIDE 17

Add another rule that changes the color, thickness and z parameters of the link. Assign color based on start1 of link. Remap the start position (0,1e6) onto color index (1,11) and use spectral-11-div palette. Assign thickness based on size of link start coordinate (1000,5000). Map it onto thickness (1,3). Set the z parameter to be the start1 position.

.

FOCUS ON DUPLICATIONS

17 GENOME VISUALIZATION WITH CIRCOS · Session 4 · Yeast Genome Comparison

slide-18
SLIDE 18

Define a parameter genome in the root of the configuration file. You can access the value of this parameter using conf(genome) anywhere in the file. Whenever you referred to cagl directly, use conf(genome). Change the parameter from cagl to sace to draw corresponding chromosomes in sace. Now change the parameter to zyro. Did you see an error message? Try to figure out what it means. How would you fix the problem?

.

FOCUS ON DUPLICATIONS

18 GENOME VISUALIZATION WITH CIRCOS · Session 4 · Yeast Genome Comparison

slide-19
SLIDE 19

GENOME VISUALIZATION WITH CIRCOS · Session 4 · Yeast Genome Comparison

LESSON 4

Yeast conservation

19

slide-20
SLIDE 20

Create a script in data/CIRCOS/CONSERVATION that extracts the 250 largest links from each link_* file (use the size of the start coordinate) and collects them into the file links.top250.txt. Use bash for loop for f in link_* ; do ... done For the command, use awk to include the size of the difference to each line, then sort by this new field, then head to list

  • nly part of the file, then remove the field

with cut. The answer is in data/CIRCOS/CONSERVATION/topN.

.

GENOME CONSERVATION

20 GENOME VISUALIZATION WITH CIRCOS · Session 4 · Yeast Genome Comparison

slide-21
SLIDE 21

Draw cagl-m, zyro-g and sace-f

  • ideograms. Make them each occupy 1/3 of

the image. Draw the links from the links.top250.txt file you created.

.

GENOME CONSERVATION

21 GENOME VISUALIZATION WITH CIRCOS · Session 4 · Yeast Genome Comparison

slide-22
SLIDE 22

Set up rules that change the color of the link depending on what genome they

  • riginate from.

Use the from(RX) function in the rule condition to check whether the link starts

  • n an ideogram that matches the regular

expression RX. Make links from CAGL orange, from SACE green and from ZYRO blue. Set flow=continue globally for all rules. How does this help?

.

GENOME CONSERVATION

22 GENOME VISUALIZATION WITH CIRCOS · Session 4 · Yeast Genome Comparison

slide-23
SLIDE 23

Add a rule that changes the color to a transparent version by adding _a4 to the end of the color name.

.

GENOME CONSERVATION

23 GENOME VISUALIZATION WITH CIRCOS · Session 4 · Yeast Genome Comparison

slide-24
SLIDE 24

Add a rule that makes any links that have start and end coordinates larger than 5kb red. Use var(size1) and var(size2) to access the link coordinate sizes.

.

GENOME CONSERVATION

24 GENOME VISUALIZATION WITH CIRCOS · Session 4 · Yeast Genome Comparison

slide-25
SLIDE 25

.

25 GENOME VISUALIZATION WITH CIRCOS · Session 4 · Yeast Genome Comparison