Alper Sarikaya 1 , Michael Correll 2 , Jorge M. Dinis 1 , David H. - - PowerPoint PPT Presentation

alper sarikaya 1 michael correll 2 jorge m dinis 1
SMART_READER_LITE
LIVE PREVIEW

Alper Sarikaya 1 , Michael Correll 2 , Jorge M. Dinis 1 , David H. - - PowerPoint PPT Presentation

Alper Sarikaya 1 , Michael Correll 2 , Jorge M. Dinis 1 , David H. OConnor 1,3 , and Michael Gleicher 1 1 University of Wisconsin-Madison 2 University of Washington 3 Wisconsin National Primate Center


slide-1
SLIDE 1

Alper Sarikaya1, Michael Correll2, Jorge M. Dinis1, David H. O’Connor1,3, and Michael Gleicher1

1 University of Wisconsin-Madison 2 University of Washington 3 Wisconsin National Primate Center

http://graphics.cs.wisc.edu/Vis/CoocurViewer/ @yelperalp http://cs.wisc.edu/~sarikaya/

slide-2
SLIDE 2

Biological Background Displaying occurrence relationships (in biology) MatrixViewer CooccurViewer Case Study, Future Work

slide-3
SLIDE 3

RNA viruses are very error prone in replication Viruses accumulate variation to help its survival Influenza, H1N1, Zika are hard to eliminate

slide-4
SLIDE 4
slide-5
SLIDE 5

Discover where functional shifts are occurring

slide-6
SLIDE 6

Identify ‘co-occurrences’ of mutations in genome

slide-7
SLIDE 7

Identify groups of like-behaving subpopulations

slide-8
SLIDE 8
slide-9
SLIDE 9
slide-10
SLIDE 10
slide-11
SLIDE 11

Identify pairs of positions where mutations co-occur Analysis requires a maximum of sifting through (# positions)2 correlations

slide-12
SLIDE 12

Biological Background Displaying occurrence relationships (in biology) MatrixViewer CooccurViewer Case Study, Future Work

slide-13
SLIDE 13
slide-14
SLIDE 14
slide-15
SLIDE 15
slide-16
SLIDE 16
slide-17
SLIDE 17
slide-18
SLIDE 18
slide-19
SLIDE 19
slide-20
SLIDE 20
slide-21
SLIDE 21

Biological Background Displaying occurrence relationships (in biology) MatrixViewer CooccurViewer Case Study, Future Work

slide-22
SLIDE 22

Collect counts of bases (A, C, T, G) for each pair of positions

slide-23
SLIDE 23

Compute co-occurrence strength between every pair of genomic positions

slide-24
SLIDE 24

Color shows the co-occurrence strength Show co-occurrences in full pairwise genomic space, in a web browser Scale up to 20,000 x 20,000

Overview Super-zoom Key Pairwise genomic space

slide-25
SLIDE 25

Color shows the co-occurrence strength Show co-occurrences in full pairwise genomic space, in a web browser Scale up to 20,000 x 20,000

slide-26
SLIDE 26

Too much data to sift through Alignment errors produce false positives Difficult to get an overview

slide-27
SLIDE 27

Always present data in genomic sequence order Display annotations alongside genome Scaffold to navigate space of all pairwise correlation Support identifying synonymy

slide-28
SLIDE 28

Biological Background Displaying occurrence relationships (in biology) MatrixViewer CooccurViewer Case Study, Future Work

slide-29
SLIDE 29

Coverage (read depth) Variation (mutations) Co-occurrence strength

slide-30
SLIDE 30

http://graphics.cs.wisc.edu/Vis/CooccurViewer

slide-31
SLIDE 31

http://graphics.cs.wisc.edu/Vis/CooccurViewer User-controlled metrics

slide-32
SLIDE 32

Annotations

Positions with significant co-occurrences http://graphics.cs.wisc.edu/Vis/CooccurViewer

slide-33
SLIDE 33

http://graphics.cs.wisc.edu/Vis/CooccurViewer Pairwise co-occurrences with a particular position

slide-34
SLIDE 34

Reads that do not overlap with the paired position

slide-35
SLIDE 35
slide-36
SLIDE 36
slide-37
SLIDE 37

Biological Background Displaying occurrence relationships (in biology) MatrixViewer CooccurViewer Case Study, Future Work

slide-38
SLIDE 38

Sample of : simian equivalent of HIV Large cluster of correlated mutations in Nef protein to evade T cell recognition Nearly no co-occurrences in structural proteins Gal & Pol

slide-39
SLIDE 39

Use analyst-controlled metrics to focus exploration Displaying the full space does not necessarily empower analysts Providing usable context and scaffolding

slide-40
SLIDE 40

Support comparison between multiple samples, and multi-step co-occurrence Data aggregation and filtering techniques to support larger data sizes Application to other event-driven sequences

slide-41
SLIDE 41

Funding from the NIH and NSF Feedback from colleagues, virologists, and reviewers Code and working demo available online!

@yelperalp http://cs.wisc.edu/~sarikaya/