Comp/Phys/Mtsc 715 Bioinformatics Visualization 4/12/2012 - - PDF document

comp phys mtsc 715
SMART_READER_LITE
LIVE PREVIEW

Comp/Phys/Mtsc 715 Bioinformatics Visualization 4/12/2012 - - PDF document

4/12/2012 Comp/Phys/Mtsc 715 Bioinformatics Visualization 4/12/2012 Bioinformatics Comp/Phys/Mtsc 715 Taylor Example Videos Vis 2005, Bertram Visualizing sound wavefront propagation Vis 2005, Cantarel (tighten.mov) Visualizing


slide-1
SLIDE 1

4/12/2012 1

Bioinformatics Visualization

Comp/Phys/Mtsc 715

4/12/2012 Bioinformatics Comp/Phys/Mtsc 715 Taylor

Example Videos

  • Vis 2005, Bertram

– Visualizing sound wavefront propagation

  • Vis 2005, Cantarel (tighten.mov)

– Visualizing self contact in tightening knots

4/12/2012 Bioinformatics Comp/Phys/Mtsc 715 Taylor 4/12/2012 Bioinformatics Comp/Phys/Mtsc 715 Taylor

Administrative

  • Presentations next week

– Brief data and goal intro – Describe ideal design

  • What perceptual characteristics help user do task?
  • Why parameters chosen (color map, viewpoint)?
  • Consider second-best approach

– Describe implementation if any (and demo) – Evaluation plan

slide-2
SLIDE 2

4/12/2012 2

Introduction

  • Bioinformatics

– Applying CS algorithms to biological problems

  • Examples

– Protein folding – Gene mapping

  • Gigantic data sets

4/12/2012 Bioinformatics 4 Comp/Phys/Mtsc 715 Taylor

What's in this lecture

  • IEEE InfoVis special issue on Bioinformatics

Visualization

– 2005, volume 4, no. 3

  • Visualization of:

– Microarray data (***) – Gene sequences – Taxonomies – Biological pathways

4/12/2012 Bioinformatics 5 Comp/Phys/Mtsc 715 Taylor 4/12/2012 Bioinformatics Comp/Phys/Mtsc 715 Taylor

slide-3
SLIDE 3

4/12/2012 3

Microarray Data

  • Warning: IANAB

– I am not a biologist

  • Array of probes

(e.g. bits of genes)

  • Measure expression

level of probes in a sample.

– relative or absolute

  • youtube1,youtube2

4/12/2012 Bioinformatics Comp/Phys/Mtsc 715 Taylor

Microarray Data + Score

  • Gehlenborg et al.
  • Default red-black-

green map for expression over trial.

4/12/2012 Bioinformatics Comp/Phys/Mtsc 715 Taylor

Microarray Data + Score

  • Gehlenborg et al.
  • Default red-black-

green map for expression over trial.

  • Blue channel for

relevance/score

– Uncertainty vis-ish.

4/12/2012 Bioinformatics Comp/Phys/Mtsc 715 Taylor

slide-4
SLIDE 4

4/12/2012 4

Microarray Data + Score

  • Gehlenborg et al.
  • Default red-black-

green map for expression vs. condition.

  • Blue channel for

relevance/score

– Uncertainty vis-ish.

  • Height by gene

score.

4/12/2012 Bioinformatics Comp/Phys/Mtsc 715 Taylor

A) Extra cols. C) PC plots D) Height scaling B) Overview, color coding for categorization.

4/12/2012 Bioinformatics Comp/Phys/Mtsc 715 Taylor

Log scaling

  • Most visualizations of microarray data are

log-scaled

– Changes in expression level are smaller for smaller values

4/12/2012 Bioinformatics 12 Comp/Phys/Mtsc 715 Taylor

slide-5
SLIDE 5

4/12/2012 5

4/12/2012 Bioinformatics Comp/Phys/Mtsc 715 Taylor

Animated Scatter Plots(1)

  • Parallel Coordinates at one time

4/12/2012 Bioinformatics 14 Comp/Phys/Mtsc 715 Taylor

Animated Scatter Plots(2)

2) Pick a time interval Scatter plot X and Y derived

4/12/2012 Bioinformatics 15 Comp/Phys/Mtsc 715 Taylor

slide-6
SLIDE 6

4/12/2012 6

Animated Scatter Plots(3)

3) Compute derivative scatter plot

4/12/2012 Bioinformatics 16 Comp/Phys/Mtsc 715 Taylor

Animated Scatter Plots(4)

4) Animate (move interval)

4/12/2012 Bioinformatics 17 Comp/Phys/Mtsc 715 Taylor

Animated Scatter Plots(5)

4/12/2012 Bioinformatics 18 Comp/Phys/Mtsc 715 Taylor

slide-7
SLIDE 7

4/12/2012 7

4/12/2012 Bioinformatics Comp/Phys/Mtsc 715 Taylor

Hierarchical Cluster Explorer

  • Seo et al.

– Find genes that have similar function

4/12/2012 Bioinformatics 20 Comp/Phys/Mtsc 715 Taylor

HCE: minimum similarity slider

4/12/2012 Bioinformatics 21 Comp/Phys/Mtsc 715 Taylor

slide-8
SLIDE 8

4/12/2012 8

HCE: minimum similarity slider

4/12/2012 Bioinformatics 22 Comp/Phys/Mtsc 715 Taylor

HCE: linked scatter plot

4/12/2012 Bioinformatics 23 Comp/Phys/Mtsc 715 Taylor

HCE: Detail Cutoff Bar

  • How to deal with too much detail?

– Merge clusters below a size threshold – Represent w/ average color

4/12/2012 Bioinformatics 24 Comp/Phys/Mtsc 715 Taylor

slide-9
SLIDE 9

4/12/2012 9

4/12/2012 Bioinformatics Comp/Phys/Mtsc 715 Taylor

HCE: algorithm comparison

25 4/12/2012 Bioinformatics Comp/Phys/Mtsc 715 Taylor

aCGH Visualization

  • Array Comparative Genomic Hybridization
  • Genome-wide, high resolution copy

numbers

  • Copy number variation:

– Segment of DNA with different numbers of copies between genomes. – Within patient (two halves of diploid) – Between patient (tumor vs. non-tumor)

4/12/2012 Bioinformatics 27 Comp/Phys/Mtsc 715 Taylor

slide-10
SLIDE 10

4/12/2012 10

Visualizing an entire genome

Genome Gene Chromosome Probe

4/12/2012 Bioinformatics 28 Comp/Phys/Mtsc 715 Taylor

Chromosome View (1)

  • Thin = centromeres,

variables, cytobands,

  • ther
  • White = 0-1SD
  • Light Gray = 1-2SD
  • Dark Gray = 2-3SD
  • Dots = samples

– x=scaled ratio

  • Line = windowed

moving average

4/12/2012 Bioinformatics Comp/Phys/Mtsc 715 Taylor

Chromosome View (2)

  • Light blue bars are

Z scores

– # SDs from mean – ~ # outliers / inliers

4/12/2012 Bioinformatics Comp/Phys/Mtsc 715 Taylor

slide-11
SLIDE 11

4/12/2012 11

Aberration Map

  • 17 breast cancer cell

lines

4/12/2012 Bioinformatics Comp/Phys/Mtsc 715 Taylor 4/12/2012 Bioinformatics Comp/Phys/Mtsc 715 Taylor

How do we know if they work?

  • Discussion

4/12/2012 Bioinformatics 33 Comp/Phys/Mtsc 715 Taylor

slide-12
SLIDE 12

4/12/2012 12

Insight User Study

  • Count # of “insights” made by users
  • Insight:

– “an individual observation about the data by the participant, a unit of discovery”

  • Characteristics:

– Time, domain value, hypotheses, expectedness, correctness, breadth, category

  • Quantification via expert

4/12/2012 Bioinformatics 34 Comp/Phys/Mtsc 715 Taylor

Experimental Setup

  • 5 Tools

– Clusterview, TimeSearcher, HCE, Spotfire, GeneSpring

  • 3 Microarray Data sets

– Timeseries data set—five time-points – Virus data set (Categorical)—three viral strains – Lupus data set (Multicategorical)—42 healthy, 48 patients

  • Participants only used tools they hadn't seen

before.

4/12/2012 Bioinformatics 35 Comp/Phys/Mtsc 715 Taylor 4/12/2012 Bioinformatics 36 Comp/Phys/Mtsc 715 Taylor

slide-13
SLIDE 13

4/12/2012 13

ClusterView

4/12/2012 Bioinformatics 37 Comp/Phys/Mtsc 715 Taylor

TimeSearcher

4/12/2012 Bioinformatics 38 Comp/Phys/Mtsc 715 Taylor

(H)ierarchical (C)luster (E)xplorer

4/12/2012 Bioinformatics 39 Comp/Phys/Mtsc 715 Taylor

slide-14
SLIDE 14

4/12/2012 14

GeneSpring

4/12/2012 Bioinformatics 40 Comp/Phys/Mtsc 715 Taylor

SpotFire

4/12/2012 Bioinformatics 41 Comp/Phys/Mtsc 715 Taylor

Results

4/12/2012 Bioinformatics 42 Comp/Phys/Mtsc 715 Taylor

slide-15
SLIDE 15

4/12/2012 15

4/12/2012 Bioinformatics Comp/Phys/Mtsc 715 Taylor

Learning Curves

43

Anecdotal Results

  • Winner was specific to data set

– Clusterview – Lupus – TimeSearcher – time series – HCE – viral – SpotFire decent for all

  • Specific/free vs. general/commercial

– General == no biological context – Tying in literature search is good

  • Poor usability can break good visualization
  • Motivation!

– People learn faster if they care.

4/12/2012 Bioinformatics 44 Comp/Phys/Mtsc 715 Taylor

Where to go from here

  • Lit search +++
  • Standardization
  • High throughput data

– Microarray data needs pathway data for context

  • Focus+context

4/12/2012 Bioinformatics 45 Comp/Phys/Mtsc 715 Taylor

slide-16
SLIDE 16

4/12/2012 16

4/12/2012 Bioinformatics Comp/Phys/Mtsc 715 Taylor

Other topics

  • Biological pathway visualization
  • Sequence visualization
  • Taxonomy visualization

4/12/2012 Bioinformatics 47 Comp/Phys/Mtsc 715 Taylor

Biological Pathways

  • networks of complex reactions at the

molecular level in living cells

4/12/2012 Bioinformatics 48 Comp/Phys/Mtsc 715 Taylor

slide-17
SLIDE 17

4/12/2012 17

Survey of Popular Techniques

  • Saraiya et al.
  • Requirements analysis
  • Anecdotal system evaluations
  • Research agenda (future work)

4/12/2012 Bioinformatics 49 Comp/Phys/Mtsc 715 Taylor

General Goals

  • recognition of changes between experiment vs

control or between time points

  • detection of changes in relationship between

components of a pathway or between entire pathways

  • identification of global patterns across a pathway
  • mapping pathway state to phenotype (observable

effects at the physical level in living organisms) or

  • ther biological information

4/12/2012 Bioinformatics 50 Comp/Phys/Mtsc 715 Taylor

Detailed Requirements

  • Construct and update
  • Context
  • Uncertainty
  • Collaboration
  • Pathway node and

edge info.

  • Source
  • Spatial information
  • Temporal information
  • High-throughput data
  • Overview
  • Interconnectivity
  • Multi-scale
  • Notebook

4/12/2012 Bioinformatics Comp/Phys/Mtsc 715 Taylor

slide-18
SLIDE 18

4/12/2012 18

BioCarta

4/12/2012 Bioinformatics 52 Comp/Phys/Mtsc 715 Taylor

GeneMapp

  • Building pathways

– Easy to use

  • But nobody wants to
  • Statistical pathway

comparison for different treatments

– microarray data

  • Animated node

color

– Different treatments

4/12/2012 Bioinformatics Comp/Phys/Mtsc 715 Taylor

Cytoscape

  • Microarray +

pathway data

  • Customizable

everything

  • CS-centric

– Generic network vis

  • UI complaints

4/12/2012 Bioinformatics Comp/Phys/Mtsc 715 Taylor

slide-19
SLIDE 19

4/12/2012 19

GScope

  • Fish-eye lens

– confusing

  • Heat-map

microarray table icons

  • Distortions made

condition comparison hard

4/12/2012 Bioinformatics Comp/Phys/Mtsc 715 Taylor

PathwayAssist: Literature Search

  • Manual pathway

building

  • Automatic pathway

building

– NLP over PubMed

  • r ResNet

– Requires curation

  • Scientific refs.

4/12/2012 Bioinformatics Comp/Phys/Mtsc 715 Taylor

Patika

  • Small database

– Geared toward cells

  • Regions make

biological sense

– Nucleus,cytoplasm, etc

4/12/2012 Bioinformatics Comp/Phys/Mtsc 715 Taylor

slide-20
SLIDE 20

4/12/2012 20

GeneSpring

4/12/2012 Bioinformatics 58 Comp/Phys/Mtsc 715 Taylor

Conclusions

  • Not enough domain-specific info access

– important for construction (NLP)

  • Context in visualization

– cell structures, molecular state

  • No standardization
  • Better microarray incorporation

4/12/2012 Bioinformatics 59 Comp/Phys/Mtsc 715 Taylor 4/12/2012 Bioinformatics Comp/Phys/Mtsc 715 Taylor

slide-21
SLIDE 21

4/12/2012 21

Near-optimal Protein Alignment

  • Smoot et al.,

– animate relationships between two proteins

4/12/2012 Bioinformatics 61 Comp/Phys/Mtsc 715 Taylor

Path Graphs

4/12/2012 Bioinformatics 62 Comp/Phys/Mtsc 715 Taylor

All Together Now

4/12/2012 Bioinformatics 63 Comp/Phys/Mtsc 715 Taylor

slide-22
SLIDE 22

4/12/2012 22

4/12/2012 Bioinformatics Comp/Phys/Mtsc 715 Taylor

Taxonomy Visualization

  • Graham and Kennedy - Synonomy, Structural

Markers

4/12/2012 Bioinformatics 65 Comp/Phys/Mtsc 715 Taylor

Revisions

4/12/2012 Bioinformatics 66 Comp/Phys/Mtsc 715 Taylor

slide-23
SLIDE 23

4/12/2012 23

Comparing Selections

4/12/2012 Bioinformatics 67 Comp/Phys/Mtsc 715 Taylor 4/12/2012 Bioinformatics 68 Comp/Phys/Mtsc 715 Taylor

Selection % Area Slider

4/12/2012 Bioinformatics 69 Comp/Phys/Mtsc 715 Taylor

slide-24
SLIDE 24

4/12/2012 24

4/12/2012 Bioinformatics Comp/Phys/Mtsc 715 Taylor 70 4/12/2012 Bioinformatics Comp/Phys/Mtsc 715 Taylor

Administrative

  • Presentations next week

– Brief data and goal intro – Describe ideal design

  • What perceptual characteristics help user do task?
  • Why parameters chosen (color map, viewpoint)?
  • Consider second-best approach

– Describe implementation (and demo) – Evaluation plan