Practical Bioinformatics Mark Voorhies 5/26/2015 Mark Voorhies - - PowerPoint PPT Presentation

practical bioinformatics
SMART_READER_LITE
LIVE PREVIEW

Practical Bioinformatics Mark Voorhies 5/26/2015 Mark Voorhies - - PowerPoint PPT Presentation

Practical Bioinformatics Mark Voorhies 5/26/2015 Mark Voorhies Practical Bioinformatics Habits are things you get for free, without requiring any special work. Cory Doctorow Advice to Writers, 4/5/2012 Mark Voorhies Practical


slide-1
SLIDE 1

Practical Bioinformatics

Mark Voorhies 5/26/2015

Mark Voorhies Practical Bioinformatics

slide-2
SLIDE 2

Habits are things you get for free, without requiring any special work. –Cory Doctorow Advice to Writers, 4/5/2012

Mark Voorhies Practical Bioinformatics

slide-3
SLIDE 3

Why compare sequences?

Mark Voorhies Practical Bioinformatics

slide-4
SLIDE 4

EM: Training an HMM

Mark Voorhies Practical Bioinformatics

slide-5
SLIDE 5

EM: Estimating transcript abundances

L

  • m

i =m i–1

(i −1)c ic −1 ) ∝ λL · ρ · ωp|−,L ·φ − |

−,L

P (

L

m

i P (−) P p (−) P(−)

∝ α ∝ λ Online EM algorithm Update parameters Constrain estimated counts Output

Relative abundances Estimated counts

Augmented alignment file

Effective counts

Get next read pair Update masses Input Capture target sequences Fragment and sequence Align to target references Calculate assignment probabilities

Error probabilities

A A C G T C G T +

Bias Targets

p,

Roberts and Pachter, Nature Methods 10:71

Mark Voorhies Practical Bioinformatics

slide-6
SLIDE 6

Evolution implies a self-consistent model

Distances (Pairwise relationships) Topology (Evolutionary history)

Mark Voorhies Practical Bioinformatics

slide-7
SLIDE 7

Measure all pairwise distances by dynamic programming

Mark Voorhies Practical Bioinformatics

slide-8
SLIDE 8

Measure all pairwise distances by dynamic programming

Mark Voorhies Practical Bioinformatics

slide-9
SLIDE 9

Generate a guide tree by UPGMA

Mark Voorhies Practical Bioinformatics

slide-10
SLIDE 10

Generate a guide tree by UPGMA

Mark Voorhies Practical Bioinformatics

slide-11
SLIDE 11

Generate a guide tree by UPGMA

Mark Voorhies Practical Bioinformatics

slide-12
SLIDE 12

Generate a guide tree by UPGMA

Mark Voorhies Practical Bioinformatics

slide-13
SLIDE 13

Generate a guide tree by UPGMA

Mark Voorhies Practical Bioinformatics

slide-14
SLIDE 14

Progressive alignment following the guide tree

Mark Voorhies Practical Bioinformatics

slide-15
SLIDE 15

Progressive alignment following the guide tree

Mark Voorhies Practical Bioinformatics

slide-16
SLIDE 16

Progressive alignment following the guide tree

Mark Voorhies Practical Bioinformatics

slide-17
SLIDE 17

Measure distances directly from the alignment

Mark Voorhies Practical Bioinformatics

slide-18
SLIDE 18

Generate neighbor-joining tree from new distances

Mark Voorhies Practical Bioinformatics

slide-19
SLIDE 19

Generate neighbor-joining tree from new distances

Mark Voorhies Practical Bioinformatics

slide-20
SLIDE 20

Generate neighbor-joining tree from new distances

Mark Voorhies Practical Bioinformatics

slide-21
SLIDE 21

Generate bootstrap values from subsets of the alignment

Mark Voorhies Practical Bioinformatics

slide-22
SLIDE 22

Generating a multiple alignment in CLUSTALX

Mark Voorhies Practical Bioinformatics

slide-23
SLIDE 23

Generating a multiple alignment in CLUSTALX

Mark Voorhies Practical Bioinformatics

slide-24
SLIDE 24

Generating a neighbor joining tree in CLUSTALX

Mark Voorhies Practical Bioinformatics

slide-25
SLIDE 25

Viewing the alignment and tree in JALVIEW

Mark Voorhies Practical Bioinformatics

slide-26
SLIDE 26

Related tools

Protein Multiple Alignment

MUSCLE Clustal Omega Probcons hmmalign (HMMer3)

Tree Building

MrBayes (Bayesian MCMC) PhyML (maximum likelihood) RaxML (fast maximum likelihood) FastTree2 (very large heuristic trees)

Mark Voorhies Practical Bioinformatics

slide-27
SLIDE 27

Homework

Finish your dynamic programming implementation.

Mark Voorhies Practical Bioinformatics