Multiple Sequence Alignment Sequences > Yeast YOR020c > - - PDF document

multiple sequence alignment sequences
SMART_READER_LITE
LIVE PREVIEW

Multiple Sequence Alignment Sequences > Yeast YOR020c > - - PDF document

Multiple Sequence Alignment Sequences > Yeast YOR020c > Crypthecodinium cohnii mstllksaksivplmdrvlvqrikaqaktasglylpe matgiakrftplldrvlvqrlkpeaktasglflpesa knveklnqaevvavgpgftdangnkvvpqvkvgdqvl akapnyatvlavgpggrtrdgdilpmnvkvgdkvvvp


slide-1
SLIDE 1

1

Multiple Sequence Alignment Sequences

> Yeast YOR020c

mstllksaksivplmdrvlvqrikaqaktasglylpe knveklnqaevvavgpgftdangnkvvpqvkvgdqvl ipqfggstiklgnddevilfrdaeilakiakd

> Neurospora crassa

mattvrsvksliplldrvlvqrvkaeaktasgiflpe ssvkdlneakvlavgpgaldkdgkrlpmgvnagdrvl ipqyggspvkvgeeeytlfrdseilakiae

> Aspergillus nidulans

msllrnvknlaplldrvlvqrvkpeaktasgiflpes svkeqneakvlavgpgavdrngqripmgvaagdrvlv pqfggsplkigeeeyhlfrdseilakine

> Schizosaccharomyces pombe (fission yeast)

matklksaksivplldrilvqrikadtktasgiflpe ksveklsegrvisvgkggynkegklaqpsvavgdrvl lpayggsnikvgeeeyslyrdhellaiike

> Mortierella alpina

masritkfsktivpmmdrvlvqrikpqqktasgiyip ekaqealnegyvvavgkglttqegkvvpselaegdkv llppyggsvvkvdneelilfreseilakiq

> Crypthecodinium cohnii

matgiakrftplldrvlvqrlkpeaktasglflpesa akapnyatvlavgpggrtrdgdilpmnvkvgdkvvvp eyggmtlkfedeefqvfrdadimgilne

> Drosophila melanogaster

maaaikkiipmldriliqraealtktkggivlpekav gkvlegtvlavgpgtrnastgnhipigvkegdrvllp efggtkvnlegdqkelflfresdilakle

> Homo sapiens

agqafrkflplfdrvlversaaetvtkggimlpeksq gkvlqatvvavgsgskgkggeiqpvsvkvgdkvllpe yggtkvvlddkdyflfrdgxilgky

> Geobacillus stearothermophilus

vlkplgdrvvievieteektasgivlpdtakekpqeg rvvavgkgrvldsgervapevevgdriifskyagtev kydgkeylilresdilavig

> Mycobacterium tuberculosis

makvnikpledkilvqaneaetttasglvipdtakek pqegtvvavgpgrwdedgekripldvaegdtviysky ggteikyngeeylilsardvlavvsk

> Mus musculus (house mouse)

magqafrkflllfdrvlversaaetvtkggimlpeks qgkvlqatvvavgsggkgksgeiepvsvkvgdkvllp eyggtkvvlddkdyflfrdsdilgkyvn

slide-2
SLIDE 2

2

Multiple Sequence Alignment (MSA) Why MSA?

– Selection of sequences – Multiple sequence alignment of sequences – Tree building – Tree evaluation

  • Proteins are often related to a larger group (i.e., a

family) of proteins

  • Multiple sequence alignment is more sensitive than

pairwise alignment for detecting homologs

  • MSAs can elucidate conserved residues, motifs, or
  • ther functional regions in a protein
  • MSA is critical for phylogenetic analysis
slide-3
SLIDE 3

3

Pairwise Alignment

5 4 6 10 4

3-sequence Alignment

5

G A A A G T

T

C

C

AGA AGT TCC

slide-4
SLIDE 4

4

Sequences

> Yeast YOR020c

mstllksaksivplmdrvlvqrikaqaktasglylpe knveklnqaevvavgpgftdangnkvvpqvkvgdqvl ipqfggstiklgnddevilfrdaeilakiakd

> Neurospora crassa

mattvrsvksliplldrvlvqrvkaeaktasgiflpe ssvkdlneakvlavgpgaldkdgkrlpmgvnagdrvl ipqyggspvkvgeeeytlfrdseilakiae

> Aspergillus nidulans

msllrnvknlaplldrvlvqrvkpeaktasgiflpes svkeqneakvlavgpgavdrngqripmgvaagdrvlv pqfggsplkigeeeyhlfrdseilakine

> Schizosaccharomyces pombe (fission yeast)

matklksaksivplldrilvqrikadtktasgiflpe ksveklsegrvisvgkggynkegklaqpsvavgdrvl lpayggsnikvgeeeyslyrdhellaiike

> Mortierella alpina

masritkfsktivpmmdrvlvqrikpqqktasgiyip ekaqealnegyvvavgkglttqegkvvpselaegdkv llppyggsvvkvdneelilfreseilakiq

> Crypthecodinium cohnii

matgiakrftplldrvlvqrlkpeaktasglflpesa akapnyatvlavgpggrtrdgdilpmnvkvgdkvvvp eyggmtlkfedeefqvfrdadimgilne

> Drosophila melanogaster

maaaikkiipmldriliqraealtktkggivlpekav gkvlegtvlavgpgtrnastgnhipigvkegdrvllp efggtkvnlegdqkelflfresdilakle

> Homo sapiens

agqafrkflplfdrvlversaaetvtkggimlpeksq gkvlqatvvavgsgskgkggeiqpvsvkvgdkvllpe yggtkvvlddkdyflfrdgxilgky

> Geobacillus stearothermophilus

vlkplgdrvvievieteektasgivlpdtakekpqeg rvvavgkgrvldsgervapevevgdriifskyagtev kydgkeylilresdilavig

> Mycobacterium tuberculosis

makvnikpledkilvqaneaetttasglvipdtakek pqegtvvavgpgrwdedgekripldvaegdtviysky ggteikyngeeylilsardvlavvsk

> Mus musculus (house mouse)

magqafrkflllfdrvlversaaetvtkggimlpeks qgkvlqatvvavgsggkgksgeiepvsvkvgdkvllp eyggtkvvlddkdyflfrdsdilgkyvn

Multiple Sequence Alignment

slide-5
SLIDE 5

5

Pairwise Alignment Scores

Yeast Neurospora Aspergillus Schizosaccharomyces Mortierella Crypthecodinium Drosophila Homo Geobacillus Mycobacterium Mus Yeast Neurospora Aspergillus Schizoscchrmycs Mortierella Crypthecodinium Drosophila Homo Geobacillus Mycobacterium Mus

49 46 78 45 55 54 44 38 37 42 52 41 40 43 46 44 41 39 43 43 48 45 45 40 40 38 39 42 53 55 41 41 40 40 43 46 40 43 38 39 61 43 34 36 45 49 42 36 49 37 32 93 59 38 32

Guide Tree

Neurospora Aspergillus Yeast Schizosaccharomyces Crypthecodinium Drosophila Geobacillus Mycobacterium Mortierella Homo Mus

slide-6
SLIDE 6

6

  • Unweighted pair group method with

arithmetic mean (UPGMA)

  • Neighbor joining (NJ)

Constructing a Guide Tree

  • Assume each organism is its own group
  • Repeat the following step

– Merge together the two closest groups

Unweighted Pair Group Method with Arithmetic mean (UPGMA)

slide-7
SLIDE 7

7

Unweighted Pair Group Method with Arithmetic mean (UPGMA)

Neurospora Aspergillus Yeast Schizosaccharomyces Crypthecodinium Drosophila Geobacillus Mycobacterium Mortierella Homo Mus Yeast Neurospora Aspergillus Schizosaccharomyces Mortierella Crypthecodinium Drosophila Homo Geobacillus Mycobacterium Mus

Unweighted Pair Group Method with Arithmetic mean (UPGMA)

Neurospora Aspergillus Yeast Schizosaccharomyces Crypthecodinium Drosophila Geobacillus Mycobacterium Mortierella Homo Mus Yeast Neurospora Aspergillus Schizosaccharomyces Mortierella Crypthecodinium Drosophila Homo Geobacillus Mycobacterium Mus

slide-8
SLIDE 8

8

Unweighted Pair Group Method with Arithmetic mean (UPGMA)

Neurospora Aspergillus Yeast Schizosaccharomyces Crypthecodinium Drosophila Geobacillus Mycobacterium Mortierella Homo Mus Yeast Neurospora Aspergillus Schizosaccharomyces Mortierella Crypthecodinium Drosophila Homo Geobacillus Mycobacterium Mus

Unweighted Pair Group Method with Arithmetic mean (UPGMA)

Neurospora Aspergillus Yeast Schizosaccharomyces Crypthecodinium Drosophila Geobacillus Mycobacterium Mortierella Homo Mus Yeast Neurospora Aspergillus Schizosaccharomyces Mortierella Crypthecodinium Drosophila Homo Geobacillus Mycobacterium Mus

slide-9
SLIDE 9

9

Unweighted Pair Group Method with Arithmetic mean (UPGMA)

Neurospora Aspergillus Yeast Schizosaccharomyces Crypthecodinium Drosophila Geobacillus Mycobacterium Mortierella Homo Mus Yeast Neurospora Aspergillus Schizosaccharomyces Mortierella Crypthecodinium Drosophila Homo Geobacillus Mycobacterium Mus

Unweighted Pair Group Method with Arithmetic mean (UPGMA)

Neurospora Aspergillus Yeast Schizosaccharomyces Crypthecodinium Drosophila Geobacillus Mycobacterium Mortierella Homo Mus Yeast Neurospora Aspergillus Schizosaccharomyces Mortierella Crypthecodinium Drosophila Homo Geobacillus Mycobacterium Mus

slide-10
SLIDE 10

10

Unweighted Pair Group Method with Arithmetic mean (UPGMA)

Neurospora Aspergillus Yeast Schizosaccharomyces Crypthecodinium Drosophila Geobacillus Mycobacterium Mortierella Homo Mus Yeast Neurospora Aspergillus Schizosaccharomyces Mortierella Crypthecodinium Drosophila Homo Geobacillus Mycobacterium Mus

Unweighted Pair Group Method with Arithmetic mean (UPGMA)

Neurospora Aspergillus Yeast Schizosaccharomyces Crypthecodinium Drosophila Geobacillus Mycobacterium Mortierella Homo Mus Yeast Neurospora Aspergillus Schizosaccharomyces Mortierella Crypthecodinium Drosophila Homo Geobacillus Mycobacterium Mus

slide-11
SLIDE 11

11

Unweighted Pair Group Method with Arithmetic mean (UPGMA)

Neurospora Aspergillus Yeast Schizosaccharomyces Crypthecodinium Drosophila Geobacillus Mycobacterium Mortierella Homo Mus Yeast Neurospora Aspergillus Schizosaccharomyces Mortierella Crypthecodinium Drosophila Homo Geobacillus Mycobacterium Mus

Unweighted Pair Group Method with Arithmetic mean (UPGMA)

Neurospora Aspergillus Yeast Schizosaccharomyces Crypthecodinium Drosophila Geobacillus Mycobacterium Mortierella Homo Mus Yeast Neurospora Aspergillus Schizosaccharomyces Mortierella Crypthecodinium Drosophila Homo Geobacillus Mycobacterium Mus

slide-12
SLIDE 12

12

Unweighted Pair Group Method with Arithmetic mean (UPGMA)

Neurospora Aspergillus Yeast Schizosaccharomyces Crypthecodinium Drosophila Geobacillus Mycobacterium Mortierella Homo Mus Yeast Neurospora Aspergillus Schizosaccharomyces Mortierella Crypthecodinium Drosophila Homo Geobacillus Mycobacterium Mus

Guide Tree

Neurospora Aspergillus Yeast Schizosaccharomyces Crypthecodinium Drosophila Geobacillus Mycobacterium Mortierella Homo Mus

slide-13
SLIDE 13

13

  • Generate full tree with starlike structure
  • Repeat the following step

– Connect two closest groups (i.e., neighbors) through a single node

Neighbor Joining (NJ) Neighbor Joining

Yeast Neurospora Aspergillus Schizosaccharomyces Mortierella Crypthecodinium Drosophila Homo Geobacillus Mycobacterium Mus

slide-14
SLIDE 14

14

Neighbor Joining

Yeast Neurospora Aspergillus Schizosaccharomyces Mortierella Crypthecodinium Drosophila Homo Geobacillus Mycobacterium Mus

Neighbor Joining

Yeast Neurospora Aspergillus Schizosaccharomyces Mortierella Crypthecodinium Drosophila Homo Geobacillus Mycobacterium Mus

slide-15
SLIDE 15

15

Neighbor Joining

Yeast Neurospora Aspergillus Schizosaccharomyces Mortierella Crypthecodinium Drosophila Homo Geobacillus Mycobacterium Mus

Neighbor Joining

Yeast Neurospora Aspergillus Schizosaccharomyces Mortierella Crypthecodinium Drosophila Homo Geobacillus Mycobacterium Mus

slide-16
SLIDE 16

16

Neighbor Joining

Yeast Neurospora Aspergillus Schizosaccharomyces Mortierella Crypthecodinium Drosophila Homo Geobacillus Mycobacterium Mus

Neighbor Joining

Yeast Neurospora Aspergillus Schizosaccharomyces Mortierella Crypthecodinium Drosophila Homo Geobacillus Mycobacterium Mus

slide-17
SLIDE 17

17

Neighbor Joining

Yeast Neurospora Aspergillus Schizosaccharomyces Mortierella Crypthecodinium Drosophila Homo Geobacillus Mycobacterium Mus

Neighbor Joining

Yeast Neurospora Aspergillus Schizosaccharomyces Mortierella Crypthecodinium Drosophila Homo Geobacillus Mycobacterium Mus

slide-18
SLIDE 18

18

Neighbor Joining

Yeast Neurospora Aspergillus Schizosaccharomyces Mortierella Crypthecodinium Drosophila Homo Geobacillus Mycobacterium Mus

Multiple Sequence Alignment

slide-19
SLIDE 19

19

Multiple Sequence Alignment Multiple Sequence Alignment

slide-20
SLIDE 20

20

What can phylogeny do for you?

Why do we care about evolution and the evolutionary history of organisms? OR How do we benefit from phylogeny? AND How is bioinformatics related to any of this?

What are the goals of phylogeny?

1) Deduce correct trees of life for all species 2) Infer or estimate divergence times All life forms share a common origin and are part of the Tree of Life

How can we use phylogenetic analyses?

slide-21
SLIDE 21

21

Revolutionalizing the Tree of Life

Carl Woese: rRNA IDs Archaea as separate branch of Tree of Life

Discovering new life forms

slide-22
SLIDE 22

22

Developing effective snakebite antivenins Identifying emergent diseases

slide-23
SLIDE 23

23

Protecting ecosystems from invasive species

Caulerpa taxifolia Purple loosetrife Eurasian water milfoil

A B C D E

Ancestral Node

  • r ROOT of

the Tree Internal Nodes hypothetical taxanomic units (HTUs) Branches or Lineages Terminal Nodes

  • perational

taxanomic units (OTUs) Represent the TAXA (genes, populations, species, etc.) used to infer the phylogeny

Common phylogenetic tree terminology

slide-24
SLIDE 24

24

Phylogenetic trees can be drawn many ways

A B C D E

Clade: group with a single common ancestor and its descendents

“B-C clade” “D-E clade” “A-B-C clade”

slide-25
SLIDE 25

25

A B C D Rooted A B C D Unrooted Shows degree of kinship Doesn’t make assumptions

  • r require knowledge of

common ancestor Specifies evolutionary path Root node is most recent common ancestor of all TUs; specifies time flow

Phylogenetic trees can be rooted or unrooted

C Unscaled Branch length not proportional to number of changes/distance

Phylogenetic trees can be scaled or unscaled

A B C D Cladogram A B D Scaled Branch length proportional to number of changes/distance Phylogram

slide-26
SLIDE 26

26

Phylogenetic trees diagram evolutionary relationships

No meaning to the spacing between the taxa, or to the order in which they appear from top to bottom. 1) No scale (cladograms) 2) Proportional to genetic distance (phylograms) 3) Proportional to time (ultrametric trees)

E D C B A

Rotating clades: same meanings

E D C B A C B A

=

E D

slide-27
SLIDE 27

27

Interpreting phylogenetic trees

Is the frog more closely related to the fish or the human ?

How are phylogenetic trees built?

  • Closely related organisms don’t always look similar
  • Similar looking organisms not always closely related
  • How do you decide importance of traits?

Caveats:

Traditionally: use homologous structures

slide-28
SLIDE 28

28

Structural analogy can result from convergent evolution Classification based on traits can be tricky

cell number

  • rganelles
slide-29
SLIDE 29

29

Molecular phylogenetic trees

Large molecular data sets: Bioinformatics! Caveat: Gene divergence may not correlate with species divergence Result: great improvement on classical phylogenies Molecular clock vs. punctuated equilibrium Eliminates analogy and trait selection issues

Molecular phylogenies can be constructed using different elements

Nuclear genes Mitochondrial DNA Genome structure Usually integrate analyses of multiple different genes Reasonably well conserved, present in common ancestors

slide-30
SLIDE 30

30

= = ≠

Molecular comparisons vs. body plans Which species are the closest living relatives of modern humans?

MYA

Chimpanzees Orangutans Humans Bonobos Gorillas

14

MitoDNA, most nuclear genes, and DNA hybridization

Bonobos and chimpanzees are related more closely to humans than either are to gorillas.

Humans Bonobos Gorillas Orangutans Chimpanzees

MYA 15-30

Pre-molecular view

Great apes (chimpanzees, gorillas and orangutans) formed a clade separate from humans.

slide-31
SLIDE 31

31

What is the closest living relative of whales? Phylogenetic trees are hypotheses

How do you construct phylogenetic trees? How do you test the robustness of hypotheses? What computational strategies are used?