Sequence alignments Genetic sequences change over time mutation - - PowerPoint PPT Presentation

sequence alignments genetic sequences change over time
SMART_READER_LITE
LIVE PREVIEW

Sequence alignments Genetic sequences change over time mutation - - PowerPoint PPT Presentation

Sequence alignments Genetic sequences change over time mutation deletion mutation LRGGD LRGD LRCD ARCD time Relationship between original and final sequence: LRGGD LRGGD or AR-CD ARC-D In practice: we only know sequences from extant


slide-1
SLIDE 1

Sequence alignments

slide-2
SLIDE 2

Genetic sequences change over time

time

LRGGD LRCD mutation ARCD mutation

Relationship between original and final sequence: LRGGD AR-CD LRGGD ARC-D

  • r

LRGD deletion

slide-3
SLIDE 3

In practice: we only know sequences from extant organisms

ancestor human LRGDDC mouse LGDCC

slide-4
SLIDE 4

We need to align these sequences to compare them

human LRGDDC mouse LGDCC LRGDDC L-GDCC LRGDDC- L-GD-CC LRGDDC

  • LGDCC

Which alignment is correct?

slide-5
SLIDE 5

We need to score the alignment

Example:

  • match = +1
  • mismatch = -1
  • gap = 0

LRGDDC L-GDCC score = 1+0+1+1-1+1 = 3 LRGDDC- L-GD-CC score = 1+0+1+1+0+1+0 = 4 LRGDDC

  • LGDCC

score = 0-1+1+1-1+1 = 1

slide-6
SLIDE 6

We need to score the alignment

Example:

  • match = +1
  • mismatch = -1
  • gap = -2

LRGDDC L-GDCC score = 1-2+1+1-1+1 = 1 LRGDDC- L-GD-CC score = 1-2+1+1-2+1-2 = -2 LRGDDC

  • LGDCC

score = -2-1+1+1-1+1 = -1

slide-7
SLIDE 7

We often score by amino-acid similarity

http://commons.wikimedia.org/wiki/File:BLOSUM62.gif

BLOSUM62 Matrix

score = log pij pipj

slide-8
SLIDE 8

Gaps in alignments are called “indels”

LRGDDC L-GDCC indel

slide-9
SLIDE 9

How do we find the best alignment given a scoring system?

Global alignment: Needleman-Wunsch algorithm Example: align GCAT and GAT Scoring: match = 1, mismatch = -1, gap = -1

  • G

C A T

  • G

A T

slide-10
SLIDE 10

How do we find the best alignment given a scoring system?

Global alignment: Needleman-Wunsch algorithm Example: align GCAT and GAT Scoring: match = 1, mismatch = -1, gap = -1

  • G

C A T

  • G

A T Alignment:

slide-11
SLIDE 11

How do we find the best alignment given a scoring system?

Global alignment: Needleman-Wunsch algorithm Example: align GCAT and GAT Scoring: match = 1, mismatch = -1, gap = -1

  • G

C A T

  • 1

G A T Alignment:

  • G
slide-12
SLIDE 12

How do we find the best alignment given a scoring system?

Global alignment: Needleman-Wunsch algorithm Example: align GCAT and GAT Scoring: match = 1, mismatch = -1, gap = -1

  • G

C A T

  • 1
  • 2

G A T Alignment:

  • GC
slide-13
SLIDE 13

How do we find the best alignment given a scoring system?

Global alignment: Needleman-Wunsch algorithm Example: align GCAT and GAT Scoring: match = 1, mismatch = -1, gap = -1

  • G

C A T

  • 1
  • 2
  • 3
  • 4

G A T Alignment:

  • GCAT
slide-14
SLIDE 14

How do we find the best alignment given a scoring system?

Global alignment: Needleman-Wunsch algorithm Example: align GCAT and GAT Scoring: match = 1, mismatch = -1, gap = -1

  • G

C A T

  • 1
  • 2
  • 3
  • 4

G

  • 1

A T Alignment:

  • G
slide-15
SLIDE 15

How do we find the best alignment given a scoring system?

Global alignment: Needleman-Wunsch algorithm Example: align GCAT and GAT Scoring: match = 1, mismatch = -1, gap = -1

  • G

C A T

  • 1
  • 2
  • 3
  • 4

G

  • 1

A

  • 2

T

  • 3

Alignment:

  • GAT
slide-16
SLIDE 16

How do we find the best alignment given a scoring system?

Global alignment: Needleman-Wunsch algorithm Example: align GCAT and GAT Scoring: match = 1, mismatch = -1, gap = -1

  • G

C A T

  • 1
  • 2
  • 3
  • 4

G

  • 1

? A

  • 2

T

  • 3
slide-17
SLIDE 17

How do we find the best alignment given a scoring system?

Global alignment: Needleman-Wunsch algorithm Example: align GCAT and GAT Scoring: match = 1, mismatch = -1, gap = -1

  • G

C A T

  • 1
  • 2
  • 3
  • 4

G

  • 1
  • 2

A

  • 2

T

  • 3

Alignment:

  • G-
  • -G
slide-18
SLIDE 18

How do we find the best alignment given a scoring system?

Global alignment: Needleman-Wunsch algorithm Example: align GCAT and GAT Scoring: match = 1, mismatch = -1, gap = -1

  • G

C A T

  • 1
  • 2
  • 3
  • 4

G

  • 1
  • 2

A

  • 2

T

  • 3

Alignment:

  • -G
  • G-
slide-19
SLIDE 19

How do we find the best alignment given a scoring system?

Global alignment: Needleman-Wunsch algorithm Example: align GCAT and GAT Scoring: match = 1, mismatch = -1, gap = -1

  • G

C A T

  • 1
  • 2
  • 3
  • 4

G

  • 1

1 A

  • 2

T

  • 3

Alignment:

  • G
  • G
slide-20
SLIDE 20

How do we find the best alignment given a scoring system?

Global alignment: Needleman-Wunsch algorithm Example: align GCAT and GAT Scoring: match = 1, mismatch = -1, gap = -1

  • G

C A T

  • 1
  • 2
  • 3
  • 4

G

  • 1

1 A

  • 2

T

  • 3

Alignment:

  • GC
  • G-
slide-21
SLIDE 21

How do we find the best alignment given a scoring system?

Global alignment: Needleman-Wunsch algorithm Example: align GCAT and GAT Scoring: match = 1, mismatch = -1, gap = -1

  • G

C A T

  • 1
  • 2
  • 3
  • 4

G

  • 1

1 A

  • 2

T

  • 3

Alignment:

  • G-
  • GA
slide-22
SLIDE 22

How do we find the best alignment given a scoring system?

Global alignment: Needleman-Wunsch algorithm Example: align GCAT and GAT Scoring: match = 1, mismatch = -1, gap = -1

  • G

C A T

  • 1
  • 2
  • 3
  • 4

G

  • 1

1 A

  • 2
  • 1

T

  • 3

Alignment:

  • GC-
  • G-A
slide-23
SLIDE 23

How do we find the best alignment given a scoring system?

Global alignment: Needleman-Wunsch algorithm Example: align GCAT and GAT Scoring: match = 1, mismatch = -1, gap = -1

  • G

C A T

  • 1
  • 2
  • 3
  • 4

G

  • 1

1 A

  • 2
  • 1

T

  • 3

Alignment:

  • G-C
  • GA-
slide-24
SLIDE 24

How do we find the best alignment given a scoring system?

Global alignment: Needleman-Wunsch algorithm Example: align GCAT and GAT Scoring: match = 1, mismatch = -1, gap = -1

  • G

C A T

  • 1
  • 2
  • 3
  • 4

G

  • 1

1 A

  • 2

T

  • 3

Alignment:

  • GC
  • GA
slide-25
SLIDE 25

How do we find the best alignment given a scoring system?

Global alignment: Needleman-Wunsch algorithm Example: align GCAT and GAT Scoring: match = 1, mismatch = -1, gap = -1

  • G

C A T

  • 1
  • 2
  • 3
  • 4

G

  • 1

1

  • 1
  • 2

A

  • 2

1 T

  • 3
  • 1
  • 1

2

slide-26
SLIDE 26

How do we find the best alignment given a scoring system?

Global alignment: Needleman-Wunsch algorithm Example: align GCAT and GAT Scoring: match = 1, mismatch = -1, gap = -1

  • G

C A T

  • 1
  • 2
  • 3
  • 4

G

  • 1

1

  • 1
  • 2

A

  • 2

1 T

  • 3
  • 1
  • 1

2 Alignment:

  • GCAT
  • G-AT
slide-27
SLIDE 27

Needleman-Wunsch algorithm, mathematical form

M(i, j)=max M(i −1, j)+p M(i, j −1)+p M(i −1, j −1)+s(aj,bi) ⎛ ⎝ ⎜ ⎜ ⎜ ⎜ ⎞ ⎠ ⎟ ⎟ ⎟ ⎟ M(0, j)= j×p

first row, p = gap penalty

M(i,0)= i ×p

first column top left diagonal s(aj, bi) = match/mismatch score for sites j and i in sequences a and b

slide-28
SLIDE 28

Now try on your own

Align ATGCT and ATTACA Scoring: match = 1, mismatch = -1, gap = -1

  • A

T T A C A

  • A

T G C T

slide-29
SLIDE 29

Multiple sequence alignment (MSA)

slide-30
SLIDE 30

Software to generate MSAs

  • MAFFT

(very good, very fast) http://mafft.cbrc.jp/alignment/software/

  • Clustal Omega

(very good, very fast) http://www.ebi.ac.uk/Tools/msa/clustalo/

  • PRANK

(extremely good, very slow) http://wasabiapp.org/software/prank/