Dynamic Programming Formula Divide a problem into a polynomial - - PowerPoint PPT Presentation

dynamic programming formula
SMART_READER_LITE
LIVE PREVIEW

Dynamic Programming Formula Divide a problem into a polynomial - - PowerPoint PPT Presentation

1 Dynamic Programming Formula Divide a problem into a polynomial number of smaller subproblems Solve subproblem, recording its answer in an array Do a case analysis where each case uses the subproblems in a different way Compare the cases to


slide-1
SLIDE 1

Dynamic Programming Formula

Divide a problem into a polynomial number of smaller subproblems Solve subproblem, recording its answer in an array Do a case analysis where each case uses the subproblems in a different way Compare the cases to find the optimal solution for the current problem

1

String Similarity

How similar are two strings?

  • currance
  • ccurrence

2-1

String Similarity

How similar are two strings?

  • currance
  • ccurrence

6 mismatches, 1 gap

  • c u

r r a n c e

  • c c

u r r e n c e

2-2 Slides18 - Sequence Alignment.key - April 10, 2019

slide-2
SLIDE 2

String Similarity

How similar are two strings?

  • currance
  • ccurrence

1 mismatch, 1 gap

  • c

u r r a n c e

  • c c

u r r e n c e 6 mismatches, 1 gap

  • c u

r r a n c e

  • c c

u r r e n c e

2-3

String Similarity

How similar are two strings?

  • currance
  • ccurrence

1 mismatch, 1 gap

  • c

u r r a n c e

  • c c

u r r e n c e 6 mismatches, 1 gap

  • c u

r r a n c e

  • c c

u r r e n c e 0 mismatches, 3 gaps

  • c

u r r a n c e

  • c c u r r e

n c e

2-4

Applications. Basis for Unix diff. Speech recognition. Computational biology. Spam filter Edit distance. Gap penalty δ; mismatch penalty αpq. Cost = sum of gap and mismatch penalties.

2δ + αCA C G A C C T A C C T C T G A C T A C A T T G A C C T A C C T C T G A C T A C A T

  • T

C C C αTC + αGT + αAG+ 2αCA

  • Edit Distance

3 Slides18 - Sequence Alignment.key - April 10, 2019

slide-3
SLIDE 3

Goal: Given two strings X = x1 x2 . . . xm and Y = y1 y2 . . . yn find alignment of minimum cost. An alignment M is a set of ordered pairs xi-yj such that each item occurs in at most one pair and no crossings. The pair xi-yj and xi'-yj' cross if i < i', but j > j'.

Sequence Alignment

4-1

Goal: Given two strings X = x1 x2 . . . xm and Y = y1 y2 . . . yn find alignment of minimum cost. An alignment M is a set of ordered pairs xi-yj such that each item occurs in at most one pair and no crossings. The pair xi-yj and xi'-yj' cross if i < i', but j > j'.

Sequence Alignment

crossing

  • c

c u r e r n c e

  • c

c u r r e n c e

4-2

Goal: Given two strings X = x1 x2 . . . xm and Y = y1 y2 . . . yn find alignment of minimum cost. An alignment M is a set of ordered pairs xi-yj such that each item occurs in at most one pair and no crossings. The pair xi-yj and xi'-yj' cross if i < i', but j > j'.

Sequence Alignment

crossing

  • c

c u r e r n c e

  • c

c u r r e n c e 2 mismatches

  • c c

u r e r n c e

  • c c

u r r e n c e

4-3 Slides18 - Sequence Alignment.key - April 10, 2019

slide-4
SLIDE 4

Example: CTACCG vs. TACATG.
 Solution: M = x2-y1, x3-y2, x4-y3, x5-y4, x6-y6.

Sequence Alignment

C T A C C - T A C A T

  • G

G y1 y2 y3 y4 y5 y6 x2 x3 x4 x5 x1 x6

5

Sequence Alignment

What are the subproblems? What are the cases? What is the solution for each case? How do you find the optimal solution from the cases?

6

Sequence Alignment Case Analysis

Consider the last character of the strings X and Y . Call them xM and yN. Case 1: xM and yN are aligned. Case 2: xM is not matched. Case 3: yN is not matched. Case 4: Neither xM nor yN are matched.

7 Slides18 - Sequence Alignment.key - April 10, 2019

slide-5
SLIDE 5

Solution 1

B O G U S B O N G O match match mismat ch mismat ch mismat ch

Cost = 3 mismatches

8

Solution 2

B O G U S B O N G O match match skip match misma tch skip

Cost = 1 mismatch + 2 skips

9

Solution 3

B O G U S B O N G O matc h matc h skip matc h skip skip skip

Cost = 4 skips

10 Slides18 - Sequence Alignment.key - April 10, 2019

slide-6
SLIDE 6

Which is best?

3 mismatches: BONGO BOGUS 1 mismatch + 2 skips: BONGO BOGUS 4 skips: BONGO BOGUS

11

Sequence Alignment Cost Analysis

Consider the last character of the strings X and Y . Call them xM and yN. Case 1: xM and yN are aligned. OPT(X, Y) = α + OPT(x1...xm-1, y1...yn-1) Case 2: xM is not matched. OPT(X, Y) = δ + OPT(x1...xm-1, y1...yn) Case 3: yN is not matched. OPT(X, Y) = δ + OPT(x1...xm, y1...yn-1) Case 4: Neither xM nor yN are matched: Covered by cases 2 and 3.

xMyN

12

Sequence Alignment: Algorithm

Alignment(m, n, x1x2...xm, y1y2...yn, δ, α) { for i = 0 to m M[i, 0] = iδ for j = 0 to n M[0, j] = jδ for i = 1 to m for j = 1 to n M[i, j] = min(α[xi, yj] + M[i-1, j-1], δ + M[i-1, j], δ + M[i, j-1]) return M[m, n] }

13 Slides18 - Sequence Alignment.key - April 10, 2019

slide-7
SLIDE 7

Sequence Alignment Example

x = boit; y = boot α = 0, for match α = 1, for mismatch δ = 2 b

  • n

g

  • 2

4 6 8 10 b 2

  • 4

g 6 u 8 s 10

14 Slides18 - Sequence Alignment.key - April 10, 2019