SLIDE 1
Chapter 15: Dynamic Programming. Rod-cutting problem. Given: rod of integer length n inches a table of retail values (dollars for rods of integer lengths) Problem: cut the rod to maximize retail value For example: length i 1 2 3 4 5 6 7 8 9 10 price pi 1 5 8 9 10 17 17 20 24 30 Note: problem with length n rods is a table of 2n entries. That is, input is Θ(n). Solutions have form i1 + i2 + . . . + ik = n, for 1 ≤ k ≤ n, with each 1 ≤ ij ≤ n. Such forms are (non-zero) integer partitions of n. How large is the search space? That is, how large is the collection of partitions of n? How large for a particular k?
1
SLIDE 2 The star-bar visualization. | | * * * * 004 | * | * * * 013 | * * | * * 022 | * * * | * 031 | * * * * | 040 * | | * * * 103 * | * | * * 112 * | * * | * 121 * | * * * | 130 * * | | * * 202 * * | * | * 211 * * | * * | 220 * * * | | * 301 * * * | * | 310 * * * * | | 400 For n = 4, k = 3 and allowing elements of size zero, we obtain the n + (k − 1) k − 1
4 + 3 − 1 2
6 2
configurations to the left. If we reserve 3 of the 4 stars to ensure that each element is at least one: | | * 001 ⇒ 112 | * | 010 ⇒ 121 * | | 100 ⇒ 211 n − k + (k − 1) k − 1
n − 1 k − 1
4 − 1 3 − 1
3 2
If Nn is the number of non-zero partitions of n, then Nn =
n
n − 1 k − 1
n−1
n − 1 k
2 · 2n, an exponential search space.
2
SLIDE 3
A recursive solution. Let rn = maximum retail value that can be obtained from a rod of length n r0 = 0 rn = max
1≤i≤n (pi + rn−i) .
Note optimal substructure. RecursiveCutRod(p, n) { if n = 0 return 0; q = −∞; for i = 1 to n q = max (q, p[i] + RecursiveCutRod(p, n − i)); return q; } What is the instruction count growth rate as n → ∞?
3
SLIDE 4 For a simple lower bound, let T(n) count only the recursive calls — no work done therein. T(0) = 0 T(n) = n +
n
T(n − i) = n +
n−1
T(i) T(1) = 1 +
T(i) = 1 + T(0) = 1 T(2) = 2 +
1
T(i) = 2 + T(0) + T(1) = 2 + 1 + 0 = 3 T(3) = 3 +
2
T(i) = 3 + 3 + 1 + 0 = 7 T(4) = 4 +
3
T(i) = 4 + 7 + 3 + 1 + 0 = 15 Conjecture: T(n) = 2n − 1. Verify: n +
n−1
T(i) = n +
n−1
(2i − 1) = n − n +
n−1
2i = 2n − 1 2 − 1 = 2n − 1. So, T(n), which undercounts the instruction count of RecursiveCutRod, is Ω(2n), which is as bad as a brute-force search of the entire solution space.
4
SLIDE 5
Why? The same recursive subproblem occurs many times under different activation records. Better approach: keep an array of subproblem solutions. History: In the early days of computation, “programming” meant filling in a grid (array) Bellman (1940-50s) used “dynamic” to conceal mathematics at Rand from an unenlightened Secretary of State (Wilson) (Wikipedia) BottomUpCutRod(p, n) { n + 1 r[0 . . . n] = 0; n + 1 for j = 1 to n { n q = −∞; n
j=1(j + 1)
for i = 1 to j n
j=1 j
q = max (q, p[i] + r[j − i]); n r[j] = q; } } T(n) = 5n + 2 + 2 n
j=1 j = 5n + 2 + n(n + 1) = Θ(n2). 5
SLIDE 6
If T ′(n) counts the number of references, T ′(n) = n
j=1 j = n(n+1) 2
= Θ(n2), imply- ing T(n) = Θ(n2) because each reference involves Θ(1) work.
6
SLIDE 7 Definitions.
- 1. Memoization: Record intermediate results for later use.
- 2. Bottom-up: compute all required sub-problems before using them.
- 3. Top-down: compute overall solution via a recursive call that computes only
the necessary sub-problems (lazy evaluation). MemoizedCutRod(p, n) { r[0 . . . n] = 0; return Aux(p, n, r); } Aux(p, n, r) { if r[n] > 0 return r[n]; if n = 0 q = 0; else { q = −∞; for i = 1 to n q = max (q, p[i] + Aux(p, n − i, r)); } r[n] = q; return q; } Count Aux calls in sketch (n = 5): 5 + 4 + 3 + 2 + 1 = 15. In general, T(n) =
n
j = n(n + 1) 2 = Θ(n2).
7
SLIDE 8 Reconstituting the cuts. ExtendedBottomUpCutRod(p, n) { (n + 1) r[0 . . . n] = 0; (n + 1) for j = 1 to n { (n) q = −∞; n
j=1(j + 1)
n
j=1 j
n
j=1(3j)
q = x; s[j] = i; ← parallel structure records decision } } (n) r[j] = q; } (1) return r, s; } T(n) = 5n + 3 + 5
n
j = 5n + 3 + 5n(n + 1) 2 = Θ(n2). i 1 2 3 4 5 6 7 8 9 10 p[i] 1 5 8 9 10 17 17 20 24 30 r[i] 1 5 8 10 13 17 18 22 25 30 s[i] 1 2 3 2 2 6 1 2 3 10 e.g. 5 → 2 → 3 → 0 In general, n → s[n] → s[n − s[n]] → s[n − s[n] − s[n − s[n]]] → . . . → 0.
8
SLIDE 9
PrintCutRodSolution(p, n) { (r, s) = ExtendedBottomUpCutRod(p, n) { while n > 0 { print(s[n]); n = n − s[n]; } } Variation that foreshadows greedy approach (Chapter 16). Let the value density be the retail dollars per inch, ρi = pi/i. A greedy approach cuts the segment with the highest density and continues to so cut the remainder. i 1 2 3 4 5 6 7 8 9 10 p[i] 1 5 8 9 10 17 17 20 24 30 r[i] 1 5 8 10 13 17 18 22 25 30 s[i] 1 2 3 2 2 6 1 2 3 10 ρ[i] 1.00 2.50 2.67 2.25 2.00 2.83 2.43 2.50 2.67 3.00 Note for n = 4, greedy approach finds max{1.00, 2.50, 2.67, 2.25} = 2.67 and chooses decomposition 4 = 3 + 1 yielding $9.00. DP approach gives 4 = 2 + 2 for a yield of $10.00.
9
SLIDE 10 Identifying characteristics of a dynamic programming opportunity.
- 1. Exponential search space
- 2. Optimal substructure property: An optimal solution contains optimal solutions
- f independent subproblems.
- 3. Recursive solution repeats subproblem solutions.
- 4. Dynamic Programming stores subproblem solutions for re-use. DP requires the
appropriate storage structure and a scan-order that ensures each subprob- lem is available when needed.
- 5. Optimal objective function value typically in the last cell computed in the
storage structure.
- 6. The corresponding search space object can be constructed from a parallel stor-
age structure that records the optimal decisions.
10
SLIDE 11
Example regarding independent subproblems. Let G = (V, E) be an unweighted graph. For u, v ∈ V , define δ(u, v) = ∞, no paths from u to v min{# of links in p : p is a path from u to v}, ∃p : u v. If p : (u = v0, v1, . . . , vk = v) is a shortest path from u to v, then p′ : (vi, vi+1, . . . , vj) is a shortest path from vi to vj for 0 ≤ i < j ≤ k. This optimal substructure property follows from a cut-and-paste proof. However, suppose δ(u, v) = # of links in the longest simple path from u to v. δ(q, t) = 2 via q → r → t. However q → r is not the longest simple path from q to r, as its path length (1) is beaten by q → s → t → r with a length of 3. Also, it is not possible to assemble longest paths of subproblems to obtain a longest path for an overall problem. For example, q → s → t → r and r → q → s → t are the longest simple paths from q to r and from r to t respectively. But, q → s → t → r → q → s → t is not even a simple path. Independent subproblems means no sharing of resources (edges in this example).
11
SLIDE 12 Matrix Chain Multiplication. Consider matrices A1(10 × 100), A2(100 × 5), A3(5 × 50). A1A2A3 via (A1A2)A3 ⇒ (10)(100)(5) + (10)(5)(50) = 7500 multiplications. A1A2A3 via A1(A2A3) ⇒ (100)(5)(50) + (10)(100)(50) = 75000 multiplications. We seek an optimal grouping (parenthesization) of A1A2 . . . An to multiply with the smallest number of multiplications. A1 : (p0 × p1) A2 : (p1 × p2) . . . An : (pn−1 × pn) ⇒ (p0 × pn) How many groupings? Search space size = ? Let P(n) be the number of parenthesis configurations for n multiplication-compatible matrices. P(1) = P(2) = 1. A1A2A3 ⇒ A1(A2A3) or (A1A2)A3. So, P(3) = 2. In general, the outermost grouping is A1 . . . Ak and Ak+1 . . . An for 1 ≤ k ≤ n − 1. P(n) =
n−1
P(k)P(n − k) bk = P(k + 1) bn = P(n + 1) =
n
P(k)P(n − k + 1) =
n
bk−1bn−k =
n−1
bkbn−k−1 B(x) =
∞
bnxn B2(x) = (b0 + b1x + b2x2 + . . .)(b0 + b1x + b2x2 + . . .) = b2
0 + x(b0b1 + b1b0) + x2(b0b2 + b1b1 + b2b0) + x3(b0b3 + b1b2 + b2b1 + b3b0) 12
SLIDE 13 B2(x) = c0 + c1x + c2x2 + . . . + cnxn + . . . cn =
n
bkbn−k cn−1 =
n−1
bkbn−k−1 = bn B2(x) = b1 + b2x + b3x2 + . . . xB2(x) = b1x + b2x2 + b3x3 + . . . = B(x) − b0 b0 = P(1) = 1 xB2(x) = B(x) − 1 xB2(x) − B(x) + 1 = 0 B(x) = 1 ± √1 − 4x 2x Want limx→0 B(x) = b0 = 1. We have lim
x→0+
1 + √1 − 4x 2x = ∞ lim
x→0−
1 + √1 − 4x 2x = −∞ lim
x→0
1 − √1 − 4x 2x = lim
x→0
−(1/2)(1 − 4x)−1/2(−4) 2 = lim
x→0
1 (1 − 4x)1/2 = 1. We conclude B(x) = 1 − √1 − 4x 2x . Expand √1 − 4x as a power series about x = 0.
13
SLIDE 14 f(x) = (1 − 4x)1/2 = d0 + d1x + d2x2 + . . . d0 = f(0) = 1 d1 = f ′(0) = 1 2(1 − 4x)−1/2(−4)
= −4 2 d2 = f ′′(0) 2! = 1 2! · 1 2 · −1 2 (1 − 4x)−3/2(−4)2
= (−4)2(1/2)(−1/2) 2! . . . . . . dk = (−4)k(1/2)(−1/2)(−3/2) · · · ((3 − 2k)/2) k! = (−4)k(−1)k−1(1)(3)(5) · · · (2k − 3) 2kk! (1 − 4x)1/2 = 1 + 1 2(−4x) +
∞
(−1)k−1(−4x)k 2kk! (1)(3)(5) · · · (2k − 3) = 1 − 2x +
∞
(−1)22k 2kk! · (2k − 3)!xk 2 · 4 · 6 · 8 · · · (2k − 4) by resolving the numerator minus signs and powers of 2 by inserting 2 · 4 · 6 · · · (2k − 4) in numerator and denominator = 1 − 2x −
∞
2k(2k − 3)!xk k!2k−2(1 · 2 · 3 · · · (k − 2)) by factoring (k − 2) 2’s from the even denominator factors = 1 − 2x −
∞
4(2k − 3)! k!(k − 2)! · xk = 1 − 2x − 4
∞
(2k + 1)!xk+2 (k + 2)!k! by cancelling powers of 2, factoring out the remaining 22 = 4 and shifting the summation index by 2
14
SLIDE 15 B(x) = 1 − √1 − 4x 2x = 1 2x ·
∞
(2k + 1)! (k + 2)!k! · xk
- by substitution of the above result for √1 − 4x by factoring out the x2
= 1 + 2x
∞
(2k + 1)! (k + 2)!k! · xk = 1 +
∞
2(2k + 1)! (k + 2)!k! · xk+1 by dividing by 2x and multiplying the x term into the summation = 1 +
∞
2(2k − 1)! (k + 1)!(k − 1)! · xk = 1 +
∞
2 · [(2k)!/(2k)] (k + 1)k![(k!)/k] = 1 +
∞
2 · k(2k)! (2k)(k!)2(k + 1) · xk = 1 +
∞
1 k + 1 2k k
=
∞
1 k + 1 2k k
- xk, since 0! = 1 by definition.
Since bn is the coefficient of xn, we have bn = 2n
n
P(n) = bn−1 = 1 n 2(n − 1) n − 1
15
SLIDE 16 Check: P(1) = 1 1 2(0)
P(2) = 1 2 2(1) 1
P(3) = 1 3 2(2) 2
P(4) = 1 4 2(3) 3
4 = 5. A1((A2A3)A4) A1(A2(A3A4)) (A1A2)(A3A4) (A1(A2A3))A4 ((A1A2)A3)A4
16
SLIDE 17 Note that 2n n
(n!)2 = 1 n! · (2n)(2n − 1)(2n − 2) · · · (2)(1) n(n − 1)(n − 2) · · · (1) = 2n n!(2n − 1)(2n − 3) · · · (1) = 2n · 2n − 1 n · 2(n − 1) − 1 n − 1 · 2(n − 2) − 1 n − 2 · · · 2[n − (n − 1)] − 1 1 = 2n
n 2 − 1 n − 1
2
So, P(n) = 1 n 2(n − 1) n − 1
n = 2n 2n > 2n 2 · 2n/2 = 2n/2 2 = Ω(2n/2). We again have an exponential search space.
2n
n
- /(n + 1), via Eugene Catalan, Belgian mathematician
(1814-1894)
- 2. Prior use by Euler in 1700s.
17
SLIDE 18 Recursive solution: Let Ai...j = AiAi+1 · · · Aj. Let m[i, j] = the minimum number
- f multiplications to compute Ai...j.
Recall Ai has dimensions pi−1 × pi, and m[i, i] = 0. Break Ai...j = AiAi+1Ai+2 · · · Aj into Ai...k and Ak+1...j for i ≤ k ≤ j − 1. If the solutions to these subproblems are available, we have a (pi−1 × pk) matrix and a (pk × pj) matrix, requiring pi−1pjpk multiplications to unite them. m[i, j] = 0, i = j mini≤k≤j−1{m[i, k] + m[k + 1, j] + pi−1pjpk}, i < j RecursiveMatrixChain[p, i, j] { if i = j return 0; m[i, j] = ∞; for k = i to j − 1 { q = RecursiveMatrixChain(p, i, k) + RecursiveMatrixChain(p, k + 1, j) + pi−1pjpk; if q < m[i, j] m[i, j] = q; } return m[i, j]; } Time for m[1, n]? Counting only the recursive calls and ignoring the work done therein:
18
SLIDE 19 T(n) =
n−1
[T(k) + T(n − k)] = [T(1) + T(n − 1)] + [T(2) + T(n − 2)] + [T(3) + T(n − 3)] + . . . [T(n − 2) + T(2)] + [T(n − 1) + T(1)] = 2
n−1
T(k) ≥ 2T(n − 1) ≥ 4T(n − 2) ≥ . . . ≥ 2n−1T(1) = Ω(2n).
19
SLIDE 20
Recall recursion m[i, j] = 0, i = j mini≤k≤j−1{m[i, k] + m[k + 1, j] + pi−1pkpj}, i < j When computing m[i, j], we need m[i, k] and m[k + 1, j] already computed for i ≤ k ≤ j − 1. That is, k = i needs (i, i), (i + 1, j) k = i + 1 needs (i, i + 1), (i + 2, j) k = i + 2 needs (i, i + 2), (i + 3, j) k = j − 2 needs (i, j − 2), (j − 1, j) k = j − 1 needs (i, j − 1), (j, j) ⇒ Fill by diagonals....
20
SLIDE 21
MatrixChainOrder(p) { n = length(p) - 1; for i = 1 to n m[i, i] = 0; for d = 2 to n { // chain length for i = 1 to n − d + 1 { // left anchor j = i + d − 1; // right anchor m[i, j] = ∞; for k = i to j − 1 { // break points q = m[i, k] + m[k + 1, j] + p[i − 1]p[k]p[j]; if q < m[i, j] { m[i, j] = q; s[i, j] = k; // to recover parenthesis schemes } } } } return (m, s); }
21
SLIDE 22 Upper bound: Θ(n2) cells are filled (see sketch to left above), each involving at most n competitors . Each competitor requires Θ(1) time to determine its score. T(n) ≤ Kn2 · n = O(n3). Lower bound: The upper triangle in the right sketch above involves (1/2)(n/2)(n/2) cells, each of which involves at least n/2 competitors, since the distance from any
- f these cells to the diagonal is at least n/2.
T(n) ≥ K(1/2)(n/2)2(n/2) = Kn3/16 = Ω(n3). We conclude that T(n) = Θ(n3). Structure recovery: Suppose the parallel array s is 6 3 3 3 5 5 x 5 3 3 3 4 x 4 3 3 3 x 3 1 2 x 2 1 x 1 x 1 2 3 4 5 6
22
SLIDE 23
MatrixChainMultiply(A, s, i, j) { // input is A = (A1, A2, . . . , An) if j = i return Ai; X = MatrixChainMultiply(A, s, i, s[i, j]); Y = MatrixChainMultiply(A, s, s[i, j] + 1, j); return X ∗ Y ; } DP-Multiply(A) { // input is A = (A1, A2, . . . , An) p0 = rows(A1); for i = 1 to length(A) pi = columns(Ai); (m, s) = MatrixChainOrder(p); return MatrixChainMultiply(A, s, 1, n); }
23
SLIDE 24
Memoization: compute globals m(i, j), s(i, j) as needed. LookupChain(p, i, j) { if m(i, j) < ∞ return; if i = j { m(i, j) = 0; return; } for k = i to j − 1 { q = LookupChain(p, i, k) + LookupChain(p, k + 1, j) + pi−1pjpk; if q < m(i, j) { m(i, j) = q; s(i, j) = k; } } } Memoized-DP-Multiply(p) { n = length(p) - 1; for i = 1 to n for j = 1 to n m(i, j) = ∞; LookupChain(p, 1, n); }
24
SLIDE 25 CYK Parsing Algorithm. (Cocke–Younger–Kasami) We assume a Chomsky Grammar (V, T , S, P), where
- 1. T is a set of terminals — tokens that appear in strings generated by the gram-
mar,
- 2. V is a set of nonterminals, variables, that denote certain subgroupings of ter-
minals,
- 3. S ∈ V is the starting symbol,
- 4. P is a collection of production rules that provide alternatives for rewriting
strings (sentential forms) and are restricted to two formats: (a) A → BC or (b) A → a. Note conventions:
- 1. Upper case Roman toward the beginning of the alphabet: single nonterminal
- 2. Upper case Roman toward the end of the alphabet (except S): string of non-
terminals
- 3. Lower case Roman toward the beginning of the alphabet: single terminal
- 4. Lower case Roman toward the end of the alphabet: string of terminals
- 5. Lower case Greek: sentential form (mixture of terminals and nonterminals).
Problem: Given a terminal string: x = ai1ai2 . . . ain ∈ T ∗, find a parse tree yielding x — if such a parse tree exists. For example, the following parse tree derives a1 . . . a7.
25
SLIDE 26 Ignoring the last stage, which derives the terminals, a parse tree for a terminal string of length n is a full binary tree with n leaves. A full binary tree is a binary tree in which each internal node has two children. Let P(n) denote the number of full binary trees having n leaves. We have P(1) = P(2) = 1 and, in general, P(n) =
n−1
P(k)P(n − k), corresponding to k ≥ 1 leaves deriving from the left subtree of the root, and the remaining n − k leaves deriving from the right subtree. This is precisely the recursion that we encountered in the Matrix-Chain-Multiply
P(n) = 1 n 2(n − 1) n − 1
which we have shown to be exponential. Given a particular tree, we have few choices for the leaf nonterminals as they must directly deliver the known terminals. However, the grammar may provide a great
26
SLIDE 27 number of rules for choosing the parent nonterminal associated with a pair of leaf
- siblings. Consequently, this exponential lower bound may be much to low.
Define Xi...j = {A ∈ V : A ⇒∗ ai . . . aj}. That is, Xi...j is the collection of nonterminals for which there is a parse tree rooted at A that derives ai . . . aj. For each i, Xi...i = {A ∈ V : (A → ai) ∈ P}. Also, for j > i, Xi...j =
j−1
{A ∈ V : (A → BC) ∈ P and B ∈ Xi...k, C ∈ Xk+1...j}. Finally, a parse for a1 . . . an exists if and only if S ∈ X1...n. RecursiveParse(G, a, i, j) { // G = (V, T , S, P), a = a1 . . . an if i = j return {A ∈ V : (A → ai) ∈ P}; X = φ; for k = i to j − 1 { L = RecursiveParse(G, a, i, k); R = RecursiveParse(G, a, k + 1, j); X = X ∪ {A ∈ V : (A → BC) ∈ P and B ∈ L, C ∈ R}; } }
27
SLIDE 28 Let T(n) count only the recursive call within RecursiveParse(G, a, 1, n). We have T(1) = 0 T(2) = 2 T(3) = T(1) + T(2) + T(2) + T(1) = 2[T(1) + T(2)] = 4 T(4) = T(1) + T(3) + T(2) + T(2) + T(3) + T(1) = 2[T(1) + T(2) + T(3)] = 12 T(n) = 2
n−1
T(k) > 2T(n − 1) > 22T(n − 2) > . . . > 2n−1, exponential growth. However, if we store the Xi...j solutions, computing them by rising diagonals, we have CYKparse(G, a) { // G = (V, T , S, P), a = a1 . . . an for i = 1 to n { X[i, i] = {A ∈ V : (A → ai) ∈ P}; for j = i + 1 to n X[i, j] = φ; } for d = 1 to n − 1 for i = 1 to n − d { j = i + d; for k = i to j − 1 for (A → BC) ∈ P if (B ∈ X[i, k] and C ∈ X[k + 1, j]) X[i, j] = X[i, j] ∪ {A}; } if S ∈ X[1, n] return true; return false; } CYKparse is manifestly Θ(n3), as the grammar is of finite size, and we can assume set operations of amortized Θ(1) time as in the analysis of Kruskal spanning tree algorithm.
28
SLIDE 29 Consider the grammar 1 S → AB 2 S → BC 3 A → BA 4 A → a 5 B → CC 6 B → b 7 C → AB 8 C → a. The following table details the derivation of baaba in this grammar. The notation B52 in (column 2, row 4), for example, means that nonterminal B is placed in set X2...4 because rule 5 is B → CC, because C appears in (column 2, row 2) and because C appears in (column 3, row 4). That is, C ∈ X2...2 and C ∈ X3...4 5 S21 A31 C72 S12 S12 C72 A33 S24 A34 B54 A34 S24 A4 C8 4 B52 S13 C73 B6 3 B52 A4 C8 2 A31 S21 A4 C8 1 B6 b a a b a Since S appears in (column 1, row 5), corresponding to X1...5, we have at least one
- derivation. These subscripts enable us to recover the derivations.
S21 → B6C72 → bC72 → bA4B54 → baB54 → baC73C8 → baA4B6C8 → baA4B6C8 → baaB6C8 → baabC8 → baaba S12 → A31B54 → B6A4B54 → bA4B54 → baB54 → baC73C8 → baA4B6C8 → baaB6C8 → baabC8 → baaba
29
SLIDE 30 Longest Common Subsequence. Basis of BLAST Algorithm (NIH, 1990), which was essential for the first DNA sequencing results. Definitions.
- 1. For a string X, let |X| denote the length of X.
- 2. Given a string X = x1x2 . . . xm, the selection ˆ
X = xi1xi2 . . . xin a subsequence
- f X if i1 < i2 < . . . < in.
- 3. Given strings
X = x1x2x3 . . . xm Y = y1y2y3 . . . yn, a common subsequence of X and Y is a string Z such that Z is a subsequence
- f X and Z is a subsequence of Y .
- 4. The longest common subsequence of X and Y is a common subsequence
Z of X and Y such that |Z| = max{|W| : W is a common subsequence of X and Y }. We denote the longest common subsequence of X and Y by LCS(X, Y ). If |X| = n, there are
- 1 subsequences of size 0,
- n subsequences of size 1
- n
2
. . .
n
30
SLIDE 31 Consequently, there are n
k=0
n
k
The search space for the longest common subsequence of X and Y is exponential. A brute force examination of this search space generates each subsequence of X and checks to see if it is a subsequence of Y . If so, the length updates a counter, which was initially zero. This approach is Ω(2|X|). Let Xi = x1x2 . . . xi, a prefix of X. Note that X0 = ǫ is the empty prefix. Given strings X and Y , consider the subproblem LCS(Xi, Yj). We observe
- 1. if i = 0 or j = 0, then LCS(Xi, Yj) = ǫ.
- 2. Otherwise, i ≥ 1 and j ≥ 1 and one of the following two cases prevails.
(a) xi = yj and LCS(Xi, Yj) = LCS(Xi−1, Yj−1) · xi. (b) xi = yj and LCS(Xi, Yj) = the longer of LCS(Xi−1, Yj) and LCS(Xi, Yj−1). To verify case (a), we reason as follows. We have xi = yj and we let the LCS be xi1xi2 . . . xit = yj1yj2 . . . yjt. If both it < i and jt < j, then we can append xi = yj to the subsequence to obtain a longer common subsequence — a contradiction. The sketch below illustrates this situation. We conclude that at least one of it = i or jt = j must hold. If it = i but jt < j, we can advance jt to j because yjt = yj = xi without changing the LCS. Similarly, if jt = j but it < i, we can advance it to i, again without changing the LCS. The sketch below depicts the scenario in which we advance jt.
31
SLIDE 32
Via one of these variations, we arrive at a longest common subsequence in which the last character is xi = yj. Optimal substructure then forces xi1xi2 . . . xit−1 and yj1yj2 . . . yjt−1 to constitute a longest common subsequence for Xi−1 and Yj−1. That is, LCS(Xi, Yj) = LCS(Xi−1, Yj−1) · xi. To verify case (b), we reason as follows. We now have xi = yj and we again denote the LCS as xi1xi2 . . . xit = yj1yj2 . . . yjt. In this situation, we must have it < i or jt < j or both, because the two sequences cannot both use the last character position. The possibilities are as follows. In these scanarios, the LCS is also a longest common subsequence of Xi−1 and Yj or of Xi and Yj−1 or of both. We now have the following recurrence, where c[i, j] = |LCS(Xi, Yj)|. c[i, j] = 0, i = 0 or j = 0 c[i − 1, j − 1] + 1, i, j > 0 and xi = yj max (c[i, j − 1], c[i − 1, j]) , i, j > 0 and xi = yj
32
SLIDE 33
We call the following recursive routine with Recursive-LCS(X.length, Y.length). Recursive-LCS(i, j) { if (i = 0 or j = 0) return 0; if (X[i] = Y [j]) return 1 + Recursive-LCS(i − 1, j − 1); a = Recursive-LCS(i, j − 1); b = Recursive-LCS(i − 1, j); if a ≤ b return a; return b; } Given inputs X = aaa . . . a and Y = bbb . . . b, where |X| = |Y | = n, we note that the test if(X[i] = Y [j]) . . . will always be false. Consequently, T(n, n) ≥ 2 + T(n, n − 1) + T(n − 1, n) ≥ 2 + [2 + T(n, n − 2) + T(n − 1, n − 1)] + [2 + T(n − 1, n − 1) + T(n − 2, n)] = 6 + T(n, n − 2) + 2T(n − 1, n − 1) + T(n − 2, n) ≥ 2T(n − 1, n − 1) ≥ 4T(n − 2, n − 2) ≥ . . . ≥ 2n−1T(1, 1) = 2n−1 · 2 ≥ 2n. We conclude that T(n, n) = Ω(2n), an exponential algorithm. However, the recurrence c[i, j] = 0, i = 0 or j = 0 c[i − 1, j − 1] + 1, i, j > 0 and xi = yj max (c[i, j − 1], c[i − 1, j]) , i, j > 0 and xi = yj can be calculated by storing the c[i, j] values in a two-dimensional array as follows
33
SLIDE 34 i\j 1 2 3 . . . n . . . 1 2 3 . . . . . .
տ ← ↑
. . . . . m The initial condition, c[i, j] = 0 if either i = 0
- r j = 0 enables filling the top row and leftmost
column. Subsequent cells reference the cells to the north, west, and northwest to compute the desired recurrence. Along with the c[i, j] value, we can retain in each cell an arrow that indicates the predecessor cell whose value determine c[i, j]. The following example computes the LCS of ABCBDAB and BDCABA. B D C A B A A 0∗ ↑ 0 ↑ 0 ↑ 0 տ 1 ← 1 տ 1 B տ 1∗ ← 1∗ ← 1 ↑ 1 տ 2 ← 2 C ↑ 1 ↑ 1 տ 2∗ ← 2∗ ↑ 2 ↑ 2 B տ 1 ↑ 1 ↑ 2 ↑ 2 տ 3∗ ← 3 D ↑ 1 տ 2 ↑ 2 ↑ 2 ↑ 3∗ ↑ 3 A ↑ 1 ↑ 2 ↑ 2 տ 3 ↑ 3 տ 4∗ B տ 1 ↑ 2 ↑ 2 տ 3 տ 4 ↑ 4∗ The length of the LCS appears in the lower-right cell, in this case 4. The actual sequence is recovered by tracing a path through the arrows from the lower-right
- cell. A symbol is recorded, in reverse order, reflecting the source of each diagonal
- arrow. In this example, we recover BCBA.
Multiple LCS may exist, and this fact will be indicated by ties between the upper and left neighbor. A back-tracking algorithm could chase down all such sequences. The LCS, coded below, performs Θ(1) work for each cell in the array, leading to a runtime of T(n, m) = Θ(nm). The scan for recovering the LCS is Θ(m + n).
34
SLIDE 35
LCS(X, Y ) { m = X.length; n = Y.length; for i = 0 to m c[i, 0] = 0; for j = 0 to n c[0, j] = 0; for i = 1 to m for j = 1 to n if X[i] = Y [j] { c[i, j] = c[i − 1, j − 1] + 1; b[i, j] =տ; } else if c[i − 1, j] ≥ c[i, j − 1] { c[i, j] = c[i − 1, j]; b[i, j] =↑; } else { c[i, j] = c[i, j − 1]; b[i, j] =←; } return (a, b); } Print-LCS(b, X) { m = b.length; n = b[0].length; Print-Recursive(b, X, m, n); } Print-Recursive(b, X, i, j); if (i = 0 or j = 0) return; if b[i, j] =տ { Print-Recursive(b, X, i − 1, j − 1); Print X[i]; } else if b[i, j) =↑ Print-Recursive(b, X, i − 1, j); else Print-Recursive(b, X, i, j − 1); }
35