CSC373 Week 3: Dynamic Programming
373F20 - Nisarg Shah 1
CSC373 Week 3: Dynamic Programming Nisarg Shah 373F20 - Nisarg - - PowerPoint PPT Presentation
CSC373 Week 3: Dynamic Programming Nisarg Shah 373F20 - Nisarg Shah 1 Recap Greedy Algorithms Interval scheduling Interval partitioning Minimizing lateness Huffman encoding 373F20 - Nisarg Shah 2 Jeff Erickson on
373F20 - Nisarg Shah 1
373F20 - Nisarg Shah 2
➢ Interval scheduling ➢ Interval partitioning ➢ Minimizing lateness ➢ Huffman encoding ➢ …
373F20 - Nisarg Shah 3
373F20 - Nisarg Shah 4
The 1950s were not good years for mathematical research. We had a very interesting gentleman in Washington named
pathological fear and hatred of the word ‘research’. I’m not using the term lightly; I’m using it precisely. His face would suffuse, he would turn red, and he would get violent if people used the term ‘research’ in his presence. You can imagine how he felt, then, about the term ‘mathematical’. The RAND Corporation was employed by the Air Force, and the Air Force had Wilson as its boss, essentially. Hence, I felt I had to do something to shield Wilson and the Air Force from the fact that I was really doing mathematics inside the RAND Corporation. What title, what name, could I choose? — Richard Bellman, on the origin of his term ‘dynamic programming’ (1984)
373F20 - Nisarg Shah 5
➢ Breaking the problem down into simpler subproblems,
➢ The next time the same subproblem occurs, instead of
➢ Hopefully, we save a lot of computation at the expense of
➢ Also called “memoization”
➢ Job 𝑘 starts at time 𝑡
𝑘 and finishes at time 𝑔 𝑘
➢ Each job 𝑘 has a weight 𝑥
𝑘
➢ Two jobs are compatible if they don’t overlap ➢ Goal: find a set 𝑇 of mutually compatible jobs with highest
𝑘
𝑘 = 1, then this is simply the interval
➢ Greedy algorithm based on earliest finish time ordering was
373F20 - Nisarg Shah 6
373F20 - Nisarg Shah 7
➢ Fails spectacularly!
373F20 - Nisarg Shah 8
➢ By weight: choose jobs with highest 𝑥
𝑘 first
➢ Maximum weight per time: choose jobs with highest
𝑘/(𝑔 𝑘 − 𝑡 𝑘) first
➢ ...
➢ They’re arbitrarily worse than the optimal solution ➢ In fact, under a certain formalization, “no greedy
373F20 - Nisarg Shah 9
➢ Jobs are sorted by finish time: 𝑔
1 ≤ 𝑔 2 ≤ ⋯ ≤ 𝑔 𝑜
➢ 𝑞 𝑘 = largest index 𝑗 < 𝑘 such that job 𝑗 is compatible
𝑗 < 𝑡 𝑘)
Among jobs before job 𝑘, the ones compatible with it are precisely 1 … 𝑗
E.g. 𝑞[8] = 1, 𝑞[7] = 3, 𝑞[2] = 0
373F20 - Nisarg Shah 10
➢ Let OPT be an optimal solution ➢ Two options regarding job 𝑜:
➢ OPT is best of both options ➢ Notice that in both options, we need to solve the
373F20 - Nisarg Shah 11
➢ 𝑃𝑄𝑈(𝑘) = max total weight of compatible jobs from 1,…,𝑘 ➢ Base case: 𝑃𝑄𝑈 0 = 0 ➢ Two cases regarding job 𝑘:
𝑘 + 𝑃𝑄𝑈(𝑞 𝑘 )
➢ Bellman equation:
𝑘 + 𝑃𝑄𝑈 𝑞 𝑘
373F20 - Nisarg Shah 12
373F20 - Nisarg Shah 13
a)
b)
c)
d)
373F20 - Nisarg Shah 14
➢ It is possible that 𝑞 𝑘 = 𝑘 − 1 for each 𝑘 ➢ Calling COMPUTE-OPT(𝑘 − 1) and COMPUTE-OPT(𝑞[𝑘])
➢ We can slightly optimize:
373F20 - Nisarg Shah 15
➢ Some solutions are being computed many, many times
COMPUTE-OPT(3)
➢ Simply remember what you’ve already computed, and re-
373F20 - Nisarg Shah 16
373F20 - Nisarg Shah 17
➢ Sorting jobs takes 𝑃 𝑜 log 𝑜 ➢ It also takes 𝑃(𝑜 log 𝑜) to do 𝑜 binary searches to
➢ M-Compute-OPT(𝑘) is called at most once for each 𝑘 ➢ Each such call takes 𝑃(1) time, not considering the time
➢ So M-Compute-OPT(𝑜) takes only 𝑃 𝑜 time ➢ Overall time is 𝑃 𝑜 log 𝑜
373F20 - Nisarg Shah 18
373F20 - Nisarg Shah 19
➢ …when not all sub-solutions need to be computed on
➢ …because one does not need to think of the “right order”
➢ …when all sub-solutions will anyway need to be
➢ …because it is faster as it prevents recursive call
➢ …because sometimes we can free-up memory early
373F20 - Nisarg Shah 20
➢ Idea: Maintain the optimal value and an optimal solution ➢ So, we compute two quantities:
𝑃𝑄𝑈 𝑘 = ൝ if 𝑘 = 0 max 𝑃𝑄𝑈 𝑘 − 1 , 𝑥
𝑘 + 𝑃𝑄𝑈 𝑞 𝑘
if 𝑘 > 0 𝑇 𝑘 = ൞ ∅ if 𝑘 = 0 𝑇(𝑘 − 1) if 𝑘 > 0 ∧ 𝑃𝑄𝑈 𝑘 − 1 ≥ 𝑥
𝑘 + 𝑃𝑄𝑈 𝑞 𝑘
𝑘 ∪ 𝑇(𝑞 𝑘 ) if 𝑘 > 0 ∧ 𝑃𝑄𝑈 𝑘 − 1 < 𝑥
𝑘 + 𝑃𝑄𝑈 𝑞 𝑘
373F20 - Nisarg Shah 21
𝑃𝑄𝑈 𝑘 = ൝ if 𝑘 = 0 max 𝑃𝑄𝑈 𝑘 − 1 , 𝑤𝑘 + 𝑃𝑄𝑈 𝑞 𝑘 if 𝑘 > 0 𝑇 𝑘 = ൞ ∅ if 𝑘 = 0 𝑇(𝑘 − 1) if 𝑘 > 0 ∧ 𝑃𝑄𝑈 𝑘 − 1 ≥ 𝑤𝑘 + 𝑃𝑄𝑈 𝑞 𝑘 𝑘 ∪ 𝑇(𝑞 𝑘 ) if 𝑘 > 0 ∧ 𝑃𝑄𝑈 𝑘 − 1 < 𝑤𝑘 + 𝑃𝑄𝑈 𝑞 𝑘
This works with both top-down (memoization) and bottom-up approaches. In this problem, we can do something simpler: just compute 𝑃𝑄𝑈 first, and later compute 𝑇 using only 𝑃𝑄𝑈.
373F20 - Nisarg Shah 22
𝑃𝑄𝑈 𝑘 = ൝ if 𝑘 = 0 max 𝑃𝑄𝑈 𝑘 − 1 , 𝑤𝑘 + 𝑃𝑄𝑈 𝑞 𝑘 if 𝑘 > 0 𝑇 𝑘 = ൞ ⊥ if 𝑘 = 0 𝑀 if 𝑘 > 0 ∧ 𝑃𝑄𝑈 𝑘 − 1 ≥ 𝑤𝑘 + 𝑃𝑄𝑈 𝑞 𝑘 𝑆 if 𝑘 > 0 ∧ 𝑃𝑄𝑈 𝑘 − 1 < 𝑤𝑘 + 𝑃𝑄𝑈 𝑞 𝑘
➢ If 𝑇 𝑘 = 𝑀, update 𝑘 ← 𝑘 − 1 ➢ If 𝑇 𝑘 = 𝑆, add 𝑘 to the solution and update 𝑘 ← 𝑞[𝑘] ➢ If 𝑇 𝑘 =⊥, stop
373F20 - Nisarg Shah 23
➢ Optimal solution to a problem can be computed easily
➢ Divide-and-conquer is a special case in which the
➢ So there’s no need for memoization ➢ In dynamic programming, two of the subproblems may in
373F20 - Nisarg Shah 24
➢ 𝑜 items: item 𝑗 provides value 𝑤𝑗 > 0 and has weight 𝑥𝑗 > 0 ➢ Knapsack has weight capacity 𝑋 ➢ Assumption: 𝑋, 𝑤𝑗-s, and 𝑥𝑗-s are all integers ➢ Goal: pack the knapsack with a collection of items with
373F20 - Nisarg Shah 25
➢ Goal: Compute 𝑃𝑄𝑈(𝑋) ➢ Claim? 𝑃𝑄𝑈(𝑥) must use at least one job 𝑘 with weight ≤ 𝑥
𝑘
➢ Let 𝑥∗ = min
𝑘
𝑘
➢ 𝑃𝑄𝑈 𝑥 = ቐ
if 𝑥 < 𝑥∗ max
𝑘:𝑥𝑘≤𝑥 𝑤𝑘 + 𝑃𝑄𝑈 𝑥 − 𝑥 𝑘
if 𝑥 ≥ 𝑥∗
➢ It might use an item more than once!
373F20 - Nisarg Shah 26
➢ Goal: Compute 𝑃𝑄𝑈(𝑜, 𝑋)
➢ If 𝑥𝑗 > 𝑥, then we can’t choose 𝑗. Just use 𝑃𝑄𝑈(𝑗 − 1, 𝑥) ➢ If 𝑥𝑗 ≤ 𝑥, there are two cases:
373F20 - Nisarg Shah 27
➢ 𝑗 ∈ 1, … , 𝑜 ➢ 𝑥 ∈ {1, … , 𝑋} (recall weights and capacity are integers) ➢ There are 𝑃(𝑜 ⋅ 𝑋) possible evaluations of 𝑃𝑄𝑈 ➢ Each is evaluated at most once (memoization) ➢ Each takes 𝑃(1) time to evaluate ➢ So the total running time is 𝑃(𝑜 ⋅ 𝑋)
➢ A: No! But it’s pseudo-polynomial.
373F20 - Nisarg Shah 28
➢ Then we can use a different dynamic programming
373F20 - Nisarg Shah 29
➢ Goal: Compute max 𝑤 ∶ 𝑃𝑄𝑈 𝑗, 𝑤 ≤ 𝑋
➢ If we choose 𝑗, we need capacity 𝑥𝑗 + 𝑃𝑄𝑈(𝑗 − 1, 𝑤 − 𝑤𝑗) ➢ If we don’t choose 𝑗, we need capacity 𝑃𝑄𝑈 𝑗 − 1, 𝑤
𝑃𝑄𝑈 𝑗, 𝑤 = if 𝑤 ≤ 0 ∞ if 𝑤 > 0, 𝑗 = 0 min 𝑥𝑗 + 𝑃𝑄𝑈 𝑗 − 1, 𝑤 − 𝑤𝑗 , 𝑃𝑄𝑈 𝑗 − 1, 𝑤 if 𝑤 > 0, 𝑗 > 0
373F20 - Nisarg Shah 30
➢ Goal: Compute max 𝑤 ∶ 𝑃𝑄𝑈 𝑗, 𝑤 ≤ 𝑋
➢ Not likely. ➢ Knapsack problem is NP-complete (we’ll see later).
373F20 - Nisarg Shah 31
➢ For any 𝜗 > 0, we can get a value that is within 1 + 𝜗
𝑃 𝑞𝑝𝑚𝑧 𝑜, log 𝑋 , log 𝑊 ,
1 𝜗
➢ Such algorithms are known as fully polynomial-time
➢ Core idea behind FPTAS for knapsack:
373F20 - Nisarg Shah 32
➢ Input: A directed graph 𝐻 = (𝑊, 𝐹) with edge lengths ℓ𝑤𝑥
➢ Goal: Compute the length of the shortest path from 𝑡 to
➢ Dijkstra’s algorithm can be used for this purpose ➢ But it fails when some edge lengths can be negative ➢ What do we do in this case?
373F20 - Nisarg Shah 33
➢ You can traverse the cycle arbitrarily many times to get
𝑡
373F20 - Nisarg Shah 34
➢ Shortest paths are well-defined even when some of the
➢ Consider the shortest 𝑡 ⇝ 𝑢 path with the fewest edges
➢ If it has a cycle, removing the cycle creates a path with
373F20 - Nisarg Shah 35
➢ It could be just a single edge ➢ But if 𝑄 has more than one edges, consider 𝑣 which
➢ If 𝑡 ⇝ 𝑢 is shortest, 𝑡 ⇝ 𝑣 must be shortest as well and it
𝑢
373F20 - Nisarg Shah 36
➢ Either this path uses at most 𝑗 − 1 edges ⇒ 𝑃𝑄𝑈(𝑢, 𝑗 − 1) ➢ Or it uses 𝑗 edges ⇒ min
𝑣 𝑃𝑄𝑈 𝑣, 𝑗 − 1 + ℓ𝑣𝑢 𝑢
373F20 - Nisarg Shah 37
➢ Either this path uses at most 𝑗 − 1 edges ⇒ 𝑃𝑄𝑈(𝑢, 𝑗 − 1) ➢ Or it uses 𝑗 edges ⇒ min
𝑣 𝑃𝑄𝑈 𝑣, 𝑗 − 1 + ℓ𝑣𝑢
𝑃𝑄𝑈 𝑢, 𝑗 = ൞ ∞ 𝑗 = 0 ∨ 𝑢 = 𝑡 𝑗 = 0 ∧ 𝑢 ≠ 𝑡 min 𝑃𝑄𝑈 𝑢, 𝑗 − 1 , min
𝑣 𝑃𝑄𝑈 𝑣, 𝑗 − 1 + ℓ𝑣𝑢
➢ Running time: 𝑃(𝑜2) calls, each takes 𝑃(𝑜) time ⇒ 𝑃 𝑜3 ➢ Q: What do you need to store to also get the actual paths?
373F20 - Nisarg Shah 38
➢ Improvement over
➢ Running time 𝑃(𝑛𝑜)
➢ Space complexity
373F20 - Nisarg Shah 39
373F20 - Nisarg Shah 40
➢ Our DP doesn’t work because its path from 𝑡 to 𝑢 might
➢ But path from 𝑡 to 𝑣 might in turn go through 𝑢 ➢ The path may no longer remain simple
➢ Hamiltonian path problem (i.e. is there a path of length
373F20 - Nisarg Shah 41
➢ Input: A directed graph 𝐻 = (𝑊, 𝐹) with edge lengths ℓ𝑤𝑥
➢ Goal: Compute the length of the shortest path from all
➢ Run single-source shortest paths from each source 𝑡 ➢ Running time is 𝑃 𝑜4 ➢ Actually, we can do this in 𝑃(𝑜3) as well
373F20 - Nisarg Shah 42
➢ Input: A directed graph 𝐻 = (𝑊, 𝐹) with edge lengths ℓ𝑤𝑥
➢ Goal: Compute the length of the shortest path from all
373F20 - Nisarg Shah 43
➢ Input: Matrices 𝑁1, … , 𝑁𝑜 where the dimension of 𝑁𝑗 is
➢ Goal: Compute 𝑁1 ⋅ 𝑁2 ⋅ … ⋅ 𝑁𝑜
➢ 𝐵 ⋅ 𝐶 ⋅ 𝐷 = 𝐵 ⋅ 𝐶 ⋅ 𝐷 ➢ So isn’t the optimal solution going to call the algorithm
➢ Insight: the time it takes to multiply two matrices
373F20 - Nisarg Shah 44
➢ We use the brute force approach for matrix multiplication ➢ So multiplying 𝑞 × 𝑟 and 𝑟 × 𝑠 matrices requires 𝑞 ⋅ 𝑟 ⋅ 𝑠
➢ 𝑁1 is 5 X 10 ➢ 𝑁2 is 10 X 100 ➢ 𝑁3 is 100 X 50 ➢ 𝑁1 ⋅ 𝑁2 ⋅ 𝑁3 → 5 ⋅ 10 ⋅ 100 + 5 ⋅ 100 ⋅ 50 = 30000 ops ➢ 𝑁1 ⋅ 𝑁2 ⋅ 𝑁3 → 10 ⋅ 100 ⋅ 50 + 5 ⋅ 10 ⋅ 50 = 52500 ops
373F20 - Nisarg Shah 45
➢ Our input is simply the dimensions 𝑒0, 𝑒1, … , 𝑒𝑜 (such that
➢ Optimal substructure property ➢ Think of the final product computed, say 𝐵 ⋅ 𝐶 ➢ 𝐵 is the product of some prefix, 𝐶 is the product of the
➢ For the overall optimal computation, each of 𝐵 and 𝐶
373F20 - Nisarg Shah 46
𝑘
➢ Here, 1 ≤ 𝑗 ≤ 𝑘 ≤ 𝑜 ➢ Q: Why do we not just care about prefixes and suffices?
⋅ 𝑁5 ⇒ need to know optimal solution for 𝑁2 ⋅ 𝑁3 ⋅ 𝑁4
➢ Running time: 𝑃 𝑜2 calls, 𝑃(𝑜) time per call ⇒ 𝑃 𝑜3
𝑃𝑄𝑈 𝑗, 𝑘 = ൝ 𝑗 = 𝑘 min 𝑃𝑄𝑈 𝑗, 𝑙 + 𝑃𝑄𝑈 𝑙 + 1, 𝑘 + 𝑒𝑗−1𝑒𝑙𝑒𝑘 ∶ 𝑗 ≤ 𝑙 < 𝑘 if 𝑗 < 𝑘
373F20 - Nisarg Shah 47
➢ Surprisingly, yes. But not by a DP algorithm (that I know of) ➢ Hu & Shing (1981) developed 𝑃(𝑜 log 𝑜) time algorithm by
Source: Wikipedia Example
triangles This slide is not in the scope of the course
➢ How similar are strings 𝑌 = 𝑦1, … , 𝑦𝑛 and 𝑍 = 𝑧1, … , 𝑧𝑜?
➢ We can do these operations on any symbol in either string ➢ How many deletions & replacements does it take to match
373F20 - Nisarg Shah 48
373F20 - Nisarg Shah 49
6 replacements, 1 deletion 1 replacement, 1 deletion
➢ Input
➢ Goal
➢ Want to delete/replace at one end and recurse
373F20 - Nisarg Shah 50
➢ Goal: match 𝑦1, … , 𝑦𝑛 and 𝑧1, … , 𝑧𝑜 ➢ Consider the last symbols 𝑦𝑛 and 𝑧𝑜 ➢ Three options:
➢ Hence in the DP, we need to compute the optimal solutions
373F20 - Nisarg Shah 51
𝐹 𝑗, 𝑘 = if 𝑗 = 𝑘 = 0 𝑒 𝑧𝑘 + 𝐹[𝑗, 𝑘 − 1] if 𝑗 = 0 ∧ 𝑘 > 0 𝑒 𝑦𝑗 + 𝐹[𝑗 − 1, 𝑘] if 𝑗 > 0 ∧ 𝑘 = 0 min{𝐵, 𝐶, 𝐷}
where 𝐵 = 𝑒 𝑦𝑗 + 𝐹 𝑗 − 1, 𝑘 , 𝐶 = 𝑒 𝑧𝑘 + 𝐹 𝑗, 𝑘 − 1 𝐷 = 𝑠 𝑦𝑗, 𝑧𝑘 + 𝐹[𝑗 − 1, 𝑘 − 1]
373F20 - Nisarg Shah 52
373F20 - Nisarg Shah 53
𝐹 𝑗, 𝑘 = if 𝑗 = 𝑘 = 0 𝑒 𝑧𝑘 + 𝐹[𝑗, 𝑘 − 1] if 𝑗 = 0 ∧ 𝑘 > 0 𝑒 𝑦𝑗 + 𝐹[𝑗 − 1, 𝑘] if 𝑗 > 0 ∧ 𝑘 = 0 min{𝐵, 𝐶, 𝐷}
where 𝐵 = 𝑒 𝑦𝑗 + 𝐹 𝑗 − 1, 𝑘 , 𝐶 = 𝑒 𝑧𝑘 + 𝐹 𝑗, 𝑘 − 1 𝐷 = 𝑠 𝑦𝑗, 𝑧𝑘 + 𝐹[𝑗 − 1, 𝑘 − 1]
➢ While computing 𝐹[⋅, 𝑘], we only need to store 𝐹[⋅, 𝑘] and 𝐹 ⋅, 𝑘 − 1 , ➢ So the additional space required is 𝑃(𝑛) ➢ By storing two rows at a time instead, we can make it 𝑃 𝑜 ➢ Usually people include storage of inputs, so it’s 𝑃(𝑜 + 𝑛) ➢ But this is not enough if we want to compute the actual solution
373F20 - Nisarg Shah 54
This slide is not in the scope of the course
373F20 - Nisarg Shah 55
𝑒(𝑦𝑗) 𝑒(𝑧𝑘) This slide is not in the scope of the course
373F20 - Nisarg Shah 56
➢ 𝐹[𝑗, 𝑘] = length of shortest path from (0,0) to (𝑗, 𝑘)
𝑒(𝑦𝑗) 𝑒(𝑧𝑘) This slide is not in the scope of the course
373F20 - Nisarg Shah 57
➢ Shortest path from (0,0) to (𝑛, 𝑜) passes through (𝑟, Τ
𝑜 2)
𝑜 2) + length of shortest path from (𝑟, Τ 𝑜 2) to (𝑛, 𝑜) This slide is not in the scope of the course
373F20 - Nisarg Shah 58
➢ Find 𝑟 using divide-and-conquer ➢ Find shortest paths from (0,0) to (𝑟, Τ
𝑜 2) and (𝑟, Τ 𝑜 2) to
This slide is not in the scope of the course
373F20 - Nisarg Shah 59