CSC373 Week 3: Dynamic Programming Nisarg Shah 373F20 - Nisarg - - PowerPoint PPT Presentation

csc373 week 3 dynamic programming nisarg shah
SMART_READER_LITE
LIVE PREVIEW

CSC373 Week 3: Dynamic Programming Nisarg Shah 373F20 - Nisarg - - PowerPoint PPT Presentation

CSC373 Week 3: Dynamic Programming Nisarg Shah 373F20 - Nisarg Shah 1 Recap Greedy Algorithms Interval scheduling Interval partitioning Minimizing lateness Huffman encoding 373F20 - Nisarg Shah 2 Jeff Erickson on


slide-1
SLIDE 1

CSC373 Week 3: Dynamic Programming

373F20 - Nisarg Shah 1

Nisarg Shah

slide-2
SLIDE 2

Recap

373F20 - Nisarg Shah 2

  • Greedy Algorithms

➢ Interval scheduling ➢ Interval partitioning ➢ Minimizing lateness ➢ Huffman encoding ➢ …

slide-3
SLIDE 3

373F20 - Nisarg Shah 3

Jeff Erickson on greedy algorithms…

slide-4
SLIDE 4

373F20 - Nisarg Shah 4

The 1950s were not good years for mathematical research. We had a very interesting gentleman in Washington named

  • Wilson. He was secretary of Defense, and he actually had a

pathological fear and hatred of the word ‘research’. I’m not using the term lightly; I’m using it precisely. His face would suffuse, he would turn red, and he would get violent if people used the term ‘research’ in his presence. You can imagine how he felt, then, about the term ‘mathematical’. The RAND Corporation was employed by the Air Force, and the Air Force had Wilson as its boss, essentially. Hence, I felt I had to do something to shield Wilson and the Air Force from the fact that I was really doing mathematics inside the RAND Corporation. What title, what name, could I choose? — Richard Bellman, on the origin of his term ‘dynamic programming’ (1984)

Richard Bellman’s quote from Jeff Erickson’s book

slide-5
SLIDE 5

Dynamic Programming

373F20 - Nisarg Shah 5

  • Outline

➢ Breaking the problem down into simpler subproblems,

solve each subproblem just once, and store their solutions.

➢ The next time the same subproblem occurs, instead of

recomputing its solution, simply look up its previously computed solution.

➢ Hopefully, we save a lot of computation at the expense of

modest increase in storage space.

➢ Also called “memoization”

  • How is this different from divide & conquer?
slide-6
SLIDE 6
  • Problem

➢ Job 𝑘 starts at time 𝑡

𝑘 and finishes at time 𝑔 𝑘

➢ Each job 𝑘 has a weight 𝑥

𝑘

➢ Two jobs are compatible if they don’t overlap ➢ Goal: find a set 𝑇 of mutually compatible jobs with highest

total weight σ𝑘∈𝑇 𝑥

𝑘

  • Recall: If all 𝑥

𝑘 = 1, then this is simply the interval

scheduling problem from last week

➢ Greedy algorithm based on earliest finish time ordering was

  • ptimal for this case

Weighted Interval Scheduling

373F20 - Nisarg Shah 6

slide-7
SLIDE 7

Recall: Interval Scheduling

373F20 - Nisarg Shah 7

  • What if we simply try to use it again?

➢ Fails spectacularly!

slide-8
SLIDE 8

Weighted Interval Scheduling

373F20 - Nisarg Shah 8

  • What if we use other orderings?

➢ By weight: choose jobs with highest 𝑥

𝑘 first

➢ Maximum weight per time: choose jobs with highest

𝑥

𝑘/(𝑔 𝑘 − 𝑡 𝑘) first

➢ ...

  • None of them work!

➢ They’re arbitrarily worse than the optimal solution ➢ In fact, under a certain formalization, “no greedy

algorithm” can produce any “decent approximation” in the worst case (beyond this course!)

slide-9
SLIDE 9

Weighted Interval Scheduling

373F20 - Nisarg Shah 9

  • Convention

➢ Jobs are sorted by finish time: 𝑔

1 ≤ 𝑔 2 ≤ ⋯ ≤ 𝑔 𝑜

➢ 𝑞 𝑘 = largest index 𝑗 < 𝑘 such that job 𝑗 is compatible

with job 𝑘 (i.e. 𝑔

𝑗 < 𝑡 𝑘)

Among jobs before job 𝑘, the ones compatible with it are precisely 1 … 𝑗

E.g. 𝑞[8] = 1, 𝑞[7] = 3, 𝑞[2] = 0

slide-10
SLIDE 10

Weighted Interval Scheduling

373F20 - Nisarg Shah 10

  • The DP approach

➢ Let OPT be an optimal solution ➢ Two options regarding job 𝑜:

  • Option 1: Job 𝑜 is in OPT
  • Can’t use incompatible jobs 𝑞 𝑜 + 1, … , 𝑜 − 1
  • Must select optimal subset of jobs from {1, … , 𝑞 𝑜 }
  • Option 2: Job 𝑜 is not in OPT
  • Must select optimal subset of jobs from {1, … , 𝑜 − 1}

➢ OPT is best of both options ➢ Notice that in both options, we need to solve the

problem on a prefix of our ordering

slide-11
SLIDE 11

Weighted Interval Scheduling

373F20 - Nisarg Shah 11

  • The DP approach

➢ 𝑃𝑄𝑈(𝑘) = max total weight of compatible jobs from 1,…,𝑘 ➢ Base case: 𝑃𝑄𝑈 0 = 0 ➢ Two cases regarding job 𝑘:

  • Job 𝑘 is selected: optimal weight is 𝑥

𝑘 + 𝑃𝑄𝑈(𝑞 𝑘 )

  • Job 𝑘 is not selected: optimal weight is 𝑃𝑄𝑈(𝑘 − 1)

➢ Bellman equation:

𝑃𝑄𝑈 𝑘 = ൝ if 𝑘 = 0 max 𝑃𝑄𝑈 𝑘 − 1 , 𝑥

𝑘 + 𝑃𝑄𝑈 𝑞 𝑘

if 𝑘 > 0

slide-12
SLIDE 12

Brute Force Solution

373F20 - Nisarg Shah 12

slide-13
SLIDE 13

Brute Force Solution

373F20 - Nisarg Shah 13

  • Q: Worst-case running time of COMPUTE-OPT(𝑜)?

a)

Θ(𝑜)

b)

Θ 𝑜 log 𝑜

c)

Θ 1.618𝑜

d)

Θ(2𝑜)

slide-14
SLIDE 14

Brute Force Solution

373F20 - Nisarg Shah 14

  • Brute force running time

➢ It is possible that 𝑞 𝑘 = 𝑘 − 1 for each 𝑘 ➢ Calling COMPUTE-OPT(𝑘 − 1) and COMPUTE-OPT(𝑞[𝑘])

separately would take 2𝑜 steps

➢ We can slightly optimize:

  • If 𝑞 𝑘 = 𝑘 − 1, call it just once, else call them separately
  • Now, the worst case is when 𝑞 𝑘 = 𝑘 − 2 for each 𝑘
  • Running time: 𝑈 𝑜 = 𝑈 𝑜 − 1 + 𝑈 𝑜 − 2
  • Fibonacci, golden ratio, … ☺
  • 𝑈 𝑜 = 𝑃(𝜒𝑜), where 𝜒 ≈ 1.618
slide-15
SLIDE 15

Dynamic Programming

373F20 - Nisarg Shah 15

  • Why is the runtime high?

➢ Some solutions are being computed many, many times

  • E.g. if 𝑞[5] = 3, then COMPUTE-OPT(5) calls COMPUTE-OPT(4) and

COMPUTE-OPT(3)

  • But COMPUTE-OPT(4) in turn calls COMPUTE-OPT(3) again
  • Memoization trick

➢ Simply remember what you’ve already computed, and re-

use the answer if needed in future

slide-16
SLIDE 16

Dynamic Program: Top-Down

373F20 - Nisarg Shah 16

  • Let’s store COMPUTE-OPT(j) in 𝑁[𝑘]
slide-17
SLIDE 17

Dynamic Program: Top-Down

373F20 - Nisarg Shah 17

  • Claim: This memoized version takes 𝑃 𝑜 log 𝑜 time

➢ Sorting jobs takes 𝑃 𝑜 log 𝑜 ➢ It also takes 𝑃(𝑜 log 𝑜) to do 𝑜 binary searches to

compute 𝑞(𝑘) for each 𝑘

➢ M-Compute-OPT(𝑘) is called at most once for each 𝑘 ➢ Each such call takes 𝑃(1) time, not considering the time

taken by any subroutine calls

➢ So M-Compute-OPT(𝑜) takes only 𝑃 𝑜 time ➢ Overall time is 𝑃 𝑜 log 𝑜

slide-18
SLIDE 18

Dynamic Program: Bottom-Up

373F20 - Nisarg Shah 18

  • Find an order in which to call the functions so that

the sub-solutions are ready when needed

slide-19
SLIDE 19

Top-Down vs Bottom-Up

373F20 - Nisarg Shah 19

  • Top-Down may be preferred…

➢ …when not all sub-solutions need to be computed on

some inputs

➢ …because one does not need to think of the “right order”

in which to compute sub-solutions

  • Bottom-Up may be preferred…

➢ …when all sub-solutions will anyway need to be

computed

➢ …because it is faster as it prevents recursive call

  • verheads and unnecessary random memory accesses

➢ …because sometimes we can free-up memory early

slide-20
SLIDE 20

Optimal Solution

373F20 - Nisarg Shah 20

  • This approach gave us the optimal value
  • What about the actual solution (subset of jobs)?

➢ Idea: Maintain the optimal value and an optimal solution ➢ So, we compute two quantities:

𝑃𝑄𝑈 𝑘 = ൝ if 𝑘 = 0 max 𝑃𝑄𝑈 𝑘 − 1 , 𝑥

𝑘 + 𝑃𝑄𝑈 𝑞 𝑘

if 𝑘 > 0 𝑇 𝑘 = ൞ ∅ if 𝑘 = 0 𝑇(𝑘 − 1) if 𝑘 > 0 ∧ 𝑃𝑄𝑈 𝑘 − 1 ≥ 𝑥

𝑘 + 𝑃𝑄𝑈 𝑞 𝑘

𝑘 ∪ 𝑇(𝑞 𝑘 ) if 𝑘 > 0 ∧ 𝑃𝑄𝑈 𝑘 − 1 < 𝑥

𝑘 + 𝑃𝑄𝑈 𝑞 𝑘

slide-21
SLIDE 21

Optimal Solution

373F20 - Nisarg Shah 21

𝑃𝑄𝑈 𝑘 = ൝ if 𝑘 = 0 max 𝑃𝑄𝑈 𝑘 − 1 , 𝑤𝑘 + 𝑃𝑄𝑈 𝑞 𝑘 if 𝑘 > 0 𝑇 𝑘 = ൞ ∅ if 𝑘 = 0 𝑇(𝑘 − 1) if 𝑘 > 0 ∧ 𝑃𝑄𝑈 𝑘 − 1 ≥ 𝑤𝑘 + 𝑃𝑄𝑈 𝑞 𝑘 𝑘 ∪ 𝑇(𝑞 𝑘 ) if 𝑘 > 0 ∧ 𝑃𝑄𝑈 𝑘 − 1 < 𝑤𝑘 + 𝑃𝑄𝑈 𝑞 𝑘

This works with both top-down (memoization) and bottom-up approaches. In this problem, we can do something simpler: just compute 𝑃𝑄𝑈 first, and later compute 𝑇 using only 𝑃𝑄𝑈.

slide-22
SLIDE 22

Optimal Solution

373F20 - Nisarg Shah 22

𝑃𝑄𝑈 𝑘 = ൝ if 𝑘 = 0 max 𝑃𝑄𝑈 𝑘 − 1 , 𝑤𝑘 + 𝑃𝑄𝑈 𝑞 𝑘 if 𝑘 > 0 𝑇 𝑘 = ൞ ⊥ if 𝑘 = 0 𝑀 if 𝑘 > 0 ∧ 𝑃𝑄𝑈 𝑘 − 1 ≥ 𝑤𝑘 + 𝑃𝑄𝑈 𝑞 𝑘 𝑆 if 𝑘 > 0 ∧ 𝑃𝑄𝑈 𝑘 − 1 < 𝑤𝑘 + 𝑃𝑄𝑈 𝑞 𝑘

  • Save space by storing only one bit of information for each 𝑘:

which option yielded the max weight

  • To reconstruct the optimal solution, start with 𝑘 = 𝑜

➢ If 𝑇 𝑘 = 𝑀, update 𝑘 ← 𝑘 − 1 ➢ If 𝑇 𝑘 = 𝑆, add 𝑘 to the solution and update 𝑘 ← 𝑞[𝑘] ➢ If 𝑇 𝑘 =⊥, stop

slide-23
SLIDE 23

Optimal Substructure Property

373F20 - Nisarg Shah 23

  • Dynamic programming applies well to problems

that have optimal substructure property

➢ Optimal solution to a problem can be computed easily

given optimal solution to subproblems

  • Recall: divide-and-conquer also uses this property

➢ Divide-and-conquer is a special case in which the

subproblems don’t “overlap”

➢ So there’s no need for memoization ➢ In dynamic programming, two of the subproblems may in

turn require access to solution to the same subproblem

slide-24
SLIDE 24

Knapsack Problem

373F20 - Nisarg Shah 24

  • Problem

➢ 𝑜 items: item 𝑗 provides value 𝑤𝑗 > 0 and has weight 𝑥𝑗 > 0 ➢ Knapsack has weight capacity 𝑋 ➢ Assumption: 𝑋, 𝑤𝑗-s, and 𝑥𝑗-s are all integers ➢ Goal: pack the knapsack with a collection of items with

highest total value given that their total weight is at most 𝑋

slide-25
SLIDE 25

A First Attempt

373F20 - Nisarg Shah 25

  • Let 𝑃𝑄𝑈(𝑥) = maximum value we can pack with a

knapsack of capacity 𝑥

➢ Goal: Compute 𝑃𝑄𝑈(𝑋) ➢ Claim? 𝑃𝑄𝑈(𝑥) must use at least one job 𝑘 with weight ≤ 𝑥

and then optimally pack the remaining capacity of 𝑥 − 𝑥

𝑘

➢ Let 𝑥∗ = min

𝑘

𝑥

𝑘

➢ 𝑃𝑄𝑈 𝑥 = ቐ

if 𝑥 < 𝑥∗ max

𝑘:𝑥𝑘≤𝑥 𝑤𝑘 + 𝑃𝑄𝑈 𝑥 − 𝑥 𝑘

if 𝑥 ≥ 𝑥∗

  • This is wrong!

➢ It might use an item more than once!

slide-26
SLIDE 26

A Refined Attempt

373F20 - Nisarg Shah 26

  • 𝑃𝑄𝑈(𝑗, 𝑥) = maximum value we can pack using
  • nly items 1, … , 𝑗 given capacity 𝑥

➢ Goal: Compute 𝑃𝑄𝑈(𝑜, 𝑋)

  • Consider item 𝑗

➢ If 𝑥𝑗 > 𝑥, then we can’t choose 𝑗. Just use 𝑃𝑄𝑈(𝑗 − 1, 𝑥) ➢ If 𝑥𝑗 ≤ 𝑥, there are two cases:

  • If we choose 𝑗, the best is 𝑤𝑗 + 𝑃𝑄𝑈 𝑗 − 1, 𝑥 − 𝑥𝑗
  • If we don’t choose 𝑗, the best is 𝑃𝑄𝑈(𝑗 − 1, 𝑥)
slide-27
SLIDE 27

Running Time

373F20 - Nisarg Shah 27

  • Consider possible evaluations 𝑃𝑄𝑈(𝑗, 𝑥)

➢ 𝑗 ∈ 1, … , 𝑜 ➢ 𝑥 ∈ {1, … , 𝑋} (recall weights and capacity are integers) ➢ There are 𝑃(𝑜 ⋅ 𝑋) possible evaluations of 𝑃𝑄𝑈 ➢ Each is evaluated at most once (memoization) ➢ Each takes 𝑃(1) time to evaluate ➢ So the total running time is 𝑃(𝑜 ⋅ 𝑋)

  • Q: Is this polynomial in the input size?

➢ A: No! But it’s pseudo-polynomial.

slide-28
SLIDE 28

What if…?

373F20 - Nisarg Shah 28

  • Note that this algorithm runs in polynomial time

when the value of 𝑋 is polynomially bounded in the length of the input

  • Q: What if instead of the weights being small

integers, we were told that the values are small integers?

➢ Then we can use a different dynamic programming

approach!

slide-29
SLIDE 29

A Different DP

373F20 - Nisarg Shah 29

  • 𝑃𝑄𝑈(𝑗, 𝑤) = minimum capacity needed to pack a

total value of at least 𝑤 using items 1, … , 𝑗

➢ Goal: Compute max 𝑤 ∶ 𝑃𝑄𝑈 𝑗, 𝑤 ≤ 𝑋

  • Consider item 𝑗

➢ If we choose 𝑗, we need capacity 𝑥𝑗 + 𝑃𝑄𝑈(𝑗 − 1, 𝑤 − 𝑤𝑗) ➢ If we don’t choose 𝑗, we need capacity 𝑃𝑄𝑈 𝑗 − 1, 𝑤

𝑃𝑄𝑈 𝑗, 𝑤 = if 𝑤 ≤ 0 ∞ if 𝑤 > 0, 𝑗 = 0 min 𝑥𝑗 + 𝑃𝑄𝑈 𝑗 − 1, 𝑤 − 𝑤𝑗 , 𝑃𝑄𝑈 𝑗 − 1, 𝑤 if 𝑤 > 0, 𝑗 > 0

slide-30
SLIDE 30

A Different DP

373F20 - Nisarg Shah 30

  • 𝑃𝑄𝑈(𝑗, 𝑤) = minimum capacity needed to pack a

total value of at least 𝑤 using items 1, … , 𝑗

➢ Goal: Compute max 𝑤 ∶ 𝑃𝑄𝑈 𝑗, 𝑤 ≤ 𝑋

  • This approach has running time 𝑃(𝑜 ⋅ 𝑊), where

𝑊 = 𝑤1 + ⋯ + 𝑤𝑜

  • So we can get 𝑃(𝑜 ⋅ 𝑋) or 𝑃(𝑜 ⋅ 𝑊)
  • Can we remove the dependence on both 𝑊 and 𝑋?

➢ Not likely. ➢ Knapsack problem is NP-complete (we’ll see later).

slide-31
SLIDE 31

Looking Ahead: FPTAS

373F20 - Nisarg Shah 31

  • While we cannot hope to solve the problem exactly

in time 𝑃 𝑞𝑝𝑚𝑧 𝑜, log 𝑋 , log 𝑊 …

➢ For any 𝜗 > 0, we can get a value that is within 1 + 𝜗

multiplicative factor of the optimal value in time

𝑃 𝑞𝑝𝑚𝑧 𝑜, log 𝑋 , log 𝑊 ,

1 𝜗

➢ Such algorithms are known as fully polynomial-time

approximation scheme (FPTAS)

➢ Core idea behind FPTAS for knapsack:

  • Approximate all weights and values up to the desired precision
  • Solve knapsack on approximate input using DP
slide-32
SLIDE 32

Single-Source Shortest Paths

373F20 - Nisarg Shah 32

  • Problem

➢ Input: A directed graph 𝐻 = (𝑊, 𝐹) with edge lengths ℓ𝑤𝑥

  • n each edge (𝑤, 𝑥), and a source vertex 𝑡

➢ Goal: Compute the length of the shortest path from 𝑡 to

every vertex 𝑢

  • When ℓ𝑤𝑥 ≥ 0 for each (𝑤, 𝑥)…

➢ Dijkstra’s algorithm can be used for this purpose ➢ But it fails when some edge lengths can be negative ➢ What do we do in this case?

slide-33
SLIDE 33

Single-Source Shortest Paths

373F20 - Nisarg Shah 33

  • Cycle length = sum of lengths of edges in the cycle
  • If there is a negative length cycle, shortest paths

are not even well defined…

➢ You can traverse the cycle arbitrarily many times to get

arbitrarily “short” paths

𝑡

slide-34
SLIDE 34

Single-Source Shortest Paths

373F20 - Nisarg Shah 34

  • But if there are no negative cycles…

➢ Shortest paths are well-defined even when some of the

edge lengths may be negative

  • Claim: With no negative cycles, there is always a

shortest path from any vertex to any other vertex that is simple

➢ Consider the shortest 𝑡 ⇝ 𝑢 path with the fewest edges

among all shortest 𝑡 ⇝ 𝑢 paths

➢ If it has a cycle, removing the cycle creates a path with

fewer edges that is no longer than the original path

slide-35
SLIDE 35

Optimal Substructure Property

373F20 - Nisarg Shah 35

  • Consider a simple shortest 𝑡 ⇝ 𝑢 path 𝑄

➢ It could be just a single edge ➢ But if 𝑄 has more than one edges, consider 𝑣 which

immediately precedes 𝑢 in the path

➢ If 𝑡 ⇝ 𝑢 is shortest, 𝑡 ⇝ 𝑣 must be shortest as well and it

must use one fewer edge than the 𝑡 ⇝ 𝑢 path

𝑢

slide-36
SLIDE 36

Optimal Substructure Property

373F20 - Nisarg Shah 36

  • 𝑃𝑄𝑈(𝑢, 𝑗) = shortest path from 𝑡 to 𝑢 using at most 𝑗

edges

  • Then:

➢ Either this path uses at most 𝑗 − 1 edges ⇒ 𝑃𝑄𝑈(𝑢, 𝑗 − 1) ➢ Or it uses 𝑗 edges ⇒ min

𝑣 𝑃𝑄𝑈 𝑣, 𝑗 − 1 + ℓ𝑣𝑢 𝑢

slide-37
SLIDE 37

Optimal Substructure Property

373F20 - Nisarg Shah 37

  • 𝑃𝑄𝑈(𝑢, 𝑗) = shortest path from 𝑡 to 𝑢 using at most 𝑗

edges

  • Then:

➢ Either this path uses at most 𝑗 − 1 edges ⇒ 𝑃𝑄𝑈(𝑢, 𝑗 − 1) ➢ Or it uses 𝑗 edges ⇒ min

𝑣 𝑃𝑄𝑈 𝑣, 𝑗 − 1 + ℓ𝑣𝑢

𝑃𝑄𝑈 𝑢, 𝑗 = ൞ ∞ 𝑗 = 0 ∨ 𝑢 = 𝑡 𝑗 = 0 ∧ 𝑢 ≠ 𝑡 min 𝑃𝑄𝑈 𝑢, 𝑗 − 1 , min

𝑣 𝑃𝑄𝑈 𝑣, 𝑗 − 1 + ℓ𝑣𝑢

  • therwise

➢ Running time: 𝑃(𝑜2) calls, each takes 𝑃(𝑜) time ⇒ 𝑃 𝑜3 ➢ Q: What do you need to store to also get the actual paths?

slide-38
SLIDE 38

Side Notes

373F20 - Nisarg Shah 38

  • Bellman-Ford-

Moore algorithm

➢ Improvement over

this DP

➢ Running time 𝑃(𝑛𝑜)

for 𝑜 vertices and 𝑛 edges

➢ Space complexity

reduces to 𝑃(𝑛 + 𝑜)

slide-39
SLIDE 39

Maximum Length Paths?

373F20 - Nisarg Shah 39

  • Can we use a similar DP to compute maximum

length paths from 𝑡 to all other vertices?

  • This is well defined when there are no positive

cycles, in which case, yes.

  • What if there are positive cycles, but we want

maximum length simple paths?

slide-40
SLIDE 40

Maximum Length Paths?

373F20 - Nisarg Shah 40

  • What goes wrong?

➢ Our DP doesn’t work because its path from 𝑡 to 𝑢 might

use a path from 𝑡 to 𝑣 and edge from 𝑣 to 𝑢

➢ But path from 𝑡 to 𝑣 might in turn go through 𝑢 ➢ The path may no longer remain simple

  • In fact, maximum length simple path is NP-hard

➢ Hamiltonian path problem (i.e. is there a path of length

𝑜 − 1 in a given undirected graph?) is a special case

slide-41
SLIDE 41

All-Pairs Shortest Paths

373F20 - Nisarg Shah 41

  • Problem

➢ Input: A directed graph 𝐻 = (𝑊, 𝐹) with edge lengths ℓ𝑤𝑥

  • n each edge (𝑤, 𝑥) and no negative cycles

➢ Goal: Compute the length of the shortest path from all

vertices 𝑡 to all other vertices 𝑢

  • Simple idea:

➢ Run single-source shortest paths from each source 𝑡 ➢ Running time is 𝑃 𝑜4 ➢ Actually, we can do this in 𝑃(𝑜3) as well

slide-42
SLIDE 42

All-Pairs Shortest Paths

373F20 - Nisarg Shah 42

  • Problem

➢ Input: A directed graph 𝐻 = (𝑊, 𝐹) with edge lengths ℓ𝑤𝑥

  • n each edge (𝑤, 𝑥) and no negative cycles

➢ Goal: Compute the length of the shortest path from all

vertices 𝑡 to all other vertices 𝑢

  • 𝑃𝑄𝑈 𝑣, 𝑤, 𝑙 = length of shortest simple path from

𝑣 to 𝑤 in which intermediate nodes from {1, … , 𝑙}

  • Exercise: Write down the recursion formula of 𝑃𝑄𝑈

such that given subsolutions, it requires 𝑃(1) time

  • Running time: 𝑃 𝑜3 calls, 𝑃 1 per call ⇒ 𝑃 𝑜3
slide-43
SLIDE 43

Chain Matrix Product

373F20 - Nisarg Shah 43

  • Problem

➢ Input: Matrices 𝑁1, … , 𝑁𝑜 where the dimension of 𝑁𝑗 is

𝑒𝑗−1 × 𝑒𝑗

➢ Goal: Compute 𝑁1 ⋅ 𝑁2 ⋅ … ⋅ 𝑁𝑜

  • But matrix multiplication is associative

➢ 𝐵 ⋅ 𝐶 ⋅ 𝐷 = 𝐵 ⋅ 𝐶 ⋅ 𝐷 ➢ So isn’t the optimal solution going to call the algorithm

for multiplying two matrices exactly 𝑜 − 1 times?

➢ Insight: the time it takes to multiply two matrices

depends on their dimensions

slide-44
SLIDE 44

Chain Matrix Product

373F20 - Nisarg Shah 44

  • Assume

➢ We use the brute force approach for matrix multiplication ➢ So multiplying 𝑞 × 𝑟 and 𝑟 × 𝑠 matrices requires 𝑞 ⋅ 𝑟 ⋅ 𝑠

  • perations
  • Example: compute 𝑁1 ⋅ 𝑁2 ⋅ 𝑁3

➢ 𝑁1 is 5 X 10 ➢ 𝑁2 is 10 X 100 ➢ 𝑁3 is 100 X 50 ➢ 𝑁1 ⋅ 𝑁2 ⋅ 𝑁3 → 5 ⋅ 10 ⋅ 100 + 5 ⋅ 100 ⋅ 50 = 30000 ops ➢ 𝑁1 ⋅ 𝑁2 ⋅ 𝑁3 → 10 ⋅ 100 ⋅ 50 + 5 ⋅ 10 ⋅ 50 = 52500 ops

slide-45
SLIDE 45

Chain Matrix Product

373F20 - Nisarg Shah 45

  • Note

➢ Our input is simply the dimensions 𝑒0, 𝑒1, … , 𝑒𝑜 (such that

each 𝑁𝑗 is 𝑒𝑗−1 × 𝑒𝑗) and not the actual matrices

  • Why is DP right for this problem?

➢ Optimal substructure property ➢ Think of the final product computed, say 𝐵 ⋅ 𝐶 ➢ 𝐵 is the product of some prefix, 𝐶 is the product of the

remaining suffix

➢ For the overall optimal computation, each of 𝐵 and 𝐶

should be computed optimally

slide-46
SLIDE 46

Chain Matrix Product

373F20 - Nisarg Shah 46

  • 𝑃𝑄𝑈(𝑗, 𝑘) = min ops required to compute 𝑁𝑗 ⋅ … ⋅ 𝑁

𝑘

➢ Here, 1 ≤ 𝑗 ≤ 𝑘 ≤ 𝑜 ➢ Q: Why do we not just care about prefixes and suffices?

  • 𝑁1 ⋅ 𝑁2 ⋅ 𝑁3 ⋅ 𝑁4

⋅ 𝑁5 ⇒ need to know optimal solution for 𝑁2 ⋅ 𝑁3 ⋅ 𝑁4

➢ Running time: 𝑃 𝑜2 calls, 𝑃(𝑜) time per call ⇒ 𝑃 𝑜3

𝑃𝑄𝑈 𝑗, 𝑘 = ൝ 𝑗 = 𝑘 min 𝑃𝑄𝑈 𝑗, 𝑙 + 𝑃𝑄𝑈 𝑙 + 1, 𝑘 + 𝑒𝑗−1𝑒𝑙𝑒𝑘 ∶ 𝑗 ≤ 𝑙 < 𝑘 if 𝑗 < 𝑘

slide-47
SLIDE 47

Chain Matrix Product

373F20 - Nisarg Shah 47

  • Can we do better?

➢ Surprisingly, yes. But not by a DP algorithm (that I know of) ➢ Hu & Shing (1981) developed 𝑃(𝑜 log 𝑜) time algorithm by

reducing chain matrix product to the problem of “optimally” triangulating a regular polygon

Source: Wikipedia Example

  • 𝐵 is 10 × 30, 𝐶 is 30 × 5, 𝐷 is 5 × 60
  • The cost of each triangle is the product
  • f its vertices
  • Want to minimize total cost of all

triangles This slide is not in the scope of the course

slide-48
SLIDE 48
  • Edit distance (aka sequence alignment) problem

➢ How similar are strings 𝑌 = 𝑦1, … , 𝑦𝑛 and 𝑍 = 𝑧1, … , 𝑧𝑜?

  • Suppose we can delete or replace symbols

➢ We can do these operations on any symbol in either string ➢ How many deletions & replacements does it take to match

the two strings?

Edit Distance

373F20 - Nisarg Shah 48

slide-49
SLIDE 49
  • Example: ocurrance vs occurrence

Edit Distance

373F20 - Nisarg Shah 49

6 replacements, 1 deletion 1 replacement, 1 deletion

slide-50
SLIDE 50
  • Edit distance problem

➢ Input

  • Strings 𝑌 = 𝑦1, … , 𝑦𝑛 and 𝑍 = 𝑧1, … , 𝑧𝑜
  • Cost 𝑒(𝑏) of deleting symbol 𝑏
  • Cost 𝑠(𝑏, 𝑐) of replacing symbol 𝑏 with 𝑐
  • Assume 𝑠 is symmetric, so 𝑠 𝑏, 𝑐 = 𝑠(𝑐, 𝑏)

➢ Goal

  • Compute the minimum total cost for matching the two strings
  • Optimal substructure?

➢ Want to delete/replace at one end and recurse

Edit Distance

373F20 - Nisarg Shah 50

slide-51
SLIDE 51
  • Optimal substructure

➢ Goal: match 𝑦1, … , 𝑦𝑛 and 𝑧1, … , 𝑧𝑜 ➢ Consider the last symbols 𝑦𝑛 and 𝑧𝑜 ➢ Three options:

  • Delete 𝑦𝑛, and optimally match 𝑦1, … , 𝑦𝑛−1 and 𝑧1, … , 𝑧𝑜
  • Delete 𝑧𝑜, and optimally match 𝑦1, … , 𝑦𝑛 and 𝑧1, … , 𝑧𝑜−1
  • Match 𝑦𝑛 and 𝑧𝑜, and optimally match 𝑦1, … , 𝑦𝑛−1 and 𝑧1, … , 𝑧𝑜−1
  • We incur cost 𝑠(𝑦𝑛, 𝑧𝑜)
  • Extend the definition of 𝑠 so that 𝑠 𝑏, 𝑏 = 0 for any symbol 𝑏

➢ Hence in the DP, we need to compute the optimal solutions

for matching 𝑦1, … , 𝑦𝑗 with 𝑧1, … , 𝑧𝑘 for all (𝑗, 𝑘)

Edit Distance

373F20 - Nisarg Shah 51

slide-52
SLIDE 52
  • 𝐹[𝑗, 𝑘] = edit distance between 𝑦1, … , 𝑦𝑗 and 𝑧1, … , 𝑧𝑘
  • Bellman equation

𝐹 𝑗, 𝑘 = if 𝑗 = 𝑘 = 0 𝑒 𝑧𝑘 + 𝐹[𝑗, 𝑘 − 1] if 𝑗 = 0 ∧ 𝑘 > 0 𝑒 𝑦𝑗 + 𝐹[𝑗 − 1, 𝑘] if 𝑗 > 0 ∧ 𝑘 = 0 min{𝐵, 𝐶, 𝐷}

  • therwise

where 𝐵 = 𝑒 𝑦𝑗 + 𝐹 𝑗 − 1, 𝑘 , 𝐶 = 𝑒 𝑧𝑘 + 𝐹 𝑗, 𝑘 − 1 𝐷 = 𝑠 𝑦𝑗, 𝑧𝑘 + 𝐹[𝑗 − 1, 𝑘 − 1]

  • 𝑃(𝑜 ⋅ 𝑛) time, 𝑃(𝑜 ⋅ 𝑛) space

Edit Distance

373F20 - Nisarg Shah 52

slide-53
SLIDE 53

Edit Distance

373F20 - Nisarg Shah 53

𝐹 𝑗, 𝑘 = if 𝑗 = 𝑘 = 0 𝑒 𝑧𝑘 + 𝐹[𝑗, 𝑘 − 1] if 𝑗 = 0 ∧ 𝑘 > 0 𝑒 𝑦𝑗 + 𝐹[𝑗 − 1, 𝑘] if 𝑗 > 0 ∧ 𝑘 = 0 min{𝐵, 𝐶, 𝐷}

  • therwise

where 𝐵 = 𝑒 𝑦𝑗 + 𝐹 𝑗 − 1, 𝑘 , 𝐶 = 𝑒 𝑧𝑘 + 𝐹 𝑗, 𝑘 − 1 𝐷 = 𝑠 𝑦𝑗, 𝑧𝑘 + 𝐹[𝑗 − 1, 𝑘 − 1]

  • Space complexity can be reduced in bottom-up approach

➢ While computing 𝐹[⋅, 𝑘], we only need to store 𝐹[⋅, 𝑘] and 𝐹 ⋅, 𝑘 − 1 , ➢ So the additional space required is 𝑃(𝑛) ➢ By storing two rows at a time instead, we can make it 𝑃 𝑜 ➢ Usually people include storage of inputs, so it’s 𝑃(𝑜 + 𝑛) ➢ But this is not enough if we want to compute the actual solution

slide-54
SLIDE 54

Hirschberg’s Algorithm

373F20 - Nisarg Shah 54

  • The optimal solution can be computed in 𝑃 𝑜 ⋅ 𝑛

time and 𝑃(𝑜 + 𝑛) space too!

This slide is not in the scope of the course

slide-55
SLIDE 55

Hirschberg’s Algorithm

373F20 - Nisarg Shah 55

  • Key idea nicely combines divide & conquer with DP
  • Edit distance graph

𝑒(𝑦𝑗) 𝑒(𝑧𝑘) This slide is not in the scope of the course

slide-56
SLIDE 56

Hirschberg’s Algorithm

373F20 - Nisarg Shah 56

  • Observation (can be proved by induction)

➢ 𝐹[𝑗, 𝑘] = length of shortest path from (0,0) to (𝑗, 𝑘)

𝑒(𝑦𝑗) 𝑒(𝑧𝑘) This slide is not in the scope of the course

slide-57
SLIDE 57

Hirschberg’s Algorithm

373F20 - Nisarg Shah 57

  • Lemma

➢ Shortest path from (0,0) to (𝑛, 𝑜) passes through (𝑟, Τ

𝑜 2)

where 𝑟 minimizes length of shortest path from (0,0) to (𝑟, Τ

𝑜 2) + length of shortest path from (𝑟, Τ 𝑜 2) to (𝑛, 𝑜) This slide is not in the scope of the course

slide-58
SLIDE 58

Hirschberg’s Algorithm

373F20 - Nisarg Shah 58

  • Idea

➢ Find 𝑟 using divide-and-conquer ➢ Find shortest paths from (0,0) to (𝑟, Τ

𝑜 2) and (𝑟, Τ 𝑜 2) to

(𝑛, 𝑜) using DP

This slide is not in the scope of the course

slide-59
SLIDE 59

373F20 - Nisarg Shah 59

Application: Protein Matching