COMP 3403 Algorithm Analysis Part 4 Chapter 8 Jim Diamond CAR - - PowerPoint PPT Presentation

comp 3403 algorithm analysis part 4 chapter 8
SMART_READER_LITE
LIVE PREVIEW

COMP 3403 Algorithm Analysis Part 4 Chapter 8 Jim Diamond CAR - - PowerPoint PPT Presentation

COMP 3403 Algorithm Analysis Part 4 Chapter 8 Jim Diamond CAR 409 Jodrey School of Computer Science Acadia University Chapter 8 Dynamic Programming Jim Diamond, Jodrey School of Computer Science, Acadia University Chapter 8 128


slide-1
SLIDE 1

COMP 3403 — Algorithm Analysis Part 4 — Chapter 8

Jim Diamond CAR 409 Jodrey School of Computer Science Acadia University

slide-2
SLIDE 2

Chapter 8

Dynamic Programming

Jim Diamond, Jodrey School of Computer Science, Acadia University

slide-3
SLIDE 3

Chapter 8 128

Dynamic Programming: Introduction

  • The word “programming” here refers to the concept of “planning”,

rather than the concept of “coding in a computer language”

  • Idea: we have seen that it is a common idea to break down a larger

problem into sub-problems –

  • Example: consider F(n) = F(n − 1) + F(n − 2)

– the two sub-problems overlap, since to calculate F(n − 1) we will need to calculate F(n − 2) (which in this particular case is the entirety of the second sub-problem) – if we choose the “obvious” recursive implementation of F(n), the number of sub-problems solved is exponential in n (!) – – the obvious recursive algorithm is very, very inefficient

Jim Diamond, Jodrey School of Computer Science, Acadia University

slide-4
SLIDE 4

Chapter 8 129

Dynamic Programming for Mr. Fibonacci

  • As mentioned, the “obvious” recursive algorithm to compute Fibonacci

numbers is horribly inefficient

  • A dynamic programming approach would arrange the calculations so

that no sub-problem is solved more than once

  • Two approaches:

– bottom-up: compute in the following order:

F(0) = 0, F(1) = 1 F(2) = 1 + 0 = 1 F(3) = 1 + 1 = 2

The iterative approach using an array

· · · F(n) = F(n − 1) + F(n − 2)

– top-down: record the solution to each sub-problem when calculated; when the solution to a sub-problem is desired, see if the particular sub-problem has already been solved

so called “memory functions”

  • Top-down may be more efficient for some problems, since the solution

to some “smaller” sub-problems may not be required

Jim Diamond, Jodrey School of Computer Science, Acadia University

slide-5
SLIDE 5

Chapter 8 130

Dynamic Programming: General Concept

  • Recall: many problem solving techniques involve

– – solving each of the sub-problems, and – assembling the solutions to the sub-problems into a solution of the big problem

  • In some problems (such as calculating F(n)) the sub-problems may

“overlap” and/or themselves have sub-problems in common –

  • Idea: store the solution to a given sub-problem in a table the first time

it is computed –

  • The “trick” to using dynamic programming is to figure out

– what the overlapping/repeated calculations are, and – how to arrange the calculations so as to avoid repeatedly solving the same sub-problem(s)

Jim Diamond, Jodrey School of Computer Science, Acadia University

slide-6
SLIDE 6

Chapter 8 131

Example: Binomial Coefficients

  • The binomial coefficients are the coefficients of the binomial formula:

(a + b)n =

  • n
  • anb0 + · · · +
  • n

k

  • an−kbk + · · · +
  • n

n

  • a0bn

Recurrence:

n = n

n

= 1

for n ≥ 0

n

k

= n−1

k−1

+ n−1

k

  • for n > k > 0

Why?

  • The value of

n

k

  • can be computed by filling a table as follows:

1 2

· · k − 1 k

1 1 1 1

·

1

n − 1

1

  • n − 1

k − 1

  • n − 1

k

  • n

1

  • n

k

  • Q: Do we need all of this table filled in?

Jim Diamond, Jodrey School of Computer Science, Acadia University

slide-7
SLIDE 7

Chapter 8 132

Example: The “Coin-Row” Problem: 1

  • Given: there are n coins in a row, of values c1, c2, . . . , cn
  • Goal: pick up the maximum amount of money, without taking two

adjacent coins

  • Observation 1: starting with the largest coin won’t work

  • Q: how to proceed?
  • Observation 2: either the optimum solution uses the first coin or it

doesn’t

  • Observation 2′: either the optimum solution uses the last coin or it

doesn’t –

Jim Diamond, Jodrey School of Computer Science, Acadia University

slide-8
SLIDE 8

Chapter 8 133

Example: The “Coin-Row” Problem: 2

  • Define F(k) to be the best solution using only the first k coins
  • Apply Observation 2′:

F(n) = max

cn + F(n − 2), F(n − 1)

  • (*)
  • Write down the base cases (“initial conditions”):

F(0) = 0, F(1) = c1

  • Now fill in a one-row table of F(i) values from left to right using

formula (*) –

  • Problem for the diligent student: F(n) just gives us the optimal

amount; how do we know which specific coins to take?

GEQ!

  • Note that for this problem, to calculate only F(n), we don’t really need

the whole array of size n: we could get away with just three memory locations

Jim Diamond, Jodrey School of Computer Science, Acadia University

slide-9
SLIDE 9

Chapter 8 134

Collecting Objects from an n × m Board: 1

  • Given: there is an n × m board with objects at some of the board

positions; for example * * * * * * * * * * *

  • Rules: you start in the upper left corner, at at each turn you can move

right or down (but not off the board); you collect the object from any location you move onto

  • Goal: collect as many objects as possible
  • Observation 1: a wrong choice at the beginning or near the end can

produce a sub-optimal solution

(as well as in the middle)

Jim Diamond, Jodrey School of Computer Science, Acadia University

slide-10
SLIDE 10

Chapter 8 135

Collecting Objects from an n × m Board: 2

  • As with many (most?) dynamic programming problems, the first trick is

to figure out what function you are optimizing

  • The second trick is to figure out how you can relate the optimal

solution of smaller problems to the optimal solution of larger problems

  • Once you know these two things, the rest is often “easy”
  • Idea 1: maximize F(k), the number of objects collected after k steps
  • Problem: relating F(k) to F(k − 1) is difficult because there are (in

general) many places you can be after k steps, and for each of those there are (for this problem) usually two places you might have come from (to the left or above)

  • So that is not a good choice of function to optimize

Jim Diamond, Jodrey School of Computer Science, Acadia University

slide-11
SLIDE 11

Chapter 8 136

Collecting Objects from an n × m Board: 3

  • Idea 2: maximize F(i, j), the number of objects which can be collected

when you get to board position (i, j)

  • This definition of F() can be “easily” related to “smaller” problems

– – (and for those, only if i and/or j is larger than 0)

  • Base case: F(0, 0) is the number of objects at (0, 0)

* * * * * * * * * * * gives F() = 1 2 2 3 4 1 1 2 2 3 4 1 1 3 4 4 4 2 3 3 5 6 6

  • Keep track of how we did it with another matrix (of size O(n · m)):

← ← ← ← ↑ ← ↑ ← ↑ ↑ ↑ ← ↑ ← ← ← ↑ ← ← ↑ ← ←

Jim Diamond, Jodrey School of Computer Science, Acadia University

slide-12
SLIDE 12

Chapter 8 137

The 0–1 Knapsack Problem: 1

  • Problem: given n items of known weights w1, . . . , wn and values

v1, . . . , vn and a knapsack of capacity W , find the most valuable subset

  • f the items that fit into the knapsack.

– – values can be non-integer

  • A brute force solution is discussed in Section 3.4

(Ω(2n) time!)

  • Q: how can we formulate this as a dynamic programming problem

  • Idea: consider just the first i items, for 1 ≤ i ≤ n
  • That by itself doesn’t give us the necessary recurrence to allow us to

state “bigger” instances in terms of “smaller” instances

  • Note: a more general formulation of the knapsack problem allows any number of each

item in the knapsack, not just 0 or 1

Jim Diamond, Jodrey School of Computer Science, Acadia University

slide-13
SLIDE 13

Chapter 8 138

The 0–1 Knapsack Problem: 2

  • Idea 2: to be able to state “bigger” instances in terms of “smaller”

instances, we must not only consider different numbers of objects, but also consider different target weights

  • Idea: allow the recurrence to be a function of not only the first i items,

but the allowed weight: – define F(i, j) be the value of an optimal solution for items with weights w1, . . . , wi and values v1, . . . , vi in a knapsack of capacity j

  • Now apply the following amazing observation: there are two categories
  • f subsets of the first i items that fit into a knapsack of capacity j:

– – those that do not contain item i

  • The subsets that don’t contain i have optimal value F(i − 1, j)
  • The subsets that contain i have optimal value F(i − 1, j − wi)

but only if j − wi ≥ 0

Jim Diamond, Jodrey School of Computer Science, Acadia University

slide-14
SLIDE 14

Chapter 8 139

The 0–1 Knapsack Problem: 3

  • These ideas give the following recurrence:

F(i, j) =

max{F(i − 1, j), vi + F(i − 1, j − wi)}

if wi ≤ j

F(i − 1, j)

  • therwise
  • As usual, we need some base cases (initial conditions):

F(0, j) = 0 for j ≥ 0

and

F(i, 0) = 0 for i ≥ 0

  • To perform the calculations, we fill a matrix in using a very similar

technique to the binomial coefficients example –

Jim Diamond, Jodrey School of Computer Science, Acadia University

slide-15
SLIDE 15

Chapter 8 140

Memory Functions

  • In the previous two examples, we used a bottom-up approach to filling

in the matrix – this has the disadvantage of possibly calculating values which are never needed for specific desired solution – it also doesn’t match the recursive definition of the functions used to describe the solution

  • Instead, we can apply the following idea:

– –

(if all possible values are valid, use a corresponding array of bit values)

– when the (recursive) algorithm needs a given value, it checks the validity of that location in the array – – if invalid, recursively calculate it

  • This way no value is ever calculated twice
  • See algorithm MFKnapsack in the textbook

Jim Diamond, Jodrey School of Computer Science, Acadia University

slide-16
SLIDE 16

Chapter 8 141

Optimal Binary Search Trees

  • Suppose you have n keys

a1 < · · · < an

and experience has shown that the probabilities of searching for each of the ai’s is (respectively)

p1, . . . , pn

Problem: find a BST with a minimum average number of comparisons in a successful search

  • Idea: different binary search trees (for a given set of probabilities) will

have different average search costs Q: Can we just generate all BSTs and pick the best? A: Since the total number of BSTs with n nodes is given by

  • 2n

n

  • ·

1 n + 1 ,

which grows exponentially, brute force is (generally) hopeless –

Jim Diamond, Jodrey School of Computer Science, Acadia University

slide-17
SLIDE 17

Chapter 8 142

Optimal Binary Search Trees: 2

  • Example: What is an optimal BST for keys A, B, C, and D (= a1, a2,

a3, a4) with search probabilities 0.1, 0.2, 0.4, and 0.3, respectively?

  • The cost (average number of comparisons) for the first tree is

Cost[1, 4] =

n

  • i=1

pi ∗ level(ai) = 0.1 · 1 + 0.2 · 2 + 0.4 · 3 + 0.3 · 4 = 2.9

and the cost for the second is

0.1 · 2 + 0.2 · 1 + 0.4 · 2 + 0.3 · 3 = 2.1

  • Are either of these optimal?

Jim Diamond, Jodrey School of Computer Science, Acadia University

slide-18
SLIDE 18

Chapter 8 143

Optimal Binary Search Trees: 3

  • To find an efficient solution for this, we employ the following amazingly

brilliant observation: –

  • If that’s not enough for you, consider this piece of cleverness:

– as well as this piece: – the right subtree in an optimal BST should be optimal for the set of keys it contains

  • Admittedly, the latter two pieces are not quite as obvious as the first

part Q1: Q2: if you can’t prove these two claims, can you at least convince yourself they are true?

Jim Diamond, Jodrey School of Computer Science, Acadia University

slide-19
SLIDE 19

Chapter 8 144

Optimal Binary Search Trees: 4

  • Consider some subtree of the optimal BST:
  • Observations:

– – all of the keys greater than ak are in the right sub-subtree

Jim Diamond, Jodrey School of Computer Science, Acadia University

slide-20
SLIDE 20

Chapter 8 145

Optimal Binary Search Trees: 5

  • Define C[i, j] to be the cost of the optimal BST T j

i made up of the

keys ai, . . . , aj, for any 1 ≤ i ≤ j ≤ n

  • Using the usual dynamic programming principle, we first find (optimal)

costs C[i′, j′] for sub-problems T j′

i′ where i′ > i and/or j′ < j

  • Suppose I have some optimal BST T j

i for the keys ai, . . . , aj

– if T j

i is (say) the left tree of a tree T ′ with root aj+1, then to access

some key in {ai, . . . , aj} we use one more comparison than we used to find the key in T j

i

<board example>

  • Thus we can relate the cost of a BST to the costs of its two sub-BSTs
  • E.g., suppose we have T j

i with root ak for i ≤ k ≤ j

C[i, j] = pk + C′[i, k − 1] + C′[k + 1, j]

where C′[l, m] is similar to C[l, m], but when subtree T m

l

is one level deeper in the complete tree

Jim Diamond, Jodrey School of Computer Science, Acadia University

slide-21
SLIDE 21

Chapter 8 146

Optimal Binary Search Trees: 6

  • How do we compute C′[. . .]?

Recall that

C[i, j] =

j

  • l=i

pl · level(al)

  • If the whole subtree T j

i is moved one level down, then the number of

comparisons to reach a given key in T j

i is increased by one:

C′[i, j] =

j

  • l=i

pl · (level(al) + 1) =

j

  • l=i

pl · level(al) +

j

  • l=i

pl = C[i, j] +

j

  • l=i

pl

so the cost of a tree grows by its weight (the sum of the key probabilities

  • f keys in that tree) whenever the tree is moved down one level

– this allows us to easily compute the overall cost of a tree, based upon knowing the costs of its subtrees

Jim Diamond, Jodrey School of Computer Science, Acadia University

slide-22
SLIDE 22

Chapter 8 147

Optimal Binary Search Trees: 7

  • We now know how to compute the cost of a tree knowing its root

vertex and the costs (and weights) of its subtrees

  • Denote j

l=i pl, the weight of T j i , by W[i, j]

  • To find the optimal BST for the keys ai, . . . , aj we can try each possible

key at the root:

C[i, j] = min

i≤k≤j (pk · 1 + C′[i, k − 1] + C′[k + 1, j])

= min

i≤k≤j (pk + C[i, k − 1] + W[i, k − 1] + C[k + 1, j] + W[k + 1, j])

= W[i, j] + min

i≤k≤j (C[i, k − 1] + C[k + 1, j])

  • Note that we need to define C[i, i − 1] = 0 for 1 ≤ i ≤ n + 1 to handle

the cases where the first or last node is the root

  • Also note that C[i, i] = pi, the weight of a one-node tree
  • Q: is this computationally efficient?

Jim Diamond, Jodrey School of Computer Science, Acadia University

slide-23
SLIDE 23

Chapter 8 148

Optimal Binary Search Trees: 8

  • Sample computation table:
  • Note that there is a column indexed with 0 and a row indexed with n + 1

Jim Diamond, Jodrey School of Computer Science, Acadia University

slide-24
SLIDE 24

Chapter 8 149

Optimal Binary Search Trees: 9

  • How much computation is required to fill in the table?

– – there are n pk’s next to that – each of the above can be written down in constant time

  • There are (n − 1) + (n − 2) + · · · + 2 + 1 entries left to be computed

– each of these is computed by computing O(n) sums (each sum is the sum of three numbers) and finding the minimum of the sums

  • In total, Θ(n3)

. . . bah!

  • We can improve this to Θ(n2) with the following observation:

adding a weight to the left can not move the optimal root to the right – that is, if the root for the optimal subtree [ai, . . . , aj] is ak, then adding aj+1 can not give an optimal subtree with a root al where l < k – a careful counting shows that computing each diagonal can be done in Θ(n) time, thus a total of Θ(n2) time is required

Proof: GEQ

Jim Diamond, Jodrey School of Computer Science, Acadia University

slide-25
SLIDE 25

Chapter 8 150

Optimal Binary Search Trees: 10

  • W[i, j] computation:

– in the algorithm in the textbook, each time a W[i, j] is needed the algorithm uses a for loop to compute the value (W[i, j] = j

l=i pl)

– instead of doing this, a matrix can be computed in a manner similar to the C[i, j] matrix –

  • Space complexity:

– –

  • nly O(n) space is needed for W[. . .] if you are careful

– GEQ: how?

Jim Diamond, Jodrey School of Computer Science, Acadia University

slide-26
SLIDE 26

Chapter 8 151

The Transitive Closure of a Directed Graph

  • Recall that while the adjacency matrix A for an undirected graph is equal

to its transpose AT , this is not (in general) the case for a directed graph

  • Given a digraph, suppose you want to know which vertices can be

reached from a given vertex with a path of length 1 or more –

  • Define the transitive closure of a digraph G = (V, E) to be the graph

TC(G) = (V, E′) where (v1, v2) ∈ E′ iff there is a path of length greater than 0 from v1 to v2

Jim Diamond, Jodrey School of Computer Science, Acadia University

slide-27
SLIDE 27

Chapter 8 152

Computing the Transitive Closure of a Digraph G

  • Idea 1: for each v ∈ V , do a BFS or DFS starting at v, recording the

vertices reachable in TC(G)’s adjacency matrix –

  • Problem: each vertex and each edge will be examined more times than

necessary – e.g., if G has (a, b), (b, c), (c, d) and (d, e), then the path from c to e will be “discovered” (at least) 3 times –

  • nce when searching from a, once from b and once from c
  • Q: Can we do better?

A: Yes! Idea: we make use of three “dimensions” here: “from vertex”, “to vertex” and “intermediate vertex” Note that the adjacency matrix of G shows the paths from vi to vj with no intermediate vertex numbered higher than 0 –

Jim Diamond, Jodrey School of Computer Science, Acadia University

slide-28
SLIDE 28

Chapter 8 153

Transitive Closure: Warshall’s Algorithm

  • Given an n-vertex digraph G, define a series of n × n Boolean matrices

R(0), . . . , R(k−1), R(k), . . . , R(n)

where r(k)

i,j = 1 iff there is a path from vi to vj with no intermediate

vertex numbered higher than k

  • R(0) is the adjacency matrix of G: a path which goes from (say) vi to vj

and has no intermediate vertices numbered higher than 0 must be a single edge

  • R(1) indicates which vertices are connected either directly or through the

intermediate vertex v1

  • Similarly, R(2) indicates which vertices are connected either directly or

through the intermediate vertices v1 and/or v2

Jim Diamond, Jodrey School of Computer Science, Acadia University

slide-29
SLIDE 29

Chapter 8 154

Transitive Closure: Warshall’s Algorithm 2

  • Warshall’s algorithm iteratively computes the
  • R(k)

matrices – – to compute R(k) given R(k−1), observe that if

r(k−1)

i,k

= 1

and r(k−1)

k,j

= 1,

then there is a path from vi to vj that goes through vk (so r(k)

i,j = 1)

Jim Diamond, Jodrey School of Computer Science, Acadia University

slide-30
SLIDE 30

Chapter 8 155

Transitive Closure: Warshall’s Algorithm 3

Applying Warshall’s Algorithm: new 1’s are in bold face

Jim Diamond, Jodrey School of Computer Science, Acadia University

slide-31
SLIDE 31

Chapter 8 156

Transitive Closure: Warshall’s Algorithm 4

/* * Warshall’s algorithm computes the transitive closure * Input: the adjacency matrix of a digraph with n vertices * Returns: the transitive closure of the input */ Warshall(A[1..n][1..n]) R(0) ← A for k ← 1 to n for i ← 1 to n for j ← 1 to n R(k) i,j ← R(k-1) i,j || ( R(k-1) i,k && R(k-1) k,j ) return R(n)

  • Time complexity: Θ(n3)

Bah!

  • Space complexity: only one matrix is needed. . . we can just update the

input matrix in-place!

Jim Diamond, Jodrey School of Computer Science, Acadia University

slide-32
SLIDE 32

Chapter 8 157

Shortest Paths

  • Given a weighted, connected graph G, the all-pairs shortest path

problem is to find the lengths of the minimum-length paths between each pair of vertices –

  • Let D be an n × n matrix where di,j is the length of the shortest path

from vi to vj

  • Example:

Jim Diamond, Jodrey School of Computer Science, Acadia University

slide-33
SLIDE 33

Chapter 8 158

Shortest Paths: Floyd’s Algorithm

  • Assume the graph does not contain a cycle with a negative length

  • Similar to Warshall’s algorithm, Floyd’s algorithm calculates a series of

distance matrices:

D(0), . . . , D(k−1), D(k), . . . , D(n)

where D(k) is the set of shortest path distances where no path uses an intermediate vertex numbered higher than k

  • As before, D(k) can be computed from D(k−1)

– similar to before, D(0) is the graph’s edge weight matrix

  • Idea: given the shortest path from vi to vj with no intermediate vertices

numbered larger than k, we can say that – – the shortest path does not go through v

k

Jim Diamond, Jodrey School of Computer Science, Acadia University

slide-34
SLIDE 34

Chapter 8 159

Shortest Paths: Floyd’s Algorithm: 2

  • Note: it turns out that we can over-write the D(k−1) matrix with the

D(k) matrix, so we don’t actually need n different matrices

/* * Floyd’s alg computes the lengths of all shortest paths * Input: the weight matrix of a graph with n vertices * Returns: the matrix of shortest path lengths */ Floyd(W[1..n][1..n]) D ← W for k ← 1 to n for i ← 1 to n for j ← 1 to n Di,j ← min(Di,j, Di,k + Dk,j) return D

  • Time complexity: Θ(n3)

Bah!

  • GEQ: what “easy” addition can be made to this algorithm so that we

can compute the paths, not just their lengths?

Jim Diamond, Jodrey School of Computer Science, Acadia University