[PPT] - COMP 3403 Algorithm Analysis Part 5 Chapter 9 Jim Diamond CAR PowerPoint Presentation

SLIDE 1

COMP 3403 — Algorithm Analysis Part 5 — Chapter 9

Jim Diamond CAR 409 Jodrey School of Computer Science Acadia University

SLIDE 2

Chapter 9 Greedy Techniques

Jim Diamond, Jodrey School of Computer Science, Acadia University

SLIDE 3

Chapter 9 160

Greedy Approaches

One technique to solve a problem is to make a series of decisions which

seem to be the best at the instant each decision is made

More specifically, these decisions have three properties:

– – they are locally optimal: that is, among all options available at the time the decision is made, the best (or a best) decision is made – they are irrevocable: that is, having made a decision, the decision can not be taken back or reversed (you can’t later change your mind)

Applying these criteria often provide a relatively simple algorithm

– sometimes the overall solution is optimal, sometimes it isn’t – in cases where the solution is not optimal, a greedy approach can still be useful – – it provides a bound, which may be useful for other purposes

Jim Diamond, Jodrey School of Computer Science, Acadia University

SLIDE 4

Chapter 9 161

Change-Making Problem

Given an “unlimited” amounts of coins of denominations d1 > · · · > dm,

make change for an amount n using the smallest number of coins

Example: d1 = 25/

c, d2 = 10/ c, d3 = 5/ c, d4 = 1/ c and n = 62/ c

The greedy solution:

– – may not be optimal for “unnormal” coin denominations – e.g., suppose n = 15/ c and the available coin denominations are 10/ c, 7/ c and 1/ c; the optimal solution uses three coins the greedy solution uses six coins

In the case of “unnormal” coin denominations, the greedy solution gives

an upper bound on the number of coins needed, but the upper bound is not (in general) “tight”

Jim Diamond, Jodrey School of Computer Science, Acadia University

SLIDE 5

Chapter 9 162

Minimum(-Cost/Weight) Spanning Trees

Given a connected graph G, a spanning tree T of G is a connected,

acyclic subgraph of G which contains all of G’s vertices

Given a weighted, connected graph G, a minimum-cost spanning tree T
f G is a spanning tree of G which is has minimum cost over all of G’s

spanning trees –

Example

A graph and its 3 spanning trees; T1 has min cost

Jim Diamond, Jodrey School of Computer Science, Acadia University

SLIDE 6

Chapter 9 163

Prim’s MST algorithm

Start with a tree T1 consisting of one (any) vertex and “grow” the tree
ne vertex at a time to produce a MST through a series of expanding

subtrees T1, T2, . . . , Tn

On each iteration, construct Ti+1 from Ti by adding a vertex not in Ti

that is closest to some vertex already in Ti (this is a “greedy” step!)

Stop when all vertices are included

/* * Prim’s algorithm to find a minimum cost spanning tree. * Input: a weighted, connected graph G=(V,E) * Returns: ET, the set of edges of an MST of G */ Prim(G)

VT ← {v0} ET ← Ø

for i ← 1 to |V| - 1 do find a min-cost edge e={u, v} with v ∈ VT and u* ∈ VT

VT ← VT ∪ {u} ET ← ET ∪ {e}

return ET

Jim Diamond, Jodrey School of Computer Science, Acadia University

SLIDE 7

Chapter 9 164

Example of Prim’s Algorithm

Arbitrarily start with the first vertex

(which is a); this gives V (T1) = {a}

Find a min-cost edge with a at one

end: (a, b); this gives V (T2) = {a, b}

Find a min-cost edge with a or b at
ne end, the other end in V (T2):

(b, c); this gives V (T3) = {a, b, c}

Next we pick (b, f); this gives

V (T4) = {a, b, c, f}

Next we pick (e, f); this gives

V (T5) = {a, b, c, e, f}

Next we pick (d, f); this gives

V (T6) = {a, b, c, d, e, f}

The min cost tree has E(T6) =

{(a, b), (b, c), (b, f), (d, f), (e, f)}

Jim Diamond, Jodrey School of Computer Science, Acadia University

SLIDE 8

Chapter 9 165

Prim’s Algorithm: Is the Tree Optimal?

The algorithm obeys the criteria: feasible, locally optimal, irrevocable;

but is the overall answer optimal? Yes! Proof by contradiction: Let T be a tree generated by Prim, where e1 was the first edge added, e2 the second, and so on. Assume T is not optimal; then there are optimal trees {T ∗(k)} which have smaller costs than T . For each such k, let eik = (v, u) be the first edge added to T which is not in

T ∗(k) (i.e., e1, . . . , eik−1 are all in T ∗(k), but eik / ∈ T ∗(k)). Choose T ∗ to be an

ptimal tree which maximizes ik.

Example: suppose T is Prim’s tree for some G, and T ∗(1), T ∗(2), T ∗(3) and

T ∗(4) are MSTs for G, where T = e1, e2, e3, e4, e5, e6, e7, e8, e9, e10, e11, e12, e13, e14, e15, . . . , en−1 {e1, e2, e3, e4} ⊂ T ∗(1), but e5 / ∈ T ∗(1)

thus ei1 = e5

{e1, e2, e3, . . . , e8} ⊂ T ∗(2), but e9 / ∈ T ∗(2)

thus ei2 = e9

{e1, e2} ⊂ T ∗(3), but e3 / ∈ T ∗(3)

thus ei3 = e3

{e1, e2, e3, . . . , e6} ⊂ T ∗(4), but e7 / ∈ T ∗(4)

thus ei4 = e7 Here we would choose T ∗ = T ∗(2)

Jim Diamond, Jodrey School of Computer Science, Acadia University

SLIDE 9

Chapter 9 166

Prim’s Algorithm: Is the Tree Optimal? (2)

Consider the graph T ∗ ∪ {eik}:

v u eik e′

This must have some cycle C, and C must have some edge e′ = eik

connecting some vertex v ∈ Tik−1 to some vertex u /

∈ Tik−1.

If e′ had less cost than eik, Prim’s algorithm would have chosen it

–

By deleting e′ from T ∗ ∪ {eik} we produce a tree T ′ whose weight is no

larger than T ∗ Since T ∗ is assumed to be optimal, c(T ′) = c(T ∗). But then T ′ is an optimal tree with a larger associated ik value than T ∗, contradicting the choice of T ∗. Therefore T is optimal.

Jim Diamond, Jodrey School of Computer Science, Acadia University

SLIDE 10

Chapter 9 167

Prim’s Algorithm: Efficiency

We should be able to do operations like “add e∗ to ET ” and “add u∗ to

VT ” efficiently (in constant time?)

But how about “find a min cost edge (u∗, v∗) with v∗ ∈ VT and

u∗ ∈ VT ”?

–

There are various data structures we could use to solve this
E.g., we can use a priority queue where each element in the queue is an

edge

the book idea uses a priority queue, but for vertices

– – when a new vertex u∗ is added to the tree, we add all edges from u∗ to non-tree vertices to the priority queue

GEQ #1: is this a valid solution?
GEQ #2: if so, how efficient is this? Is it better or worse than the

description in the book?

GEQ #3: if better, does every edge removed from the priority queue

end up in the tree?

Jim Diamond, Jodrey School of Computer Science, Acadia University

SLIDE 11

Chapter 9 168

Kruskal’s MST Algorithm

Idea:

– – grow tree one edge at a time to produce an MST through a series

f forests F1, F2, . . . , Fn−1

–

n each iteration, add the next edge on the sorted list unless this

would create a cycle; if it would, skip the edge

// Kruskal’s algorithm to find an MST // Input: a weighted, connected graph G = (V,E) // Returns: ET, the set of edges of an MST of G Kruskal(G) sort E in nondecreasing order of weights: ei1, ei2, ei3,. . .

ET ← Ø

; edge_count ← 0 ; k ← 0 while edge_count < |V| - 1 k++ if ET ∪ {eik} is acyclic

ET ← ET ∪ {eik}, edge_count++

return ET

Jim Diamond, Jodrey School of Computer Science, Acadia University

SLIDE 12

Chapter 9 169

Kruskal Example

Jim Diamond, Jodrey School of Computer Science, Acadia University

SLIDE 13

Chapter 9 170

Kruskal’s Algorithm Implementation Considerations

In some respects, this algorithm appears simpler than Prim’s algorithm
However, the question “is ET ∪ {Eik} acyclic?” is easier said than done

–

Idea 1: if Eik = (u, v), search in the current graph to see if v is

reachable from u – – can we do better?

Idea 2: can we come up with an efficient way of answering the question

“is u in the same tree as v?” – Q: what if each tree in the forest had a unique id, and we could find the id of u’s tree (and v’s tree) efficiently? – Q’: when an edge is added, two trees are merged: how can we efficiently update one tree’s id?

Jim Diamond, Jodrey School of Computer Science, Acadia University

SLIDE 14

Chapter 9 171

The Union-Find Algorithm

Suppose we have some algorithm which requires the following abstract
perations:

–

makeset(v) — create a set with the element v

–

find(v) — return (a unique identifier for) the set containing v

–

union(u,v) — move all elements of the set containing v to the set

containing u

These operations allow us to implement Kruskal’s algorithm:

– – this creates a forest of |V| trees, each with just one vertex – to answer the question “if ET ∪ {eik} is acyclic” where

eik = (u,v) merely compare find(u) to find(v)

– – to implement “ET ← ET ∪ {eik}” where eik = (u,v) merely call

union(u,v)

Jim Diamond, Jodrey School of Computer Science, Acadia University

SLIDE 15

Chapter 9 172

The Union-Find Algorithm: 2

What data structure(s) can we pick to allow us to efficiently implement

these operations?

A1: use a linked list for each set

– –

union(): constant time

–

find(): ummm. . . ahhh. . . O(|V |) time

A2: store the set number for each vertex in a vector

– –

find(): constant time!

–

union(): ummm. . . ahhh. . . Θ(|V |) time

A3: use both a linked list and a vector

– –

find(): constant time!

–

union(): ummm. . . ahhh. . . O(|V |) time worst case (why?)

None of these is good enough!

Jim Diamond, Jodrey School of Computer Science, Acadia University

SLIDE 16

Chapter 9 173

The Union-Find Algorithm: 3

Idea: represent each set as a tree, with (directed) edges pointing

towards the root:

makeset(): still constant time
union(): constant time (assuming we have the roots of the two trees

already, which would be the case for Kruskal’s algorithm)

Why? GEQ?

find(): still a problem, since the paths from a vertex to a root can be

“long” –

Jim Diamond, Jodrey School of Computer Science, Acadia University

SLIDE 17

Chapter 9 174

The Union-Find Algorithm: 4

Path compression: every time a find(x) operation is done, make all of

the nodes from x to the immediate child of the root children of the root –

With this path compression operation, a sequence of n union()s and m

find()s is only very very very very slightly worse than linear in m + n

– – the analysis is fairly difficult. . . multiple papers were presented showing (incorrectly) that the overall time is linear

Jim Diamond, Jodrey School of Computer Science, Acadia University

SLIDE 18

Chapter 9 175

Single-Source Shortest Path

Problem: find the shortest path from a given vertex s to all other

vertices – – Dijkstra’s algorithm solves the problem only when we have this restriction

Dijkstra’s algorithm works by first finding the path to the closest vertex

to s, then to the next closest vertex, and so on – – at any step the vertices adjacent to Ti but not in Ti are considered for inclusion – the vertex with minimum path length to s is added to Ti to produce Ti+1 – the candidate vertices adjacent to Ti+1 and their total distance from s are then updated

Jim Diamond, Jodrey School of Computer Science, Acadia University

SLIDE 19

Chapter 9 176

Dijkstra’s Algorithm

Dijkstra’s algorithm is similar to Prim’s MST algorithm, but with a

different way of computing numerical labels: among vertices not already in the tree, it finds a vertex u with the smallest sum

dv + w(v, u)

where –

v is a vertex for which a shortest path has been already found on

preceding iterations (such vertices form a tree) –

dv is the length of a shortest path from the source s to v

–

Algorithm: see text

– – the length of the shortest path from s, and – the previous vertex on the path from s

Jim Diamond, Jodrey School of Computer Science, Acadia University

SLIDE 20

Chapter 9 177

Dijkstra’s Algorithm: Example

Jim Diamond, Jodrey School of Computer Science, Acadia University

SLIDE 21

Chapter 9 178

Dijkstra’s Algorithm: Comments

As mentioned previously, Dijkstra’s algorithm does not work for graphs

with negative edge weights

Dijkstra’s algorithm is applicable to both undirected and directed graphs
Efficiency:

– – GEQ: prove this –

O(|E| log |V |) for graphs represented by adjacency lists and min-heap

implementation of priority queue – – for sparse graphs the adjacency list representation is preferable

Jim Diamond, Jodrey School of Computer Science, Acadia University

SLIDE 22

Chapter 9 179

Information Theory

The information content of a symbol is equivalent to the amount of

“surprise” one experiences upon receiving it – – there is little surprise if next symbol is “y” – this is much surprise if next symbol is “q”

Claim: more information is transmitted by an “unlikely” symbol than by

a “likely” one – example above: after Thursda has been sent, could transmit “0” (the 0 bit) or “1e(q)” (respectively), where 1 is the 1 bit and “e(q)” is the encoding of q

Idea: a likely symbol can be transmitted with fewer bits than an unlikely

symbol

Jim Diamond, Jodrey School of Computer Science, Acadia University

SLIDE 23

Chapter 9 180

Data Compression

Fewer bits are needed for likely (i.e., probable) symbols
“Expectation” is based upon the knowledge of sender and receiver

–

Compression (using this idea) requires:

– – a non-uniform probability distribution of the next symbol to be transmitted

If, given “all” knowledge

P(si) = P(sj), ∀i, j ∈ {1 . . . n}

then, on average, at least log2(n) bits are required to send next symbol

Jim Diamond, Jodrey School of Computer Science, Acadia University

SLIDE 24

Chapter 9 181

Information Theory (continued)

Example

Walk up to someone† and ask them to complete this sentence: “Peter Piper picked a peck of pickled ” They will look you straight in the eye and say: “0”

† A native English speaker, anyway

Jim Diamond, Jodrey School of Computer Science, Acadia University

SLIDE 25

Chapter 9 182

Coding

Given some set of symbols S = {s1, s2, . . . , sn}, a coding is an

assignment of bit strings to symbols

The codes can be either

– – variable-length

One desirable property of a variable-length code is that no code is the

prefix of another code –

Suppose the symbols have difference occurrence probabilities; one

coding of the symbols might produce a shorter encoding of a message than another coding

Problem: If frequencies of the character occurrences are known, what is

the best binary prefix-free code?

Jim Diamond, Jodrey School of Computer Science, Acadia University

SLIDE 26

Chapter 9 183

Huffman Coding: 1

Concept: replace each (fixed-length) encoding e(si) with a

variable-length bit string bi, transmit bi instead of e(si)

Assume each symbol si has a certain probability pi of being transmitted

–

Example:

–

S = {a, b, c, d}, pa = 1/2, pb = 1/4, pc = 1/8, pd = 1/8

–

ba = 0, bb = 10, bc = 110, bd = 111

– – Huffman coding requires (on average)

i

p(si) × |bi| = 1 × m 2 + 2 × m 4 + 3 × m 8 + 3 × m 8 = 7m/4 bits, a 12.5% saving

Note: more significant savings are possible on larger examples

Jim Diamond, Jodrey School of Computer Science, Acadia University

SLIDE 27

Chapter 9 184

Huffman Coding: 2

We need to know probabilities; how?

(a) (b) examine entire message before sending – then we must also transmit frequency distribution to receiver (c) – we must also transmit frequency distribution in this case

Other related possibilities

– – cope with non-stationarity by periodically re-computing distribution – use 2nd order probabilities pi|j: i.e., the probability of seeing si given that the previous symbol was sj – e.g., the probability that a “random” symbol in English text is “u” is fairly low, but if the previous symbol was “q” the probability is quite high

Jim Diamond, Jodrey School of Computer Science, Acadia University

SLIDE 28

Chapter 9 185

Construction of Huffman Codes

Q: how can we find these codes?
Huffman’s algorithm:

– – repeat the following step n − 1 times: – find two trees T 1

i and T 2 i with smallest weights (break ties

arbitrarily) – join T 1

i and T 2 i into one (as left and right subtrees) and make its

weight equal the sum of the weights of T 1

i and T 2 i

– mark edges leading to left and right subtrees with 0’s and 1’s, respectively.

Jim Diamond, Jodrey School of Computer Science, Acadia University

SLIDE 29

Chapter 9 186

Huffman Example

There are 5 symbols, ’A’, ’B’, ’C’, ’D’

and ’ ’

They have the probabilities 0.35, 0.1,

0.2, 0.2 and 0.15, respectively

The lightest-weight trees at the first

step are ’B’ and ’ ’ –

The Huffman codes are 11, 100, 00, 01

and 101, respectively

Jim Diamond, Jodrey School of Computer Science, Acadia University

SLIDE 30

Chapter 9 187

Huffman Results

Huffman codes are guaranteed to be the best possible

(that is, no other code mapping one input symbol to one code has better performance on average) –

There are many variations on the Huffman theme

– – probabilities may not be known in advance at all “adaptive Huffman” encoding modifies the probability estimates as each symbol is encoded – decoder duplicates this process – the Huffman tree must be updated as the probability estimates change, which means the codes change over the course of encoding a message – pairs or triples of symbols can be encoded at once –

Jim Diamond, Jodrey School of Computer Science, Acadia University

COMP 3403 — Algorithm Analysis Part 5 — Chapter 9

Jim Diamond CAR 409 Jodrey School of Computer Science Acadia University

Chapter 9

Greedy Techniques

Greedy Approaches

seem to be the best at the instant each decision is made

– – they are locally optimal: that is, among all options available at the time the decision is made, the best (or a best) decision is made – they are irrevocable: that is, having made a decision, the decision can not be taken back or reversed (you can’t later change your mind)

– sometimes the overall solution is optimal, sometimes it isn’t – in cases where the solution is not optimal, a greedy approach can still be useful – – it provides a bound, which may be useful for other purposes

Change-Making Problem

make change for an amount n using the smallest number of coins

c, d2 = 10/ c, d3 = 5/ c, d4 = 1/ c and n = 62/ c

– – may not be optimal for “unnormal” coin denominations – e.g., suppose n = 15/ c and the available coin denominations are 10/ c, 7/ c and 1/ c; the optimal solution uses three coins the greedy solution uses six coins

an upper bound on the number of coins needed, but the upper bound is not (in general) “tight”

Minimum(-Cost/Weight) Spanning Trees

acyclic subgraph of G which contains all of G’s vertices

spanning trees –

A graph and its 3 spanning trees; T1 has min cost

Prim’s MST algorithm

subtrees T1, T2, . . . , Tn

that is closest to some vertex already in Ti (this is a “greedy” step!)

/* * Prim’s algorithm to find a minimum cost spanning tree. * Input: a weighted, connected graph G=(V,E) * Returns: ET, the set of edges of an MST of G */ Prim(G)

VT ← {v0} ET ← Ø

for i ← 1 to |V| - 1 do find a min-cost edge e*={u*, v*} with v* ∈ VT and u* ∈ VT

VT ← VT ∪ {u*} ET ← ET ∪ {e*}

return ET

Example of Prim’s Algorithm

(which is a); this gives V (T1) = {a}

end: (a, b); this gives V (T2) = {a, b}

(b, c); this gives V (T3) = {a, b, c}

V (T4) = {a, b, c, f}

V (T5) = {a, b, c, e, f}

V (T6) = {a, b, c, d, e, f}

{(a, b), (b, c), (b, f), (d, f), (e, f)}

Prim’s Algorithm: Is the Tree Optimal?

T ∗(k) (i.e., e1, . . . , eik−1 are all in T ∗(k), but eik / ∈ T ∗(k)). Choose T ∗ to be an

Example: suppose T is Prim’s tree for some G, and T ∗(1), T ∗(2), T ∗(3) and

T ∗(4) are MSTs for G, where T = e1, e2, e3, e4, e5, e6, e7, e8, e9, e10, e11, e12, e13, e14, e15, . . . , en−1 {e1, e2, e3, e4} ⊂ T ∗(1), but e5 / ∈ T ∗(1)

thus ei1 = e5

{e1, e2, e3, . . . , e8} ⊂ T ∗(2), but e9 / ∈ T ∗(2)

thus ei2 = e9

{e1, e2} ⊂ T ∗(3), but e3 / ∈ T ∗(3)

thus ei3 = e3

{e1, e2, e3, . . . , e6} ⊂ T ∗(4), but e7 / ∈ T ∗(4)

thus ei4 = e7 Here we would choose T ∗ = T ∗(2)

Prim’s Algorithm: Is the Tree Optimal? (2)

Consider the graph T ∗ ∪ {eik}:

v u eik e′

connecting some vertex v ∈ Tik−1 to some vertex u /

∈ Tik−1.

–

larger than T ∗ Since T ∗ is assumed to be optimal, c(T ′) = c(T ∗). But then T ′ is an optimal tree with a larger associated ik value than T ∗, contradicting the choice of T ∗. Therefore T is optimal.

Prim’s Algorithm: Efficiency

VT ” efficiently (in constant time?)

u∗ ∈ VT ”?

–

edge

– – when a new vertex u∗ is added to the tree, we add all edges from u∗ to non-tree vertices to the priority queue

description in the book?

end up in the tree?

Kruskal’s MST Algorithm

– – grow tree one edge at a time to produce an MST through a series

–

would create a cycle; if it would, skip the edge

// Kruskal’s algorithm to find an MST // Input: a weighted, connected graph G = (V,E) // Returns: ET, the set of edges of an MST of G Kruskal(G) sort E in nondecreasing order of weights: ei1, ei2, ei3,. . .

ET ← Ø

; edge_count ← 0 ; k ← 0 while edge_count < |V| - 1 k++ if ET ∪ {eik} is acyclic

ET ← ET ∪ {eik}, edge_count++

return ET

Kruskal Example

Kruskal’s Algorithm Implementation Considerations

–

reachable from u – – can we do better?

“is u in the same tree as v?” – Q: what if each tree in the forest had a unique id, and we could find the id of u’s tree (and v’s tree) efficiently? – Q’: when an edge is added, two trees are merged: how can we efficiently update one tree’s id?

The Union-Find Algorithm

–

makeset(v) — create a set with the element v

–

find(v) — return (a unique identifier for) the set containing v

–

union(u,v) — move all elements of the set containing v to the set

for i ← 1 to |V| - 1 do find a min-cost edge e={u, v} with v ∈ VT and u* ∈ VT

VT ← VT ∪ {u} ET ← ET ∪ {e}