Greedy Algorithms Chapter 16 1 CPTR 430 Algorithms Greedy - - PowerPoint PPT Presentation

greedy algorithms
SMART_READER_LITE
LIVE PREVIEW

Greedy Algorithms Chapter 16 1 CPTR 430 Algorithms Greedy - - PowerPoint PPT Presentation

Greedy Algorithms Chapter 16 1 CPTR 430 Algorithms Greedy Algorithms Greedy Algorithms For some optimization problems, dynamic programming algorithms are overkill A greedy algorithm always makes a choice that looks best at the moment


slide-1
SLIDE 1

Greedy Algorithms

Chapter 16

CPTR 430 Algorithms Greedy Algorithms

1

slide-2
SLIDE 2

Greedy Algorithms

■ For some optimization problems, dynamic programming algorithms are

  • verkill

■ A greedy algorithm always makes a choice that looks best at the

moment

■ The strategy: A locally optimal choice will ultimately lead to a globally

  • ptimal solution

■ Greedy algorithms do not always yield optimal solutions ■ For some kinds of problems, though, greedy algorithms always yield

  • ptimal solutions

❚ Dijkstra’s shortest path in a graph ❚ Prim’s minimum spanning tree of a graph ❚ Plus many others . . .

CPTR 430 Algorithms Greedy Algorithms

2

slide-3
SLIDE 3

Example: Activity Selection

■ Need to schedule several activities that compete for a common

resource to which they require exclusive access

For example, a classroom in which only one class can be conducted at a time

■ Each activity ai has a start time si and a finish time fi, where

  • si

fi

■ Activity ai takes place during the half-open interval

si

fi

■ Activities ai and a j are compatible if the intervals

si

fi

and

s j

f j

do not overlap

That is, si

f j or s j

fi

CPTR 430 Algorithms Greedy Algorithms

3

slide-4
SLIDE 4

The Activity Selection Problem

■ Select a maximum size subset of mutually compatible activities ■ Example:

Consider the set of activities (sorted in monotonically increasing order of finish time):

i

1 2 3 4 5 6 7 8 9 10 11

si

1 3 5 3 5 6 8 8 2 12

fi

4 5 6 7 8 9 10 11 12 13 14

■ The subset

  • a3

a9

a11

contains mutually compatible activities but is not maximal

■ The subsets

  • a1

a4

a8

a11

and

  • a1

a4

a9

a11

are both larger (and both are maximal)

CPTR 430 Algorithms Greedy Algorithms

4

slide-5
SLIDE 5

A Dynamic Programming Solution

■ Find the optimal substructure ■ Define set Si j:

Si j

  • ai j

fi

  • sk

fk

  • s j

Si j is the set of activities in S that can start after activity ai finishes

and finish before activity a j starts

■ Si j is the set of all activities that are compatible with both ai and a j ■ To make a nice, regular algorithm, create dummy activities a0 and an

1

(to bracket the real activities in which we are interested)

f0

  • 0 and sn

1

■ S, the set of all activities, can thus be written S0

n

1, ■ 0

  • i

j

  • n

1

CPTR 430 Algorithms Greedy Algorithms

5

slide-6
SLIDE 6

Finding Subproblems

■ Sort the activities in monotonically increasing order of finish time:

f0

  • f1
  • f2
  • fn
  • fn

1 ■ i

j

Si j

❚ Prove by contradiction ❚ Suppose ak

Si j for some i

j, where ai follows a j in sorted order

(MIOFT)

❚ This means fi

  • sk

fk

  • s j

f j

fi

f j, a contradiction to the

assumption that ai follows a j in sorted order (MIOFT)

■ Thus, for subproblems we chose a maximal-size subset of mutually

compatible activities from Si j where 0

  • i

j

  • n

1

All other Si j will be empty

CPTR 430 Algorithms Greedy Algorithms

6

slide-7
SLIDE 7

Substructure

■ Let a solution to Si j contain activity ak ■ fi

  • sk

fk

  • s j

■ Use ak to split the problem into two subproblems: ❚ Sik, the set of activities that start after ai finishes and before ak starts ❚ Sk j, the set of activities that start after ak finishes and before a j starts ❚ Let the optimal solution to Spq be expressed as Apq

This means that no other set A

  • pq contains more activities than

Apq

❚ The solution for Si j, Ai j, is then Aik

ak

Ak j: Ai j

  • Aik
  • ak
✁ ✁

Ak j

CPTR 430 Algorithms Greedy Algorithms

7

slide-8
SLIDE 8

Optimal Substructure

■ ak

Ai j

Aik is an optimal solution to Sik Ak j is an optimal solution to Sk j

■ To show, suppose otherwise ■ Let

A

  • ik

Aik

(A

  • ik contains more activities than Aik)

Then Ai j is not optimal, since substituting A

  • ik for Aik yields a more
  • ptimal solution to Si j

■ Let

A

  • k j

Ak j

(A

  • k j contains more activities than Ak j)

Then Ai j is not optimal, since substituting A

  • k j for Ak j yields a more
  • ptimal solution to Si j

CPTR 430 Algorithms Greedy Algorithms

8

slide-9
SLIDE 9

DP Solution

Build a maximum-size subset of mutually compatible activities in Si j by:

■ Choose an activity, ak, that is a member of Ai j, an optimal solution ■ Find the optimal solutions to Sik and Sk j (that is, Aik and Ak j) ■ Combine the results:

Ai j

  • Aik
  • ak
✁ ✁

Ak j

■ To solve the entire problem, S0

n

1, find A0

n

1

CPTR 430 Algorithms Greedy Algorithms

9

slide-10
SLIDE 10

Recursive Solution

■ Let c

i

j

  • be the number of activities in a maximal-size subset of

mutually compatible activities in Si j

■ Si j

c

i

j

  • ■ i

j

c

i

j

  • Why?

■ c

i

j

  • c

i

k

1

c

k

j

  • ■ What is k?

❚ i

1

  • k
  • j

1

❚ Check all these possible values for k (all j

i

1 possibilities)

❚ c

i

j

  • if Si j

max

i

k

j

  • c

i

k

1

c

k

j

  • if Si j

CPTR 430 Algorithms Greedy Algorithms

10

slide-11
SLIDE 11

Making the Choice of k Easy

Theorem 16.1:

■ Let Si j

, and let am be the activity with the earliest finish time in Si j

fm

  • min
  • fk

ak

Si j

■ fm

  • min
  • fk

ak

Si j

✁ ✄

1

  • am

Ai j 2

  • Subproblem Sim

❚ The first item means that am is in a maximal-size subset of mutually

compatible activities in Si j

❚ The second item means that of the two subproblems formed by

choosing am, only Sm j is nonempty

CPTR 430 Algorithms Greedy Algorithms

11

slide-12
SLIDE 12

Proof of Item 1

■ Let Ai j be a maximal-size subset of mutually compatible activities in Si j ■ Order the elements of Ai j in monotonically increasing order of finish

time

■ Let ak be the first element in Ai j: ❚ ak

  • am

am

Ai j

❚ If ak

  • am, then make a new set A
  • i j
  • Ai j
  • ak
✁ ✁
  • am

❚ The activities in A

  • i j are mutually compatible because:

❙ The activities in Ai j are mutually compatible ❙ ak is the first activity to finish in Ai j ❙ fm

fk

❚ Since

A

  • i j

Ai j

, Ai j must also be a maximal-size subset of mutually compatible activities in Si j, and am

A

  • i j

CPTR 430 Algorithms Greedy Algorithms

12

slide-13
SLIDE 13

Proof of Item 2

■ Suppose Sim

■ Sim

there exists activity ak such that fi

  • sk

fk

  • sm

fm

■ fk

fm

fm

  • min
  • fk

ak

Si j

, contradicting our choice of am

  • CPTR 430 Algorithms

Greedy Algorithms

13

slide-14
SLIDE 14

Value of Theorem 16.1

■ DP solution

two subproblems

■ DP solution

j

i

1 choices for solving subproblem Si j

■ Greedy solution

  • ne subproblem

■ Greedy solution

  • ne choice for selecting the subproblem

CPTR 430 Algorithms Greedy Algorithms

14

slide-15
SLIDE 15

Greedy Approach is Top-down

■ DP is naturally bottom-up:

Recursively solve the subproblems to determine the best choice for am, then and combine the results to obtain the solution to the

  • verall problem

■ The greedy approach is naturally top-down:

To solve Si j, choose am first, then solve the subproblem Sm j

CPTR 430 Algorithms Greedy Algorithms

15

slide-16
SLIDE 16

Being Greedy

■ In the Activity scheduling problem, we always choose the activity with

the earliest finish time that is mutually compatible with activities already in the solution

■ The choice is “greedy” since it leaves more activities left to be

considered

■ This maximizes the amount of unscheduled time so that more activities

may be crammed into the solution

CPTR 430 Algorithms Greedy Algorithms

16

slide-17
SLIDE 17

Recursive Activity Scheduler

First, some of the supporting code:

public class ActivitySelector { private static class Time { public int start; public int finish; public Time(int s, int f) { start = s; finish = f; } public String toString() { return "[" + start + "," + finish + "]"; } } private static final int INFINITY = Integer.MAX_VALUE; // . . . }

CPTR 430 Algorithms Greedy Algorithms

17

slide-18
SLIDE 18

More Supporting Code

public class ActivitySelector { // . . . private static Time[] concat(Time[] array1, Time[] array2) { Time[] result = null; // Default value if ( array1 == null ) { result = array2; } else if ( array2 == null ) { result = array1; } else { result = new Time[array1.length + array2.length]; System.arraycopy(array1, 0, result, 0, array1.length); System.arraycopy(array2, 0, result, array1.length, array2.length); } return result; // . . . } }

CPTR 430 Algorithms Greedy Algorithms

18

slide-19
SLIDE 19

The Method

public class ActivitySelector { // . . . private static Time[] recursiveActivitySelector(Time[] t, int i, int j) { int m = i + 1; // Search for an activity t[m] that is compatible with t[i] while ( m < j && t[m].start < t[i].finish ) { m++; } if ( m < j ) { // Add the activity, if one was found return concat(new Time[] { t[m] }, recursiveActivitySelector(t, m, j)); } else { // None was found return null; } } // . . . }

CPTR 430 Algorithms Greedy Algorithms

19

slide-20
SLIDE 20

Running Time

■ The method recursiveActivitySelector() assumes that the array

  • f activities is ordered by monotonically increasing finish time

■ If it is not sorted, it can be sorted in O

  • nlgn

time

■ The recursiveActivitySelector() method itself runs in time Θ

  • n

❚ Each activity is considered exactly once, in the while loop ❚ A recursive call never examines an activity that has already been

considered

CPTR 430 Algorithms Greedy Algorithms

20

slide-21
SLIDE 21

Eliminating the Recursion

■ The method recursiveActivitySelector() uses tail recursion ❚ Tail recursive methods can be easily rewritten to use iteration instead

  • f recursion

❚ Compilers/interpreters for some programming languages automatically

transform tail recursive constructs into iterative form

❚ Use a local variable to maintain the cumulative computation that the

stack normally would handle implicitly

❚ The iterative version is more efficient practically in terms of time and

space

CPTR 430 Algorithms Greedy Algorithms

21

slide-22
SLIDE 22

The Iterative Version

public class ActivitySelector { // . . . private static Time[] greedyActivitySelector(Time[] t) { int n = t.length - 1; Time[] result = new Time[] { t[1] }; int i = 1; // Search for an activity t[m] that is compatible with t[i] for ( int m = 2; m <= n; m++ ) { if ( t[m].start >= t[i].finish ) { // t[m] compatible? result = concat(result, new Time[] { t[m] }); i = m; } } return result; } // . . . }

CPTR 430 Algorithms Greedy Algorithms

22

slide-23
SLIDE 23

Summary of the Greedy Stategy

Just like in DP , a choice must be made In a greedy algorithm, the choice that seems best at the time, the greedy choice is chosen

  • 1. Find optimal substructure of the problem
  • 2. Devise a recursive solution
  • 3. Prove that at any stage of the recursion the greedy choice is an optimal choice
  • 4. Show that all but one of the subproblems induced by the greedy choice are empty
  • 5. Create a recursive algorithm that implements the greedy strategy
  • 6. Convert the recursive algorithm into an iterative algorithm

CPTR 430 Algorithms Greedy Algorithms

23

slide-24
SLIDE 24

DP Greedy Algorithm

Greedy algorithms are a restricted application of the DP approach

  • 1. Express the optimization problem in a way so its solution consists of

making a choice and then solving a remaining subproblem

  • 2. Prove that a greedy choice will produce an optimal solution
  • 3. Show that making the greedy choice leaves a subproblem whose

solution combined with the greedy choice constitutes an optimal solution to the original problem Some DP problems can be solved with the greedy approach

CPTR 430 Algorithms Greedy Algorithms

24

slide-25
SLIDE 25

Key Greedy Algorithm Ingredients

■ Greedy-choice property ■ Optimal substructure

In general, there is no test that can show if a given optimization problem can be solved by a greedy algorithm

CPTR 430 Algorithms Greedy Algorithms

25

slide-26
SLIDE 26

Greedy-choice Property

A locally optimal (greedy) choice

a globally optimal solution

■ Consider the best choice for the given problem without considering the

results of subproblems

■ In DP

, the choice depends on the results obtained from subproblem solutions

■ DP is therefore bottom-up ■ Greedy algorithms are top-down

CPTR 430 Algorithms Greedy Algorithms

26

slide-27
SLIDE 27

Establishing the Greedy-choice Property

■ Must prove that the greedy choice at each step yields a globally optimal

solution This requires some clever insight

■ For the choice to be made efficiently, we often order the data in some

special way (as in the Activity Selection problem) Priority queue, for example

CPTR 430 Algorithms Greedy Algorithms

27

slide-28
SLIDE 28

Optimal Substructure

■ A problem exhibits optimal substructure if an optimal solution contains

  • ptimal solutions to subproblems

■ To separate greedy algorithms from DP

, the greedy choice leaves us with a subproblem:

❚ We must show that the greedy choice plus an optimal solution to that

subproblem yields a globally optimal solution

❚ This requires an inductive proof, in one form or another

CPTR 430 Algorithms Greedy Algorithms

28

slide-29
SLIDE 29

Is Dynamic Programming Necessary?

Can a greedy algorithm be devised for all problems solvable by DP? No To see why, consider two problems:

■ 0-1 knapsack problem ■ Fractional knapsack problem

CPTR 430 Algorithms Greedy Algorithms

29

slide-30
SLIDE 30

The 0-1 Knapsack Problem

■ n items ■ Each item is indivisble ■ Item i is worth vi dollars ■ Item i weighs wi kilograms ■ A knapsack (backpack) can hold W kilograms ■ Stuff the knapsack with as many items as possible to maximize the

value of the contents

CPTR 430 Algorithms Greedy Algorithms

30

slide-31
SLIDE 31

The Fractional Knapsack Problem

Same as the 0-1 knapsack problem, except that fractions of items can be selected 0-1 means the decision is binary—you either take the item or you do not In the 0-1 knapsack problem, think of gold bars; in the fractional knapsack problem, think of boxes containing gold dust Which one yields to a greedy algorithm, and which does not?

CPTR 430 Algorithms Greedy Algorithms

31

slide-32
SLIDE 32

Optimal Substructure

Both knapsack problems exhibit the optimal substructure property:

■ 0-1: ❚ Consider the most valuable load that weighs at most W kilograms ❚ Remove item j from the knapsack ❚ The remaining load must be the most valuable load of items

(excluding item j) that weighs at most W

w j

(Why?)

■ Fractional: ❚ Consider the most valuable load that weighs at most W kilograms ❚ Remove w kilograms from one item j in the knapsack ❚ The remaining load must be the most valuable load (excluding w j

w

kilograms of item j) that weights at most W

w

(Why?)

CPTR 430 Algorithms Greedy Algorithms

32

slide-33
SLIDE 33

Solving the Fractional Knapsack Problem

■ Compute the value per kilogram of each item, vi

  • wi

■ Take as much as possible of the item with the greatest value-to-weight

ratio

■ If this item is exhausted and the knapsack can hold more, choose the

item with the next highest value-to-weight ratio

■ Continue this strategy until the knapsack can hold no more

Does this sound like a greedy algorithm?

CPTR 430 Algorithms Greedy Algorithms

33

slide-34
SLIDE 34

Example

50 Knapsack 30 Item 3 $120 20 Item 2 $100 10 Item 1 $60

How would you fill the knapsack?

CPTR 430 Algorithms Greedy Algorithms

34

slide-35
SLIDE 35

Solution

10 20 30 20 $100 $60 $80 + + = $240 50 Knapsack 30 Item 3 $120 20 Item 2 $100 10 Item 1 $60 $6 $/kg 5 $/kg 4 $/kg

Sorting the value-to-weight ratios takes O

  • nlgn

time, and the choices thereafter take O

  • n

time, yielding an O

  • nlgn

algorithm

CPTR 430 Algorithms Greedy Algorithms

35

slide-36
SLIDE 36

Greedy Algorithm for 0-1 Knapsack Problem?

50 Knapsack 30 Item 3 $120 20 Item 2 $100 10 Item 1 $60

How would you fill the knapsack?

CPTR 430 Algorithms Greedy Algorithms

36

slide-37
SLIDE 37

Greedy Algorithm for 0-1 Knapsack Problem?

50 Knapsack 30 Item 3 $120 20 Item 2 $100 10 Item 1 $60 20 $100 $60 30 10 10 30 20 $100 $120 $120 $60 = $160 = $180 = $220 + + +

Great for fractional problem, not for 0-1

CPTR 430 Algorithms Greedy Algorithms

37

slide-38
SLIDE 38

0-1 Knapsack Problem

■ The empty space left over in the knapsack lowers the effective value

per kilogram

■ When considering the inclusion of an item we must consider the two

resulting subproblems:

❚ the result when the item is added ❚ the result when the item is not added ■ There obviously will be multiple overlapping subproblems—as we saw

in DP

■ DP can be used to solve the 0-1 knapsack problem

CPTR 430 Algorithms Greedy Algorithms

38

slide-39
SLIDE 39

Character Encoding

■ Consider a data file consisting of 100,000 characters ■ The characters a, b, c, d, e, and f, in various quantities each, are

stored in the file

■ For example, the file might begin:

cffeadabeeab . . .

What is the minimum amount of storage required to hold the given data?

CPTR 430 Algorithms Greedy Algorithms

39

slide-40
SLIDE 40

“Standard” Approach

■ Use three bits to represent the six different characters:

Character Code

a 000 b 001 c 010 d 011 e 100 f 101

■ 100,000 characters = 300,000 bits

file Can we do better?

CPTR 430 Algorithms Greedy Algorithms

40

slide-41
SLIDE 41

Fixed-length Encoding

■ The “standard” approach uses fixed-length encoding ■ Each character code uses the same number of bits ■ No delimiters needed; just grab the next three bits to get the next

character

CPTR 430 Algorithms Greedy Algorithms

41

slide-42
SLIDE 42

Variable-length Encoding

If we have a statistical profile of the data, we can do better with a variable- length encoding scheme:

■ more frequently appearing characters use fewer bits ■ less frequently occurring characters use more bits

CPTR 430 Algorithms Greedy Algorithms

42

slide-43
SLIDE 43

Character Frequencies for Our Data

Character Frequency Code

a

45%

b

13%

101 c

12%

100 d

16%

111 e

9%

1101 f

5%

1100

45

  • 000

as fi le

1

bit a

13

  • 000

bs fi le

3

bits b

12

  • 000

cs fi le

3

bits c

16

  • 000

ds fi le

3

bits d

9

  • 000

es fi le

4

bits e

5

  • 000

fs fi le

4

bits f

224

  • 000

bits fi le

CPTR 430 Algorithms Greedy Algorithms

43

slide-44
SLIDE 44

Storage Savings

225

000 300

000

  • 75

Variable-length encoding reduces the file size by 25%

CPTR 430 Algorithms Greedy Algorithms

44

slide-45
SLIDE 45

Prefix Codes

■ Since the characters have variable bit lengths, do we now need

delimiters? How do we know to grab one, three, or four bits for the next character?

■ No code word must be the prefix of any other code word ■ Such codes are called prefix codes ■ The file

001011101

would be decoded unambiguously as

101 1101 a a b e

CPTR 430 Algorithms Greedy Algorithms

45

slide-46
SLIDE 46

Binary Tree Representation

■ A binary tree can be used to conveniently represent prefix codes ■ Leave represent the characters ■ The bitstring encodes a path to a leaf

1 1 1 1 1 1 1 1 1 a :45 b:13 c:12 d:16 e:9 f :5 28 14 86 100 58 100 a:45 14 b :13 c:12 d:16 e:9 f:5 25 55 14 30

CPTR 430 Algorithms Greedy Algorithms

46

slide-47
SLIDE 47

Binary Tree Representation

1 1 1 1 1 1 1 1 1 a:45 b :13 c:12 d:16 e:9 f:5 28 14 86 100 58 14 100 a :45 b:13 c :12 d:16 e :9 f:5 25 55 14 30

■ An optimal code for a file is

represented by a full binary tree In a full binary tree, every non-leaf node has two children

■ The fixed-length code is non-optimal ■ The given variable-length code may

be optimal (as it is does make a full binary tree)

CPTR 430 Algorithms Greedy Algorithms

47

slide-48
SLIDE 48

Binary Tree Representation

Given a full binary tree T representing a prefix code:

■ C is the alphabet of characters ■ T has

  • C
  • leaves (for an optimal code), one leaf for each character

■ T has

  • C

1 internal nodes

■ If c

C, then f

c

is the frequency of c in the fi le

■ If c

C, then dT

c

is the depth of c’s leaf in the tree

dT is also the length of c’s code word

■ The number of bits required to encode a fi le given tree T is

B

T

☎ ✄

c

C

f

c

☎ ✁

dT

c

■ B

T

is called the cost of T

CPTR 430 Algorithms Greedy Algorithms

48

slide-49
SLIDE 49

Huffman Codes

■ David Huffman in 1952 devised an algorithm for that creates optimal

prefix codes

■ Each character is wrapped in an object with the character’s associated

frequency

■ The two objects with the lowest frequencies are merged into an internal

node object with their combined frequencies; this new object replaces the two objects it combined

■ This process continues until only one object remains—the internal node

which is the root of the tree

C

✁ ✁

1 merging operations are required to build the Huffman tree

■ Huffman’s algorithm uses a min-priority queue to identify the two nodes

to merge into a new internal node

CPTR 430 Algorithms Greedy Algorithms

49

slide-50
SLIDE 50

The huffman() Method

private static HuffmanNode huffman(HuffmanNode[] C) { int n = C.length; MinHeap Q = new MinHeap(n, C); Q.printAsHeap(); Q.print(); for ( int i = 0; i < n - 1; i++ ) { HuffmanNode left = Q.extractMinimum(), right = Q.extractMinimum(); InteriorNode z = new InteriorNode(left.getFrequency() + right.getFrequency(), left, right); Q.insert(z); Q.print(); } return Q.extractMinimum(); }

CPTR 430 Algorithms Greedy Algorithms

50

slide-51
SLIDE 51

The huffman() Method

■ A HuffmanNode is an object that can be

a node in a Huffman tree

❚ Subclass CharNode is an object that

holds a character and its associated frequency (it is a leaf node)

❚ Subclass InteriorNode is an interior

Huffman tree node that contains references to left and right children

■ C is a set (array) of HuffmanNodes ■ Q is a minheap of HuffmanNode objects

HuffmanNode CharNode InteriorNode

CPTR 430 Algorithms Greedy Algorithms

51

slide-52
SLIDE 52

Time Complexity of huffman()

■ The creation of the minheap Q

int n = C.length; MinHeap Q = new MinHeap(n, C);

requires O

n

time

■ The for loop is executed n

1 times

for ( int i = 0; i < n - 1; i++ ) { HuffmanNode left = Q.extractMinimum(), right = Q.extractMinimum(); InteriorNode z = new InteriorNode(left.getFrequency() + right.getFrequency(), left, right); Q.insert(z); }

and each heap operation (extractMinimum() and insert()) requires O

lgn

time

■ Thus, for n characters, huffman() takes O

nlgn

time

CPTR 430 Algorithms Greedy Algorithms

52

slide-53
SLIDE 53

Correctness of huffman()

To prove huffman() correct, we must show that the two key greedy algorithm properties hold:

■ the greedy-choice property ■ the optimal-substructure property

CPTR 430 Algorithms Greedy Algorithms

53

slide-54
SLIDE 54

Greedy-choice Property

Theorem 16.2 Let C be an alphabet in which each character c

C has frequency f

c

  • .

Let x and y be two characters in C having the lowest

  • frequencies. Then there exists an optimal prefix code for C in which

the code words for x and y have the same length and differ only in the last bit. Why is this lemma significant?

CPTR 430 Algorithms Greedy Algorithms

54

slide-55
SLIDE 55

Significance of Lemma 16.2

■ We must be able to take tree T which represents an arbitrary optimal

prefix code and modify it to produce a new tree that represents another

  • ptimal prefix code and has characters x and y as sibling leaves of

maximum depth

■ The code words for x and y will thus have the same length and differ

  • nly in the last bit

■ If we can do this, since the Huffman tree is built in a bottom-up manner

merging nodes with the lowest frequencies, we can legitimately make the greedy choice (the two lowest frequency nodes) and build a tree representing a correct optimal prefix code

CPTR 430 Algorithms Greedy Algorithms

55

slide-56
SLIDE 56

Proof of Lemma 16.2

■ Let a and b be two characters that are sibling leaves of maximum depth

in T

■ Without loss of generality, assume f

a

  • f

b

  • and f

x

  • f

y

  • ■ It follows that f

x

  • f

a

  • (since x has the lowest frequency) and

f

y

  • f

b

  • (since y has the next lowest frequency after x)

■ Transform T to produce T

  • by exchanging the positions of x and a

■ Next, transform T

  • to produce T
  • by exchanging the positions of y and b

CPTR 430 Algorithms Greedy Algorithms

56

slide-57
SLIDE 57

Proof of Lemma 16.2

T T

a x b y y b a x x y a b

T

Do these transformations produce a non-optimal tree?

CPTR 430 Algorithms Greedy Algorithms

57

slide-58
SLIDE 58

Comparing the Costs of T, T

  • , and T
  • B
  • T
☎ ✁

B

  • T

c

C

f

  • c

dT

  • c
☎ ✁

c

C

f

  • c

dT

  • c
  • f

x

  • dT
  • x
☎ ☎

f

a

  • dT
  • a
☎ ✁

f

x

  • dT
  • x
☎ ✁

f

a

  • dT
  • a
  • f

x

  • dT
  • x
☎ ☎

f

a

  • dT
  • a
☎ ✁

f

x

  • dT
  • a
☎ ✁

f

a

  • dT
  • x
  • f

a

f

x

  • dT
  • a
☎ ✁

dT

  • x
☎ ☎ ✂

since

■ f

x

  • f

a

  • (since x has minimum frequency)

■ dT

  • x
  • dT
  • a

(since a is a leaf at maximum depth in T) Similar analysis yields

B

  • T

B

  • T

CPTR 430 Algorithms Greedy Algorithms

58

slide-59
SLIDE 59

Comparing the Costs of T, T

  • , and T
  • ■ B
  • T
☎ ✁

B

  • T

0 and B

  • T

B

  • T
✂ ✄

B

  • T
  • B
  • T

■ Since T is optimal, B

  • T
  • B
  • T

■ B

  • T
  • B
  • T

and B

  • T
  • B
  • T

B

  • T
  • B
  • T

■ Thus, T

  • is an optimal tree in which x and y appear as sibling leaves at

maximum depth This proves Lemma 16.2

CPTR 430 Algorithms Greedy Algorithms

59

slide-60
SLIDE 60

So, What have We Accomplished?

■ Lemma 16.2 shows that we can begin building the Huffman tree by

making the greedy choice—merging together the two characters with the lowest frequencies

■ This merger choice produces an internal node witht the lowest cost

CPTR 430 Algorithms Greedy Algorithms

60

slide-61
SLIDE 61

Optimal-substructure Property

Theorem 16.3 Let C be an alphabet in which each character c

C has frequency f

c

  • .

Let x and y be two characters in C having the lowest

  • frequencies. Let C
  • be the alphabet with characters x and y removed

and new character z added, so that C

  • C
  • x

y

✁ ✁
  • z

. Define

f for C

  • as for C, except that f

z

  • f

x

f

y

  • . Let T
  • be any tree

representing an optimal prefix code for the alphabet C

  • . Then the

tree T, obtained from from T

  • by replacing the leaf node for z with an

internal node having x and y as children, represents as optimal prefix code for the alphabet C. Why is this lemma significant?

CPTR 430 Algorithms Greedy Algorithms

61

slide-62
SLIDE 62

Proof of Lemma 16.3

■ For each c

C

  • x

y

, dT

  • c
  • dT
  • c

■ Thus, f

c

  • dT
  • c
  • f

c

  • dT
  • c

, for c

C

  • x

y

■ dT

  • x
  • dT
  • y
  • dT
  • z
☎ ☎

1

■ So,

f

x

  • dT
  • x
☎ ☎

f

y

  • dT
  • y
  • f

x

f

y

  • dT
  • z
☎ ☎

1

  • f

x

  • dT
  • z
☎ ☎

f

x

f

y

  • dT
  • z
☎ ☎

f

y

  • f

x

f

y

dT

  • z
☎ ☎
  • f

x

f

y

  • f

z

  • dT
  • z
☎ ☎
  • f

x

f

y

CPTR 430 Algorithms Greedy Algorithms

62

slide-63
SLIDE 63

Proof of Lemma 16.3 (cont.)

So we have

f

x

  • dT
  • x
☎ ☎

f

y

  • dT
  • y
  • f

z

  • dT
  • z
☎ ☎
  • f

x

f

y

which implies

B

  • T
  • B
  • T

f

x

f

y

  • r, equivalently

B

  • T
  • B
  • T
☎ ✁

f

x

f

y

  • CPTR 430 Algorithms

Greedy Algorithms

63

slide-64
SLIDE 64

Proof of Lemma 16.3 (cont.)

Proceed by contradiction

■ Suppose T does not represent a optimal prefix code for C ■ Therefore, there exists tree T

  • such that B
  • T

B

  • T

■ Using the results of Lemma 16.2 and without loss of generality let x and

y be siblings in T

  • (recall that x and y have the lowest frequencies in C)

■ Let T

  • be tree T
  • with the common parent of x and y replaced with leaf

z with frequency f

z

  • f

x

f

y

  • ■ It follows that

B

  • T
  • B
  • T

f

x

f

y

B

  • T
☎ ✁

f

x

f

y

  • B
  • T

CPTR 430 Algorithms Greedy Algorithms

64

slide-65
SLIDE 65

Proof of Lemma 16.3 (cont.)

So

B

  • T

B

  • T

■ This contradicts the assumption that T

  • represents an optimal prefix

code for C

  • ■ T therefore must represent an optimal prefix code for C

CPTR 430 Algorithms Greedy Algorithms

65