Lecture 14 Greedy algorithms! Announcements HW6 Due Friday! TONS - - PowerPoint PPT Presentation

lecture 14
SMART_READER_LITE
LIVE PREVIEW

Lecture 14 Greedy algorithms! Announcements HW6 Due Friday! TONS - - PowerPoint PPT Presentation

Lecture 14 Greedy algorithms! Announcements HW6 Due Friday! TONS OF PRACTICE ON DYNAMIC PROGRAMMING Sometimes I have hidden slides in PowerPoint These are usually rough drafts or ways I was thinking of presenting things that I


slide-1
SLIDE 1

Lecture 14

Greedy algorithms!

slide-2
SLIDE 2

Announcements

  • HW6 Due Friday!
  • TONS OF PRACTICE ON DYNAMIC PROGRAMMING
  • Sometimes I have hidden slides in PowerPoint
  • These are usually rough drafts or ways I was thinking of

presenting things that I didn’t end up going with.

  • I just realized that when I export to PDF the hidden

slides show up with no distinction from normal slides.

  • I tried to clear these out, but if you find a slide that looks

like it doesn’t belong or is full of garbage, let me know.

slide-3
SLIDE 3

Last week

slide-4
SLIDE 4

This week

  • Greedy algorithms!
  • Builds on our ideas from dynamic programming
slide-5
SLIDE 5

Today

  • Three examples of greedy algorithms:
  • Activity Selection
  • Job Scheduling
  • Huffman Coding
slide-6
SLIDE 6

Example

Activity selection

Frisbee Practice Orchestra CS161 study group Sleep CS110 Class Theory Lunch Theory Seminar Combinatorics Seminar Underwater basket weaving class Math 51 Class CS 161 Class CS 166 Class CS 161 Section CS 161 Office Hours Swimming lessons Programming team meeting Social activity time

You can only do one activity at a time, and you want to maximize the number of activities that you do.

What to choose?

slide-7
SLIDE 7

Activity selection

  • Input:
  • Activities a1, a2, …, an
  • Start times s1, s2, …, sn
  • Finish times f1, f2, …, fn
  • Output:
  • How many activities can you do today?
slide-8
SLIDE 8

Greedy Algorithm

a3 a1 a4 a2 a5 a7 a6

time

  • Pick the activity you can add
  • that has the smallest finish time.
  • Include it in your activity list.
  • Repeat.
slide-9
SLIDE 9

Greedy Algorithm

a3 a1 a4 a2 a5 a7 a6

time

  • Pick the activity you can add
  • that has the smallest finish time.
  • Include it in your activity list.
  • Repeat.
slide-10
SLIDE 10

Greedy Algorithm

a3 a1 a4 a2 a5 a7 a6

time

  • Pick the activity you can add
  • that has the smallest finish time.
  • Include it in your activity list.
  • Repeat.
slide-11
SLIDE 11

Greedy Algorithm

a3 a1 a4 a2 a5 a7 a6

time

  • Pick the activity you can add
  • that has the smallest finish time.
  • Include it in your activity list.
  • Repeat.
slide-12
SLIDE 12

Greedy Algorithm

a3 a1 a4 a2 a5 a7 a6

time

  • Pick the activity you can add
  • that has the smallest finish time.
  • Include it in your activity list.
  • Repeat.
slide-13
SLIDE 13

Greedy Algorithm

a3 a1 a4 a2 a5 a7 a6

time

  • Pick the activity you can add
  • that has the smallest finish time.
  • Include it in your activity list.
  • Repeat.
slide-14
SLIDE 14

Greedy Algorithm

a3 a1 a4 a2 a5 a7 a6

time

  • Pick the activity you can add
  • that has the smallest finish time.
  • Include it in your activity list.
  • Repeat.
slide-15
SLIDE 15

Greedy Algorithm

a3 a1 a4 a2 a5 a7 a6

time

  • Pick the activity you can add
  • that has the smallest finish time.
  • Include it in your activity list.
  • Repeat.
slide-16
SLIDE 16

That seems like a reasonable thing to do…

  • Running time:
  • O(n) if the activities are already sorted by finish time.
  • Otherwise O(nlog(n)) if you have to sort them first.
  • Does it work?
  • We’ll see soon.
slide-17
SLIDE 17

This is an example of a gr

greedy dy alg algorith ithm

  • At each step in the algorithm, make a choice.
  • Hey, I can increase my activity set by one,
  • And leave lots of room for future choices,
  • Let’s do that and hope for the best!!!
  • Hope that at the end of the day, this results in a

globally optimal solution.

slide-18
SLIDE 18

Three questions

  • 1. Does this greedy algorithm for activity selection work?
  • 2. In general, when are greedy algorithms a good idea?
  • 3. The “greedy” approach is often the first you’d think of…
  • Why are we getting to it now, in Week 8?
slide-19
SLIDE 19

Answers

  • 1. Does this greedy algorithm for activity selection work?
  • Yes.
  • 2. In general, when are greedy algorithms a good idea?
  • When they exhibit especially nice optimal substructure.
  • 3. The “greedy” approach is often the first you’d think of…
  • Why are we getting to it now, in Week 8?
  • Like dynamic programming!
  • (Which we did in Week 7).
  • Proving that greedy algorithms work is often not so easy.
slide-20
SLIDE 20

DP view of activity selection

slide-21
SLIDE 21

Recipe for applying Dynamic Programming

  • Step 1: Identify optimal substructure.
  • Step 2: Find a recursive formulation for the value of

the optimal solution.

  • Step 3: Use dynamic programming to find the value
  • f the optimal solution.
  • Step 4: If needed, keep track of some additional

info so that the algorithm from Step 3 can find the actual solution.

  • Step 5: If needed, code this up like a reasonable

person.

slide-22
SLIDE 22

Optimal substructure

  • Subproblems:
  • A[i,j] = number of activities you can squeeze in after

Activity i finishes and before Activity j starts a5 a2 a7 a6

time

a4 a1 a3

slide-23
SLIDE 23

Optimal substructure

  • Subproblems:
  • A[i,j] = number of activities you can squeeze in after

Activity i finishes and before Activity j starts

  • A[5,7] = solution to this subproblem

a5 a2 a7 a6

time

a4 a1 a3

slide-24
SLIDE 24

Recipe for applying Dynamic Programming

  • Step 1: Identify optimal substructure.
  • Step 2: Find a recursive formulation for the value of

the optimal solution.

  • Step 3: Use dynamic programming to find the value
  • f the optimal solution.
  • Step 4: If needed, keep track of some additional

info so that the algorithm from Step 3 can find the actual solution.

  • Step 5: If needed, code this up like a reasonable

person.

slide-25
SLIDE 25

This satisfies a nice recursive relationship

  • A[i,j] = maxk { A[i,k] + 1 + A[k,j] }
  • The maximum is over all k so that Activity k fits in

between Activities i and j.

ai a2 aj a6

time

ak a1 a3

A[i,j] = number of activities you can squeeze in after Activity i finishes and before Activity j starts

slide-26
SLIDE 26

This satisfies a nice recursive relationship

  • A[i,j] = maxk { A[i,k] + 1 + A[k,j] }
  • The maximum is over all k so that Activity k fits in

between Activities i and j.

ai a2 aj a6

time

ak a1 a3

A[k,j] A[i,k] + 1 +

A[i,j] = number of activities you can squeeze in after Activity i finishes and before Activity j starts

slide-27
SLIDE 27

We could turn this into a DP algorithm

  • .Would take time something like O(n3)
  • Fill out an n-by-n table.
  • For each entry search over maybe n

possiblities for k.

  • But this would be wasteful!
  • we just saw an algorithm that takes time

O(nlog(n)), if it’s correct…

Try it! It builds character!

slide-28
SLIDE 28

The thing that’s wasteful

  • Actually, we should know in advance what

subproblem to look at.

  • Lemma:
  • Suppose that k is the activity you can squeeze in

after i with the smallest finishing time.

  • Then there is an optimal solution to A[i..n+1]

that extends the optimal solution to A[k..n+1].

Let’s add an additional activity an+1 that starts “tomorrow”.

A[i,j] = number of activities you can squeeze in after Activity i finishes and before Activity j starts

This is abusing notation…technically A[i..n+1] is a value, not a problem.

slide-29
SLIDE 29

Lemma

  • Suppose that k is the activity you can squeeze in

after i with the smallest finishing time.

  • Then there is an optimal solution to A[i..n+1]

that extends the optimal solution to A[k..n+1].

a1 ai a7 a6

time

ak a3 a5

A[i,j] = number of activities you can squeeze in after Activity i finishes and before Activity j starts

slide-30
SLIDE 30

Lemma

  • Suppose that k is the activity you can squeeze in

after i with the smallest finishing time.

  • Then there is an optimal solution to A[i..n+1]

that extends the optimal solution to A[k..n+1].

a1 ai a7 a6

time

ak a3 a5

A[i,j] = number of activities you can squeeze in after Activity i finishes and before Activity j starts

slide-31
SLIDE 31

Proof

  • Suppose that k is the activity you can squeeze in

after i with the smallest finishing time.

  • Then there is an optimal solution to A[i..n+1]

that extends the optimal solution to A[k..n+1].

a1 ai a7 a6

time

a3 ak a5

slide-32
SLIDE 32

Proof

a1 ai a7 a6

time

a3

  • Suppose that this is an optimal solution to A[i..n+1]
  • Doesn’t involve ak

ak a5

  • Suppose that k is the activity you can squeeze in

after i with the smallest finishing time.

  • Then there is an optimal solution to A[i..n+1]

that extends the optimal solution to A[k..n+1].

slide-33
SLIDE 33

Proof

a1 ai a7 a6

time

a3

  • Suppose that this is an optimal solution to A[i..n+1]
  • Doesn’t involve ak
  • Swap ak in for whatever had the smallest finishing

time in that solution.

ak a5

  • Suppose that k is the activity you can squeeze in

after i with the smallest finishing time.

  • Then there is an optimal solution to A[i..n+1]

that extends the optimal solution to A[k..n+1].

slide-34
SLIDE 34

Proof

a1 ai a7 a6

time

a3

  • Suppose that this is an optimal solution to A[i..n+1]
  • Doesn’t involve ak
  • Swap ak in for whatever had the smallest finishing

time in that solution.

ak a5

  • Suppose that k is the activity you can squeeze in

after i with the smallest finishing time.

  • Then there is an optimal solution to A[i..n+1]

that extends the optimal solution to A[k..n+1].

slide-35
SLIDE 35

Proof

a1 ai a7 a6

time

a3

  • Suppose that this is an optimal solution to A[i..n+1]
  • Doesn’t involve ak
  • Swap ak in for whatever had the smallest finishing

time in that solution.

  • This is still a legit schedule, and it involves ak

ak a5

  • Suppose that k is the activity you can squeeze in

after i with the smallest finishing time.

  • Then there is an optimal solution to A[i..n+1]

that extends the optimal solution to A[k..n+1].

slide-36
SLIDE 36

Proof

a1 ai a7 a6

time

a3

  • Suppose that this is an optimal solution to A[i..n+1]
  • Doesn’t involve ak
  • Swap ak in for whatever had the smallest finishing

time in that solution.

  • This is still a legit schedule, and it involves ak

ak a5

  • Suppose that k is the activity you can squeeze in

after i with the smallest finishing time.

  • Then there is an optimal solution to A[i..n+1]

that extends the optimal solution to A[k..n+1].

slide-37
SLIDE 37

This means that DP would have been wasteful.

A[i,j] i j n+1 n+1

slide-38
SLIDE 38

This means that DP would have been wasteful.

A[i,j] i j n+1 n+1 A[0,n+1] is the return value we wanted.

slide-39
SLIDE 39

This means that DP would have been wasteful.

A[i,j] i j n+1 n+1 A[0,n+1] is the return value we wanted. We should know ahead

  • f time that it
  • nly depends
  • n A[2,n+1]
slide-40
SLIDE 40

This means that DP would have been wasteful.

A[i,j] i j n+1 n+1 A[0,n+1] is the return value we wanted. We should know ahead

  • f time that it
  • nly depends
  • n A[2,n+1]

etc.

slide-41
SLIDE 41

This means that DP would have been wasteful.

A[i,j] i j n+1 n+1 A[0,n+1] is the return value we wanted. We should know ahead

  • f time that it
  • nly depends
  • n A[2,n+1]

etc.

slide-42
SLIDE 42

This means that DP would have been wasteful.

A[i,j] i j n+1 n+1 A[0,n+1] is the return value we wanted. We should know ahead

  • f time that it
  • nly depends
  • n A[2,n+1]

etc.

slide-43
SLIDE 43

This means that DP would have been wasteful.

A[i,j] i j n+1 n+1 A[0,n+1] is the return value we wanted. We should know ahead

  • f time that it
  • nly depends
  • n A[2,n+1]

etc. There’s no reason we have to look at the whole table!

slide-44
SLIDE 44

Instead, let’s use this insight to make a greedy algorithm

  • Suppose the activities are sorted by finishing time
  • if not, sort them.
  • mySchedule = []
  • for k = 1,…,n:
  • if I can fit in Activity k after the last thing in mySchedule:
  • mySchedule.append(Activity k)
  • return mySchedule

This is the same thing we saw before

slide-45
SLIDE 45

Greedy Algorithm

a3 a1 a4 a2 a5 a7 a6

time

  • Pick the activity you can add
  • that has the smallest finish time.
  • Include it in your activity list.
  • Repeat.
slide-46
SLIDE 46

Greedy Algorithm

a3 a1 a4 a2 a5 a7 a6

time

  • Pick the activity you can add
  • that has the smallest finish time.
  • Include it in your activity list.
  • Repeat.
slide-47
SLIDE 47

Greedy Algorithm

a3 a1 a4 a2 a5 a7 a6

time

  • Pick the activity you can add
  • that has the smallest finish time.
  • Include it in your activity list.
  • Repeat.
slide-48
SLIDE 48

Greedy Algorithm

a3 a1 a4 a2 a5 a7 a6

time

  • Pick the activity you can add
  • that has the smallest finish time.
  • Include it in your activity list.
  • Repeat.
slide-49
SLIDE 49

Greedy Algorithm

a3 a1 a4 a2 a5 a7 a6

time

  • Pick the activity you can add
  • that has the smallest finish time.
  • Include it in your activity list.
  • Repeat.
slide-50
SLIDE 50

Greedy Algorithm

a3 a1 a4 a2 a5 a7 a6

time

  • Pick the activity you can add
  • that has the smallest finish time.
  • Include it in your activity list.
  • Repeat.
slide-51
SLIDE 51

Greedy Algorithm

a3 a1 a4 a2 a5 a7 a6

time

  • Pick the activity you can add
  • that has the smallest finish time.
  • Include it in your activity list.
  • Repeat.
slide-52
SLIDE 52

Greedy Algorithm

a3 a1 a4 a2 a5 a7 a6

time

  • Pick the activity you can add
  • that has the smallest finish time.
  • Include it in your activity list.
  • Repeat.
slide-53
SLIDE 53

Why does this work?

  • At each step, we make a choice
  • Include activity k
  • We can show that this choice will never rule out an
  • ptimal solution.
  • Formally: There is an optimal solution to A[i..n+1] that

contains A[k..n+1].

  • So when we reach the end of the argument:
  • we haven’t ruled out an optimal solution
  • and we only have one solution left
  • so it must be optimal.
slide-54
SLIDE 54

Answers

  • 1. Does this greedy algorithm for activity selection work?
  • Yes.
  • 2. In general, when are greedy algorithms a good idea?
  • When they exhibit especially nice optimal substructure.
  • 3. The “greedy” approach is often the first you’d think of…
  • Why are we getting to it now, in Week 8?
  • Like dynamic programming!
  • (Which we did in Week 7).
  • Proving that greedy algorithms work is often not so easy.

In particular, when each big problem depends on only one sub-problem.

slide-55
SLIDE 55

Sub-problem graph view

  • Divide-and-conquer:

Big problem sub-problem sub-problem sub-sub- problem sub-sub- problem sub-sub- problem sub-sub- problem sub-sub- problem

slide-56
SLIDE 56

Sub-problem graph view

  • Dynamic Programming:

Big problem sub-problem sub-problem sub-sub- problem sub-sub- problem sub-sub- problem sub-sub- problem sub-problem

slide-57
SLIDE 57

Sub-problem graph view

  • Greedy algorithms:

Big problem sub-sub- problem sub-problem

slide-58
SLIDE 58

Sub-problem graph view

  • Greedy algorithms:

Big problem sub-sub- problem sub-problem

  • Not only is there optimal sub-structure:
  • optimal solutions to a problem are made up

from optimal solutions of sub-problems

  • but each problem depends on only one

sub-problem.

slide-59
SLIDE 59

What have we learned?

  • If we come up with a DP solution, and it turns out

that we really only care about one sub-problem, then maybe we can use a greedy algorithm.

  • One example was activity selection.
  • In order to come up with a greedy algorithm, we:
  • Made a series of choices
  • Proved that our choices will never rule out an optimal

solution.

  • Conclude that our solution at the end is optimal.
slide-60
SLIDE 60

Let’s see a few more examples

slide-61
SLIDE 61

Another example:

Scheduling

Overcommitted Stanford Student CS161 HW! Call your parents! Math HW! Econ HW! Practice musical instrument! Read CLRS! Have a social life! Sleep! Administrative stuff for your student club! Do laundry! Meditate!

slide-62
SLIDE 62

Scheduling

  • n tasks
  • Task i takes ti hours
  • Everything is already late!
  • For every hour that passes until task i is done, pay ci
  • CS161 HW, then Sleep: costs 10 ⋅ 2 + (10 + 8) ⋅ 3 = 74 units
  • Sleep, then CS161 HW: costs 8 ⋅ 3 + (10 + 8) ⋅ 2 = 60 units

CS161 HW! Sleep! 10 hours 8 hours Cost: 2 units per hour until it’s done. Cost: 3 units per hour until it’s done.

slide-63
SLIDE 63

Optimal substructure

  • This problem breaks up nicely into sub-problems:

Job A Job B Job C Job D Suppose this is the optimal schedule:

Then this must be the optimal schedule on just jobs A and B.

slide-64
SLIDE 64

How to use this optimal sub-structure to design a greedy algorithm?

  • We make a series of choices.
  • We show that, at each step, our choice won’t rule
  • ut an optimal solution at the end of the day.
  • After we’ve made all our choices, we haven’t ruled
  • ut an optimal solution, so we must have found
  • ne.

Job A Job B Job C Job D Of all these jobs, which one(s) is it safe to choose first? Which won’t rule out an optimal solution?

slide-65
SLIDE 65

Head-to-head

  • Of these two jobs, which should we do first?
  • Cost( A then B ) = x ⋅ z + (x + y) ⋅ w
  • Cost( B then A ) = y ⋅ w + (x + y) ⋅ z

Job A Job B x hours y hours Cost: z units per hour until it’s done. Cost: w units per hour until it’s done. A then B is better than B then A when: 𝑦𝑨 + 𝑦 + 𝑧 𝑥 ≤ 𝑧𝑥 + 𝑦 + 𝑧 𝑨 𝑦𝑨 + 𝑦𝑥 + 𝑧𝑥 ≤ 𝑧𝑥 + 𝑦𝑨 + 𝑧𝑨 𝑥𝑦 ≤ 𝑧𝑨 𝑥 𝑧 ≤ 𝑨 𝑦 What matters is the ratio: cos cost of

  • f de

delay ti time it it ta takes Do the job with the biggest ratio first.

slide-66
SLIDE 66

Lemma

  • Given jobs so that Job i takes time ti with cost ci ,
  • There is an optimal schedule so that the first job is the
  • ne that maximizes the ratio ci/ti
  • Proof:
  • Say Job B maximizes this ratio, and it’s not first:
  • Switch A and B! Nothing else will change, and we showed on

the previous slide that the cost won’t increase.

  • Repeat until B is first.

Job A Job B cA/tA >= cB/tB Job C Job D Job A Job B Job C Job D

slide-67
SLIDE 67

Greedy Scheduling Solution

  • scheduleJobs( JOBS ):
  • Sort JOBS by the ratio:
  • 𝒔𝒋 = 𝒅𝒋

𝒖𝒋 =

co cost of

  • f de

delayin ing jo job i tim ime jo job i tak takes es to to co complete

  • Say that sorted_JOBS[i] is the job with the i’th biggest ri
  • Return sorted_JOBS

The running time is O(nlog(n))

slide-68
SLIDE 68

Formally, we’d use induction

to prove this works

  • Inductive hypothesis:
  • There is an optimal ordering so that the first t jobs are

sorted_JOBS[1..t].

  • Base case:
  • When t=0, this reads: “There is an optimal ordering so that

the first 0 jobs are []”

  • That’s true.
  • Inductive Step:
  • Boils down to: there is an optimal ordering on

sorted_JOBS[t+1..n] so that sorted_JOBS[t] is first.

  • This follows from the Lemma.
  • Conclusion:
  • When t=n, this reads: “There is an optimal ordering so that

the first n jobs are sorted_JOBS.”

  • aka, what we returned is an optimal ordering.
slide-69
SLIDE 69

What have we learned?

  • We saw that scheduling is another example where a

greedy algorithm works.

  • This followed the same outline as the previous example:
  • Identify optimal substructure:
  • Find a way to make “safe” choices that won’t rule out an
  • ptimal solution.
  • smallest ratios first.

Job A Job B Job C Job D

slide-70
SLIDE 70

One more example

Huffman coding

  • everyday english sentence
  • 01100101 01110110 01100101 01110010 01111001 01100100 01100001

01111001 00100000 01100101 01101110 01100111 01101100 01101001 01110011 01101000 00100000 01110011 01100101 01101110 01110100 01100101 01101110 01100011 01100101

  • qwertyui_opasdfg+hjklzxcv
  • 01110001 01110111 01100101 01110010 01110100 01111001 01110101

01101001 01011111 01101111 01110000 01100001 01110011 01100100 01100110 01100111 00101011 01101000 01101010 01101011 01101100 01111010 01111000 01100011 01110110

slide-71
SLIDE 71

One more example

Huffman coding

  • everyday english sentence
  • 01100101 01110110 01100101 01110010 01111001 01100100 01100001

01111001 00100000 01100101 01101110 01100111 01101100 01101001 01110011 01101000 00100000 01110011 01100101 01101110 01110100 01100101 01101110 01100011 01100101

  • qwertyui_opasdfg+hjklzxcv
  • 01110001 01110111 01100101 01110010 01110100 01111001 01110101

01101001 01011111 01101111 01110000 01100001 01110011 01100100 01100110 01100111 00101011 01101000 01101010 01101011 01101100 01111010 01111000 01100011 01110110 ASCII is pretty wasteful. If e shows up so often, we should have a more parsimonious way

  • f representing it!
slide-72
SLIDE 72

Suppose we have some distribution on characters

slide-73
SLIDE 73

Suppose we have some distribution on characters

A B C D E F Percentage Letter 45 13 12 16 9 5 For simplicity, let’s go with this made-up example

How to encode them as efficiently as possible?

slide-74
SLIDE 74

Try 1

A B C D E F Percentage Letter 45 13 12 16 9 5 1 00 01 10 11

  • Every letter is assigned a binary string
  • f one or two bits.
  • The more frequent letters get the

shorter strings.

  • Problem:
  • Does 000 mean AAA or BA or AB?
slide-75
SLIDE 75

Try 2: prefix-free coding

A B C D E F Percentage Letter 45 13 12 16 9 5 01 00 101 110 111 100

  • Every letter is assigned a binary string.
  • More frequent letters get shorter strings.
  • No encoded string is a prefix of any other.

10010101

Confusingly, “prefix-free codes” are also sometimes called “prefix codes” (including in CLRS).

slide-76
SLIDE 76

Try 2: prefix-free coding

A B C D E F Percentage Letter 45 13 12 16 9 5 01 00 101 110 111 100

  • Every letter is assigned a binary string.
  • More frequent letters get shorter strings.
  • No encoded string is a prefix of any other.

10010101 F

Confusingly, “prefix-free codes” are also sometimes called “prefix codes” (including in CLRS).

slide-77
SLIDE 77

Try 2: prefix-free coding

A B C D E F Percentage Letter 45 13 12 16 9 5 01 00 101 110 111 100

  • Every letter is assigned a binary string.
  • More frequent letters get shorter strings.
  • No encoded string is a prefix of any other.

10010101 FA

Confusingly, “prefix-free codes” are also sometimes called “prefix codes” (including in CLRS).

slide-78
SLIDE 78

Try 2: prefix-free coding

A B C D E F Percentage Letter 45 13 12 16 9 5 01 00 101 110 111 100

  • Every letter is assigned a binary string.
  • More frequent letters get shorter strings.
  • No encoded string is a prefix of any other.

10010101 FAB

Question: What is the most efficient way to do prefix-free coding? (This isn’t it). Confusingly, “prefix-free codes” are also sometimes called “prefix codes” (including in CLRS).

slide-79
SLIDE 79

A prefix-free code is a tree

D: 16 A: 45 B:13 F:5 C:12 E:9

1 1 1 1 1

00 01 100 101 110 111 As long as all the letters show up as leaves, this code is prefix-free. B:13 below means that ‘B’ makes up 13% of the characters that ever appear.

slide-80
SLIDE 80

Some trees are better than others

D: 16 A: 45 B:13 F:5 C:12 E:9

1 1 1 1 1

00 01 100 101 110 111

  • Imagine choosing a letter at random from the language.
  • Not uniform, but according to our histogram!
  • The cost of a tree is the expected length of the encoding of that letter.

Expected cost of encoding a letter with this tree: 𝟑 𝟏. 𝟓𝟔 + 𝟏. 𝟐𝟕 + 𝟒 𝟏. 𝟏𝟔 + 𝟏. 𝟐𝟒 + 𝟏. 𝟐𝟑 + 𝟏. 𝟏𝟘 = 𝟑. 𝟒𝟘 Question: What is lowest-cost tree for this distribution? (This isn’t it). Cost = I 𝑄 𝑦 ⋅ depth(𝑦)

  • OPQRPS T

P(x) is the probability

  • f letter x

The depth in the tree is the length

  • f the encoding
slide-81
SLIDE 81

Optimal sub-structure

  • Suppose this is an optimal tree:

1

Then this is an

  • ptimal tree on

fewer letters.

Otherwise, we could change this sub-tree and end up with a better overall tree.

slide-82
SLIDE 82

In order to design a greedy algorithm

  • Think about what letters belong in this sub-problem...

1

What’s a safe choice to make for these lower sub-trees? Infrequent elements!

We want them as low down as possible.

slide-83
SLIDE 83

Solution

greedily build subtrees, starting with the infrequent letters

D: 16 A: 45 B:13 F:5 C:12 E:9 14

1

slide-84
SLIDE 84

Solution

greedily build subtrees, starting with the infrequent letters

D: 16 A: 45 B:13 F:5 C:12 E:9 14

1

25

1

slide-85
SLIDE 85

Solution

greedily build subtrees, starting with the infrequent letters

D: 16 A: 45 B:13 F:5 C:12 E:9 14

1

25

1

30

1

slide-86
SLIDE 86

Solution

greedily build subtrees, starting with the infrequent letters

D: 16 A: 45 B:13 F:5 C:12 E:9 14

1

25

1

30

1

55

1

slide-87
SLIDE 87

Solution

greedily build subtrees, starting with the infrequent letters

D: 16 A: 45 B:13 F:5 C:12 E:9 14

1

25

1

30

1

55

1

100

1

slide-88
SLIDE 88

Solution

greedily build subtrees, starting with the infrequent letters

D: 16 A: 45 B:13 F:5 C:12 E:9 14

1

25

1

30

1

55

1

100

1

100 101 110 1110 1111 Expected cost of encoding a letter: 𝟐 ⋅ 𝟏. 𝟓𝟔 + 𝟒 ⋅ 𝟏. 𝟓𝟐 + 𝟓 ⋅ 𝟏. 𝟐𝟓

= 𝟑. 𝟑𝟓

slide-89
SLIDE 89

What exactly was the algorithm?

  • Create a node like for each letter/frequency
  • The key is the frequency (16 in this case)
  • Let CURRENT be the list of all these nodes.
  • while len(CURRENT) > 1:
  • X and Y ← the nodes in CURRENT with the smallest keys.
  • Create a new node Z with Z.key = X.key + Y.key
  • Set Z.left = X, Z.right = Y
  • Add Z to CURRENT and remove X and Y
  • return CURRENT[0]

D: 16 F:5 E:9 14

1

Y Z X

D: 16 A: 45 B:13 C:12

slide-90
SLIDE 90

Proof strategy

just like before

  • Show that at each step, the choices we are making

won’t rule out an optimal solution.

  • Lemma:
  • Suppose that x and y are the two least-frequent letters.

Then there is an optimal tree where x and y are siblings.

D: 16 A: 45 B:13 F:5 C:12 E:9 14

1

slide-91
SLIDE 91

Lemma

proof idea

  • Say that an optimal tree looks like this:
  • What happens to the cost if we swap x for a?
  • the cost can’t increase; a was more frequent than x, and

we just made its encoding shorter.

  • Repeat this logic until we get an optimal tree with x

and y as siblings.

If x and y are the two least-frequent letters, there is an optimal subtree where x and y are siblings. x a Lowest-level sibling nodes: at least one of them is neither x nor y

slide-92
SLIDE 92

Lemma

proof idea

  • Say that an optimal tree looks like this:
  • What happens to the cost if we swap x for a?
  • the cost can’t increase; a was more frequent than x, and

we just made its encoding shorter.

  • Repeat this logic until we get an optimal tree with x

and y as siblings.

x y Lowest-level sibling nodes: at least one of them is neither x nor y If x and y are the two least-frequent letters, there is an optimal subtree where x and y are siblings.

slide-93
SLIDE 93

Proof strategy

just like last time

  • Show that at each step, the choices we are making

won’t rule out an optimal solution.

  • Lemma:
  • Suppose that x and y are the two least-frequent letters.

Then there is an optimal tree where x and y are siblings.

D: 16 A: 45 B:13 F:5 C:12 E:9 14

1

Actually that’s not quite enough…

slide-94
SLIDE 94

Proof strategy

just like last time

  • Show that at each step, the choices we are making

won’t rule out an optimal solution.

  • Lemma:
  • Suppose that x and y are the two least-frequent letters.

Then there is an optimal tree where x and y are siblings.

Actually that’s not quite enough…

D: 16 A: 45 B:13 F:5 C:12 E:9

1

25

1 1

14 30 Our argument before just showed that we made the right choice at the first step, when everything was a leaf. What about

  • nce we start grouping stuff?
slide-95
SLIDE 95

Lemma 2

this distinction doesn’t really matter

D: 16 F:5 E:9 14

1

25

1

30

1

55

1

100

1

C:12 B:13 A: 45 A: 45 55

1

100

1

G: 25 H: 30 The first thing is an optimal tree on {A,B,C,D,E,F} if and only if the second thing is an

  • ptimal tree on {A,G,H}
slide-96
SLIDE 96
  • For a proof:
  • See CLRS, Lemma 16.3
  • Rigorous although presented in a slightly different way
  • See Lecture Notes 14
  • A bit sketchier, but presented in the same way as here
  • Prove it yourself!
  • This is the best!

Ollie the over-achieving ostrich Getting all the details isn’t that important, but you should convince yourself that this is true.

Lemma 2

this distinction doesn’t really matter

slide-97
SLIDE 97

Together

  • Lemma 1:
  • Suppose that x and y are the two least-frequent letters.

Then there is an optimal tree where x and y are siblings.

  • Lemma 2:
  • We may as well imagine that CURRENT contains only

leaves.

  • These imply:
  • At each step, our choice doesn’t rule out an optimal

tree.

slide-98
SLIDE 98

Formally, we’d use induction

  • Inductive hypothesis:
  • after the t’th step,
  • there is an optimal tree containing the current subtrees as “leaves”
  • Base case:
  • after the 0’th step,
  • there is an optimal tree containing all the characters.
  • Inductive step:
  • TO DO
  • Conclusion:
  • after the last step,
  • there is an optimal tree containing this whole tree as a subtree.
  • aka,
  • after the last step the tree we’ve constructed is optimal.

After the t’th step, we’ve got a bunch of current sub-trees:

Inductive hyp. asserts that our subtrees can be assembled into an

  • ptimal tree:
slide-99
SLIDE 99

Inductive step

  • Suppose that the inductive hypothesis holds for t-1
  • After t-1 steps, there is an optimal tree containing all the

current sub-trees as “leaves.”

  • Want to show:
  • After t steps, there is an optimal tree containing all the

current sub-trees as leaves.

  • Two ingredients:
  • Lemma 1: If x and y are the two least-frequent letters,

there is an optimal subtree where x and y are siblings.

  • Lemma 2: Suppose that there is an optimal tree

containing as a subtree. Then we may as well replace it with a new letter with frequency

We’ve got a bunch of current sub-trees: x y say that x and y are the two smallest. a a w z

slide-100
SLIDE 100

Inductive step

  • Suppose that the inductive hypothesis holds for t-1
  • After t-1 steps, there is an optimal tree containing all the

current sub-trees as “leaves”.

  • By Lemma 2, may as well treat as

We’ve got a bunch of current sub-trees: x y w say that x and y are the two smallest. a a y x w z z

slide-101
SLIDE 101

Inductive step

  • Suppose that the inductive hypothesis holds for t-1
  • After t-1 steps, there is an optimal tree containing all the

current sub-trees as “leaves.”

  • By Lemma 2, may as well treat as
  • In particular, optimal trees on this new alphabet

correspond to optimal trees on the original alphabet.

We’ve got a bunch of current sub-trees: x y w say that x and y are the two smallest. a a z w y x z

slide-102
SLIDE 102

Inductive step

  • Suppose that the inductive hypothesis holds for t-1
  • After t-1 steps, there is an optimal tree containing all the

current sub-trees as “leaves.”

  • Our algorithm would do this at level t:

We’ve got a bunch of current sub-trees: x y w say that x and y are the two smallest. x y w a a = x+y z z w y x z

slide-103
SLIDE 103

Inductive step

  • Suppose that the inductive hypothesis holds for t-1
  • After t-1 steps, there is an optimal tree containing all the

current sub-trees as “leaves.”

  • Our algorithm would do this at level t:

We’ve got a bunch of current sub-trees: x y w say that x and y are the two smallest. z w a y x x y w a a = x+y Lemma 1 implies that there’s an optimal sub-tree that looks like this; aka, what our algorithm did okay. z z

slide-104
SLIDE 104

Inductive step

  • Suppose that the inductive hypothesis holds for t-1
  • After t-1 steps, there is an optimal tree containing all the

current sub-trees as “leaves.”

  • Our algorithm would do this at level t:

We’ve got a bunch of current sub-trees: x y w say that x and y are the two smallest. w a x y w a a = x+y Lemma 2 again says that there’s an optimal tree that looks like this z y x z z

slide-105
SLIDE 105

Inductive step

  • Suppose that the inductive hypothesis holds for t-1
  • After t-1 steps, there is an optimal tree containing all the

current sub-trees as “leaves.”

  • Our algorithm would do this at level t:

We’ve got a bunch of current sub-trees: x y w say that x and y are the two smallest. w a x y w a a = x+y Lemma 2 again says that there’s an optimal tree that looks like this z y x z This is what we wanted to show for the inductive step. z

slide-106
SLIDE 106

Inductive outline:

  • Inductive hypothesis:
  • after the t’th step,
  • there is an optimal tree containing the current subtrees as “leaves”
  • Base case:
  • after the 0’th step,
  • there is an optimal tree containing all the vertices.
  • Inductive step:
  • TO DO
  • Conclusion:
  • after the last step,
  • there is an optimal tree containing this whole tree as a subtree.
  • aka,
  • after the last step the tree we’ve constructed is optimal.

After the t’th step, we’ve got a bunch of current sub-trees:

Inductive hyp. asserts that our subtrees can be assembled into an

  • ptimal tree:
slide-107
SLIDE 107

What have we learned?

  • ASCII isn’t an optimal way to encode English, since

the distribution on letters isn’t uniform.

  • Huffman Coding is an optimal way!
  • To come up with an optimal scheme for any

language efficiently, we can use a greedy algorithm.

  • To come up with a greedy algorithm:
  • Identify optimal substructure
  • Find a way to make “safe” choices that won’t rule out an
  • ptimal solution.
  • Create subtrees out of the smallest two current subtrees.
slide-108
SLIDE 108

Recap I

  • Greedy algorithms!
  • Three examples:
  • Activity Selection
  • Scheduling Jobs
  • Huffman Coding
slide-109
SLIDE 109

Recap II

  • Greedy algorithms!
  • Often easy to write down
  • But may be hard to come up with and hard to justify
  • The natural greedy algorithm may not always be

correct

  • You’ll see this on HW7
  • A problem is a good candidate for a greedy

algorithm if:

  • it has optimal substructure
  • that optimal substructure is REALLY NICE
  • solutions depend on just one other sub-problem.
slide-110
SLIDE 110

Next time

  • Greedy algorithms for Minimum Spanning Tree!