Heapsort Chapter 6 1 CPTR 430 Algorithms Heapsort - - PowerPoint PPT Presentation

heapsort
SMART_READER_LITE
LIVE PREVIEW

Heapsort Chapter 6 1 CPTR 430 Algorithms Heapsort - - PowerPoint PPT Presentation

Heapsort Chapter 6 1 CPTR 430 Algorithms Heapsort


slide-1
SLIDE 1

Heapsort

Chapter 6

CPTR 430 Algorithms Heapsort

1

slide-2
SLIDE 2

Sorting

■ Input: a sequence of n numbers

  • a0

a1

✁ ✂ ✂ ✂ ✁

an

1

■ Output: a permutation (reordering)

  • a
✆ ✁

a

1

✁ ✂ ✂ ✂ ✁

a

n

1

such that

a

✆ ✝

a

1

✝ ✞ ✞ ✞ ✝

a

n

1 ■ Instance of problem:

Input:

  • 31

41

59

26

41

58

Output:

  • 26

31

41

41

58

59

CPTR 430 Algorithms Heapsort

2

slide-3
SLIDE 3

Sorting

■ Numbers all by themselves are rarely sorted ■ Usually, records are sorted by numeric keys ❚ The rest of the record is satellite data and goes along with the keys

when sorting

❚ If records are large (lots of satellite data), then an array of pointers to

the records can be sorted to minimize the data movement

■ An algorithm can concentrate on sorting plain numbers; sorting records

would be a relatively minor implementation detail

CPTR 430 Algorithms Heapsort

3

slide-4
SLIDE 4

What’s the Big Deal about Sorting?

■ Many applications need to sort data ■ Many algorithms sort data via a subroutine in the course of their actions ■ There are many different sorting algorithms,

each with its own interesting characteristics

■ Bounds can be easily determined for sorting algorithms; this experience

transfers to other algorithms

■ The implementations of sorting algorithms demonsrate many software

engineering issues: data shaping, the memory hierarchy, etc.

CPTR 430 Algorithms Heapsort

4

slide-5
SLIDE 5

Sorts We Will Examine

■ Insertion sort (we saw this in Chapter 2) ■ Merge sort (we saw this in Chapter 2) ■ Heapsort (this chapter) ■ Quicksort (Chapter 7)

CPTR 430 Algorithms Heapsort

5

slide-6
SLIDE 6

Heapsort

■ Like merge sort, heapsort runs in O

  • nlgn

time (insertion sort takes

O

  • n2

time)

■ Like insertion sort, heapsort sorts in place (merge sort requires extra

space)

■ Heapsort uses a specialized data structure to manage the sort ❚ The heap ❚ Not to be confused with the “heap” used for dynamic memory

allocation in Java and C++

❚ Also makes a nice priority queue

CPTR 430 Algorithms Heapsort

6

slide-7
SLIDE 7

Heap

■ The heap is an array that can be viewed as a nearly complete binary tree ❚ The tree is completely filled at all levels except the lowest ❚ The lowest level is filled from the left 7 9 3 1 4 2 8 14 10 16 16 9 3 1 4 2 7 8 10 14 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9

CPTR 430 Algorithms Heapsort

7

slide-8
SLIDE 8

Array as Binary Tree?

7 9 3 1 4 2 8 14 10 16 16 9 3 1 4 2 7 8 10 14 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9

left

  • i
✁ ✂

2i

1

right

  • i
✁ ✂

2i

2

parent

  • i
✁ ✂

i

1 2

For every node i except the root (root is index 0): A[parent

  • i

]

A[i]

CPTR 430 Algorithms Heapsort

8

slide-9
SLIDE 9

Max-heap vs. Min-heap

■ In a max-heap, A[parent

  • i

]

  • A[i]

❚ The root holds the largest value ❚ Max-heaps are used for sorting ■ In a min-heap, A[parent

  • i

]

A[i]

❚ The root holds the smallest value ❚ Min-heaps are often used for priority queues

CPTR 430 Algorithms Heapsort

9

slide-10
SLIDE 10

Heap Height

■ The height of a node in a heap is the number of edges on the longest

path from the node to a leaf

■ The height of a heap is the height of its root ■ Heap is based on a complete binary tree

  • height of heap = Θ
  • lgn

■ Basic operations on a heap run in time proportional to the height of the

heap—O

  • lgn

CPTR 430 Algorithms Heapsort

10

slide-11
SLIDE 11

Heap Operations

■ max-heapify: maintains the max-heap property ■ build-max-heap: produces a max-heap from an unsorted array ■ heapsort: sorts an array in place (using a heap) ■ max-heap-insert, heap-extract-max, heap-increase-key, heap-maximum:

used for priority queues

CPTR 430 Algorithms Heapsort

11

slide-12
SLIDE 12

Maintaining the Heap Property

public static void maxHeapify(int[] a, int heapSize, int i) { (Code removed due to Assignment #4) }

■ a is an array ■ heapSize is the size of the heap ■ i is an index into the array

CPTR 430 Algorithms Heapsort

12

slide-13
SLIDE 13

Maintaining the Heap Property

public static void maxHeapify(int[] a, int heapSize, int i) { (Code removed due to Assignment #4) }

■ left(i) and right(i) must be roots of max-heaps ■ a[i] may be smaller than either of its children (thus violating the max-

heap property)

■ maxHeapify() moves a[i] down into the heap so that the subtree

rooted at i becomes a max-heap

CPTR 430 Algorithms Heapsort

13

slide-14
SLIDE 14

maxHeapify()

1 2 3 4 5 6 7 8 9 7 9 3 1 2 10 16 4 14 8 i 1 2 3 4 5 6 7 8 9 7 9 3 1 2 14 10 16 8 4 i 1 2 3 4 5 6 7 8 9 7 9 3 1 2 14 10 16 8 4 i

maxHeapify(a, 1);

CPTR 430 Algorithms Heapsort

14

slide-15
SLIDE 15

maxHeapify() Running Time

For a subtree of size n rooted at node i, T

  • n

is:

T(fix up relationships among a[i], a[left(i)], and a[right(i)])

+

T(run maxHeapify() on one of the children of i)

but

■ The time to fix up the relationships is constant ■ Each subtree of i has size at most 2n

  • 3

Worst case is when last row is half full

and so

T

  • n
✁ ✁

T

  • 2n
  • 3
✁✄✂

Θ

  • 1

CPTR 430 Algorithms Heapsort

15

slide-16
SLIDE 16

maxHeapify() Running Time

Based on our study of Chapter 4, the recurrence relation

T

  • n
✁ ✁

T

  • 2n
  • 3
✁✄✂

Θ

  • 1

has the solution

T

  • n
✁ ✁

O

  • lgn

If the dimension of interest is the height of the tree, h, then

T

  • n
✁ ✁

O

  • h

CPTR 430 Algorithms Heapsort

16

slide-17
SLIDE 17

Building a Heap

■ To build a max-heap from an arbitrary array, apply maxHeapify() in a

bottom-up manner

■ All the elements in the range

  • n
  • 2
✁ ✂ ✂ ✂

n

1 are leaves of the tree

❚ Each of the leaves is a subtree of size 1 ❚ n

a.length

1

public static void buildMaxHeap(int[] a) { (Code removed due to Assignment #4) }

CPTR 430 Algorithms Heapsort

17

slide-18
SLIDE 18

buildMaxHeap() in Action

1 2 3 4 5 6 7 8 9 9 i 2 10 3 7 1 4 14 8 16 1 2 3 4 5 6 7 8 9 9 8 4 14 2 10 3 7 1 16 1 2 3 4 5 6 7 8 9 9 8 7 16 1 4 3 10 14 2 1 2 3 4 5 6 7 8 9 9 8 i 7 16 1 4 14 2 10 3 1 2 3 4 5 6 7 8 9 9 8 14 2 7 16 1 4 3 10 1 2 3 4 5 6 7 8 9 9 8 14 2 7 16 1 4 3 10 i i i i

CPTR 430 Algorithms Heapsort

18

slide-19
SLIDE 19

Correctness of buildMaxHeap()

Loop invariant: At the start of each for iteration, each node

i

i

1

i

2

✁ ✂ ✂ ✂ ✁

n

1 is the root of a max-heap

Proof: Initialization: Before the first iteration, i

  • n
  • 2
✁ ✂

1.

Each node

  • n
  • 2
✁ ✁
  • n
  • 2
✁ ✂

1

  • n
  • 2
✁ ✂

2

✁ ✂ ✂ ✂

n

1 is a leaf, and therefore each is a

root of a max-heap of size 1.

CPTR 430 Algorithms Heapsort

19

slide-20
SLIDE 20

Correctness of buildMaxHeap()

Mainentance: Observe that the children of node i have values greater than i. By the loop invariant, they are all roots of max- heaps.

maxHeapify() can thus be applied to make i a max-heap. maxHeapify() ensures that i

i

1

i

2

✁ ✂ ✂ ✂ ✁

n

1 are all roots of max-

  • heaps. The for loop decrements i to reestablish the loop invariant for the

next iteration. Termination: When the for loop finishes, i

✁ ✂
  • 1. According to the loop

invariant, 0

1

2

✁ ✂ ✂ ✂ ✁

n

1 are all roots of max-heaps. Since 0 is the root,

the array is a heap.

CPTR 430 Algorithms Heapsort

20

slide-21
SLIDE 21

Running Time of buildMaxHeap()

Simple upper bound:

■ Each call to maxHeapify() takes O

  • lgn

time

■ maxHeapify() is called O

  • n

times

■ T

  • n
✁ ✁

O

  • nlgn

Can we get a tighter bound?

■ maxHeapify()’s run time depends on the height of the node within the

tree

■ Most nodes are not very high (more nodes are on lower levels than

higher levels)

CPTR 430 Algorithms Heapsort

21

slide-22
SLIDE 22

Tighter Running Time Bound for

buildMaxHeap()

■ An n-element heap has height

  • lgn

■ An n-element heap has at most

  • n
  • 2h

1

nodes of some height h

■ The time for maxHeapify() to run on a heap of height h is O

  • h

■ The running time of buildMaxHeap() is thus

lgn

h

n 2h

1 O

  • h

CPTR 430 Algorithms Heapsort

22

slide-23
SLIDE 23

Tighter Running Time Bound for

buildMaxHeap()

We can simplify this a bit:

lgn

h

n 2h

1 O

  • h
✁ ✁

O n

lgn

h

h 2h

CPTR 430 Algorithms Heapsort

23

slide-24
SLIDE 24

Tighter Running Time Bound for

buildMaxHeap()

O n

lgn

h

h 2h

We can apply the mathematical identity

k

kxk

x

  • 1

x

2

(Page 1061)

where x

1 2 and k

h:

h

h 2h

h

h 1 2

h

1

  • 2
  • 1

1

  • 2

2

2

CPTR 430 Algorithms Heapsort

24

slide-25
SLIDE 25

Tighter Running Time Bound for

buildMaxHeap()

O n

lgn

h

h 2h

O n

h

h 2h

O

  • n

Thus we can build a max-heap in linear time

CPTR 430 Algorithms Heapsort

25

slide-26
SLIDE 26

Heapsort

To sort an array:

  • 1. Build a max-heap (using buildMaxHeap())

■ The largest value is now at the front (top of the heap)

  • 2. Exchange the front and rear (of the heap)

■ The largest element is now in its proper place in the sorted array ■ If the last element is ignored, the children of the root remain max-heaps, but the

new root element may violate the max-heap property

  • 3. Reduce the size of the heap by one, and restore the max-heap property

(using maxHeapify())

■ This effectively cuts the previous root off from the rest of the heap

  • 4. If the heap’s size is greater than 1, go to Step 2

CPTR 430 Algorithms Heapsort

26

slide-27
SLIDE 27

heapSort() in Action

1 2 3 4 5 6 7 8 9 9 2 10 3 7 i 16 14 8 4 1 1 2 3 4 5 6 7 8 9 2 3 7 16 8 4 i 14 10 9 1 1 2 3 4 5 6 7 8 9 7 16 8 4 14 1 10 9 3 2 i 1 2 3 4 5 6 7 8 9 9 2 10 3 7 1 4 14 8 16

CPTR 430 Algorithms Heapsort

27

slide-28
SLIDE 28

heapSort() in Action

1 2 3 4 5 6 7 8 9 16 4 14 1 10 3 9 i 8 7 2 1 2 3 4 5 6 7 8 9 16 14 10 3 9 2 8 7 4 1 i 1 2 3 4 5 6 7 8 9 16 14 10 3 9 8 1 7 4 2 i 1 2 3 4 5 6 7 8 9 16 14 10 9 8 7 2 4 3 1 i

CPTR 430 Algorithms Heapsort

28

slide-29
SLIDE 29

heapSort() in Action

1 2 3 4 5 6 7 8 9 16 14 10 9 8 7 4 3 2 1 i 1 2 3 4 5 6 7 8 9 16 14 10 9 8 7 4 3 1 2 i

CPTR 430 Algorithms Heapsort

29

slide-30
SLIDE 30

heapSort() Running Time

public static void heapsort(int[] a) { (Code removed due to Assignment #4) }

■ buildMaxHeap() is called once—O

  • n

■ maxHeapify() is called n

1 times, and each call takes O

  • lgn

time

■ T

  • n
✁ ✁

O

  • n
✁✄✂

O

  • n
✁ ✞

O

  • lgn
✁ ✁

O

  • nlgn

CPTR 430 Algorithms Heapsort

30

slide-31
SLIDE 31

Priority Queues

■ Quicksort (Chapter 7) is generally faster than heapsort ■ The

heap data structure has

  • ther

applications; for example, implementing efficient priority queues

■ Applications of priority queues: ❚ Job scheduling on a multiprocess system (max-heap) ❚ Event-driven simulations (min-heap—time stamp) ■ For applications, a heap element normally consists of a reference

(handle, pointer, etc.) to the actual application object

CPTR 430 Algorithms Heapsort

31

slide-32
SLIDE 32

Priority Queue Operations

S is a set of elements, each with an associated key

■ maximum(S): returns the element in S with the largest key ■ extractMax(S): removes and returns the element in S with the largest

key

■ insert(S

x): S

  • S

x

■ increaseKey(S

x

k):

increases element x’s key to k (k must be greater than or equal to the current value)

CPTR 430 Algorithms Heapsort

32

slide-33
SLIDE 33

MaxHeap ADT

public class MaxHeap { private int size; private int[] heap; public MaxHeap(int max) { heap = new int[max]; size = 0; } public MaxHeap(int max, int[] elements) { this(max); for ( int i = 0; i < elements.length; i++ ) { insert(elements[i]); } } public void insert(int newValue) { . . . } public int maximum() { . . . } public int extractMaximum() { . . . } private static int left(int i) { . . . } private static int right(int i) { . . . } private static int parent(int i) { . . . } public void maxHeapify(int heapSize, int i) { . . . } public void buildMaxHeap() { . . . } }

CPTR 430 Algorithms Heapsort

33

slide-34
SLIDE 34

maximum()

public class MaxHeap { private int size; private int[] heap; public MaxHeap(int max) { . . . } public MaxHeap(int max, int[] elements) { . . . } public void insert(int newValue) { . . . } public int maximum() { return heap[0]; } public int extractMaximum() { . . . } private static int left(int i) { . . . } private static int right(int i) { . . . } private static int parent(int i) { . . . } public void maxHeapify(int heapSize, int i) { . . . } public void buildMaxHeap() { . . . } }

T

  • n
✁ ✁

Θ

  • 1

CPTR 430 Algorithms Heapsort

34

slide-35
SLIDE 35

extractMaximum()

public class MaxHeap { private int size; private int[] heap; public MaxHeap(int max) { . . . } public MaxHeap(int max, int[] elements) { . . . } public void insert(int newValue) { . . . } public int maximum() { . . . } public int extractMaximum() { if ( size < 1 ) { System.out.println("Heap underflow"); } int max = heap[0]; heap[0] = heap[--size]; maxHeapify(size, 0); return max; } private static int left(int i) { . . . } private static int right(int i) { . . . } private static int parent(int i) { . . . } public void maxHeapify(int heapSize, int i) { . . . } public void buildMaxHeap() { . . . } }

■ First part: constant time ■ maxHeapify(): O

  • lgn

T

  • n
✁ ✁

Θ

  • 1
✁✄✂

O

  • lgn
✁ ✁

O

  • lgn

CPTR 430 Algorithms Heapsort

35

slide-36
SLIDE 36

insert()

public class MaxHeap { private int size; private int[] heap; public MaxHeap(int max) { . . . } public MaxHeap(int max, int[] elements) { . . . } public void insert(int newValue) { if ( size < heap.length ) { heap[size++] = newValue; int i = size - 1; while ( i > 0 && heap[parent(i)] < heap[i] ) { int temp = heap[parent(i)]; heap[parent(i)] = heap[i]; heap[i] = temp; i = parent(i); } } else { System.out.println("Heap overflow"); } } public int maximum() { . . . } public int extractMaximum() { . . . } private static int left(int i) { . . . } private static int right(int i) { . . . } private static int parent(int i) { . . . } public void maxHeapify(int heapSize, int i) { . . . } public void buildMaxHeap() { . . . } }

■ The path from size - 1

to the root is O

  • lgn

T

  • n
✁ ✁

O

  • lgn

CPTR 430 Algorithms Heapsort

36