Sorting Algorithms October 18, 2017 CMPE 250 Sorting Algorithms - - PowerPoint PPT Presentation

sorting algorithms
SMART_READER_LITE
LIVE PREVIEW

Sorting Algorithms October 18, 2017 CMPE 250 Sorting Algorithms - - PowerPoint PPT Presentation

Sorting Algorithms October 18, 2017 CMPE 250 Sorting Algorithms October 18, 2017 1 / 74 Sorting Sorting is a process that organizes a collection of data into either ascending or descending order. An internal sort requires that the collection


slide-1
SLIDE 1

Sorting Algorithms

October 18, 2017

CMPE 250 Sorting Algorithms October 18, 2017 1 / 74

slide-2
SLIDE 2

Sorting

Sorting is a process that organizes a collection of data into either ascending or descending order. An internal sort requires that the collection of data fit entirely in the computer’s main memory. We can use an external sort when the collection of data cannot fit in the computer’s main memory all at once but must reside in secondary storage such as on a disk (or tape). We will analyze only internal sorting algorithms.

CMPE 250 Sorting Algorithms October 18, 2017 2 / 74

slide-3
SLIDE 3

Why Sorting?

Any significant amount of computer output is generally arranged in some sorted order so that it can be interpreted. Sorting also has indirect uses. An initial sort of the data can significantly enhance the performance of an algorithm. Majority of programming projects use a sort somewhere, and in many cases, the sorting cost determines the running time. A comparison-based sorting algorithm makes ordering decisions only

  • n the basis of comparisons.

CMPE 250 Sorting Algorithms October 18, 2017 3 / 74

slide-4
SLIDE 4

Sorting Algorithms

There are many sorting algorithms, such as:

Selection Sort Insertion Sort Bubble Sort Merge Sort Quick Sort Heap Sort Shell Sort

The first three are the foundations for faster and more efficient algorithms.

CMPE 250 Sorting Algorithms October 18, 2017 4 / 74

slide-5
SLIDE 5

Insertion Sort

Insertion sort is a simple sorting algorithm that is appropriate for small inputs.

The most common sorting technique used by card players.

The list is divided into two parts: sorted and unsorted. In each pass, the first element of the unsorted part is picked up, transferred to the sorted sublist, and inserted at the appropriate place. A list of n elements will take at most n − 1 passes to sort the data.

CMPE 250 Sorting Algorithms October 18, 2017 5 / 74

slide-6
SLIDE 6

Insertion Sort Example

CMPE 250 Sorting Algorithms October 18, 2017 6 / 74

slide-7
SLIDE 7

Insertion Sort Algorithm

// Simple insertion sort. template <typename Comparable> void insertionSort( vector<Comparable> & a ) { for( int p = 1; p < a.size( ); ++p ) { Comparable tmp = std::move( a[ p ] ); int j; for( j = p; j > 0 && tmp < a[ j - 1 ]; --j ) a[ j ] = std::move( a[ j - 1 ] ); a[ j ] = std::move( tmp ); } }

CMPE 250 Sorting Algorithms October 18, 2017 7 / 74

slide-8
SLIDE 8

Insertion Sort – Analysis

Running time depends on not only the size of the array but also the contents of the array. Best-case: → O(n)

Array is already sorted in ascending order. Inner loop will not be executed. The number of moves: 2 × (n − 1) → O(n) The number of key comparisons: (n − 1) → O(n)

Worst-case: → O(n2)

Array is in reverse order: Inner loop is executed i − 1 times, for i = 2, 3, . . . , n The number of moves: 2 × (n − 1) + (1 + 2 + · · · + n − 1) = 2 × (n − 1) + n × (n − 1)/2 → O(n2) The number of key comparisons: (1 + 2 + · · · + n − 1) = n × (n − 1)/2 → O(n2)

Average-case: → O(n2)

We have to look at all possible initial data organizations.

So, Insertion Sort is O(n2)

CMPE 250 Sorting Algorithms October 18, 2017 8 / 74

slide-9
SLIDE 9

Analysis of insertion sort

Which running time will be used to characterize this algorithm?

Best, worst or average?

Worst:

Longest running time (this is the upper limit for the algorithm) It is guaranteed that the algorithm will not be worse than this.

Sometimes we are interested in the average case. But there are some problems with the average case.

It is difficult to figure out the average case. i.e. what is average input? Are we going to assume all possible inputs are equally likely? In fact for most algorithms the average case is the same as the worst case.

CMPE 250 Sorting Algorithms October 18, 2017 9 / 74

slide-10
SLIDE 10

A lower bound for simple sorting algorithms

An inversion : an ordered pair (Ai, Aj) such that i < j but Ai > Aj Example: 10, 6, 7, 15, 3,1 Inversions are: (10,6), (10,7), (10,3), (10,1), (6,3), (6,1) (7,3), (7,1) (15,3), (15,1), (3,1)

CMPE 250 Sorting Algorithms October 18, 2017 10 / 74

slide-11
SLIDE 11

Swapping

Swapping adjacent elements that are out of order removes one inversion. A sorted array has no inversions. Sorting an array that contains i inversions requires at least i swaps of adjacent elements.

CMPE 250 Sorting Algorithms October 18, 2017 11 / 74

slide-12
SLIDE 12

Theorems

Theorem 1: The average number of inversions in an array of N distinct elements is N(N − 1)/4 Theorem 2: Any algorithm that sorts by exchanging adjacent elements requires Ω(N2) time on average. For a sorting algorithm to run in less than quadratic time it must do something other than swap adjacent elements.

CMPE 250 Sorting Algorithms October 18, 2017 12 / 74

slide-13
SLIDE 13

Mergesort

Mergesort algorithm is one of the two important divide-and-conquer sorting algorithms (the other one is quicksort). It is a recursive algorithm.

Divides the list into halves, Sorts each half separately, and Then merges the sorted halves into one sorted array.

CMPE 250 Sorting Algorithms October 18, 2017 13 / 74

slide-14
SLIDE 14

Merge Sort Example

CMPE 250 Sorting Algorithms October 18, 2017 14 / 74

slide-15
SLIDE 15

Mergesort

/** * Mergesort algorithm (driver). */ template <typename Comparable> void mergeSort( vector<Comparable> & a ) { vector<Comparable> tmpArray( a.size( ) ); mergeSort( a, tmpArray, 0, a.size( ) - 1 ); }

CMPE 250 Sorting Algorithms October 18, 2017 15 / 74

slide-16
SLIDE 16

Mergesort (Cont.)

/** * Internal method that makes recursive calls. * a is an array of Comparable items. * tmpArray is an array to place the merged result. * left is the left-most index of the subarray. * right is the right-most index of the subarray. */ template<typename Comparable> void mergeSort(vector<Comparable> & a, vector<Comparable> & tmpArray, int left, int right) { if (left < right) { int center = (left + right) / 2; mergeSort(a, tmpArray, left, center); mergeSort(a, tmpArray, center + 1, right); merge(a, tmpArray, left, center + 1, right); } }

CMPE 250 Sorting Algorithms October 18, 2017 16 / 74

slide-17
SLIDE 17

Merge

/** * Internal method that merges two sorted halves of a subarray. * a is an array of Comparable items. * tmpArray is an array to place the merged result. * leftPos is the left-most index of the subarray. * rightPos is the index of the start of the second half. * rightEnd is the right-most index of the subarray. */ template <typename Comparable> void merge( vector<Comparable> & a, vector<Comparable> & tmpArray, int leftPos, int rightPos, int rightEnd ) { int leftEnd = rightPos - 1; int tmpPos = leftPos; int numElements = rightEnd - leftPos + 1; // Main loop while( leftPos <= leftEnd && rightPos <= rightEnd ) if( a[ leftPos ] <= a[ rightPos ] ) tmpArray[ tmpPos++ ] = std::move( a[ leftPos++ ] ); else tmpArray[ tmpPos++ ] = std::move( a[ rightPos++ ] ); while( leftPos <= leftEnd ) // Copy rest of first half tmpArray[ tmpPos++ ] = std::move( a[ leftPos++ ] ); while( rightPos <= rightEnd ) // Copy rest of right half tmpArray[ tmpPos++ ] = std::move( a[ rightPos++ ] ); // Copy tmpArray back for( int i = 0; i < numElements; ++i, --rightEnd ) a[ rightEnd ] = std::move( tmpArray[ rightEnd ] ); }

CMPE 250 Sorting Algorithms October 18, 2017 17 / 74

slide-18
SLIDE 18

Merge Sort Example

CMPE 250 Sorting Algorithms October 18, 2017 18 / 74

slide-19
SLIDE 19

Merge Sort Example

CMPE 250 Sorting Algorithms October 18, 2017 19 / 74

slide-20
SLIDE 20

Mergesort – Analysis of Merge

A worst-case instance of the merge step in mergesort

CMPE 250 Sorting Algorithms October 18, 2017 20 / 74

slide-21
SLIDE 21

Mergesort – Analysis of Merge (cont.)

Merging two sorted arrays of size k Best-case:

All the elements in the first array are smaller (or larger) than all the elements in the second array. The number of moves: 2k + 2k The number of key comparisons: k

Worst-case:

The number of moves: 2k + 2k The number of key comparisons: 2k − 1

CMPE 250 Sorting Algorithms October 18, 2017 21 / 74

slide-22
SLIDE 22

Mergesort - Analysis

Levels of recursive calls to mergesort, given an array of eight items

CMPE 250 Sorting Algorithms October 18, 2017 22 / 74

slide-23
SLIDE 23

Mergesort - Analysis

CMPE 250 Sorting Algorithms October 18, 2017 23 / 74

slide-24
SLIDE 24

Mergesort - Analysis

Worst-case – The number of key comparisons:

= 20 ×(2×2m−1 −1)+21 ×(2×2m−2 −1)+...+2m−1 ×(2×20 −1) = (2m − 1) + (2m − 2) + ... + (2m − 2m−1) ( m terms ) = m × 2m − m−1

i=0 2i

= m × 2m − 2m − 1

Using m = logn

= n × log2n − n − 1 → O(n × log2n)

CMPE 250 Sorting Algorithms October 18, 2017 24 / 74

slide-25
SLIDE 25

Mergesort - Analysis

Mergesort is an extremely efficient algorithm with respect to time.

Both worst case and average cases are O(n × log2n)

But, mergesort requires an extra array whose size equals to the size

  • f the original array.

If we use a linked list, we do not need an extra array

But, we need space for the links And, it will be difficult to divide the list into half ( O(n) )

CMPE 250 Sorting Algorithms October 18, 2017 25 / 74

slide-26
SLIDE 26

Mergesort for Linked Lists

Merge sort is often preferred for sorting a linked list. The slow random-access performance of a linked list makes some other algorithms (such as quicksort) perform poorly, and others (such as heapsort) completely impossible. MergeSort

1

If head is NULL or there is only one element in the Linked List then return.

2

Else divide the linked list into two halves.

3

Sort the two halves a and b. MergeSort(&first); MergeSort(&second);

4

Merge the two parts of the list into a sorted one. *head = Merge(first, second);

CMPE 250 Sorting Algorithms October 18, 2017 26 / 74

slide-27
SLIDE 27

Mergesort for linked lists

#include <iostream> using namespace std; // Link list node typedef struct Node* listpointer; struct Node { int data; listpointer next; }; // function prototypes listpointer SortedMerge(listpointer a, listpointer b); void FrontBackSplit(listpointer source, listpointer* frontRef, listpointer* backRef); // sorts the linked list by changing next pointers (not data) void MergeSort(listpointer* headRef) { listpointer head = *headRef; listpointer a; listpointer b; //Base case -- length 0 or 1 if ((head == NULL) || (head->next == NULL)) { return; } // Split head into ’a’ and ’b’ sublists FrontBackSplit(head, &a, &b); // Recursively sort the sublists MergeSort(&a); MergeSort(&b); // answer = merge the two sorted lists together *headRef = SortedMerge(a, b); }

CMPE 250 Sorting Algorithms October 18, 2017 27 / 74

slide-28
SLIDE 28

Mergesort for linked lists (cont.)

listpointer SortedMerge(listpointer a, listpointer b) { listpointer result = NULL; // Base cases if (a == NULL) return(b); else if (b==NULL) return(a); // Pick either a or b, and make recursive call if (a->data <= b->data) { result = a; result->next = SortedMerge(a->next, b); } else { result = b; result->next = SortedMerge(a, b->next); } return(result); }

CMPE 250 Sorting Algorithms October 18, 2017 28 / 74

slide-29
SLIDE 29

Mergesort for linked lists (cont.)

// Split the nodes of the given list into front and back halves, // and return the two lists using the reference parameters. // If the length is odd, the extra node should go in the front list. // Uses the fast/slow pointer strategy. void FrontBackSplit(listpointer source, listpointer* frontRef, listpointer* backRef) { listpointer fast; listpointer slow; if (source==NULL || source->next==NULL) { // length < 2 cases *frontRef = source; *backRef = NULL; } else { slow = source; fast = source->next; // Advance ’fast’ two nodes, and advance ’slow’ one node while (fast != NULL) { fast = fast->next; if (fast != NULL) { slow = slow->next; fast = fast->next; } } // ’slow’ is before the midpoint in the list, so split it in two //at that point. *frontRef = source; *backRef = slow->next; slow->next = NULL; } }

CMPE 250 Sorting Algorithms October 18, 2017 29 / 74

slide-30
SLIDE 30

Mergesort for linked lists (cont.)

// Function to print nodes in a given linked list void printList(listpointer node) { while(node!=NULL) { cout<< node->data<<" "; node = node->next; } } // Function to insert a node at the beginning of the linked list void push(listpointer* head_ref, int new_data) { // allocate node listpointer new_node = new Node; // put in the data new_node->data = new_data; // link the old list off the new node new_node->next = (*head_ref); // move the head to point to the new node (*head_ref) = new_node; }

CMPE 250 Sorting Algorithms October 18, 2017 30 / 74

slide-31
SLIDE 31

Mergesort for linked lists (cont.)

// Driver program to test above functions int main() { // Start with the empty list listpointer a = NULL; int n,num; // Let us create an unsorted linked list to test the functions cout<<endl<<"Enter the number of data elements to be sorted: "; cin>>n; // Create linked list. for(int i = 0; i < n; i++) { cout<<"Enter element "<<i+1<<": "; cin>>num; push(&a,num); } // Sort the above created Linked List MergeSort(&a); cout<< endl << "Sorted Linked List is: "<<endl; printList(a); return 0; }

CMPE 250 Sorting Algorithms October 18, 2017 31 / 74

slide-32
SLIDE 32

Quicksort

Like mergesort, Quicksort is also based on the divide-and-conquer paradigm. But it uses this technique in a somewhat opposite manner, as all the hard work is done before the recursive calls. It works as follows:

1

First, it partitions an array into two parts,

2

Then, it sorts the parts independently,

3

Finally, it combines the sorted subsequences by a simple concatenation.

CMPE 250 Sorting Algorithms October 18, 2017 32 / 74

slide-33
SLIDE 33

Quicksort

Algorithm 1 Quicksort

1: Let S be the input set. 2: if | S| = 0 or | S| = 1 then

return

3: Pick an element v in S. Call v the pivot. 4: Partition S − v into two disjoint groups:

S1 = {x ∈ S − {v} |x ≤ v} S2 = {x ∈ S − {v} |x ≥ v} return { quicksort(S1), v, quicksort(S2) }

CMPE 250 Sorting Algorithms October 18, 2017 33 / 74

slide-34
SLIDE 34

Quicksort Illustrated

CMPE 250 Sorting Algorithms October 18, 2017 34 / 74

slide-35
SLIDE 35

Issues To Consider

How to pick the pivot?

Many methods (discussed later)

How to partition?

Several methods exist. The one we consider is known to give good results and to be easy and efficient. We discuss the partition strategy first.

CMPE 250 Sorting Algorithms October 18, 2017 35 / 74

slide-36
SLIDE 36

Partitioning Strategy

For now, assume that pivot = A[(left+right)/2]. We want to partition array A[left .. right]. First, get the pivot element out of the way by swapping it with the last element (swap pivot and A[right]). Let i start at the first element and j start at the next-to-last element (i = left, j = right – 1)

CMPE 250 Sorting Algorithms October 18, 2017 36 / 74

slide-37
SLIDE 37

Partitioning Strategy (Cont.)

Want to have

A[k] ≤ pivot, for k < i A[k] ≥ pivot, for k > j

When i < j

Move i right, skipping over elements smaller than the pivot Move j left, skipping over elements greater than the pivot When both i and j have stopped

A[i] ≥ pivot A[j] ≤ pivot ⇒ A[i] and A[j] should now be swapped

CMPE 250 Sorting Algorithms October 18, 2017 37 / 74

slide-38
SLIDE 38

Partitioning Strategy (Cont.)

When i and j have stopped and i is to the left of j (thus legal)

Swap A[i] and A[j]

The large element is pushed to the right and the small element is pushed to the left

After swapping

A[i] ≤ pivot A[j] ≥ pivot

Repeat the process until i and j cross

CMPE 250 Sorting Algorithms October 18, 2017 38 / 74

slide-39
SLIDE 39

Partitioning Strategy (Cont.)

When i and j have crossed

swap A[i] and pivot

Result:

A[k] ≤ pivot, for k < i A[k] ≥ pivot, for k > i

CMPE 250 Sorting Algorithms October 18, 2017 39 / 74

slide-40
SLIDE 40

Pivot Strategies

First element:

Bad choice if input is sorted or in reverse sorted order Bad choice if input is nearly sorted

Random: even a malicious agent cannot arrange a bad input Median-of-three: choose the median of the left, right, and center elements

CMPE 250 Sorting Algorithms October 18, 2017 40 / 74

slide-41
SLIDE 41

Median of Three

CMPE 250 Sorting Algorithms October 18, 2017 41 / 74

slide-42
SLIDE 42

Median of Three

// Return median of left, center, and right. // Order these and hide the pivot. template <typename Comparable> const Comparable & median3( vector<Comparable> & a, int left, int right ) { int center = ( left + right ) / 2; if( a[ center ] < a[ left ] ) std::swap( a[ left ], a[ center ] ); if( a[ right ] < a[ left ] ) std::swap( a[ left ], a[ right ] ); if( a[ right ] < a[ center ] ) std::swap( a[ center ], a[ right ] ); // Place pivot at position right - 1 std::swap( a[ center ], a[ right - 1 ] ); return a[ right - 1 ]; }

CMPE 250 Sorting Algorithms October 18, 2017 42 / 74

slide-43
SLIDE 43

Discussion

Small arrays: Quicksort is slower than insertion sort when is N is small (say, N ≤ 20). Optimization: Make |S| ≤ 20 the base case and call insertion sort.

CMPE 250 Sorting Algorithms October 18, 2017 43 / 74

slide-44
SLIDE 44

Quicksort algorithm (driver)

template <typename Comparable> void quicksort( vector<Comparable> & a ) { quicksort( a, 0, a.size( ) - 1 ); }

CMPE 250 Sorting Algorithms October 18, 2017 44 / 74

slide-45
SLIDE 45

Quicksort algorithm (recursive)

// Uses median-of-three partitioning and a cutoff of 20. // a is an array of Comparable items. // left is the left-most index of the subarray. // right is the right-most index of the subarray. template <typename Comparable> void quicksort( vector<Comparable> & a, int left, int right ) { if( left + 20 <= right ) { const Comparable & pivot = median3( a, left, right ); // Begin partitioning int i = left, j = right - 1; for( ; ; ) { while( a[ ++i ] < pivot ) { } while( pivot < a[ --j ] ) { } if( i < j ) std::swap( a[ i ], a[ j ] ); else break; } std::swap( a[ i ], a[ right - 1 ] ); // Restore pivot quicksort( a, left, i - 1 ); // Sort small elements quicksort( a, i + 1, right ); // Sort large elements } else // Do an insertion sort on the subarray insertionSort( a, left, right ); } CMPE 250 Sorting Algorithms October 18, 2017 45 / 74

slide-46
SLIDE 46

Analysis of Quicksort

Worst case: pivot is the smallest (or largest) element all the time. T(N) = T(N − 1) + cN T(N − 1) = T(N − 2) + c(N − 1) T(N − 2) = T(N − 3) + c(N − 2)

. . .

T(2) = T(1) + c(2) T(N) = T(1) + c N

i=2 i → O(N2)

Best case: pivot is the median T(N) = 2T(N/2) + cN T(N) = cNlogN + N → O(NlogN)

CMPE 250 Sorting Algorithms October 18, 2017 46 / 74

slide-47
SLIDE 47

Quicksort: Average case

Assume each of the sizes for S1 are equally likely. 0 ≤ |S1| ≤ N − 1. T(N) =

  • 1

N

N−1

i=0 [T(i) + T(N − i − 1)]

  • + cN

T(N) =

  • 2

N

N−1

i=0 T(i)

  • + cN

NT(N) =

  • 2 N−1

i=0 T(i)

  • + cN2

(N − 1)T(N − 1) =

  • 2 N−2

i=0 T(i)

  • + c(N − 1)2

NT(N) − (N − 1)T(N − 1) = 2T(N − 1) + 2cN − c NT(N) = (N + 1)T(N − 1) + 2cN Divide equation by N(N + 1)

T(N) N+1 = T(N−1) N

+

2c N+1

CMPE 250 Sorting Algorithms October 18, 2017 47 / 74

slide-48
SLIDE 48

Quicksort: Average case (Cont.)

T(N−1) N

= T(N−2)

N−1

+ 2c

N T(N−2) N−1

= T(N−3)

N−2

+

2c N−1

. . .

T(2) 3

= T(1)

2

+ 2c

3 T(N) N+1 = T(1) 2

+ 2c N+1

i=3 1 i

2c N+1

i=3 1 i = 2c(HN+1 − 3 2)

T(N) = (N + 1)( T(1)

2

+ 2c(HN+1 − 3

2)

HN ≈ loge(N) + γ +

1 2N (γ = 0.577215664901(Euler-Mascheroni

Constant) T(N) ≈ (N + 1)

  • T(1)

2

+ 2c

  • (loge(N + 1) + γ +

1 2(N+1)) − 3 2

  • T(N) → O(NlogN)

CMPE 250 Sorting Algorithms October 18, 2017 48 / 74

slide-49
SLIDE 49

Heapsort

The priority queue can be used to sort N items by

inserting every item into a binary heap and extracting every item by calling deleteMin N times, thus sorting the result.

An algorithm based on this idea is heapsort. It is an O(NlogN) worst-case sorting algorithm.

CMPE 250 Sorting Algorithms October 18, 2017 49 / 74

slide-50
SLIDE 50

Heapsort

The main problem with this algorithm is that it uses an extra array for the items exiting the heap. We can avoid this problem as follows:

After each deleteMin, the heap shrinks by 1. Thus the cell that was last in the heap can be used to store the element that was just deleted. Using this strategy, after the last deleteMin, the array will contain all elements in decreasing order.

If we want them in increasing order we must use a max heap.

CMPE 250 Sorting Algorithms October 18, 2017 50 / 74

slide-51
SLIDE 51

Heapsort Example

Max heap after the buildHeap phase for the input sequence 59,36,58,21,41,97,31,16,26,53

CMPE 250 Sorting Algorithms October 18, 2017 51 / 74

slide-52
SLIDE 52

Heapsort Example (Cont.)

Heap after the first deleteMax operation

CMPE 250 Sorting Algorithms October 18, 2017 52 / 74

slide-53
SLIDE 53

Heapsort Example (Cont.)

Heap after the second deleteMax operation

CMPE 250 Sorting Algorithms October 18, 2017 53 / 74

slide-54
SLIDE 54

Implementation

In the implementation of heapsort, the ADT BinaryHeap is not used.

Everything is done in an array.

The root is stored in position 0. Thus there are some minor changes in the code:

Since we use max heap, the logic of comparisons is changed from > to <. For a node in position i, the parent is in (i − 1)/2, the left child is in 2i + 1 and right child is next to left child. Percolating down needs the current heap size which is lowered by 1 at every deletion.

CMPE 250 Sorting Algorithms October 18, 2017 54 / 74

slide-55
SLIDE 55

The Heapsort Sort Algorithm

// Standard heapsort. template <typename Comparable> void heapsort( vector<Comparable> & a ) { for( int i = a.size( ) / 2 - 1; i >= 0; --i ) // buildHeap percDown( a, i, a.size( ) ); for( int j = a.size( ) - 1; j > 0; --j ) { std::swap( a[ 0 ], a[ j ] ); // deleteMax percDown( a, 0, j ); } }

CMPE 250 Sorting Algorithms October 18, 2017 55 / 74

slide-56
SLIDE 56

percDown Algorithm

// Internal method for heapsort. // i is the index of an item in the heap. // Returns the index of the left child. inline int leftChild( int i ) { return 2 * i + 1; } // Internal method for heapsort that is used in // deleteMax and buildHeap. // i is the position from which to percolate down. // n is the logical size of the binary heap. template <typename Comparable> void percDown( vector<Comparable> & a, int i, int n ) { int child; Comparable tmp; for( tmp = std::move( a[ i ] ); leftChild( i ) < n; i = child ) { child = leftChild( i ); if( child != n - 1 && a[ child ] < a[ child + 1 ] ) ++child; if( tmp < a[ child ] ) a[ i ] = std::move( a[ child ] ); else break; } a[ i ] = std::move( tmp ); } CMPE 250 Sorting Algorithms October 18, 2017 56 / 74

slide-57
SLIDE 57

Analysis of Heapsort

It is an O(NlogN) algorithm.

First phase: Build heap O(N) Second phase: N deleteMax operations: O(NlogN).

Detailed analysis shows that, the average case for heapsort is poorer than quick sort.

Quicksort’s worst case however is far worse.

An average case analysis of heapsort is very complicated, but empirical studies show that there is little difference between the average and worst cases.

Heapsort usually takes about twice as long as quicksort. Heapsort therefore should be regarded as something of an insurance policy: On average, it is more costly, but it avoids the possibility of O(N2).

CMPE 250 Sorting Algorithms October 18, 2017 57 / 74

slide-58
SLIDE 58

How fast can we sort?

Heapsort, Mergesort, and Quicksort all run in O(NlogN) best case running time. Can we do any better?

CMPE 250 Sorting Algorithms October 18, 2017 58 / 74

slide-59
SLIDE 59

The Answer is No! (if using comparisons only)

Our basic assumption: we can only compare two elements at a time – how does this limit the run time? Suppose you are given N elements

Assume no duplicates – any sorting algorithm must also work for this case

How many possible orderings can you get?

CMPE 250 Sorting Algorithms October 18, 2017 59 / 74

slide-60
SLIDE 60

How many possible orderings?

Example: a, b, c (N = 3) Orderings:

1

a b c

2

b c a

3

c a b

4

a c b

5

b a c

6

c b a

6 orderings = 3 × 2 × 1 = 3! For N elements: N! orderings

CMPE 250 Sorting Algorithms October 18, 2017 60 / 74

slide-61
SLIDE 61

A Decision Tree

Leaves contain possible orderings of a, b, c

CMPE 250 Sorting Algorithms October 18, 2017 61 / 74

slide-62
SLIDE 62

Decision Trees and Sorting

A Decision Tree is a Binary Tree such that:

Each node = a set of orderings Each edge = 1 comparison Each leaf = 1 unique ordering How many leaves for N distinct elements?

Only 1 leaf has sorted ordering Each sorting algorithm corresponds to a decision tree

Finds correct leaf by following edges (= comparisons)

Run time ≥ maximum number of comparisons

Depends on: depth of decision tree What is the depth of a decision tree for N distinct elements?

CMPE 250 Sorting Algorithms October 18, 2017 62 / 74

slide-63
SLIDE 63

Lower Bound on Comparison-Based Sorting

Suppose you have a binary tree of depth d. How many leaves can the tree have?

e.g. depth d = 1 → at most 2 leaves, d = 2 → at most 4 leaves, etc.

CMPE 250 Sorting Algorithms October 18, 2017 63 / 74

slide-64
SLIDE 64

Lower Bound on Comparison-Based Sorting

A binary tree of depth d has at most 2d leaves Number of leaves L ≤ 2d → d ≥ logL Decision tree has L = N! leaves → its depth d ≥ log(N!)

CMPE 250 Sorting Algorithms October 18, 2017 64 / 74

slide-65
SLIDE 65

Lower Bound on Comparison-Based Sorting

Stirling’s approximation: N! ≈

2πN

N

e

N

log(N!) ≈ log

2πN

N

e

N = log √

2πN

  • + log

N

e

N = 1

2log(2πN) + N(log(N) − 1) → Ω(NlogN)

Conclusion: Any sorting algorithm based on comparisons between elements requires Ω(NlogN) comparisons

CMPE 250 Sorting Algorithms October 18, 2017 65 / 74

slide-66
SLIDE 66

Comparison of Sorting Algorithms

Algorithm Worst case Average case Selection sort O(N2) O(N2) Bubble sort O(N2) O(N2) Insertion sort O(N2) O(N2) Mergesort O(NlogN) O(NlogN) Quicksort O(N2) O(NlogN) Radix sort O(N) O(N) Treesort O(N2) O(NlogN) Heapsort O(NlogN) O(NlogN)

CMPE 250 Sorting Algorithms October 18, 2017 66 / 74

slide-67
SLIDE 67

Sorting in linear time

Comparison sort:

Lower bound: Ω(nlogn).

Non comparison sort:

Bucket sort, radix sort They can sort in linear time (under certain assumptions).

CMPE 250 Sorting Algorithms October 18, 2017 67 / 74

slide-68
SLIDE 68

Bucket Sort

Assumption: uniform distribution

Input numbers are uniformly distributed in [0, 1). Suppose input size is n.

Idea:

Divide [0, 1) into n equal-sized subintervals (buckets). Distribute n numbers into buckets Expect that each bucket contains few numbers. Sort numbers in each bucket (insertion sort as default). Then go through buckets in order, listing elements.

CMPE 250 Sorting Algorithms October 18, 2017 68 / 74

slide-69
SLIDE 69

Bucket Sort Algorithm

Algorithm 2 BucketSort(A)

1: n ←

length[A]

2: for i ← 1 to n do

insert A[i] into bucket B[⌊nA[i]⌋]

3: for i ← 0 to n − 1 do

sort bucket B[i] using insertion sort

4: Concatenate bucket B[0],B[1],. . . ,B[n-1]

CMPE 250 Sorting Algorithms October 18, 2017 69 / 74

slide-70
SLIDE 70

Bucket Sort

CMPE 250 Sorting Algorithms October 18, 2017 70 / 74

slide-71
SLIDE 71

Analysis of Bucket Sort Algorithm

Algorithm 3 BucketSort(A)

1: n ←

length[A]

Ω(1)

2: for i ← 1 to n do

O(n) insert A[i] into bucket B[⌊nA[i]⌋]

Ω(1) (i.e. total O(n)

3: for i ← 0 to n − 1 do

O(n) sort bucket B[i] using insertion sort O(n2

i )

4: Concatenate bucket B[0],B[1],. . . ,B[n-1]

O(n) where ni is the size of bucket B[i]. Thus T(n) = Θ(n) + n−1

i=0 O(n2 i ) = Θ(n) + nO(2 − 1 n) = Θ(n)

Better than Ω(nlogn)

CMPE 250 Sorting Algorithms October 18, 2017 71 / 74

slide-72
SLIDE 72

Radix Sort

Origin: Herman Hollerith’s card-sorting machine for the 1890 U.S. Census Digit-by-digit sort. Hollerith’s original (bad) idea: sort on most-significant digit first. Good idea: Sort on least-significant digit first with auxiliary stable sort. Stable Sort Property:The relative order of any two items with the same key is preserved after the execution of the algorithm.

CMPE 250 Sorting Algorithms October 18, 2017 72 / 74

slide-73
SLIDE 73

Radix Sort Algorithm

Algorithm 4 RadixSort(A,d)

1: for i ← 1 to d do

use stable BucketSort to sort array A on digit i. Lemma: Given n d-digit numbers in which each digit can take on up to k possible values, RadixSort correctly sorts these numbers in

Θ(d(n + k)) time.

If d is constant and k = O(n), then time is Θ(n).

CMPE 250 Sorting Algorithms October 18, 2017 73 / 74

slide-74
SLIDE 74

Radix Sort Example

CMPE 250 Sorting Algorithms October 18, 2017 74 / 74