Sorting 15-121 Fall 2020 Margaret Reid-Miller Today Margaret will - - PowerPoint PPT Presentation
Sorting 15-121 Fall 2020 Margaret Reid-Miller Today Margaret will - - PowerPoint PPT Presentation
Sorting 15-121 Fall 2020 Margaret Reid-Miller Today Margaret will have office hours today 4-5pm Today Quadratic sorts O(n log n) sorts Next time Bucket and Radix sorts Sorting properties Fall 2020 15-121 (Reid-Miller) 2
Today
- Margaret will have office hours today 4-5pm
Today
- Quadratic sorts
- O(n log n) sorts
Next time
- Bucket and Radix sorts
- Sorting properties
Fall 2020 15-121 (Reid-Miller) 2
Quadratic Sorts Review
- Let A be an array of n elements, and we wish to sort these
elements in non-decreasing order.
- Which is selection sort and which is insertion sort?
- Selection sort "select" the next minimum and swaps
- Insertion sort "inserts" the next element into the sorted
part
- These sort algorithms works in place, meaning it uses its
- wn storage to perform the sort. part
Fall 2020 15-121 (Reid-Miller) 3
Selection Sort : Repeatedly select the minimum and add to sorted part
i smallest SORTED UNSORTED i smallest SORTED UNSORTED
j
min Loop invariant: A[0..i-1] are the i smallest elements sorted in non-decreasing order and are in their final position
A A
i i
Fall 2020 15-121 (Reid-Miller) 5
swap
public static void selectionSort(int[] a){ for (int i = 0; i < a.length-1; i++) { int minIndex = indexOfMin(a, i); int temp = a[minIndex]; a[minIndex] = a[i]; a[i] = temp; } } // returns index of minimum, from start to end public static int indexOfMin(int[] a, int start) { int minIndex = start; for (int j = start+1; j < a.length; j++) { if (a[j] < a[minIndex]) minIndex = j; } return minIndex; }
Fall 2020 15-121 (Reid-Miller) 6
Selection Sort Example
66 44 99 55 11 88 22 77 33 11 44 99 55 66 88 22 77 33 11 22 99 55 66 88 44 77 33 11 22 33 55 66 88 44 77 99 11 22 33 44 66 88 55 77 99 11 22 33 44 55 88 66 77 99 11 22 33 44 55 66 88 77 99 11 22 33 44 55 66 77 88 99 11 22 33 44 55 66 77 88 99
Fall 2020 15-121 (Reid-Miller) 7
Selection Sort: Run time analysis
Worst Case: Search for 1st min: n-1 comparisons Search for 2nd min: n-2 comparisons ... Search for 2nd-to-last min: 1 comparison Total comparisons: (n-1) + (n-2) + ... + 2 + 1 = O(_____) Average Case: = O(_____) Best Case: = O(_____)
Fall 2020 15-121 (Reid-Miller) 8
n2 n2 n2
Insertion Sort: repeatedly insert the next element into the sorted part
SORTED UNSORTED SORTED UNSORTED i i k
insert Loop invariant: A[0..i-1] are sorted in non-decreasing order.
Fall 2020 15-121 (Reid-Miller) 10
public static void insertionSort(int[] a){ // insert a[i] into sorted part for (int i = 0; i < a.length; i++) { int toInsert = a[i]; // creates hole int hole = i; // move values right into to hole until // find the insertion point while (hole > 0 && toInsert < a[hole-1]){ a[hole] = a[hole-1]; hole--; } a[hole] = toInsert; }
Fall 2020 15-121 (Reid-Miller) 11
Insertion Sort Example
66 44 99 55 11 88 22 77 33 44 66 99 55 11 88 22 77 33 44 66 99 55 11 88 22 77 33 44 55 66 99 11 88 22 77 33 11 44 55 66 99 88 22 77 33 11 44 55 66 88 99 22 77 33 11 22 44 55 66 88 99 77 33 11 22 44 55 66 77 88 99 33 11 22 33 44 55 66 77 88 99
Fall 2020 15-121 (Reid-Miller) 12
Worst Case (when does this occur?):
Insert 2nd element: 1 comparison Insert 3rd element: 2 comparisons ... Insert last element: n-1 comparisons
Total comparisons: 1 + 2 + ... + (n-1) = O(____) Average Case: = O(____) Best Case: = O(____) Insertion sort is adaptive: It’s runtime depends
- n the input data.
Insertion sort: Runtime analysis
Fall 2020 15-121 (Reid-Miller) 13
n2 n When? n2
Quadratic Sorts
- Quadratic sorts have a worst-case order of
complexity of O(n2)
- Selection sort always performs poorly, even
- n a sequence of sorted elements!
- Insertion sort is (nearly) linear if the elements
are (nearly) sorted.
Fall 2020 15-121 (Reid-Miller) 14
Tree Sort
- 1. Build a binary search
tree out of the elements.
- 2. Traverse the tree using an inorder traversal to get the
elements in increasing order. Runtime to traverse? ________ What is the runtime to build the search tree? build total Worst case ________ ________ Average case ________ ________ Best case ________ ________
84 41 96 24 37 50 13 98
Fall 2020 15-121 (Reid-Miller) 15
O(n) O(n2) O(n2) O(n log n) O(n log n) O(n log n) O(n log n)
Divide and Conquer
- In computation:
- 1. Divide the problem into simpler versions of itself.
- 2. Conquer each problem using the same process
(usually recursively).
- 3. Combine the results of the simpler versions to
form your final solution.
- Examples: Towers of Hanoi, fractals, Binary Search,
Merge Sort, Quicksort, and many, many more
4
Fall 2020 15-121 (Reid-Miller) 16
Merge Sort
6
84 27 49 91 32 53 63 17 84 27 49 91 32 53 63 17 27 49 84 91 17 32 53 63 17 27 32 49 53 63 84 91
Divide: Conquer: (sort recursively) Combine: (merge)
Fall 2020 15-121 (Reid-Miller) 18
Merge Sort
- Split the array into two “halves”.
- Sort each of the halves recursively using merge sort.
- Merge the two sorted halves into a new sorted array.
- Merge sort does not sort in place.
- Example:
66 33 77 55 / 11 99 22 88 44 sort the halves recursively... 33 55 66 77 / 11 22 44 88 99
Fall 2020 15-121 (Reid-Miller) 19
Merge Sort (cont’d)
Then merge the two sorted halves into a new array:
33 55 66 77 / 11 22 44 88 99 __ __ __ __ __ __ __ __ __ 33 55 66 77 / 11 22 44 88 99 11 __ __ __ __ __ __ __ __ 33 55 66 77 / 11 22 44 88 99 11 22 __ __ __ __ __ __ __
Fall 2020 15-121 (Reid-Miller) 20
Merge Sort (cont’d)
33 55 66 77 / 11 22 44 88 99 11 22 33 __ __ __ __ __ __ 33 55 66 77 / 11 22 44 88 99 11 22 33 44 __ __ __ __ __ 33 55 66 77 / 11 22 44 88 99 11 22 33 44 55 __ __ __ __
Fall 2020 15-121 (Reid-Miller) 21
Merge Sort (cont’d)
33 55 66 77 / 11 22 44 88 99 11 22 33 44 55 66 __ __ __ 44 55 66 77 / 11 22 33 88 99 11 22 33 44 55 66 77 __ __
Once one of the halves has been merged into the new array, copy the remaining element(s) of the other half into the new array:
44 55 66 77 / 11 22 33 88 99 11 22 33 44 55 66 77 88 99
Fall 2020 15-121 (Reid-Miller) 22
Analysis of Merge Sort: Divide
20
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
…
n n/2 n/2 n/4 n/4 n/4 n/4 n/8 n/8 n/8 n/8 n/8 n/8 n/8 n/8
Fall 2020 15-121 (Reid-Miller) 23
…
log n
Merge in Merge Sort Always runs in O(n log n)
20
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
…
n n/2 n/2 n/4 n/4 n/4 n/4 n/8 n/8 n/8 n/8 n/8 n/8 n/8 n/8
Fall 2020 15-121 (Reid-Miller) 24
1 * n = n 2 * n/2 = n 4 * n/4 = n 8 * n/8 = n n * 1 = n …
log n
Comparing Big O Functions
25
n (amount of data) Number of Operations O(2n) O(1) O(n log n) O(log n) O(n2) O(n)
Fall 2020 15-121 (Reid-Miller) 26
Quicksort
- Choose a pivot element of the array.
- Partition the array so that
- the pivot element is in its final correct position
- all the elements to the left of the pivot are
less than or equal to the pivot
- all the elements to the right of the pivot are
greater than the pivot
- Sort the each partition recursively using quicksort
Fall 2020 15-121 (Reid-Miller) 27
Partition: move l right until >= p move g left until ≤ p
Fall 2020 15-121 (Reid-Miller) 28
p l g p ≤ p ? ≥ p l g
swap
p <p ? >p l g
≥p ≤p
?
Partition: move l right until >= p move g left until ≤ p
Fall 2020 15-121 (Reid-Miller) 29
p ≤ p ? ≥ p l g p ≤ p ? ≥ p l g
swap
p ≤ p <p ? >p ≥ p l g
≥p ≤p
Partition: stop when l and g meet or cross and put pivot between partitions
p ≤ p ≤p ≥p ≥ p l g
Fall 2020 15-121 (Reid-Miller) 30
≤ p p ≥ p
swap
p is in its final position
Partitioning the array
Arbitrarily choose the first element as the pivot.
66 44 99 55 11 88 22 77 33
Search from the left end for the first element that is greater than (or equal to) the pivot.
66 44 99 55 11 88 22 77 33
Search from the right end for the first element that is less than (or equal to) the pivot.
66 44 99 55 11 88 22 77 33
Now swap these two elements.
66 44 33 55 11 88 22 77 99
Fall 2020 15-121 (Reid-Miller) 31
Partitioning the array (cont’d)
66 44 33 55 11 88 22 77 99
From the two elements just swapped, search again from the left and right ends for the next elements that are greater than and less than the pivot, respectively.
66 44 33 55 11 88 22 77 99
Swap these as well.
66 44 33 55 11 22 88 77 99
Continue this process until our searches from each end meet or cross.
Fall 2020 15-121 (Reid-Miller) 32
Partitioning the array (cont’d)
At this point, the array has been partitioned into two subarrays,
- ne with elements less than (or equal to) the pivot, and the other
with elements greater than (or equal to) the pivot.
66 44 33 55 11 22 88 77 99
Finally, swap the pivot with the last element in the first subarray section (the elements that are less than the pivot).
22 44 33 55 11 66 88 77 99
The pivot is now in its final position. Now sort the two subarrays on either side of the pivot using quick sort recursively.
Fall 2020 15-121 (Reid-Miller) 33
Quicksort
- Invariant: After the ith partition, the ith pivot is in its
final position (i.e., all values to the left are less or equal than the pivot and all values to the right are greater than or equal the pivot).
- Thus, after completing the divide and conquer
phases, the data is completely sorted (every pivot is in its final position) and the combine phase is trivial.
- Compare with Merge Sort where the divide phase is
trivial and the conquer and combine phases do all the work.
Fall 2020 15-121 (Reid-Miller) 34
Run-Time Analysis
- What is the run time for partition?
- Assume the pivot ends up in the center position of the
array every time (recursively too).
- Then, quicksort runs in
- BUT, quicksort in the worst case is O(n2) – when might
that be?
- In practice, though, quicksort is usually O(n log n) and
faster (better constants) than merge sort (and quicksort is in place).
- Merge sort is better when need to stream data from disk.
Fall 2020 15-121 (Reid-Miller) 35
O(n) O(n log n) time (best case) just like merge sort.
Some Improvements to Quicksort
- Choose three values from the array, and use the middle
element of the three as the pivot.
66 44 99 55 11 88 22 77 33
Of 11, 33, 66, use 33 as the pivot.
- Quick sort is called recursively and many recursive calls
are "not cheap".
- Stop the recursion when the subarrays are of “small
size”. Now the array is almost sorted.
- Apply insertion sort on the whole array. O(n)
Fall 2020 15-121 (Reid-Miller) 36
Randomized quicksort is fast
- Fact: Quicksort has expected runtime of O(n log n)
averaged over all n! input orderings.
- Randomized quicksort: For every partition, pick a
pivot at random from the partition.
- Fact: Randomized quicksort has expected runtime of
O(n log n) for any input ordering.
- Although it is possible for randomized quicksort to have
O(n2) runtime (bad random pivots), it is highly unlikely.
- If you run it again on the same data, the expected
runtime will be O(n log n).
Fall 2020 15-121 (Reid-Miller) 37