Sorting 15-121 Fall 2020 Margaret Reid-Miller Today Margaret will - - PowerPoint PPT Presentation

sorting
SMART_READER_LITE
LIVE PREVIEW

Sorting 15-121 Fall 2020 Margaret Reid-Miller Today Margaret will - - PowerPoint PPT Presentation

Sorting 15-121 Fall 2020 Margaret Reid-Miller Today Margaret will have office hours today 4-5pm Today Quadratic sorts O(n log n) sorts Next time Bucket and Radix sorts Sorting properties Fall 2020 15-121 (Reid-Miller) 2


slide-1
SLIDE 1

Sorting

15-121 Fall 2020 Margaret Reid-Miller

slide-2
SLIDE 2

Today

  • Margaret will have office hours today 4-5pm

Today

  • Quadratic sorts
  • O(n log n) sorts

Next time

  • Bucket and Radix sorts
  • Sorting properties

Fall 2020 15-121 (Reid-Miller) 2

slide-3
SLIDE 3

Quadratic Sorts Review

  • Let A be an array of n elements, and we wish to sort these

elements in non-decreasing order.

  • Which is selection sort and which is insertion sort?
  • Selection sort "select" the next minimum and swaps
  • Insertion sort "inserts" the next element into the sorted

part

  • These sort algorithms works in place, meaning it uses its
  • wn storage to perform the sort. part

Fall 2020 15-121 (Reid-Miller) 3

slide-4
SLIDE 4

Selection Sort : Repeatedly select the minimum and add to sorted part

i smallest SORTED UNSORTED i smallest SORTED UNSORTED

j

min Loop invariant: A[0..i-1] are the i smallest elements sorted in non-decreasing order and are in their final position

A A

i i

Fall 2020 15-121 (Reid-Miller) 5

swap

slide-5
SLIDE 5

public static void selectionSort(int[] a){ for (int i = 0; i < a.length-1; i++) { int minIndex = indexOfMin(a, i); int temp = a[minIndex]; a[minIndex] = a[i]; a[i] = temp; } } // returns index of minimum, from start to end public static int indexOfMin(int[] a, int start) { int minIndex = start; for (int j = start+1; j < a.length; j++) { if (a[j] < a[minIndex]) minIndex = j; } return minIndex; }

Fall 2020 15-121 (Reid-Miller) 6

slide-6
SLIDE 6

Selection Sort Example

66 44 99 55 11 88 22 77 33 11 44 99 55 66 88 22 77 33 11 22 99 55 66 88 44 77 33 11 22 33 55 66 88 44 77 99 11 22 33 44 66 88 55 77 99 11 22 33 44 55 88 66 77 99 11 22 33 44 55 66 88 77 99 11 22 33 44 55 66 77 88 99 11 22 33 44 55 66 77 88 99

Fall 2020 15-121 (Reid-Miller) 7

slide-7
SLIDE 7

Selection Sort: Run time analysis

Worst Case: Search for 1st min: n-1 comparisons Search for 2nd min: n-2 comparisons ... Search for 2nd-to-last min: 1 comparison Total comparisons: (n-1) + (n-2) + ... + 2 + 1 = O(_____) Average Case: = O(_____) Best Case: = O(_____)

Fall 2020 15-121 (Reid-Miller) 8

n2 n2 n2

slide-8
SLIDE 8

Insertion Sort: repeatedly insert the next element into the sorted part

SORTED UNSORTED SORTED UNSORTED i i k

insert Loop invariant: A[0..i-1] are sorted in non-decreasing order.

Fall 2020 15-121 (Reid-Miller) 10

slide-9
SLIDE 9

public static void insertionSort(int[] a){ // insert a[i] into sorted part for (int i = 0; i < a.length; i++) { int toInsert = a[i]; // creates hole int hole = i; // move values right into to hole until // find the insertion point while (hole > 0 && toInsert < a[hole-1]){ a[hole] = a[hole-1]; hole--; } a[hole] = toInsert; }

Fall 2020 15-121 (Reid-Miller) 11

slide-10
SLIDE 10

Insertion Sort Example

66 44 99 55 11 88 22 77 33 44 66 99 55 11 88 22 77 33 44 66 99 55 11 88 22 77 33 44 55 66 99 11 88 22 77 33 11 44 55 66 99 88 22 77 33 11 44 55 66 88 99 22 77 33 11 22 44 55 66 88 99 77 33 11 22 44 55 66 77 88 99 33 11 22 33 44 55 66 77 88 99

Fall 2020 15-121 (Reid-Miller) 12

slide-11
SLIDE 11

Worst Case (when does this occur?):

Insert 2nd element: 1 comparison Insert 3rd element: 2 comparisons ... Insert last element: n-1 comparisons

Total comparisons: 1 + 2 + ... + (n-1) = O(____) Average Case: = O(____) Best Case: = O(____) Insertion sort is adaptive: It’s runtime depends

  • n the input data.

Insertion sort: Runtime analysis

Fall 2020 15-121 (Reid-Miller) 13

n2 n When? n2

slide-12
SLIDE 12

Quadratic Sorts

  • Quadratic sorts have a worst-case order of

complexity of O(n2)

  • Selection sort always performs poorly, even
  • n a sequence of sorted elements!
  • Insertion sort is (nearly) linear if the elements

are (nearly) sorted.

Fall 2020 15-121 (Reid-Miller) 14

slide-13
SLIDE 13

Tree Sort

  • 1. Build a binary search

tree out of the elements.

  • 2. Traverse the tree using an inorder traversal to get the

elements in increasing order. Runtime to traverse? ________ What is the runtime to build the search tree? build total Worst case ________ ________ Average case ________ ________ Best case ________ ________

84 41 96 24 37 50 13 98

Fall 2020 15-121 (Reid-Miller) 15

O(n) O(n2) O(n2) O(n log n) O(n log n) O(n log n) O(n log n)

slide-14
SLIDE 14

Divide and Conquer

  • In computation:
  • 1. Divide the problem into simpler versions of itself.
  • 2. Conquer each problem using the same process

(usually recursively).

  • 3. Combine the results of the simpler versions to

form your final solution.

  • Examples: Towers of Hanoi, fractals, Binary Search,

Merge Sort, Quicksort, and many, many more

4

Fall 2020 15-121 (Reid-Miller) 16

slide-15
SLIDE 15

Merge Sort

6

84 27 49 91 32 53 63 17 84 27 49 91 32 53 63 17 27 49 84 91 17 32 53 63 17 27 32 49 53 63 84 91

Divide: Conquer: (sort recursively) Combine: (merge)

Fall 2020 15-121 (Reid-Miller) 18

slide-16
SLIDE 16

Merge Sort

  • Split the array into two “halves”.
  • Sort each of the halves recursively using merge sort.
  • Merge the two sorted halves into a new sorted array.
  • Merge sort does not sort in place.
  • Example:

66 33 77 55 / 11 99 22 88 44 sort the halves recursively... 33 55 66 77 / 11 22 44 88 99

Fall 2020 15-121 (Reid-Miller) 19

slide-17
SLIDE 17

Merge Sort (cont’d)

Then merge the two sorted halves into a new array:

33 55 66 77 / 11 22 44 88 99 __ __ __ __ __ __ __ __ __ 33 55 66 77 / 11 22 44 88 99 11 __ __ __ __ __ __ __ __ 33 55 66 77 / 11 22 44 88 99 11 22 __ __ __ __ __ __ __

Fall 2020 15-121 (Reid-Miller) 20

slide-18
SLIDE 18

Merge Sort (cont’d)

33 55 66 77 / 11 22 44 88 99 11 22 33 __ __ __ __ __ __ 33 55 66 77 / 11 22 44 88 99 11 22 33 44 __ __ __ __ __ 33 55 66 77 / 11 22 44 88 99 11 22 33 44 55 __ __ __ __

Fall 2020 15-121 (Reid-Miller) 21

slide-19
SLIDE 19

Merge Sort (cont’d)

33 55 66 77 / 11 22 44 88 99 11 22 33 44 55 66 __ __ __ 44 55 66 77 / 11 22 33 88 99 11 22 33 44 55 66 77 __ __

Once one of the halves has been merged into the new array, copy the remaining element(s) of the other half into the new array:

44 55 66 77 / 11 22 33 88 99 11 22 33 44 55 66 77 88 99

Fall 2020 15-121 (Reid-Miller) 22

slide-20
SLIDE 20

Analysis of Merge Sort: Divide

20

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

n n/2 n/2 n/4 n/4 n/4 n/4 n/8 n/8 n/8 n/8 n/8 n/8 n/8 n/8

Fall 2020 15-121 (Reid-Miller) 23

log n

slide-21
SLIDE 21

Merge in Merge Sort Always runs in O(n log n)

20

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

n n/2 n/2 n/4 n/4 n/4 n/4 n/8 n/8 n/8 n/8 n/8 n/8 n/8 n/8

Fall 2020 15-121 (Reid-Miller) 24

1 * n = n 2 * n/2 = n 4 * n/4 = n 8 * n/8 = n n * 1 = n …

log n

slide-22
SLIDE 22

Comparing Big O Functions

25

n (amount of data) Number of Operations O(2n) O(1) O(n log n) O(log n) O(n2) O(n)

Fall 2020 15-121 (Reid-Miller) 26

slide-23
SLIDE 23

Quicksort

  • Choose a pivot element of the array.
  • Partition the array so that
  • the pivot element is in its final correct position
  • all the elements to the left of the pivot are

less than or equal to the pivot

  • all the elements to the right of the pivot are

greater than the pivot

  • Sort the each partition recursively using quicksort

Fall 2020 15-121 (Reid-Miller) 27

slide-24
SLIDE 24

Partition: move l right until >= p move g left until ≤ p

Fall 2020 15-121 (Reid-Miller) 28

p l g p ≤ p ? ≥ p l g

swap

p <p ? >p l g

≥p ≤p

?

slide-25
SLIDE 25

Partition: move l right until >= p move g left until ≤ p

Fall 2020 15-121 (Reid-Miller) 29

p ≤ p ? ≥ p l g p ≤ p ? ≥ p l g

swap

p ≤ p <p ? >p ≥ p l g

≥p ≤p

slide-26
SLIDE 26

Partition: stop when l and g meet or cross and put pivot between partitions

p ≤ p ≤p ≥p ≥ p l g

Fall 2020 15-121 (Reid-Miller) 30

≤ p p ≥ p

swap

p is in its final position

slide-27
SLIDE 27

Partitioning the array

Arbitrarily choose the first element as the pivot.

66 44 99 55 11 88 22 77 33

Search from the left end for the first element that is greater than (or equal to) the pivot.

66 44 99 55 11 88 22 77 33

Search from the right end for the first element that is less than (or equal to) the pivot.

66 44 99 55 11 88 22 77 33

Now swap these two elements.

66 44 33 55 11 88 22 77 99

Fall 2020 15-121 (Reid-Miller) 31

slide-28
SLIDE 28

Partitioning the array (cont’d)

66 44 33 55 11 88 22 77 99

From the two elements just swapped, search again from the left and right ends for the next elements that are greater than and less than the pivot, respectively.

66 44 33 55 11 88 22 77 99

Swap these as well.

66 44 33 55 11 22 88 77 99

Continue this process until our searches from each end meet or cross.

Fall 2020 15-121 (Reid-Miller) 32

slide-29
SLIDE 29

Partitioning the array (cont’d)

At this point, the array has been partitioned into two subarrays,

  • ne with elements less than (or equal to) the pivot, and the other

with elements greater than (or equal to) the pivot.

66 44 33 55 11 22 88 77 99

Finally, swap the pivot with the last element in the first subarray section (the elements that are less than the pivot).

22 44 33 55 11 66 88 77 99

The pivot is now in its final position. Now sort the two subarrays on either side of the pivot using quick sort recursively.

Fall 2020 15-121 (Reid-Miller) 33

slide-30
SLIDE 30

Quicksort

  • Invariant: After the ith partition, the ith pivot is in its

final position (i.e., all values to the left are less or equal than the pivot and all values to the right are greater than or equal the pivot).

  • Thus, after completing the divide and conquer

phases, the data is completely sorted (every pivot is in its final position) and the combine phase is trivial.

  • Compare with Merge Sort where the divide phase is

trivial and the conquer and combine phases do all the work.

Fall 2020 15-121 (Reid-Miller) 34

slide-31
SLIDE 31

Run-Time Analysis

  • What is the run time for partition?
  • Assume the pivot ends up in the center position of the

array every time (recursively too).

  • Then, quicksort runs in
  • BUT, quicksort in the worst case is O(n2) – when might

that be?

  • In practice, though, quicksort is usually O(n log n) and

faster (better constants) than merge sort (and quicksort is in place).

  • Merge sort is better when need to stream data from disk.

Fall 2020 15-121 (Reid-Miller) 35

O(n) O(n log n) time (best case) just like merge sort.

slide-32
SLIDE 32

Some Improvements to Quicksort

  • Choose three values from the array, and use the middle

element of the three as the pivot.

66 44 99 55 11 88 22 77 33

Of 11, 33, 66, use 33 as the pivot.

  • Quick sort is called recursively and many recursive calls

are "not cheap".

  • Stop the recursion when the subarrays are of “small

size”. Now the array is almost sorted.

  • Apply insertion sort on the whole array. O(n)

Fall 2020 15-121 (Reid-Miller) 36

slide-33
SLIDE 33

Randomized quicksort is fast

  • Fact: Quicksort has expected runtime of O(n log n)

averaged over all n! input orderings.

  • Randomized quicksort: For every partition, pick a

pivot at random from the partition.

  • Fact: Randomized quicksort has expected runtime of

O(n log n) for any input ordering.

  • Although it is possible for randomized quicksort to have

O(n2) runtime (bad random pivots), it is highly unlikely.

  • If you run it again on the same data, the expected

runtime will be O(n log n).

Fall 2020 15-121 (Reid-Miller) 37