Mergesort and Quicksort LAST TODAY NEXT Binary search Divide and - - PowerPoint PPT Presentation

mergesort and quicksort last today next binary search
SMART_READER_LITE
LIVE PREVIEW

Mergesort and Quicksort LAST TODAY NEXT Binary search Divide and - - PowerPoint PPT Presentation

Mergesort and Quicksort LAST TODAY NEXT Binary search Divide and conquer Part II of course mergesort and quicksort Data structures Recursion Randomness Recall: Complexity of binary search Worst case: O(log n) Best case: O(1) Review


slide-1
SLIDE 1

Mergesort and Quicksort

slide-2
SLIDE 2

LAST Binary search TODAY Divide and conquer

  • mergesort and quicksort

Recursion Randomness NEXT Part II of course

  • Data structures
slide-3
SLIDE 3

Recall: Complexity of binary search

Worst case: O(log n) Best case: O(1)

slide-4
SLIDE 4

Review

Algorithm Complexity Linear search O(n) Binary search O(log n)

Why do we get a logarithmic speed up in moving from linear search to binary search?

slide-5
SLIDE 5

Review

Algorithm Complexity Linear search O(n) Selection sort O(n2) Binary search O(log n)

slide-6
SLIDE 6

In Practice …

Suppose that Google sorts 109 pages, and examining each page takes 10-9

  • seconds. How long does it take to sort all the pages using selection sort?

Algorithm Complexity Linear search O(n) Selection sort O(n2) Binary search O(log n)

(109 )2 * 10-9 = 109 more than 3 years

slide-7
SLIDE 7

Warm-up exercise

sorting the first half: _________ steps sorting the second half: _________ steps

n

sort
 each
 half

sorted sorted

If we used an O(n2) algorithm for sorting, for an input of size n, how many steps would it take to sort the two halves?

n/2 elements n/2 elements

Suppose we sort each half separately, and then combine them with an O(n) algorithm.

slide-8
SLIDE 8

Doing less work for sorting

size of input

work (number of steps)

n n2 n/2 n2/4

n/2n/2 nn/2

sort
 each
 half

sorted sorted

slide-9
SLIDE 9

Divide and conquer for sorting

slide-10
SLIDE 10

Divide and conquer

n/2n/2 nn/2 n/2sorted

merge
 two halves sort
 each
 half

sorted sorted

slide-11
SLIDE 11

Toward implementation

n

lo hi mid

slide-12
SLIDE 12

void selection_sort(int[] A, int lo, int hi) //@requires 0 <= lo && lo <= hi && hi <= \length(A); //@ensures is_sorted(A, lo, hi); ; void sort(int[] A, int lo, int hi) //@requires 0 <= lo && lo <= hi && hi <= \length(A); //@ensures is_sorted(A, lo, hi); { int mid = lo + (hi - lo)/2; //@assert … selection_sort(A, lo, mid); //@assert is_sorted(A, lo, mid); selection_sort(A, mid, hi); //@assert is_sorted(A, mid, hi); … }

n

lo hi mid

Are the function calls safe?

slide-13
SLIDE 13

void selection_sort(int[] A, int lo, int hi) //@requires 0 <= lo && lo <= hi && hi <= \length(A); //@ensures is_sorted(A, lo, hi); ; void sort(int[] A, int lo, int hi) //@requires 0 <= lo && lo <= hi && hi <= \length(A); //@ensures is_sorted(A, lo, hi); { int mid = lo + (hi - lo)/2; //@assert lo <= mid && mid <= hi; selection_sort(A, lo, mid); //@assert is_sorted(A, lo, mid); selection_sort(A, mid, hi); //@assert is_sorted(A, mid, hi); … }

n

lo hi mid

slide-14
SLIDE 14

void selection_sort(int[] A, int lo, int hi) //@requires 0 <= lo && lo <= hi && hi <= \length(A); //@ensures is_sorted(A, lo, hi); ; void sort(int[] A, int lo, int hi) //@requires 0 <= lo && lo <= hi && hi <= \length(A); //@ensures is_sorted(A, lo, hi); { int mid = lo + (hi - lo)/2; //@assert lo <= mid && mid <= hi; selection_sort(A, lo, mid); //@assert is_sorted(A, lo, mid); selection_sort(A, mid, hi); //@assert is_sorted(A, mid, hi); … } void merge(int[] A, int lo, int mid, int hi) //@requires 0 <= lo && lo <= mid && mid <= hi && hi <= \length(A); //@requires is_sorted(A, lo, mid) && is_sorted(A, mid, hi); //@ensures is_sorted(A, lo, hi);

n

lo hi mid

slide-15
SLIDE 15

void selection_sort(int[] A, int lo, int hi) //@requires 0 <= lo && lo <= hi && hi <= \length(A); //@ensures is_sorted(A, lo, hi); ; void sort(int[] A, int lo, int hi) //@requires 0 <= lo && lo <= hi && hi <= \length(A); //@ensures is_sorted(A, lo, hi); { int mid = lo + (hi - lo)/2; //@assert lo <= mid && mid <= hi; selection_sort(A, lo, mid); //@assert is_sorted(A, lo, mid); selection_sort(A, mid, hi); //@assert is_sorted(A, mid, hi); merge(A, lo, mid, hi); //@assert is_sorted(A, lo, hi); } void merge(int[] A, int lo, int mid, int hi) //@requires 0 <= lo && lo <= mid && mid <= hi && hi <= \length(A); //@requires is_sorted(A, lo, mid) && is_sorted(A, mid, hi); //@ensures is_sorted(A, lo, hi);

slide-16
SLIDE 16

void selection_sort(int[] A, int lo, int hi) //@requires 0 <= lo && lo <= hi && hi <= \length(A); //@ensures is_sorted(A, lo, hi); ; void sort(int[] A, int lo, int hi) //@requires 0 <= lo && lo <= hi && hi <= \length(A); //@ensures is_sorted(A, lo, hi); { int mid = lo + (hi - lo)/2; //@assert lo <= mid && mid <= hi; selection_sort(A, lo, mid); //@assert is_sorted(A, lo, mid); selection_sort(A, mid, hi); //@assert is_sorted(A, mid, hi); merge(A, lo, mid, hi); //@assert is_sorted(A, lo, hi); }

Suppose merge is O(n), what is the complexity of sort?

O(n2) + O(n) = O(n2)

slide-17
SLIDE 17

Mergesort

slide-18
SLIDE 18

Some observations

void selection_sort(int[] A, int lo, int hi) //@requires 0 <= lo && lo <= hi && hi <= \length(A); //@ensures is_sorted(A, lo, hi); ; void sort(int[] A, int lo, int hi) //@requires 0 <= lo && lo <= hi && hi <= \length(A); //@ensures is_sorted(A, lo, hi); { int mid = lo + (hi - lo)/2; //@assert lo <= mid && mid <= hi; selection_sort(A, lo, mid); selection_sort(A, mid, hi); merge(A, lo, mid, hi); }

same contracts

We can use sort instead of selection_sort recursively!

slide-19
SLIDE 19

Recursive function

void sort(int[] A, int lo, int hi) //@requires 0 <= lo && lo <= hi && hi <= \length(A); //@ensures is_sorted(A, lo, hi); { int mid = lo + (hi - lo)/2; //@assert lo <= mid && mid <= hi; sort(A, lo, mid); sort(A, mid, hi); merge(A, lo, mid, hi); //@assert is_sorted(A, lo, hi); }

slide-20
SLIDE 20

Recursive merge sort

void merge(int[] A, int lo, int mid, int hi) //@requires 0 <= lo && lo <= mid && mid <= hi && hi <= \length(A); //@requires is_sorted(A, lo, mid) && is_sorted(A, mid, hi); //@ensures is_sorted(A, lo, hi); ;

How can we reason about correctness of recursive code?

void mergesort(int[] A, int lo, int hi) //@requires 0 <= lo && lo <= hi && hi <= \length(A); //@ensures is_sorted(A, lo, hi); { int mid = lo + (hi - lo)/2; //@assert lo <= mid && mid <= hi; mergesort(A, lo, mid); mergesort(A, mid, hi); merge(A, lo, mid, hi); }

slide-21
SLIDE 21

A problem?

void merge(int[] A, int lo, int mid, int hi) //@requires 0 <= lo && lo <= mid && mid <= hi && hi <= \length(A); //@requires is_sorted(A, lo, mid) && is_sorted(A, mid, hi); //@ensures is_sorted(A, lo, hi); ; void mergesort(int[] A, int lo, int hi) //@requires 0 <= lo && lo <= hi && hi <= \length(A); //@ensures is_sorted(A, lo, hi); { int mid = lo + (hi - lo)/2; //@assert lo <= mid && mid <= hi; mergesort(A, lo, mid); mergesort(A, mid, hi); merge(A, lo, mid, hi); }

slide-22
SLIDE 22

Adding a base case

void mergesort(int[] A, int lo, int hi) //@requires 0 <= lo && lo <= hi && hi <= \length(A); //@ensures is_sorted(A, lo, hi); { if (hi - lo <= 1) return; int mid = lo + (hi - lo)/2; //@assert lo <= mid && mid <= hi; mergesort(A, lo, mid); //@assert is_sorted(A, lo, mid); mergesort(A, mid, hi); //@assert is_sorted(A, mid, hi); merge(A, lo, mid, hi); }

slide-23
SLIDE 23

Adding a base case

void mergesort(int[] A, int lo, int hi) //@requires 0 <= lo && lo <= hi && hi <= \length(A); //@ensures is_sorted(A, lo, hi); { if (hi - lo <= 1) return; int mid = lo + (hi - lo)/2; //@assert lo < mid && mid < hi; mergesort(A, lo, mid); //@assert is_sorted(A, lo, mid); mergesort(A, mid, hi); //@assert is_sorted(A, mid, hi); merge(A, lo, mid, hi); }

slide-24
SLIDE 24

Complexity

n arrays of size 1

merge

n/2 arrays of size 2 2 arrays of size n/2

merge

1 array of size n

void mergesort(int[] A, int lo, int hi) //@requires 0 <= lo && lo <= hi && hi <= \length(A); //@ensures is_sorted(A, lo, hi); { if (hi - lo <= 1) return; int mid = lo + (hi - lo)/2; //@assert lo < mid && mid < hi; mergesort(A, lo, mid); //@assert is_sorted(A, lo, mid); mergesort(A, mid, hi); //@assert is_sorted(A, mid, hi); merge(A, lo, mid, hi); }

slide-25
SLIDE 25

How many levels are there?

n arrays of size 1 n/2 arrays of size 2 2 arrays of size n/2

O(n)

1 array of size n

void mergesort(int[] A, int lo, int hi) //@requires 0 <= lo && lo <= hi && hi <= \length(A); //@ensures is_sorted(A, lo, hi); { if (hi - lo <= 1) return; int mid = lo + (hi - lo)/2; //@assert lo < mid && mid < hi; mergesort(A, lo, mid); //@assert is_sorted(A, lo, mid); mergesort(A, mid, hi); //@assert is_sorted(A, mid, hi); merge(A, lo, mid, hi); }

O(n) O(n)

slide-26
SLIDE 26

How many levels are there?

n arrays of size 1 n/2 arrays of size 2 2 arrays of size n/2

O(n log n)

1 array of size n

void mergesort(int[] A, int lo, int hi) //@requires 0 <= lo && lo <= hi && hi <= \length(A); //@ensures is_sorted(A, lo, hi); { if (hi - lo <= 1) return; int mid = lo + (hi - lo)/2; //@assert lo < mid && mid < hi; mergesort(A, lo, mid); //@assert is_sorted(A, lo, mid); mergesort(A, mid, hi); //@assert is_sorted(A, mid, hi); merge(A, lo, mid, hi); }

slide-27
SLIDE 27

Recall

Suppose that Google sorts 109 pages, and examining each page takes 10-9

  • seconds. How long does it take to sort all the pages using selection sort?

Algorithm Complexity Linear search O(n) Selection sort O(n2) Binary search O(log n)

(109 )2 * 10-9 = 109 more than 3 years

slide-28
SLIDE 28

In Practice …

Suppose that Google sorts 109 pages, and examining each page takes 10-9

  • seconds. How long does it take to sort all the pages using an O(n log n)

sorting algorithm?

Algorithm Complexity Linear search O(n) Selection sort O(n2) Binary search O(log n) Merge sort O(n log n)

109 * log (109 ) * 10-9 = log 109 30 seconds

slide-29
SLIDE 29

Quicksort

slide-30
SLIDE 30

Abstract view like merge sort

partition

lo hi

x sorted sorted

lo hi

x

p

sort parts

lo hi

x smaller

p

larger A[p] ≥ A[lo,p) A[p] ≤ A[p+1,hi)

slide-31
SLIDE 31

Example

slide-32
SLIDE 32

int partition(int[] A, int lo, int hi) //@requires 0 <= lo && lo < hi && hi <= \length(A); //@ensures lo <= \result && \result < hi; //@ensures ge_seg(A[\result], A, lo, \result); //@ensures le_seg(A[\result], A, \result+1, hi); ; void quicksort(int[] A, int lo, int hi) //@requires 0 <= lo && lo <= hi && hi <= \length(A); //@ensures is_sorted(A, lo, hi);

slide-33
SLIDE 33

int partition(int[] A, int lo, int hi) //@requires 0 <= lo && lo < hi && hi <= \length(A); //@ensures lo <= \result && \result < hi; //@ensures ge_seg(A[\result], A, lo, \result); //@ensures le_seg(A[\result], A, \result+1, hi); ; void quicksort(int[] A, int lo, int hi) //@requires 0 <= lo && lo <= hi && hi <= \length(A); //@ensures is_sorted(A, lo, hi); { if (hi - lo <= 1) return; }

slide-34
SLIDE 34

int partition(int[] A, int lo, int hi) //@requires 0 <= lo && lo < hi && hi <= \length(A); //@ensures lo <= \result && \result < hi; //@ensures ge_seg(A[\result], A, lo, \result); //@ensures le_seg(A[\result], A, \result+1, hi); ; void quicksort(int[] A, int lo, int hi) //@requires 0 <= lo && lo <= hi && hi <= \length(A); //@ensures is_sorted(A, lo, hi); { if (hi - lo <= 1) return; int p = ________________; //@assert lo <= p && p < hi; //@assert ge_seg(A[p],A,lo,p) && le_seg(A[p],A,p+1,hi); _____________________; //@assert is_sorted(A, lo, p); _____________________; //@assert is_sorted(A, p+1, hi); //@assert is_sorted(A, lo, hi); }

slide-35
SLIDE 35

int partition(int[] A, int lo, int hi) //@requires 0 <= lo && lo < hi && hi <= \length(A); //@ensures lo <= \result && \result < hi; //@ensures ge_seg(A[\result], A, lo, \result); //@ensures le_seg(A[\result], A, \result+1, hi); ; void quicksort(int[] A, int lo, int hi) //@requires 0 <= lo && lo <= hi && hi <= \length(A); //@ensures is_sorted(A, lo, hi); { if (hi - lo <= 1) return; int p = partition(A, lo, hi); //@assert lo <= p && p < hi; //@assert ge_seg(A[p],A,lo,p) && le_seg(A[p],A,p+1,hi); quicksort(A, lo, p); //@assert is_sorted(A, lo, p); quicksort(A, p+1, hi); //@assert is_sorted(A, p+1, hi); //@assert is_sorted(A, lo, hi); }

slide-36
SLIDE 36

Correctness of quicksort

int partition(int[] A, int lo, int hi) //@requires 0 <= lo && lo < hi && hi <= \length(A); //@ensures lo <= \result && \result < hi; //@ensures ge_seg(A[\result], A, lo, \result); //@ensures le_seg(A[\result], A, \result+1, hi); ; void quicksort(int[] A, int lo, int hi) //@requires 0 <= lo && lo <= hi && hi <= \length(A); //@ensures is_sorted(A, lo, hi); { if (hi - lo <= 1) return; int p = partition(A, lo, hi); //@assert lo <= p && p < hi; //@assert ge_seg(A[p], A,lo,p) && le_seg(A[p], A,p+1,hi); quicksort(A, lo, p); //@assert is_sorted(A, lo, p); quicksort(A, p+1, hi); //@assert is_sorted(A, p+1, hi); //@assert is_sorted(A, lo, hi); }

  • A[p] ≥ A[lo,p) postcondition of partition
  • A[p] ≤ A[p+1,hi) postcondition of partition
  • A[lo,p) is sorted
  • A[p+1,hi) is sorted
slide-37
SLIDE 37

Choice of midpoint

  • Best: partition always chooses median as pivot
  • Cost: O(n log n)
  • Impractical
  • Worst: partition always return index of minimal element
  • Degenerates into selection sort
  • O(n2)
  • In practice
  • Always return a fixed index, or random index, …
  • Small probability that we hit the worst case O(n2)
  • Average case O(n log n)
slide-38
SLIDE 38

Comparing sorting algorithms

selection sort merge sort quicksort worst-case average-case in-place?

O(n2) O(n2) O(n log n) O(n2) O(n log n) O(n log n) Yes No Yes

slide-39
SLIDE 39

Bonus slides

slide-40
SLIDE 40
slide-41
SLIDE 41
slide-42
SLIDE 42

Comparing sorting algorithms

selection sort merge sort quicksort worst-case average-case in-place? stable?

O(n2) O(n2) O(n log n) O(n2) O(n log n) O(n log n) Yes No Yes Yes No No

slide-43
SLIDE 43

Practical consequences

Credit: Algorithm Design, Tardos, Kleinberg (table), UC Berkeley CS 61B, Hug (slide)

slide-44
SLIDE 44

Partitioning

In-place

slide-45
SLIDE 45
slide-46
SLIDE 46
slide-47
SLIDE 47
slide-48
SLIDE 48
slide-49
SLIDE 49
slide-50
SLIDE 50
slide-51
SLIDE 51
slide-52
SLIDE 52
slide-53
SLIDE 53
slide-54
SLIDE 54