Quicksort Proposed by C.A.R. Hoare in 1962. Divide-and-conquer - - PDF document

quicksort
SMART_READER_LITE
LIVE PREVIEW

Quicksort Proposed by C.A.R. Hoare in 1962. Divide-and-conquer - - PDF document

CS 3343 -- Fall 2007 Quicksort Proposed by C.A.R. Hoare in 1962. Divide-and-conquer algorithm. Sorts in place (like insertion sort, but not like merge sort). Very practical (with tuning). Quicksort Carola Wenk Slides


slide-1
SLIDE 1

1

9/27/07 CS 3343 Analysis of Algorithms 1

CS 3343 -- Fall 2007

Quicksort

Carola Wenk Slides courtesy of Charles Leiserson with small changes by Carola Wenk

9/27/07 CS 3343 Analysis of Algorithms 2

Quicksort

  • Proposed by C.A.R. Hoare in 1962.
  • Divide-and-conquer algorithm.
  • Sorts “in place” (like insertion sort, but not

like merge sort).

  • Very practical (with tuning).

9/27/07 CS 3343 Analysis of Algorithms 3

Divide and conquer

Quicksort an n-element array:

  • 1. Divide: Partition the array into two subarrays

around a pivot x such that elements in lower subarray ≤ x ≤ elements in upper subarray.

  • 2. Conquer: Recursively sort the two subarrays.
  • 3. Combine: Trivial.

≤ x ≤ x x x ≥ x ≥ x Key: Linear-time partitioning subroutine.

9/27/07 CS 3343 Analysis of Algorithms 4

Running time = O(n) for n elements. Running time = O(n) for n elements.

Partitioning subroutine

PARTITION(A, p, q) // A[p . . q] x ← A[p] // pivot = A[p] i ← p for j ← p + 1 to q do if A[ j] ≤ x then i ← i + 1 exchange A[i] ↔ A[ j] exchange A[p] ↔ A[i] return i

x x ≤ x ≤ x ≥ x ≥ x ? ? p i q j Invariant:

9/27/07 CS 3343 Analysis of Algorithms 5

Example of partitioning

i j 6 6 10 10 13 13 5 5 8 8 3 3 2 2 11 11

9/27/07 CS 3343 Analysis of Algorithms 6

Example of partitioning

i j 6 6 10 10 13 13 5 5 8 8 3 3 2 2 11 11

slide-2
SLIDE 2

2

9/27/07 CS 3343 Analysis of Algorithms 7

Example of partitioning

i j 6 6 10 10 13 13 5 5 8 8 3 3 2 2 11 11

9/27/07 CS 3343 Analysis of Algorithms 8

Example of partitioning

6 6 10 10 13 13 5 5 8 8 3 3 2 2 11 11 i j 6 6 5 5 13 13 10 10 8 8 3 3 2 2 11 11

9/27/07 CS 3343 Analysis of Algorithms 9

Example of partitioning

6 6 10 10 13 13 5 5 8 8 3 3 2 2 11 11 i j 6 6 5 5 13 13 10 10 8 8 3 3 2 2 11 11

9/27/07 CS 3343 Analysis of Algorithms 10

Example of partitioning

6 6 10 10 13 13 5 5 8 8 3 3 2 2 11 11 i j 6 6 5 5 13 13 10 10 8 8 3 3 2 2 11 11

9/27/07 CS 3343 Analysis of Algorithms 11

Example of partitioning

6 6 10 10 13 13 5 5 8 8 3 3 2 2 11 11 i j 6 6 5 5 3 3 10 10 8 8 13 13 2 2 11 11 6 6 5 5 13 13 10 10 8 8 3 3 2 2 11 11

9/27/07 CS 3343 Analysis of Algorithms 12

Example of partitioning

6 6 10 10 13 13 5 5 8 8 3 3 2 2 11 11 i j 6 6 5 5 3 3 10 10 8 8 13 13 2 2 11 11 6 6 5 5 13 13 10 10 8 8 3 3 2 2 11 11

slide-3
SLIDE 3

3

9/27/07 CS 3343 Analysis of Algorithms 13

Example of partitioning

6 6 10 10 13 13 5 5 8 8 3 3 2 2 11 11 6 6 5 5 3 3 10 10 8 8 13 13 2 2 11 11 6 6 5 5 13 13 10 10 8 8 3 3 2 2 11 11 i j 6 6 5 5 3 3 2 2 8 8 13 13 10 10 11 11

9/27/07 CS 3343 Analysis of Algorithms 14

Example of partitioning

6 6 10 10 13 13 5 5 8 8 3 3 2 2 11 11 6 6 5 5 3 3 10 10 8 8 13 13 2 2 11 11 6 6 5 5 13 13 10 10 8 8 3 3 2 2 11 11 i j 6 6 5 5 3 3 2 2 8 8 13 13 10 10 11 11

9/27/07 CS 3343 Analysis of Algorithms 15

Example of partitioning

6 6 10 10 13 13 5 5 8 8 3 3 2 2 11 11 6 6 5 5 3 3 10 10 8 8 13 13 2 2 11 11 6 6 5 5 13 13 10 10 8 8 3 3 2 2 11 11 i j 6 6 5 5 3 3 2 2 8 8 13 13 10 10 11 11

9/27/07 CS 3343 Analysis of Algorithms 16

Example of partitioning

6 6 10 10 13 13 5 5 8 8 3 3 2 2 11 11 6 6 5 5 3 3 10 10 8 8 13 13 2 2 11 11 6 6 5 5 13 13 10 10 8 8 3 3 2 2 11 11 6 6 5 5 3 3 2 2 8 8 13 13 10 10 11 11 i 2 2 5 5 3 3 6 6 8 8 13 13 10 10 11 11

9/27/07 CS 3343 Analysis of Algorithms 17

Pseudocode for quicksort

QUICKSORT(A, p, r) if p < r then q ← PARTITION(A, p, r) QUICKSORT(A, p, q–1) QUICKSORT(A, q+1, r) Initial call: QUICKSORT(A, 1, n)

9/27/07 CS 3343 Analysis of Algorithms 18

Analysis of quicksort

  • Assume all input elements are distinct.
  • In practice, there are better partitioning

algorithms for when duplicate input elements may exist.

  • Let T(n) = worst-case running time on

an array of n elements.

slide-4
SLIDE 4

4

9/27/07 CS 3343 Analysis of Algorithms 19

Worst-case of quicksort

  • Input sorted or reverse sorted.
  • Partition around min or max element.
  • One side of partition always has no elements.

) ( ) ( ) 1 ( ) ( ) 1 ( ) 1 ( ) ( ) 1 ( ) ( ) (

2

n n n T n n T n n T T n T Θ = Θ + − = Θ + − + Θ = Θ + − + = (arithmetic series)

9/27/07 CS 3343 Analysis of Algorithms 20

Worst-case recursion tree

T(n) = T(0) + T(n–1) + cn

9/27/07 CS 3343 Analysis of Algorithms 21

Worst-case recursion tree

T(n) = T(0) + T(n–1) + cn T(n)

9/27/07 CS 3343 Analysis of Algorithms 22

cn T(0) T(n–1)

Worst-case recursion tree

T(n) = T(0) + T(n–1) + cn

9/27/07 CS 3343 Analysis of Algorithms 23

cn T(0) c(n–1)

Worst-case recursion tree

T(n) = T(0) + T(n–1) + cn T(0) T(n–2)

9/27/07 CS 3343 Analysis of Algorithms 24

cn T(0) c(n–1)

Worst-case recursion tree

T(n) = T(0) + T(n–1) + cn T(0) c(n–2) T(0) T(0) O

slide-5
SLIDE 5

5

9/27/07 CS 3343 Analysis of Algorithms 25

cn T(0) c(n–1)

Worst-case recursion tree

T(n) = T(0) + T(n–1) + cn T(0) c(n–2) T(0) T(0) O

( )

2 1

n k

k

Θ =         Θ ∑

=

height

height = n

9/27/07 CS 3343 Analysis of Algorithms 26

cn T(0) c(n–1)

Worst-case recursion tree

T(n) = T(0) + T(n–1) + cn T(0) c(n–2) T(0) T(0) O

( )

2 1

n k

k

Θ =         Θ ∑

=

n

height = n

9/27/07 CS 3343 Analysis of Algorithms 27

cn c(n–1)

Worst-case recursion tree

T(n) = T(0) + T(n–1) + cn c(n–2) Θ(1) O

( )

2 1

n k

k

Θ =         Θ ∑

=

n

height = n Θ(1) Θ(1) Θ(1) T(n) = Θ(n) + Θ(n2) = Θ(n2)

9/27/07 CS 3343 Analysis of Algorithms 28

Best-case analysis

(For intuition only!) If we’re lucky, PARTITION splits the array evenly: T(n) = 2T(n/2) + Θ(n) = Θ(n log n) (same as merge sort) What if the split is always

10 9 10 1 :

?

( ) ( )

) ( ) (

10 9 10 1

n n T n T n T Θ + + = What is the solution to this recurrence?

9/27/07 CS 3343 Analysis of Algorithms 29

Analysis of “almost-best” case

) (n T

9/27/07 CS 3343 Analysis of Algorithms 30

Analysis of “almost-best” case

cn

( )

n T 10

1

( )

n T 10

9

slide-6
SLIDE 6

6

9/27/07 CS 3343 Analysis of Algorithms 31

Analysis of “almost-best” case

cn cn

10 1

cn

10 9

( )

n T 100

1

( )

n T 100

9

( )

n T 100

9

( )

n T 100

81

9/27/07 CS 3343 Analysis of Algorithms 32

Analysis of “almost-best” case

cn cn

10 1

cn

10 9

cn

100 1

cn

100 9

cn

100 9

cn

100 81

Θ(1) Θ(1) … … log10/9n cn cn cn … O(n) leaves O(n) leaves

9/27/07 CS 3343 Analysis of Algorithms 33

log10n

Analysis of “almost-best” case

cn cn

10 1

cn

10 9

cn

100 1

cn

100 9

cn

100 9

cn

100 81

Θ(1) Θ(1) … … log10/9n cn cn cn T(n) ≤ cnlog10/9n + Ο(n) … cnlog10n ≤ O(n) leaves O(n) leaves Θ(nlog n)

9/27/07 CS 3343 Analysis of Algorithms 34

Quicksort Runtimes

  • Best case runtime Tbest(n) ∈ O(n log n)
  • Worst case runtime Tworst(n) ∈ O(n2)
  • Worse than mergesort? Why is it called

quicksort then?

  • Its average runtime Tavg(n) ∈ O(n log n )
  • Better even, the expected runtime of

randomized quicksort is O(n log n)

9/27/07 CS 3343 Analysis of Algorithms 35

Average Runtime

The average runtime Tavg(n) for Quicksort is the average runtime over all possible inputs

  • f length n.
  • What kind of inputs are there?
  • How many inputs are there?

9/27/07 CS 3343 Analysis of Algorithms 36

Average Runtime

  • What kind of inputs are there?
  • Do [1,2,…,n] and [5,6,…,n+5] cause

different runtimes of Quicksort?

  • No. Therefore only consider all

permutations of [1,2,…,n] .

  • How many inputs are there?
  • There are n! different permutations of

[1,2,…,n]

slide-7
SLIDE 7

7

9/27/07 CS 3343 Analysis of Algorithms 37

Average Runtime

  • Therefore, Tavg(n) has to average the runtimes
  • ver all n! different input permutations
  • Disadvantage of considering average runtime:
  • There are still worst-case inputs that will

have a O(n2) runtime

  • Are all inputs really equally likely ? That

depends on the application ⇒ Better: Use randomized quicksort

9/27/07 CS 3343 Analysis of Algorithms 38

Randomized quicksort

IDEA: Partition around a random element.

  • Running time is independent of the input
  • rder.
  • No assumptions need to be made about

the input distribution.

  • No specific input elicits the worst-case

behavior.

  • The worst case is determined only by the
  • utput of a random-number generator.

9/27/07 CS 3343 Analysis of Algorithms 39

Randomized quicksort analysis

  • T(n) = random variable for the running time of

randomized quicksort on an input of size n

  • E(T(n)) = expected value of T(n),

the “average runtime” of randomized quicksort T(n) = T(0) + T(n–1) + dn if 0 : n–1 split, T(1) + T(n–2) + dn if 1 : n–2 split, M T(n–1) + T(0) + dn if n–1 : 0 split,

9/27/07 CS 3343 Analysis of Algorithms 40

Randomized quicksort analysis

( )

− =

+ − − + =

1

)) 1 ( ( )) ( ( 1 )) ( (

n k

dn k n T E k T E n n T E Assume that each split is equally likely, with 1/n probability. ⇒ The expected runtime (the “average runtime”) is

9/27/07 CS 3343 Analysis of Algorithms 41

Randomized quicksort analysis

( ) ( )

∑ ∑

− = − =

+ = + − − + =

1 1

)) ( ( 2 )) 1 ( ( )) ( ( 1 )) ( (

n k n k

dn k T E n dn k n T E k T E n n T E Assume that each split is equally likely, with 1/n probability. ⇒ The expected runtime (the “average runtime”) is

9/27/07 CS 3343 Analysis of Algorithms 42

Hairy recurrence

Prove: E[T(n)] ≤ cnlogn for constant c > 0. Use fact:

2 1 2 8 1 2 2 1

log log n n n k k

n k

− =

− ≤ (exercise).

  • Base case: Choose c large enough so that cn

logn dominates E[T(n)] for sufficiently small n ≥ n0=2.

[ ] dn

k T E n n T E

n k

+ = ∑

− = 1 2

) ( 2 )] ( [

(Assume base cases E(T(0))=E(T(1))=0.)

slide-8
SLIDE 8

8

9/27/07 CS 3343 Analysis of Algorithms 43

Inductive step

[ ]

dn k ck n n T E

n k

+ ≤ ∑

− = 1 2

log 2 ) ( Substitute inductive hypothesis.

9/27/07 CS 3343 Analysis of Algorithms 44

Inductive step

[ ]

dn n n n n c dn k ck n n T E

n k

+       − ≤ + ≤ ∑

− = 2 2 1 2

8 1 log 2 1 2 log 2 ) ( Use fact.

9/27/07 CS 3343 Analysis of Algorithms 45

Inductive step

[ ]

      − − = +       − ≤ + ≤ ∑

− =

dn cn n cn dn n n n n c dn k ck n n T E

n k

4 log 8 1 log 2 1 2 log 2 ) (

2 2 1 2

Express as desired – residual.

9/27/07 CS 3343 Analysis of Algorithms 46

Inductive step

[ ]

n cn dn cn n cn dn n n n n c dn k ck n n T E

n k

log 4 log 8 1 log 2 1 2 log 2 ) (

2 2 1 2

≤       − − = +       − = + ≤ ∑

− =

if c ≥ 4d. ,

9/27/07 CS 3343 Analysis of Algorithms 47

Quicksort in practice

  • Quicksort is a great general-purpose

sorting algorithm.

  • Quicksort is typically over twice as fast

as merge sort.

  • Quicksort can benefit substantially from

code tuning.

  • Quicksort behaves well even with

caching and virtual memory.

9/27/07 CS 3343 Analysis of Algorithms 48

Quicksort runtimes

  • Best case: n log n
  • Worst case: n2
  • Expected runtime for randomized

quicksort: n log n