Symbol Comparisons in QuickSort & QuickSelect I. Overview - - PowerPoint PPT Presentation

symbol comparisons in quicksort quickselect
SMART_READER_LITE
LIVE PREVIEW

Symbol Comparisons in QuickSort & QuickSelect I. Overview - - PowerPoint PPT Presentation

The Number of Symbol Comparisons in QuickSort & QuickSelect I. Overview ~~~ Philippe Flajolet II. Average-Case Analysis ~~~ Brigitte Valle III. Distributions ~~~ Jim Fill Wednesday, June 17, 2009 1 1.


slide-1
SLIDE 1

The Number of

Symbol Comparisons in QuickSort & QuickSelect

  • I. Overview ~~~ Philippe Flajolet
  • II. Average-Case Analysis ~~~ Brigitte

Vallée

  • III. Distributions ~~~ Jim Fill
1 Wednesday, June 17, 2009
slide-2
SLIDE 2
  • 1. Algorithms & analysis
  • 2. Cost measures
  • 3. Sources (data model)
  • 4. Results: average-case & distributional
2 Wednesday, June 17, 2009
slide-3
SLIDE 3

1.QuickSort & QuickSelect

3 Wednesday, June 17, 2009
slide-4
SLIDE 4

k-1 n-k P: pivot <P >P

4 Wednesday, June 17, 2009
slide-5
SLIDE 5

Analyses of QuickSort

  • Average-case: recurrences, then generating

functions (GFs). Exchanges; Median-of-3, etc.

  • Variance: multivariate GFs
  • Distribution:

MGFs & moments, Martingales, Contraction Hoare; Knuth; Sedgewick [1960-1975] Hennequin, Régnier, Rösler [1989+] Fill & Janson [2000], Martinez...

5 Wednesday, June 17, 2009
slide-6
SLIDE 6

m m m < k? m > k? m = k?

P: pivot <P >P

6 Wednesday, June 17, 2009
slide-7
SLIDE 7

Various brands of QuickSelect:

7 Wednesday, June 17, 2009
slide-8
SLIDE 8

Average-case analyses Knuth et al [ca 1970]

8 Wednesday, June 17, 2009
slide-9
SLIDE 9

Distributional analyses

  • Quickselect: e.g., Dickman distribution

Mahmoud-Modarres-Smythe, Grübel, Rösler, Hwang-Tsai, et al. perpetuities: 1+U1+U1U2+U1U2U3+... i.i.d. unif. [0,1]

(fixed rank; fixed quantile)

  • Multiple Quickselect, ancestors, &c

Lent-Mahmoud, Prodinger, et al.

9 Wednesday, June 17, 2009
slide-10
SLIDE 10
  • 2. Cost

measures

10 Wednesday, June 17, 2009
slide-11
SLIDE 11

Sedgewick @ AofA-02(?): “actual complexity matters!”

  • So far: number of key-comparisons
  • But... keys are often “non-atomic” records!
  • And...need common information-theoretic basis,

to compare with radix methods, hashing, etc.

11 Wednesday, June 17, 2009
slide-12
SLIDE 12
  • Count all symbol comparisons in algorithms:
  • comparing u and v has cost 1+coincidence

(u,v). a b a b b b... a b a a b a... coincidence=3; #comparisons=4. (γ) (β) Alphabet: Σ

12 Wednesday, June 17, 2009
slide-13
SLIDE 13

A Binary Search Tree: symbol comparisons

13 Wednesday, June 17, 2009
slide-14
SLIDE 14

It takes O(n.log n) symbol comparisons to “distinguish” n elements --- in probability, on average With high probability, the common prefix of any two words has length at most O(log n). Under a wide range of classical STRING (WORD) MODELS: Many many people in the audience...

14 Wednesday, June 17, 2009
slide-15
SLIDE 15
  • Bernoulli, Markov, etc.
  • Devroye’s density model
  • Vallée’s dynamic sources...

TRIES Sn=O(Kn.log(n))

  • Quicksort: O(n.(log n)2)
  • Quickselect: O(n.log n)

Upper bounds

15 Wednesday, June 17, 2009
slide-16
SLIDE 16
  • QuickSort: [Janson & Fill 2002] binary

source + density model.

  • QuickSelect: [Fill-Nakama 2007-9] binary

source for QuickMin/Max & QuickRand Symbol comparisons

(cf also: Panholzer & Prodinger)

CONSTANTS? ~Cn.log(n)2 ~C’.n

16 Wednesday, June 17, 2009
slide-17
SLIDE 17
  • 3. Sources

“A source models the way data (symbols) are produced.”

“La Source” by Ingres @ Musée d’Orsay

17 Wednesday, June 17, 2009
slide-18
SLIDE 18
  • Totally ordered alphabet (usually finite) ∑
  • Fundamental probabilities (pw) :=

the probability of starting with w

  • pw →0 as |w| →∞
  • Keys are invariably i.i.d.

Axioms for SOURCES

[Later] + “regularity” conditions: tameness

18 Wednesday, June 17, 2009
slide-19
SLIDE 19

Property: The Source is parameterized by [0,1]: to an infinite word w, there corresponds α such that M(α)=w. a b aa ab ba bb aba abb

1

19 Wednesday, June 17, 2009
slide-20
SLIDE 20

Notations:

1

pw pw- pw+ Pr(prefix<w) Pr(prefix=w) Pr(prefix>w) Fundamental constants of QuickStuffs will be all expressed in terms of fundamental probabilities aw bw

20 Wednesday, June 17, 2009
slide-21
SLIDE 21
  • Standard binary source (uniform: 1/2,1/2);

Bernoulli sources such as 1/2, 1/6,1/3.

  • Density models: Standard binary source with

density f(x) or c.d.f F(x).

  • Markov
  • Dynamical sources

[Devroye 1986] [Vallée 2001; Clément-Fl-Vallée 2001]

21 Wednesday, June 17, 2009
slide-22
SLIDE 22

Fundamental intervals & triangles

1/2 1/6 1/3

22 Wednesday, June 17, 2009
slide-23
SLIDE 23
  • 4. Results

(Le Savant Cosinus)

23 Wednesday, June 17, 2009
slide-24
SLIDE 24

Average-case

➜ ➜QuickMin, QuickRand

2 1

QuickVal ☞ ☞

24 Wednesday, June 17, 2009
slide-25
SLIDE 25

QUICKVAL(α): is dual to QuickSelect

  • QuickVal(n,α) := rank of element whose parameter

[corresponding to value v] is α.

  • QuickVal(n,α) behaves “almost” as QuickSelect(nα).

<P >P v<P v=P v>P P: pivot

25 Wednesday, June 17, 2009
slide-26
SLIDE 26

Distribution Theorem: Assuming a suitable tameness condition, there exists a limiting distribution of the cost Sn/n of QuickQuant(α), which can be described explicitly

26 Wednesday, June 17, 2009