CS 5633 Analysis of Algorithms 1 2/12/08
Order Statistics Carola Wenk Slides courtesy of Charles Leiserson - - PowerPoint PPT Presentation
Order Statistics Carola Wenk Slides courtesy of Charles Leiserson - - PowerPoint PPT Presentation
CS 5633 -- Spring 2008 Order Statistics Carola Wenk Slides courtesy of Charles Leiserson with small changes by Carola Wenk 2/12/08 CS 5633 Analysis of Algorithms 1 Order statistics Select the i th smallest of n elements (the element with
CS 5633 Analysis of Algorithms 2 2/12/08
Order statistics
Select the ith smallest of n elements (the element with rank i).
- i = 1: minimum;
- i = n: maximum;
- i = (n+1)/2 or (n+1)/2: median.
Naive algorithm: Sort and index ith element. Worst-case running time = Θ(n log n) + Θ(1) = Θ(n log n), using merge sort or heapsort (not quicksort).
CS 5633 Analysis of Algorithms 3 2/12/08
Randomized divide-and- conquer algorithm
RAND-SELECT(A, p, q, i)
⊳ ith smallest of A[p . . q]
if p = q then return A[p] r ← RAND-PARTITION(A, p, q) k ← r – p + 1
⊳ k = rank(A[r])
if i = k then return A[r] if i < k then return RAND-SELECT(A, p, r – 1, i) else return RAND-SELECT(A, r + 1, q, i – k) ≤ A[r] ≤ A[r] ≥ A[r] ≥ A[r] r p q k
CS 5633 Analysis of Algorithms 4 2/12/08
Example
pivot i = 7 6 6 10 10 13 13 5 5 8 8 3 3 2 2 11 11 k = 4 Select the 7 – 4 = 3rd smallest recursively. Select the i = 7th smallest: 2 2 5 5 3 3 6 6 8 8 13 13 10 10 11 11 Partition:
CS 5633 Analysis of Algorithms 5 2/12/08
Intuition for analysis
Lucky: 1
1 log
9 / 10
= = n n CASE 3 T(n) = T(9n/10) + Θ(n) = Θ(n) Unlucky: T(n) = T(n – 1) + Θ(n) = Θ(n2) arithmetic series Worse than sorting! (All our analyses today assume that all elements are distinct.)
CS 5633 Analysis of Algorithms 6 2/12/08
Analysis of expected time
Let T(n) = the random variable for the running time of RAND-SELECT on an input of size n, assuming random numbers are independent. For k = 0, 1, …, n–1, define the indicator random variable Xk = 1 if PARTITION generates a k : n–k–1 split, 0 otherwise. The analysis follows that of randomized quicksort, but it’s a little different.
CS 5633 Analysis of Algorithms 7 2/12/08
Analysis (continued)
T(n) = T(max{0, n–1}) + Θ(n) if 0 : n–1 split, T(max{1, n–2}) + Θ(n) if 1 : n–2 split, M T(max{n–1, 0}) + Θ(n) if n–1 : 0 split,
( )
∑
− =
Θ + − − =
1
) ( }) 1 , (max{
n k k
n k n k T X
.
To obtain an upper bound, assume that the i th element always falls in the larger side of the partition:
( )
∑
− =
Θ + ≤
1 2 /
) ( ) ( 2
n n k k
n k T X
CS 5633 Analysis of Algorithms 8 2/12/08
Calculating expectation
Take expectations of both sides.
( )
Θ + =
∑
− = 1 2 /
) ( ) ( 2 )] ( [
n n k k
n k T X E n T E
CS 5633 Analysis of Algorithms 9 2/12/08
Calculating expectation
Linearity of expectation.
( )
( )
[ ]
∑ ∑
− = − =
Θ + = Θ + =
1 2 / 1 2 /
) ( ) ( 2 ) ( ) ( 2 )] ( [
n n k k n n k k
n k T X E n k T X E n T E
CS 5633 Analysis of Algorithms 10 2/12/08
Calculating expectation
Independence of Xk from other random choices.
( )
( )
[ ]
[ ] [ ]
∑ ∑ ∑
− = − = − =
Θ + ⋅ = Θ + = Θ + =
1 2 / 1 2 / 1 2 /
) ( ) ( 2 ) ( ) ( 2 ) ( ) ( 2 )] ( [
n n k k n n k k n n k k
n k T E X E n k T X E n k T X E n T E
CS 5633 Analysis of Algorithms 11 2/12/08
Calculating expectation
Linearity of expectation; E[Xk] = 1/n.
( )
( )
[ ]
[ ] [ ]
[ ]
∑ ∑ ∑ ∑ ∑
− = − = − = − = − =
Θ + = Θ + ⋅ = Θ + = Θ + =
1 2 / 1 2 / 1 2 / 1 2 / 1 2 /
) ( 2 ) ( 2 ) ( ) ( 2 ) ( ) ( 2 ) ( ) ( 2 )] ( [
n n k n n k n n k k n n k k n n k k
n n k T E n n k T E X E n k T X E n k T X E n T E
CS 5633 Analysis of Algorithms 12 2/12/08
Calculating expectation
( )
( )
[ ]
[ ] [ ]
[ ]
[ ]
) ( ) ( 2 ) ( 2 ) ( 2 ) ( ) ( 2 ) ( ) ( 2 ) ( ) ( 2 )] ( [
1 2 / 1 2 / 1 2 / 1 2 / 1 2 / 1 2 /
n k T E n n n k T E n n k T E X E n k T X E n k T X E n T E
n n k n n k n n k n n k k n n k k n n k k
Θ + = Θ + = Θ + ⋅ = Θ + = Θ + =
∑ ∑ ∑ ∑ ∑ ∑
− = − = − = − = − = − =
CS 5633 Analysis of Algorithms 13 2/12/08
Hairy recurrence
[ ]
) ( ) ( 2 )] ( [
1 2 /
n k T E n n T E
n n k
Θ + =
∑
− =
Prove: E[T(n)] ≤ cn for constant c > 0. Use fact:
2 1 2 / 8 3n
k
n n k ∑ − =
≤ (exercise).
- The constant c can be chosen large enough
so that E[T(n)] ≤ cn for the base cases. (But not quite as hairy as the quicksort one.)
CS 5633 Analysis of Algorithms 14 2/12/08
Substitution method
[ ]
) ( 2 ) (
1 2 /
n ck n n T E
n n k
Θ + ≤
∑
− =
Substitute inductive hypothesis.
CS 5633 Analysis of Algorithms 15 2/12/08
Substitution method
[ ]
) ( 8 3 2 ) ( 2 ) (
2 1 2 /
n n n c n ck n n T E
n n k
Θ + ≤ Θ + ≤
∑
− =
Use fact.
CS 5633 Analysis of Algorithms 16 2/12/08
Substitution method
Express as desired – residual.
[ ]
Θ − − = Θ + ≤ Θ + ≤
∑
− =
) ( 4 ) ( 8 3 2 ) ( 2 ) (
2 1 2 /
n cn cn n n n c n ck n n T E
n n k
CS 5633 Analysis of Algorithms 17 2/12/08
Substitution method
[ ]
cn n cn cn n n n c n ck n n T E
n n k
≤ Θ − − = Θ + ≤ Θ + ≤
∑
− =
) ( 4 ) ( 8 3 2 ) ( 2 ) (
2 1 2 /
if c is chosen large enough so that cn/4 dominates the Θ(n). ,
CS 5633 Analysis of Algorithms 18 2/12/08
Summary of randomized
- rder-statistic selection
- Works fast: linear expected time.
- Excellent algorithm in practice.
- But, the worst case is very bad: Θ(n2).
- Q. Is there an algorithm that runs in linear
time in the worst case? IDEA: Generate a good pivot recursively.
- A. Yes, due to Blum, Floyd, Pratt, Rivest,
and Tarjan [1973].
CS 5633 Analysis of Algorithms 19 2/12/08
Worst-case linear-time order statistics
if i = k then return x elseif i < k then recursively SELECT the ith smallest element in the lower part else recursively SELECT the (i–k)th smallest element in the upper part
SELECT(i, n)
- 1. Divide the n elements into groups of 5. Find
the median of each 5-element group by rote.
- 2. Recursively SELECT the median x of the n/5
group medians to be the pivot.
- 3. Partition around the pivot x. Let k = rank(x).
4. Same as RAND- SELECT
CS 5633 Analysis of Algorithms 20 2/12/08
Choosing the pivot
CS 5633 Analysis of Algorithms 21 2/12/08
Choosing the pivot
- 1. Divide the n elements into groups of 5.
CS 5633 Analysis of Algorithms 22 2/12/08
Choosing the pivot
lesser greater
- 1. Divide the n elements into groups of 5. Find
the median of each 5-element group by rote.
CS 5633 Analysis of Algorithms 23 2/12/08
Choosing the pivot
lesser greater
- 1. Divide the n elements into groups of 5. Find
the median of each 5-element group by rote.
- 2. Recursively SELECT the median x of the n/5
group medians to be the pivot. x
CS 5633 Analysis of Algorithms 24 2/12/08
Developing the recurrence
if i = k then return x elseif i < k then recursively SELECT the ith smallest element in the lower part else recursively SELECT the (i–k)th smallest element in the upper part
SELECT(i, n)
- 1. Divide the n elements into groups of 5. Find
the median of each 5-element group by rote.
- 2. Recursively SELECT the median x of the n/5
group medians to be the pivot.
- 3. Partition around the pivot x. Let k = rank(x).
4. T(n) Θ(n) T(n/5) Θ(n) T( )
?
CS 5633 Analysis of Algorithms 25 2/12/08
Analysis
lesser greater
x At least half the group medians are ≤ x, which is at least n/5 /2 = n/10 group medians. (Assume all elements are distinct.)
CS 5633 Analysis of Algorithms 26 2/12/08
Analysis
lesser greater
x At least half the group medians are ≤ x, which is at least n/5 /2 = n/10 group medians.
- Therefore, at least 3n/10 elements are ≤ x.
(Assume all elements are distinct.)
CS 5633 Analysis of Algorithms 27 2/12/08
Analysis
lesser greater
x At least half the group medians are ≤ x, which is at least n/5 /2 = n/10 group medians.
- Therefore, at least 3n/10 elements are ≤ x.
- Similarly, at least 3n/10 elements are ≥ x.
(Assume all elements are distinct.)
CS 5633 Analysis of Algorithms 28 2/12/08
- At least 3n/10 elements are ≤ x
⇒ at most n-3n/10 elements are ≥ x
- At least 3n/10 elements are ≥ x
⇒ at most n-3n/10 elements are ≤ x
- The recursive call to SELECT in Step 4 is
executed recursively on n-3n/10 elements.
Analysis (Assume all elements are distinct.)
Need “at most” for worst-case runtime
CS 5633 Analysis of Algorithms 29 2/12/08
- Use fact that a/b ≥ ((a-(b-1))/b (page 51)
- n-3n/10 ≤ n-3·(n-9)/10 = (10n -3n +27)/10
≤ 7n/10 + 3
- The recursive call to SELECT in Step 4 is
executed recursively on at most 7n/10+3 elements.
Analysis (Assume all elements are distinct.)
CS 5633 Analysis of Algorithms 30 2/12/08
Developing the recurrence
if i = k then return x elseif i < k then recursively SELECT the ith smallest element in the lower part else recursively SELECT the (i–k)th smallest element in the upper part
SELECT(i, n)
- 1. Divide the n elements into groups of 5. Find
the median of each 5-element group by rote.
- 2. Recursively SELECT the median x of the n/5
group medians to be the pivot.
- 3. Partition around the pivot x. Let k = rank(x).
4. T(n) Θ(n) T(n/5) Θ(n) T(7n/10 +3)
CS 5633 Analysis of Algorithms 31 2/12/08
Solving the recurrence
dn n T n T n T + + + = 3 10 7 5 1 ) ( if c is chosen large enough, e.g., c=10d
) 4 ( 10 1 ) 4 ( 4 10 9 ) 4 3 10 7 ( ) 4 5 1 ( ) ( − ≤ + − − = + − ≤ + − + + − ≤ n c dn cn n c dn c cn dn n c n c n T
,
Substitution: T(n) ≤ c(n - 4)
for Θ(n)
Technical trick. This shows that T(n)∈ O(n)
CS 5633 Analysis of Algorithms 32 2/12/08
Conclusions
- Since the work at each level of recursion is
basically a constant fraction (9/10) smaller, the work per level is a geometric series dominated by the linear work at the root.
- In practice, this algorithm runs slowly,
because the constant in front of n is large.
- The randomized algorithm is far more