order statistics
play

Order Statistics Carola Wenk Slides courtesy of Charles Leiserson - PowerPoint PPT Presentation

CS 3343 Fall 2011 Order Statistics Carola Wenk Slides courtesy of Charles Leiserson with small y changes by Carola Wenk 10/6/11 1 CS 3343 Analysis of Algorithms Order statistics Order statistics Select the i th smallest of n elements


  1. CS 3343 – Fall 2011 Order Statistics Carola Wenk Slides courtesy of Charles Leiserson with small y changes by Carola Wenk 10/6/11 1 CS 3343 Analysis of Algorithms

  2. Order statistics Order statistics Select the i th smallest of n elements (the element with rank i ). • i = 1: minimum ; • i = n : maximum ; • i = ⎣ ( n +1)/2 ⎦ or ⎡ ( n +1)/2 ⎤ : median . ( ) ( ) Naive algorithm : Sort and index i th element. Worst-case running time = Θ ( n log n + 1) Θ ( l W t i ti + 1) = Θ ( n log n ), using merge sort or heapsort ( not quicksort). i t h t ( i k t) t 10/6/11 2 CS 3343 Analysis of Algorithms

  3. Randomized divide-and- conquer algorithm l ith R AND -S ELECT ( A , p, q, i ) ( p q ) i- th smallest of A [ p . . q ] [ p q ] if p = q then return A [ p ] r ← R AND -P ARTITION ( A , p, q ) k ← r – p + 1 k ← + 1 k = rank( A [ r ]) k( A [ ]) k if i = k then return A [ r ] if i < k if i < k then return R AND -S ELECT ( A , p, r – 1 , i ) else return R AND -S ELECT ( A , r + 1 , q, i – k ) k ≤ A [ r ] ≥ A [ r ] p r q 10/6/11 3 CS 3343 Analysis of Algorithms

  4. Example Example Select the i = 7th smallest: Select the i = 7th smallest: 6 10 13 5 8 3 2 11 i = 7 pivot Partition: P i i 2 5 3 6 8 13 10 11 k = 4 Select the 7 – 4 = 3rd smallest recursively. Select the 7 4 3rd smallest recursively. 10/6/11 4 CS 3343 Analysis of Algorithms

  5. Intuition for analysis Intuition for analysis (All our analyses today assume that all elements ( y y are distinct.) for R AND - P ARTITION Lucky: Lucky: log 1 = n 0 = T ( n ) = T (9 n /10) + dn 1 n 10 / 9 = Θ ( n ) Θ ( n ) C ASE 3 C ASE 3 Unlucky: T ( n ) = T ( n – 1) + dn T ( n ) T ( n 1) + dn arithmetic series arithmetic series = Θ ( n 2 ) Worse than sorting! Worse than sorting! 10/6/11 5 CS 3343 Analysis of Algorithms

  6. Analysis of expected time Analysis of expected time The analysis follows that of randomized The analysis follows that of randomized quicksort, but it’s a little different. Let T ( n ) = the random variable for the running Let T ( n ) = the random variable for the running time of R AND -S ELECT on an input of size n , assuming random numbers are independent. assuming random numbers are independent. For k = 0, 1, …, n –1, define the indicator random variable random variable 1 if P ARTITION generates a k : n – k –1 split, X k = 0 otherwise 0 otherwise. k 10/6/11 6 CS 3343 Analysis of Algorithms

  7. Analysis (continued) Analysis (continued) To obtain an upper bound, assume that the i th element pp always falls in the larger side of the partition: T (max{0, n –1}) + dn if 0 : n –1 split, T (max{1, n –2}) + dn if 1 : n –2 split, T ( n ) = M T (max{ n –1, 0}) + dn ( { 1 0}) if if n –1 : 0 split, 1 0 li d − 1 n X ( ( ) ) ∑ ∑ = = − − − − + + (max{ (max{ , 1 1 }) }) X T T k k n n k k dn dn k . = 0 k − 1 n ( ( ) ) ∑ ∑ ≤ + 2 ( ( ) ) X T k dn k k = ⎣ ⎦ / 2 k n 10/6/11 7 CS 3343 Analysis of Algorithms

  8. Calculating expectation Calculating expectation ⎡ ⎤ − 1 n ( ( ) ) ⎥ ∑ ∑ = + + [ [ ( ( )] )] 2 2 ( ( ) ) ⎢ ⎢ ⎥ E E T T n n E E X X T T k k dn dn k k ⎣ ⎦ = ⎣ ⎦ / 2 k n Take expectations of both sides. Take expectations of both sides. 10/6/11 8 CS 3343 Analysis of Algorithms

  9. Calculating expectation Calculating expectation ⎡ ⎤ − 1 n ( ( ) ) ∑ ∑ = + + [ [ ( ( )] )] 2 2 ( ( ) ) ⎢ ⎢ ⎥ ⎥ E E T T n n E E X X T T k k dn dn k k ⎣ ⎦ = ⎣ ⎦ / 2 k n − 1 n [ [ ] ] ( ( ) ) ∑ ∑ = = + + 2 2 ( ( ) ) E E X X T T k k dn dn k = ⎣ ⎦ / 2 k n Linearity of expectation. y p 10/6/11 9 CS 3343 Analysis of Algorithms

  10. Calculating expectation Calculating expectation ⎡ ⎤ − 1 n ( ( ) ) ∑ ∑ = + + [ [ ( ( )] )] 2 2 ( ( ) ) ⎢ ⎢ ⎥ ⎥ E E T T n n E E X X T T k k dn dn k k ⎣ ⎦ = ⎣ ⎦ / 2 k n − 1 n [ [ ] ] ( ( ) ) ∑ ∑ = = + + 2 2 ( ( ) ) E E X X T T k k dn dn k = ⎣ ⎦ / 2 k n − 1 n [ [ ] ] [ [ ] ] ∑ ∑ = = ⋅ ⋅ + + 2 2 ( ( ) ) E E X X E E T T k k dn dn k = ⎣ ⎦ / 2 k n Independence of X k from other random p k choices. 10/6/11 10 CS 3343 Analysis of Algorithms

  11. Calculating expectation Calculating expectation ⎡ ⎤ − 1 n ( ( ) ) ∑ ∑ = + + [ [ ( ( )] )] 2 2 ( ( ) ) ⎢ ⎢ ⎥ ⎥ E E T T n n E E X X T T k k dn dn k k ⎣ ⎦ ⎣ ⎦ = / 2 k n − 1 n [ [ ] ] ( ( ) ) ∑ ∑ = = + + 2 2 ( ( ) ) E E X X T T k k dn dn k = ⎣ ⎦ / 2 k n − 1 n [ [ ] ] [ [ ] ] ∑ ∑ = = ⋅ ⋅ + + 2 2 ( ( ) ) E E X X E E T T k k dn dn k = ⎣ ⎦ / 2 k n − − 2 1 2 1 n n [ [ ] ] ∑ ∑ ∑ ∑ = = + + ( ( ) ) E E T T k k dn dn n n ⎣ ⎦ ⎣ ⎦ = = / 2 / 2 k n k n Linearity of expectation; E [ X k ] = 1/ n . Linearity of expectation; E [ X k ] 1/ n . 10/6/11 11 CS 3343 Analysis of Algorithms

  12. Calculating expectation Calculating expectation ⎡ ⎤ − 1 n ( ( ) ) ∑ ∑ = + + [ [ ( ( )] )] 2 2 ( ( ) ) ⎢ ⎢ ⎥ ⎥ E E T T n n E E X X T T k k dn dn k k ⎣ ⎦ ⎣ ⎦ = / 2 k n − 1 n [ [ ] ] ( ( ) ) ∑ ∑ = = + + 2 2 ( ( ) ) E E X X T T k k dn dn k = ⎣ ⎦ / 2 k n − 1 n [ [ ] ] [ [ ] ] ∑ ∑ = = ⋅ ⋅ + + 2 2 ( ( ) ) E E X X E E T T k k dn dn k = ⎣ ⎦ / 2 k n − − 2 1 2 1 n n [ [ ] ] ∑ ∑ ∑ ∑ = = + + ( ( ) ) E E T T k k dn dn n n ⎣ ⎦ ⎣ ⎦ = = / 2 / 2 k n k n − 1 2 n [ [ ] ] dn ∑ ∑ = = + + ( ( ) ) E E T T k k dn n = ⎣ ⎦ / 2 k n 10/6/11 12 CS 3343 Analysis of Algorithms

  13. Hairy recurrence Hairy recurrence (But not quite as hairy as the quicksort one.) − 1 2 n [ ] ∑ = + [ ( )] ( ) E T n E T k dn n n ⎣ ⎣ ⎦ ⎦ k = / / 2 2 k n Prove: E [ T ( n )] ≤ cn for constant c > 0. • The constant c can be chosen large enough so that E [ T ( n )] ≤ cn for the base cases. − 1 n k ∑ ∑ 3 n 2 ≤ k ( (exercise). ) Use fact: 8 8 = ⎣ ⎦ / 2 n 10/6/11 13 CS 3343 Analysis of Algorithms

  14. Substitution method Substitution method − 1 2 n [ [ ] ] k ∑ ∑ ≤ + ( ) E T n ck dn n ⎣ ⎦ = / 2 n Substitute inductive hypothesis. 10/6/11 14 CS 3343 Analysis of Algorithms

  15. Substitution method Substitution method − 1 2 n [ [ ] ] ∑ ∑ ≤ + ( ) E T n ck dn n ⎣ ⎦ = / 2 k n ⎛ ⎞ 2 3 c ≤ ≤ + + 2 ⎜ ⎜ ⎟ ⎟ n n dn dn ⎝ 8 ⎠ n Use fact. 10/6/11 15 CS 3343 Analysis of Algorithms

  16. Substitution method Substitution method − 2 1 n [ [ ] ] ∑ ∑ ≤ + ( ) E T n ck dn n = ⎣ ⎦ / 2 k n ⎛ ⎞ 2 3 c ≤ ≤ + + 2 ⎜ ⎜ ⎟ ⎟ n n dn dn ⎝ 8 ⎠ n ⎛ ⎛ ⎞ ⎞ cn cn = − − ⎜ ⎜ ⎟ ⎟ cn d dn ⎝ 4 ⎠ Express as desired – residual . 10/6/11 16 CS 3343 Analysis of Algorithms

  17. Substitution method Substitution method − 2 1 n [ [ ] ] ∑ ∑ ≤ + ( ) E T n ck dn n ⎣ ⎦ = / 2 k n ⎛ ⎞ 2 3 c ≤ ≤ + + 2 ⎜ ⎜ ⎟ ⎟ n n dn dn ⎝ 8 ⎠ n ⎛ ⎛ ⎞ ⎞ cn cn = − − ⎜ ⎜ ⎟ ⎟ cn d dn ⎝ 4 ⎠ ≤ ≤ , cn cn if c ≥ 4 d . 10/6/11 17 CS 3343 Analysis of Algorithms

  18. Summary of randomized order-statistic selection d i i l i • Works fast: linear expected time • Works fast: linear expected time. • Excellent algorithm in practice. • But the worst case is very bad: Θ ( n 2 ) • But, the worst case is very bad: Θ ( n ). Q. Is there an algorithm that runs in linear time in the worst case? i i h ? A. Yes, due to Blum, Floyd, Pratt, Rivest, and Tarjan [1973]. d T j [1973] I DEA : Generate a good pivot recursively. g p y 10/6/11 18 CS 3343 Analysis of Algorithms

  19. Worst-case linear-time order statistics i i S ELECT ( i, n ) 1. Divide the n elements into groups of 5. Find the median of each 5-element group by rote. 2 Recursively S ELECT the median x of the ⎣ n /5 ⎦ 2. Recursively S ELECT the median x of the ⎣ n /5 ⎦ group medians to be the pivot. 3. Partition around the pivot x . Let k = rank( x ). p ( ) if i = k then return x 4. Same as elseif i < k R AND R AND - then recursively S ELECT the i th then recursively S ELECT the i th smallest element in the lower part S ELECT else recursively S ELECT the ( i–k )th smallest element in the upper part 10/6/11 19 CS 3343 Analysis of Algorithms

  20. Choosing the pivot Choosing the pivot 10/6/11 20 CS 3343 Analysis of Algorithms

  21. Choosing the pivot Choosing the pivot 1 Divide the n elements into groups of 5 1. Divide the n elements into groups of 5. 10/6/11 21 CS 3343 Analysis of Algorithms

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend