order statistics
play

Order Statistics Carola Wenk Slides courtesy of Charles Leiserson - PowerPoint PPT Presentation

CS 5633 -- Spring 2008 Order Statistics Carola Wenk Slides courtesy of Charles Leiserson with small changes by Carola Wenk 2/12/08 CS 5633 Analysis of Algorithms 1 Order statistics Select the i th smallest of n elements (the element with


  1. CS 5633 -- Spring 2008 Order Statistics Carola Wenk Slides courtesy of Charles Leiserson with small changes by Carola Wenk 2/12/08 CS 5633 Analysis of Algorithms 1

  2. Order statistics Select the i th smallest of n elements (the element with rank i ). • i = 1: minimum ; • i = n : maximum ; • i =  ( n +1)/2  or  ( n +1)/2  : median . Naive algorithm : Sort and index i th element. Worst-case running time = Θ ( n log n ) + Θ (1) = Θ ( n log n ), using merge sort or heapsort ( not quicksort). 2/12/08 CS 5633 Analysis of Algorithms 2

  3. Randomized divide-and- conquer algorithm ⊳ i th smallest of A [ p . . q ] R AND -S ELECT ( A , p, q, i ) if p = q then return A [ p ] r ← R AND -P ARTITION ( A , p, q ) k ← r – p + 1 ⊳ k = rank( A [ r ]) if i = k then return A [ r ] if i < k then return R AND -S ELECT ( A , p, r – 1 , i ) else return R AND -S ELECT ( A , r + 1 , q, i – k ) k ≤ A [ r ] ≥ A [ r ] ≤ A [ r ] ≥ A [ r ] p r q 2/12/08 CS 5633 Analysis of Algorithms 3

  4. Example Select the i = 7th smallest: 6 10 13 5 8 3 2 11 i = 7 6 10 13 5 8 3 2 11 pivot Partition: 2 5 3 6 8 13 10 11 k = 4 2 5 3 6 8 13 10 11 Select the 7 – 4 = 3rd smallest recursively. 2/12/08 CS 5633 Analysis of Algorithms 4

  5. Intuition for analysis (All our analyses today assume that all elements are distinct.) Lucky: log 1 T ( n ) = T (9 n /10) + Θ ( n ) = n 0 = n 1 10 / 9 = Θ ( n ) C ASE 3 Unlucky: T ( n ) = T ( n – 1) + Θ ( n ) arithmetic series = Θ ( n 2 ) Worse than sorting! 2/12/08 CS 5633 Analysis of Algorithms 5

  6. Analysis of expected time The analysis follows that of randomized quicksort, but it’s a little different. Let T ( n ) = the random variable for the running time of R AND -S ELECT on an input of size n , assuming random numbers are independent. For k = 0, 1, …, n –1, define the indicator random variable 1 if P ARTITION generates a k : n – k –1 split, X k = 0 otherwise. 2/12/08 CS 5633 Analysis of Algorithms 6

  7. Analysis (continued) To obtain an upper bound, assume that the i th element always falls in the larger side of the partition: T (max{0, n –1}) + Θ ( n ) if 0 : n –1 split, T (max{1, n –2}) + Θ ( n ) if 1 : n –2 split, T ( n ) = M T (max{ n –1, 0}) + Θ ( n ) if n –1 : 0 split, − n 1 ( ) ∑ = − − + Θ X T (max{ k , n k 1 }) ( n ) k . = k 0 − n 1 ( ) ∑ ≤ + Θ 2 X T ( k ) ( n ) k =   k n / 2 2/12/08 CS 5633 Analysis of Algorithms 7

  8. Calculating expectation   − n 1 ( ) ∑ = + Θ E [ T ( n )] E 2 X T ( k ) ( n )   k   =   k n / 2 Take expectations of both sides. 2/12/08 CS 5633 Analysis of Algorithms 8

  9. Calculating expectation   − n 1 ( ) ∑ = + Θ E [ T ( n )] E 2 X T ( k ) ( n )   k   =   k n / 2 − n 1 [ ] ( ) ∑ = + Θ 2 E X T ( k ) ( n ) k =   k n / 2 Linearity of expectation. 2/12/08 CS 5633 Analysis of Algorithms 9

  10. Calculating expectation   − n 1 ( ) ∑ = + Θ E [ T ( n )] E 2 X T ( k ) ( n )   k   =   k n / 2 − n 1 [ ] ( ) ∑ = + Θ 2 E X T ( k ) ( n ) k =   k n / 2 − n 1 [ ] [ ] ∑ = ⋅ + Θ 2 E X E T ( k ) ( n ) k =   k n / 2 Independence of X k from other random choices. 2/12/08 CS 5633 Analysis of Algorithms 10

  11. Calculating expectation   − n 1 ( ) ∑ = + Θ E [ T ( n )] E 2 X T ( k ) ( n )   k     = k n / 2 − n 1 [ ] ( ) ∑ = + Θ 2 E X T ( k ) ( n ) k =   k n / 2 − n 1 [ ] [ ] ∑ = ⋅ + Θ 2 E X E T ( k ) ( n ) k =   k n / 2 − − 2 n 1 2 n 1 [ ] ∑ ∑ = + Θ E T ( k ) ( n ) n n     = = k n / 2 k n / 2 Linearity of expectation; E [ X k ] = 1/ n . 2/12/08 CS 5633 Analysis of Algorithms 11

  12. Calculating expectation   − n 1 ( ) ∑ = + Θ E [ T ( n )] E 2 X T ( k ) ( n )   k     = k n / 2 − n 1 [ ] ( ) ∑ = + Θ 2 E X T ( k ) ( n ) k =   k n / 2 − n 1 [ ] [ ] ∑ = ⋅ + Θ 2 E X E T ( k ) ( n ) k =   k n / 2 − − 2 n 1 2 n 1 [ ] ∑ ∑ = + Θ ( ) ( ) E T k n n n     = = k n / 2 k n / 2 − n 1 2 [ ] ∑ = + Θ E T ( k ) ( n ) n =   k n / 2 2/12/08 CS 5633 Analysis of Algorithms 12

  13. Hairy recurrence (But not quite as hairy as the quicksort one.) − n 1 ∑ 2 [ ] = + Θ E [ T ( n )] E T ( k ) ( n ) n =   k n / 2 Prove: E [ T ( n )] ≤ cn for constant c > 0. • The constant c can be chosen large enough so that E [ T ( n )] ≤ cn for the base cases. − n 1 k ∑ 3 n 2 ≤ k Use fact: (exercise). 8 =   n / 2 2/12/08 CS 5633 Analysis of Algorithms 13

  14. Substitution method − n 1 ∑ [ ] 2 ≤ + Θ E T ( n ) ck ( n ) n =   k n / 2 Substitute inductive hypothesis. 2/12/08 CS 5633 Analysis of Algorithms 14

  15. Substitution method − n 1 ∑ [ ] 2 ≤ + Θ E T ( n ) ck ( n ) n =   k n / 2   2 c 3 2 ≤ + Θ  n  ( n )   n 8 Use fact. 2/12/08 CS 5633 Analysis of Algorithms 15

  16. Substitution method − n 1 ∑ [ ] 2 ≤ + Θ E T ( n ) ck ( n ) n =   k n / 2   2 c 3 2 ≤ + Θ  n  ( n )   n 8   cn = − − Θ   cn ( n )   4 Express as desired – residual . 2/12/08 CS 5633 Analysis of Algorithms 16

  17. Substitution method − n 1 ∑ [ ] 2 ≤ + Θ E T ( n ) ck ( n ) n =   k n / 2   2 c 3 2 ≤ + Θ   n ( n )   n 8   cn = − − Θ   cn ( n )   4 ≤ cn , if c is chosen large enough so that cn /4 dominates the Θ ( n ). 2/12/08 CS 5633 Analysis of Algorithms 17

  18. Summary of randomized order-statistic selection • Works fast: linear expected time. • Excellent algorithm in practice. • But, the worst case is very bad: Θ ( n 2 ). Q. Is there an algorithm that runs in linear time in the worst case? A. Yes, due to Blum, Floyd, Pratt, Rivest, and Tarjan [1973]. I DEA : Generate a good pivot recursively. 2/12/08 CS 5633 Analysis of Algorithms 18

  19. Worst-case linear-time order statistics S ELECT ( i, n ) 1. Divide the n elements into groups of 5. Find the median of each 5-element group by rote. 2. Recursively S ELECT the median x of the  n /5  group medians to be the pivot. 3. Partition around the pivot x . Let k = rank( x ). 4. if i = k then return x Same as elseif i < k R AND - then recursively S ELECT the i th smallest element in the lower part S ELECT else recursively S ELECT the ( i–k )th smallest element in the upper part 2/12/08 CS 5633 Analysis of Algorithms 19

  20. Choosing the pivot 2/12/08 CS 5633 Analysis of Algorithms 20

  21. Choosing the pivot 1. Divide the n elements into groups of 5. 2/12/08 CS 5633 Analysis of Algorithms 21

  22. Choosing the pivot lesser 1. Divide the n elements into groups of 5. Find the median of each 5-element group by rote. greater 2/12/08 CS 5633 Analysis of Algorithms 22

  23. Choosing the pivot x lesser 1. Divide the n elements into groups of 5. Find the median of each 5-element group by rote. 2. Recursively S ELECT the median x of the  n /5  group medians to be the pivot. greater 2/12/08 CS 5633 Analysis of Algorithms 23

  24. Developing the recurrence S ELECT ( i, n ) T ( n ) 1. Divide the n elements into groups of 5. Find Θ ( n ) the median of each 5-element group by rote. 2. Recursively S ELECT the median x of the  n /5  T ( n /5) group medians to be the pivot. Θ ( n ) 3. Partition around the pivot x . Let k = rank( x ). 4. if i = k then return x elseif i < k then recursively S ELECT the i th ? T ( ) smallest element in the lower part else recursively S ELECT the ( i–k )th smallest element in the upper part 2/12/08 CS 5633 Analysis of Algorithms 24

  25. Analysis (Assume all elements are distinct.) x lesser At least half the group medians are ≤ x , which is at least   n /5  /2  =  n /10  group medians. greater 2/12/08 CS 5633 Analysis of Algorithms 25

  26. Analysis (Assume all elements are distinct.) x lesser At least half the group medians are ≤ x , which is at least   n /5  /2  =  n /10  group medians. • Therefore, at least 3  n /10  elements are ≤ x . greater 2/12/08 CS 5633 Analysis of Algorithms 26

  27. Analysis (Assume all elements are distinct.) x lesser At least half the group medians are ≤ x , which is at least   n /5  /2  =  n /10  group medians. • Therefore, at least 3  n /10  elements are ≤ x . • Similarly, at least 3  n /10  elements are ≥ x . greater 2/12/08 CS 5633 Analysis of Algorithms 27

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend