presented at the 8th scandinavian workshop on algorithm
play

Presented at the 8th Scandinavian Workshop on Algorithm Theory held - PDF document

Presented at the 8th Scandinavian Workshop on Algorithm Theory held on 35 July 2002 in Turku, Finland. Title: A randomized in-place algorithm for po- sitioning the k th element in a multiset Authors: Jyrki Katajainen and Tomi Pasanen


  1. Presented at the 8th Scandinavian Workshop on Algorithm Theory held on 3–5 July 2002 in Turku, Finland. Title: A randomized in-place algorithm for po- sitioning the k th element in a multiset Authors: Jyrki Katajainen and Tomi Pasanen Speaker: Jyrki Katajainen These slides are available at http://www.cphstl.dk . This bunch also contains slides that I did not have time to show. � Performance Engineering Laboratory c 1

  2. Algorithm senility Strictly in-place algorithms: In addition to the input sequence, use only O (1) extra words of memory. Element moves: Elements in the input sequence must be moved by swapping elements wordwise. Practical relevance: ≈ 0 Theoretical motivation: What can be done efficiently when only O (1) memory cells are available? � Performance Engineering Laboratory c 2

  3. � � Positioning Input: A sequence A of n elements, an integer k ∈ [1: ⌈ n/ 2 ⌉ ], and an ordering function � < returning true or false. Task: Rearrange the elements of A such that A [ k ] � < A [ j ] is false for all j ∈ [1: k − 1], and A [ ℓ ] � < A [ k ] is false for all ℓ ∈ [ k +1: n ]. 1 k n � > � < Goodies: 1. Do positioning, not only selection. 2. Operate (strictly) in-place. 3. Handle multiset data. 4. Rely only on boolean ordering func- tions (binary element comparisons). � Performance Engineering Laboratory c 3

  4. STL interface template < typename random_access_iterator > void nth_element ( random_access_iterator first, random_access_iterator nth, random_access_iterator one_past_the_end ); template < typename random_access_iterator, typename ordering > void nth_element ( random_access_iterator first, random_access_iterator nth, random_access_iterator one_past_the_end, ordering less ); template < typename element > struct less: binary_function<element, element, bool> { bool operator() ( const element& x, const element& y ) const { return x < y; } }; � Performance Engineering Laboratory c 4

  5. Known in-place results Reference Runtime #Comps #Swaps Comments [Hoare 1961, Kirschenhoffer et al. 1997] exp. O ( n ) 2 . 75 n + o ( n ) 0 . 46 n + o ( n ) median of 3 bounds for k = ⌈ n/ 2 ⌉ [Floyd & Rivest 1975] exp. O ( n ) n + k + o ( n ) k + o ( n ) for sets [Lai & Wood 1988] O ( n ) 6 . 9 n + o ( n ) > 9 n 3-way comps [Cunto & Munro 1989] Ω( n ) n + k − O (1) k all Las-Vegas algorithms [Carlsson & Sundstr¨ om 1995] O ( n ) (2 . 95+ ε ) n O ( n ) median finding 3-way comps moves in reg- isters gratis [Carlsson & Sundstr¨ om 1995] O ( n ) 3 . 75 n + o ( n ) > 4 . 5 n + o ( n ) selection 3-way comps moves in reg- isters gratis [Geffert 2000] O ( n ) O ( n log 2 (1 /ε )) εn selection 3-way comps � Performance Engineering Laboratory c 5

  6. Our results A Las-Vegas algorithm: Runtime #Comps #Swaps Comments O ( n ) n + k + o ( n ) k + o ( n ) if both � < and � = are given O ( n ) 2 n + o ( n ) k + o ( n ) if only � < is given The probability that these resource bounds are exceeded is at most e − n Ω(1) . A deterministic algorithm: Runtime #Comps #Swaps Comments O ( n ) 3 . 64 n + 0 . 72 k O ( n ) based on the + o ( n ) algorithm of Sch¨ onhage et al. [1976] The last result is not presented in the pro- ceedings. � Performance Engineering Laboratory c 6

  7. Randomized algorithm using o ( n ) extra space Position( A , k , � < ) n ← | A | ; s ← n β 1 ⊲ 0 < β < 1 2 if n < some constant or space available < s : 3 Sort( A , � < ); return 4 Pick a random sample S of size s from A ; tag each element with its index 5 Sort( S ,lex- � < ) if k < n γ : 6 ⊲ 1 − β < γ < 1 µ ← n γ s/n ; y ← S [2 µ ] 7 8 M, R ← 2-Partition( A ,y,lex- � < ) if | M | < n γ : Sort( A , � 9 < ) 10 else : Sort( M , � < ) ⊲ normal mode 11 else : µ ← ks/n ; ∆ ← n α µ 1 / 2 12 ⊲ 0 < α < β 13 λ ← ⌊ µ − ∆ ⌋ ; ν ← ⌈ µ +∆ ⌉ 14 x ← S [ λ ]; y ← S [ ν ] 15 L, M, R ← 3-Partition( A , x , y ,lex- � < ) 16 if | L | ≥ k or | R | ≥ n − k : Sort( A , � < ) 17 else : Sort( M , � < ) ⊲ normal mode � Performance Engineering Laboratory c 7

  8. Analysis: normal mode if k < n γ : 6 1 k n M R 11 else : 1 k n L M R – In this mode, the k th element falls in M and M is small. – Since s = n β , 0 < β < 1, the manipulation of the sample takes o ( n ) time. – If | M | < o ( n/ log 2 n ), the sorting of M takes o ( n ) time. – By carefully implementing 2-Partition and 3-Partition, the claimed bounds follow. � Performance Engineering Laboratory c 8

  9. Failure modes The algorithm may fail in six ways: k < n γ 1 ↓ k n 1. M 1 k ↓ n 2. M k ≥ n γ 1 k ↓ n 3. L 1 ↓ k n 4. R 1 ↓ k n 5. M 1 k ↓ n 6. M The probabilities of these failures can be bounded above by Chernoff bounds. � Performance Engineering Laboratory c 9

  10. Analysis: failure mode 3 k ≥ n γ µ � �� � ∆ ∆ 1 s µ = ks/n � �� � y x ∆ = n α µ 1 / 2 1 k ↓ n L Define X i = 1, if the i th sample element is lexicographically smaller than the k th el- ement, and X i = 0 otherwise. For X = � s i =1 X i , E [ X ] = µ = ks/n . In the case of failure, X < µ − ∆. We bound the lower tail probability of X us- ing the simplified Chernoff bound [Motwani & Raghavan 1995, Theorem 4.3]: δ =∆ /µ Pr[ X < µ − ∆] = Pr[ X < (1 − δ ) µ ] Theorem 4.3 e − µδ 2 / 2 ≤ e − n 2 α / 2 . = For parameters α = 1 / 6, β = 2 / 3, and γ = 5 / 6, we have that δ ≤ 2 e − 1, so we can use the simplified Chernoff bound. � Performance Engineering Laboratory c 10

  11. Making it in-place 1 n bits M L ? ? R M – Use the bit encoding technique of Munro [1986] to encode the indices of the ele- ments in the sample. Two distinct elements x and y , x � < y , can be used to present a 0-bit (1-bit) by stor- ing them in two consecutive locations in order xy ( yx ). By using ⌈ log 2 ( n +1) ⌉ such pairs an index can be represented. – If there are not enough distinct elements, the positioning problem is easy. – Store the elements used for encoding in the beginning of the sequence. To find the pairs fast, we need both � < and � =. – Rely on any efficient in-place sorting al- gorithm. � Performance Engineering Laboratory c 11

  12. Spiders and their use d � �� � ✈ · · · ✈ ✈ ✈ ❍❍❍❍❍❍ ✟ ❅ � ✟ ✟ ❅ � ✟ ✎☞ ✟ ❅ ❅ � � ❅ ❍ ✟ ✟ � ✈ ✟✟✟✟✟✟ ✟ ❍ ❍ � ❅ ❍ ✍✌ � � ❅ ❅ ❍ � ❅ ❍ ❍ ✈ · · · � ❅ ❍ ✈ ✈ ✈ � �� � d Hasse diagram of a S d d spider ; the centre of the spider is circled. more than ⌈ n/ 2 ⌉ elements smaller s s · · · s s ❍❍❍ ✟ ❅ � ✟ ✟ ❍ ✟ s ❤ ✟ ❍ ✟✟✟ ❅ ❅ � � ❍ � � ❅ ❅ ❍ ✦ ✦✦✦✦✦✦✦✦✦✦✦✦✦✦✦✦✦✦✦✦✦✦✦✦✦✦✦✦✦✦✦✦✦✦✦ � ❅ ❍ s s · · · s s s s · · · s s ❍❍❍ ✟ ❅ � ✟ ✟ ❍ ✟ s ❤ ✟ ❍ ✟✟✟ ❅ ❅ � � ❍ � � ❅ ❅ factory ❍ � ❅ ❍ s s · · · s s provided that r elements r < t − 1 t S d s s · · · s s d spiders ❍❍❍ ✟ ❅ � ✟ ✟ ❍ ✟ s ❤ ✟ ❍ ✟✟✟ ❅ ❅ � � ❍ � � ❅ ❅ ❍ � ❅ ❍ s s · · · s s s s · · · s s ❍❍❍ ✟ more than ⌈ n/ 2 ⌉ ❅ � ✟ n = ✟ ❍ ✟ s ❤ ✟ ❅ � ❍ ✟✟✟ ❅ � ❍ � � ❅ ❅ ❍ � ❅ ❍ s s · · · s s elements larger t (2 d +1)+ r Keep the spiders in a priority deque , and repeatedly remove from this deque the spider with the smallest centre and the spider with the largest centre. � Performance Engineering Laboratory c 12

  13. Spider factory Let w be a bit string and let λ denote the empty string. A factory tree of type F w is defined as follows: 1. F λ is a single node containing one ele- ment; this node is the centre of the tree. 2. F w 0 consists of two disjoint factory trees of type F w , T 0 and T 1 , whose centres are connected. The element at the centre of T 0 should not be larger than that at the centre of T 1 . The centre of T 0 is the centre of the whole tree. 3. F w 1 is similar, but the centre of T 1 is the centre of the whole tree. ✈ ✈ ✈ ❍ ❍ �❅ ❍ ✈ � ❅ � ✈ ❅ ✈ � ❅ ✈ ✈ � ❅ ✎☞ � ❅ ✈ ✈ ❍ ❍ ❍ ❍ ❍ ❍ � ✍✌ ❍ ❍ ✈ ✈ � � ✈ � ✈ � � ✈ ❍ ❍ ❍ ❍ ✈ Hasse diagram of a factory tree of type F 0110 . � Performance Engineering Laboratory c 13

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend