algorithm engineering
play

Algorithm Engineering (aka. How to Write Fast Code) CS26 S260 - PowerPoint PPT Presentation

Algorithm Engineering (aka. How to Write Fast Code) CS26 S260 Lecture cture 6 Yan n Gu I/O Algorithms and Parallel Samplesort The I/O Model CS260: Algorithm Sampling in Algorithm Design Engineering Lecture 6 Parallel Samplesort 2


  1. Algorithm Engineering (aka. How to Write Fast Code) CS26 S260 – Lecture cture 6 Yan n Gu I/O Algorithms and Parallel Samplesort

  2. The I/O Model CS260: Algorithm Sampling in Algorithm Design Engineering Lecture 6 Parallel Samplesort 2

  3. 3

  4. Last week - The I/O model • The I/O O model el has two speci cial al memor mory y transfer sfer instructions: ructions: • Read transfe nsfer: load a block from slow memory • Write te transf sfer: write a block to slow memory • The co comp mplexi lexity ty of an algor orithm ithm on the I/O O model del (I/O O co complexi plexity) ty) is measur sured ed by: y: #( #(rea ead tran ansfe sfers) rs) + #( #(write e transfe ansfers) rs) Slow Memory Fast Memory 1 0 CPU 𝑁/𝐶 1 𝐶

  5. Cache-Oblivious Algorithms • Alg lgorit ithms hms not paramete meteriz ized ed by 𝐶 or 𝑁 • These algorithms are unaware of the parameters of the memory hierarchy • Analy lyze ze in in the id ideal l cache model el — same e as the I/O m model l except pt optim imal al repla laceme ement nt is is assum sumed ed Fast Memory Slow Memory 1 0 CPU 𝑁/𝐶 1 𝐶

  6. The I/O Model CS260: Algorithm Sampling in Algorithm Design Engineering Lecture 6 Parallel Samplesort 6

  7. Why Sampling? • Yan has an array {𝒃 𝟏 , 𝒃 𝟐 , … , 𝒃 𝒐−𝟐 } such that 𝒃 𝒋 = 𝟏 or 𝟐 , and Yan wants to know how many 𝟏 (s) in the array • Scan, linear work, can be parallelized • Sounds like a good idea?

  8. Why Sampling? • Yan has an array {𝒃 𝟏 , 𝒃 𝟐 , … , 𝒃 𝒐−𝟐 } and a function 𝒈(⋅) such that 𝒈(𝒃 𝒋 ) = 𝟏 or 𝟐 , and Yan wants to know how many 𝒈(𝒃 𝒋 ) = 𝟏

  9. Why Sampling? • Yan has an array {𝒃 𝟏 , 𝒃 𝟐 , … , 𝒃 𝒐−𝟐 } and 𝒐 function 𝒈 𝟐 ⋅ , … , 𝒈 𝒐 (⋅) such that 𝒈 𝒌 (𝒃 𝒋 ) = 𝟏 or 𝟐 , and Yan wants to know how many 𝒈 𝒌 (𝒃 𝒋 ) = 𝟏 • Takes quadratic work, does not work for reasonable input size • Examples: • Find the median 𝑛 of 𝑏 𝑗 , 𝑔 𝑛 𝑏 𝑗 = "𝑏 𝑗 < 𝑛" , check if #(𝑔 𝑏 𝑘 𝑏 𝑗 = 0) is 𝑜/2 𝑜 3𝑜 • Find a good pivot 𝑞 in quicksort (e.g., 4 ≤ #(𝑔 𝑞 𝑏 𝑗 = 0) ≤ 4 ) • Guarantee all sorts of properties in graph, geometry and other algorithms

  10. Approximate Solution: Sampling • Yan has an array {𝒃 𝟏 , 𝒃 𝟐 , … , 𝒃 𝒐−𝟐 } and 𝒐 function 𝒈 ⋅ such that 𝒈(𝒃 𝒋 ) = 𝟏 or 𝟐 , and Yan wants to know how many 𝒈(𝒃 𝒋 ) = 𝟏 • Uniformly randomly pick 𝒍 elements, compute the 𝒈 𝒃 𝒋 = 𝟏 𝒐⋅𝒍 𝟏 case (denoted as 𝒍 𝟏 ), and estimate by 𝒍 • As long as 𝑙 is sufficiently large, we are “confident” with our estimation • On the other hand, when 𝑙 is small, the result can be random • When is the estimation good? • What is “good”?

  11. Approximate Solution: Sampling • What is “good”? • With high probability (informal): happens with probability 1 − 𝑜 −𝑑 for any constant 𝑑 > 0 • This is large when 𝑜 is reasonably large, like > 10 6 • When is the estimation good? • Claim: when 𝑙 0 is Ω log 𝑜 • How can reality off from the estimate?

  12. Approximate Solution: Sampling • When is the estimation good? • Claim: when 𝑙 0 is Ω log 𝑜 • How can reality off from the estimate? • Assume there are 𝑨 elements with 𝒈(𝒃 𝒋 ) = 𝟏 , and we have 𝑙 samples with 𝑙 0 hits. The expected #hits E 𝑙 0 = 𝑙𝑨/𝑜 . • The probability that this is off by 100% (i.e., 𝑙 0 > 2𝑙𝑨/𝑜 ) is 𝑓 − 𝑙𝑨 3𝑜 Chernoff bound: for 𝑜 independent random variables in {0, 1} , let 𝑌 be the sum, and 𝜈 = E 𝑌 , then for any 0 ≤ 𝜀 ≤ 1 , Pr 𝑌 ≥ 1 + 𝜀 𝜈 ≤ 𝑓 −𝜀 2 𝜈 3

  13. Approximate Solution: Sampling • When is the estimation good? • Claim: when 𝑙 0 is Ω log 𝑜 • How can reality off from the estimate? • Assume there are 𝑨 elements with 𝒈(𝒃 𝒋 ) = 𝟏 , and we have 𝑙 samples with 𝑙 0 hits. The expected #hits E 𝑙 0 = 𝑙𝑨/𝑜 . • The probability that this is off by 100% (i.e., 𝑙 0 > 2𝑙𝑨/𝑜 ) is 𝑓 − 𝑙𝑨 3𝑜 • Since 𝑙 0 ≈ 𝑙𝑨/𝑜 , 𝑓 − 𝑙𝑨 3𝑜 is 𝑜 −𝑑 when 𝑙 0 = Ω log 𝑜 , because 𝑓 − 𝑙𝑨 3𝑜 ≈ 𝑓 − 𝑙0 3 < 𝑓 −𝑑 ′ log 2 𝑜 = 𝑜 −𝑑

  14. Approximate Solution: Sampling • When is the estimation good? • Claim: when 𝑙 0 is Ω log 𝑜 • How can reality off from the estimate? • Assume there are 𝑨 elements with 𝒈(𝒃 𝒋 ) = 𝟏 , and we have 𝑙 samples with 𝑙 0 hits. The expected #hits E 𝑙 0 = 𝑙𝑨/𝑜 . • The probability that this is off by 1% (i.e., 𝑙 0 > 1.01𝑙𝑨/𝑜 ) is 𝑓 − 𝜀2𝑙𝑨 3𝑜 • Since 𝑙 0 ≈ 𝑙𝑨/𝑜 , 𝑓 − 𝜀2𝑙𝑨 3𝑜 is 𝑜 −𝑑 when 𝑙 0 = Ω log 𝑜 , because 𝑓 − 𝜀2𝑙𝑨 𝑙0 3⋅1002 < 𝑓 −𝑑 ′ log 2 𝑜 = 𝑜 −𝑑 3𝑜 ≈ 𝑓 − Chernoff bound: for 𝑜 independent random variables in {0, 1} , let 𝑌 be the sum, and 𝜈 = E 𝑌 , then for any 0 < 𝜀 < 1 , Pr 𝑌 ≥ 1 + 𝜀 𝜈 ≤ 𝑓 −𝜀 2 𝜈 3

  15. Rule of Thumbs for Sampling • Example Applications: • Find the median 𝑛 of 𝑏 𝑗 , 𝑔 𝑏 𝑗 = "𝑏 𝑗 < 𝑛" , check if #(𝑔 𝑏 𝑘 𝑏 𝑗 = 0) is 𝑜/2 𝑜 3𝑜 • Find a good pivot 𝑞 in quicksort (e.g., 4 ≤ #(𝑔 𝑞 𝑏 𝑗 = 0) ≤ 4 ) • Guarantee all sorts of properties in graph, geometry and other algorithms • Take some samples! Uniformly randomly pick 𝒍 elements, 𝒐⋅𝒍 𝟏 compute the 𝒈 𝒃 𝒋 = 𝟏 case (denoted as 𝒍 𝟏 ), and estimate by 𝒍 • 4 sample hits gives you reasonable result • 20 sample hits gives you confident • 100 sample hits is sufficient! • Remember: only hits count

  16. The I/O Model CS260: Algorithm Sampling in Algorithm Design Engineering Lecture 6 Parallel Samplesort 16

  17. Parallel and I/O-efficient Sorting Algorithms • Cla lassi sic c sortin ing g alg lgorit ithm hms s are easy y to b be p parallel lleliz ized ed • Quicksort: find a “good” pivot, apply partition (filter) to find elements that are smaller and that are larger, and recurse • Mergesort: apply parallel merge for log 2 𝑜 rounds • But not I/O efficient since we need log 2 𝑜 rounds of global data movement • We now introduce samplesort, which is both highly in parallel and I/O efficient

  18. Sample-sort outline Analo logou gous s to mult ltiw iway ay quic ickso ksort 1. 1. Sp Spli lit in input ut array in into 𝑂 contiguo iguous us suba barra rrays ys of siz ize 𝑂 . So Sort subar arrays rays recursi sivel vely … 𝑂 , sorted 𝑂

  19. Sample-sort outline Analo logou gous s to mult ltiw iway ay quic ickso ksort 𝑂 , sorted 1. 1. Sp Spli lit in input ut array in into 𝑂 contiguo iguous us suba barra rrays ys of siz ize 𝑂 . So Sort subar arrays rays recursi sivel vely y (sequ equent entia ially lly) …

  20. Sample-sort outline 2. 2. Choo oose se 𝑂 − 1 “good” pivots 𝑂 , sorted 𝑞 1 ≤ 𝑞 2 ≤ ⋯ ≤ 𝑞 𝑂−1 3. 3. Dis istribu ribute te su subar barrays rays in into o buckets ckets , , ac accordin ording g to … pivot vots Size ≈ 𝑂 ≤ 𝑞 1 ≤ ≤ 𝑞 2 ≤ ⋯ ≤ 𝑞 𝑂−1 ≤ Bucket 1 Bucket 2 Bucket 𝑂

  21. Sample-sort outline 4. Recurs 4. cursively ively sort rt the buckets ckets ≤ 𝑞 1 ≤ ≤ 𝑞 2 ≤ ⋯ ≤ 𝑞 𝑂−1 ≤ Bucket 1 Bucket 2 Bucket 𝑂 5. 5. Copy py conca oncatenated tenated buckets ckets bac ack k to input put ar arra ray sorted

  22. Choosing good pivots based on sampling 2. 2. Cho hoose ose 𝑂 − 1 “good” pivots 𝑞 1 ≤ 𝑞 2 ≤ ⋯ ≤ 𝑞 𝑂−1 Can an be ac achieved ieved by y ra randoml domly y pic ick k 𝑑 𝑂 log 𝑂 ra rando dom m sam amples les, , sort rt them m an and pick ck the eve very ry 𝑑 log 𝑂 -th th element ment This is step p is fa fast

  23. Sequential local sorts (e.g., call stl::sort) 1. 1. Sp Spli lit in input ut array in into 𝑂 contiguo iguous us subar array ays of siz ize 𝑂 . So Sort rt suba barray rrays s re recu cursi rsivel vely y (sequen quentia ially) lly) … 𝑂 , sorted 4. Recur ursi sively vely sort the buckets ets (sequ quenti ential al) ≤ 𝑞 1 ≤ ≤ 𝑞 2 ≤ ⋯ ≤ 𝑞 𝑂−1 ≤ Bucket 1 Bucket 2 Bucket 𝑂

  24. Key Part: the Distribution Phase 3. . Dis istribute ribute su subarr arrays ays in into to 𝑂 , sorted buck uckets ets , , ac according cording to pivot vots … Size ≈ 𝑂 ≤ 𝑞 1 ≤ ≤ 𝑞 2 ≤ ⋯ ≤ 𝑞 𝑂−1 ≤ Bucket 1 Bucket 2 Bucket 𝑂

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend