Sorting (Chapter 9) Alexandre David B2-206 Sorting Problem - PowerPoint PPT Presentation

Sorting (Chapter 9) Alexandre David B2-206

Sorting Problem Arrange an unordered collection of elements into monotonically increasing (or decreasing) order. Let S = <a 1 ,a 2 ,…,a n >. Sort S into S’ = <a 1 ’,a 2 ’,…,a n ’> such that a i ’ ≤ a j ’ for 1 ≤ i ≤ j ≤ n and S’ is a permutation of S. 21-04-2006 Alexandre David, MVP'06 2

Recall on Comparison Based Sorting Algorithms Bubble sort Θ ( n 2 ) Selection sort Insertion sort Ω ( n ) O( n 2 ) Quick sort Ω ( n log n ) Merge sort Θ ( n log n ) Heap sort 21-04-2006 Alexandre David, MVP'06 3

Characteristics of Sorting Algorithms � In-place sorting: No need for additional memory (or only constant size). � Stable sorting: Ordered elements keep their original relative position. � Internal sorting: Elements fit in process memory. � External sorting: Elements are on auxiliary storage. 21-04-2006 Alexandre David, MVP'06 4

Fundamental Distinction � Comparison based sorting: � Compare-exchange of pairs of elements. � Lower bound is Ω ( n log n ) (proof based on decision trees). � Merge & heap-sort are optimal. � Non-comparison based sorting: � Use information on the element to sort. � Lower bound is Ω (n) . � Counting & radix-sort are optimal. 21-04-2006 Alexandre David, MVP'06 5

Issues in Parallel Sorting � Where to store input & output? � One process or distributed? � Enumeration of processes used to distribute output. � How to compare? � How many elements per process? � As many processes as element ⇒ poor performance because of inter-process communication. 21-04-2006 Alexandre David, MVP'06 6

Parallel Compare-Exchange Communication cost: t s +t w . Comparison cost much cheaper ⇒ communication time dominates. 21-04-2006 Alexandre David, MVP'06 7

Blocks of Elements Per Process n/p elements per process n elements P 0 P 1 P p-1 … Blocks: A 0 ≤ A 1 ≤ … ≤ A p-1 21-04-2006 Alexandre David, MVP'06 8

Compare-Split For large blocks: Θ (n/p) Exchange: Θ ( t s +t w n/p ) Merge: Θ (n/p) Split: O(n/p) 21-04-2006 Alexandre David, MVP'06 9

Sorting Networks � Mostly of theoretical interest. � Key idea: Perform many comparisons in parallel. � Key elements: � Comparators: 2 inputs, 2 outputs. � Network architecture: Comparators arranged in columns, each performing a permutation. � Speed proportional to the depth. 21-04-2006 Alexandre David, MVP'06 10

Comparators 21-04-2006 Alexandre David, MVP'06 11

Sorting Networks 21-04-2006 Alexandre David, MVP'06 12

Bitonic Sequence Definition A bitonic sequence is a sequence of elements <a 0 ,a 1 ,…,a n > s.t. 1. ∃ i, 0 ≤ i ≤ n-1 s.t. <a 0 ,…,a i > is monotonically increasing and <a i+1 ,…,a n-1 > is monotonically decreasing, 2.or there is a cyclic shift of indices so that 1) is satisfied. 21-04-2006 Alexandre David, MVP'06 13

Bitonic Sort � Rearrange a bitonic sequence to be sorted. � Divide & conquer type of algorithm (similar to quicksort) using bitonic splits . � Sorting a bitonic sequence using bitonic splits = bitonic merge. � But we need a bitonic sequence… 21-04-2006 Alexandre David, MVP'06 14

Bitonic Split s 2 s 1 <a 0 ,a 1 ,…,a n/2-1 ,a n/2 ,a n/2+1 ,…,a n-1 > s 1 ≤ s 2 s 1 & s 2 bitonic! s 1 = <min{a 0 ,a n/2 },min{a 1 ,a n/2+1 },…,min{a n/2-1 ,a n-1 }> b i s 2 = <max{a 0 ,a n/2 },max{a 1 ,a n/2+1 },…,max{a n/2-1 ,a n-1 }> b i ’ 21-04-2006 Alexandre David, MVP'06 15

Bitonic Merging Network log n stages n/2 comparators ⊕ BM[n] 21-04-2006 Alexandre David, MVP'06 16

Bitonic Sort � Use the bitonic network to merge bitonic sequences of increasing length… starting from 2, etc. � Bitonic network is a component. 21-04-2006 Alexandre David, MVP'06 17

Bitonic Sort log n stages Cost: O(log 2 n ). Simulated on a serial computer: O( n log 2 n ). 21-04-2006 Alexandre David, MVP'06 18

Mapping to Hypercubes & Mesh – Idea � Communication intensive, so special care for the mapping. � How are the input wires paired? � Pairs have their labels differing by only one bit ⇒ mapping to hypercube straightforward. But not efficient & not scalable � For a mesh lower connectivity, several because the sequential algorithm solutions but worse than the hypercube is suboptimal. T P = Θ (log 2 n )+ Θ ( √ n ) for 1 element/process. � Block of elements: sort locally ( n/p log n/p ) & use bitonic merge ⇒ cost optimal. 21-04-2006 Alexandre David, MVP'06 19

Bubble Sort procedure BUBBLE_SORT(n) begin for i := n-1 downto 1 do Θ ( n 2 ) for j := 1 to i do compare_exchange(a j ,a j+1 ); end � Difficult to parallelize as it is because it is inherently sequential. 21-04-2006 Alexandre David, MVP'06 20

Odd-Even Transposition Sort Θ ( n 2 ) (a 1 ,a 2 ),(a 3 ,a 4 )… (a 2 ,a 3 ),(a 4 ,a 5 )… 21-04-2006 Alexandre David, MVP'06 21

21-04-2006 Alexandre David, MVP'06 22

Odd-Even Transposition Sort � Easy to parallelize! � Θ ( n ) if 1 process/element. � Not cost optimal but use fewer processes, an optimal local sort, and compare-splits: ⎛ ⎞ ( ) ( ) n n ⎜ ⎟ = Θ + Θ + Θ log T P n n ⎜ ⎟ ⎝ ⎠ p p Cost optimal for p = O(log n ) local sort (optimal) + comparisons + communication but not scalable (few processes). 21-04-2006 Alexandre David, MVP'06 23

Improvement: Shellsort � 2 phases: � Move elements on longer distances. � Odd-even transposition but stop when no change. � Idea: Put quickly elements near their final position to reduce the number of iterations of odd-even transposition. 21-04-2006 Alexandre David, MVP'06 24

Quicksort � Average complexity: O( n log n ). � But very efficient in practice. � Average “robust”. � Low overhead and very simple. � Divide & conquer algorithm: � Partition A[q..r] into A[q..s] ≤ A[s+1..r]. � Recursively sort sub-arrays. � Subtlety: How to partition? 21-04-2006 Alexandre David, MVP'06 26

q r 3 2 1 5 3 8 4 5 3 7 3 3 2 1 3 7 8 4 5 8 7 3 1 2 1 3 3 7 5 4 5 7 8 1 2 3 3 4 5 5 4 7 8 21-04-2006 Alexandre David, MVP'06 27

BUG 21-04-2006 Alexandre David, MVP'06 28

Parallel Quicksort � Simple version: � Recursive decomposition with one process per recursive call. � Not cost optimal: Lower bound = n (initial partitioning). � Best we can do: Use O(log n ) processes. � Need to parallelize the partitioning step. 21-04-2006 Alexandre David, MVP'06 29

Parallel Quicksort for CRCW PRAM � See execution of quicksort as constructing a binary tree. 3,2,1 3 7,4,5,8 5,4 8 3 7 1,2 1 5 8 2 4 21-04-2006 Alexandre David, MVP'06 30

Text & algorithm 9.5: A[p..s] ≤ x < A[s+1..q]. Figures & algorithm 9.6: A[p..s] < x ≤ A[s+1..q]. BUG 21-04-2006 Alexandre David, MVP'06 31

only one succeeds A[i] ≤ A[parent i ] 21-04-2006 Alexandre David, MVP'06 32

1 2 1 1 2 1 6 1 6 1 2 1 6 1 1 3 2 2 3 1 4 5 5 8 6 4 7 3 8 7 2 6 root=1 1 3 2 6 21-04-2006 Alexandre David, MVP'06 33

1 1 2 1 6 5 1 1 6 1 2 1 3 6 1 5 1 3 2 2 3 1 4 5 5 8 6 4 7 3 8 7 2 6 3 1 5 1 3 2 2 6 4 3 1 7 3 5 8 4 5 8 7 Each step: Θ (1). Average height: Θ (log n ). This is cost-optimal – but it is only a model. 21-04-2006 Alexandre David, MVP'06 34

Parallel Quicksort – Shared Address (Realistic) � Same idea but remove contention: � Choose the pivot & broadcast it. � Each process rearranges its block of elements locally . � Global rearrangement of the blocks. � When the blocks reach a certain size, local sort is used. 21-04-2006 Alexandre David, MVP'06 35

Cost � Scalability determined by time to broadcast the pivot & compute the prefix-sums. � Cost optimal. 21-04-2006 Alexandre David, MVP'06 38

MPI Formulation of Quicksort � Arrays must be explicitly distributed. � Two phases: � Local partition smaller/larger than pivot. � Determine who will sort the sub-arrays. � And send the sub-arrays to the right process. 21-04-2006 Alexandre David, MVP'06 39

Final Word � Pivot selection is very important. � Affects performance. � Bad pivot means idle processes. 21-04-2006 Alexandre David, MVP'06 40

Sorting (Chapter 9) Alexandre David B2-206 Sorting Problem - PowerPoint PPT Presentation

Sorting (Chapter 9) Alexandre David B2-206 Sorting Problem Arrange an unordered collection of elements into monotonically increasing (or decreasing) order. Let S = <a 1 ,a 2 ,,a n >. Sort S into S = <a 1 ,a 2 ,,a n

SORTING Review of Sorting Merge Sort Sets sorting 1 Sorting Algorithms

Overview/Questions What is sorting? Why does sorting matter? How is sorting

Sorting Lower Bound Sorting Lower Bound 1 Comparison-Based Sorting (10.4) Many sorting

Sorting Insertion sort Bubble sort Divide and conquer sorting Sorting Last time: introduction

Sorting Sorting used as a step in many algorithms Savitch Chapter 7.4 Sorting algorithms

Chapter 7 External Sorting Sorting Tables Larger Than Main Memory Query Processing Sorting

Sorting with Pop Stacks Stack sorting Pop stack sorting 1-pop-stack sortability 2-pop-stack

Sorting Sorting as a tool Sorting problem: Given a list a with n elements possessing a There are

Sorting Sorting: to arrange data in some sequential order Sorting occurs as a part in

Sorting Algorithms Introduction Sorting Problem Sorting Problem Given a sequence A = a 1 , .

Chapter 10 Sorting and Searching Some concepts Sorting is one of the most common

Sorting Algorithms CENG 707 Data Structures and Algorithms Sorting Sorting is a process

Sorting Algorithms October 18, 2017 CMPE 250 Sorting Algorithms October 18, 2017 1 / 74

Sorting a List: bubble sort selection sort insertion sort Sept. 22, 2017 1 Sorting BEFORE

Sorting in Linear Time Pedro Ribeiro DCC/FCUP 2018/2019 Pedro Ribeiro (DCC/FCUP) Sorting in

Cache and TLB-aware Parallel Sorting Kynan Shook Sorting Sorting is used in many places

Lecture 33: Concurrency Example of Parallelism: Sorting Moores law (Transistors per chip

qt ssts r t t st

Top-Quark Pair Production Close to Threshold QCD and Electroweak Effects Johann H. K uhn I.

Large BGP Community draft-heitz-idr-large-community-00 Jakob Heitz (Cisco) Keyur Patel (Cisco)

Vorlesung Datenstrukturen und Algorithmen Letzte Vorlesung 2018 Felix Friedrich, 30.5.2018

PROGRAMMING IN PARALLAXIS THOMAS BRUNL Overview Definition of the Parallaxis programming

3.3 Index Access Scheduling Given: index scans over m lists L i (i=1..m), with current

Trail Bound Techniques in Primitives with Weak Alignment Silvia Mella 1 based on a joint work