Topic 3: Parallel and Scalable Data Processing
- Ch. 9.4, 12.2, 14.1.1, 14.6, 22.1-22.3, 22.4.1, 22.8 of Cow Book
- Ch. 5, 6.1, 6.3, 6.4 of MLSys Book
Arun Kumar
1
DSC 102 Systems for Scalable Analytics Arun Kumar Topic 3: - - PowerPoint PPT Presentation
DSC 102 Systems for Scalable Analytics Arun Kumar Topic 3: Parallel and Scalable Data Processing Ch. 9.4, 12.2, 14.1.1, 14.6, 22.1-22.3, 22.4.1, 22.8 of Cow Book Ch. 5, 6.1, 6.3, 6.4 of MLSys Book 1 Q: Why bother with large-scale data?
1
2
3
4
5
6
7
8
9
10
https://ai.googleblog.com/2017/07/revisiting-unreasonable-effectiveness.html
11
12
13
14
15
16
17
18
19
20
21
22
23
https://docs.dask.org/en/latest/graphviz.html
24
25
26
T1
T2 T3 T4 T5 T6
27
28
T1
T2 T3 T4 T5 T6
29
30
31
T1
T2 T3 T4 T5 T6
W1: T1 T1 T4 T6 T6 W2: T2 T5 T5 T5 T5 W3: T3 T3 T3 5 10 15 20 25 30 35
32
T1
T2 T3 T4 T5 T6
33
T1
T2 T3 T4 T5 T6
34
35
36
37
38
D1 D2 D3 D4 D3 D4 D1 D2 D4 D3 D2 D1
39
40
41
42
43
44
45
46
Multi-core CPU GPU FPGA ASICs (e.g., TPUs) Peak FLOPS/s Moderate High High Very High Power Consumption High Very High Very Low Low Cost Low High Very High Highest Generality / Flexibility Highest Medium Very High Lowest Fitness for DL Training Poor Fit Best Fit Low Fit Potential exists but unrealized Fitness for DL Inference Moderate Moderate Good Fit Best Fit Cloud Vendor Support All All AWS, Azure GCP
https://www.embedded.com/leveraging-fpgas-for-deep-learning/
47
48
105 – 106
A C C E S S C Y C L E S 107 – 108 100s
Cache
49
CU ALU Caches
Registers
tmp.csv tmp.py
50
51
52
F1P1 F1P2 F1P3 F2P1 F2P2 …
F1P1 F1P2 F2P1 F2P2 F1P3 F3P1 F5P7 F3P5
53
54
55
P1 P2 P3 P4 P5 P6
P1 P2 P3 P4 P5 P6
56
57
58
59
A B C D 1a 1b 1c 1d 2a 2b 2c 2d 3a 3b 3c 3d 4a 4b 4c 4d 5a 5b 5c 5d 6a 6b 6c 6d
1a,1b, 1c,1d
2a,2b, 2c,2d 3a,3b, 3c,3d
1a,2a, 3a,4a
5a,6a 1b,2b, 3b,4b
60
https://docs.dask.org/en/latest/dataframe-best-practices.html#repartition-to-reduce-overhead
61
1a,1b, 2a,2b
1c,1d, 2c,2d 3a,3b, 4a,4b
A B C D 1a 1b 1c 1d 2a 2b 2c 2d 3a 3b 3c 3d 4a 4b 4c 4d 5a 5b 5c 5d 6a 6b 6c 6d
62
63
64
65
A B C D 1a 1b 1c 1d 2a 2b 2c 2d 3a 3b 3c 3d 4a 4b 4c 4d 5a 5b 5c 5d 6a 6b 6c 6d 1a,1b, 1c,1d
2a,2b, 2c,2d 3a,3b, 3c,3d 4a,4b, 4c,4d 5a,5b, 5c,5d 6a,6b, 6c,6d
66
1a,1b, 1c,1d 2a,2b, 2c,2d 3a,3b, 3c,3d 4a,4b, 4c,4d 5a,5b, 5c,5d 6a,6b, 6c,6d
1a,1b, 1c,1d 2a,2b, 2c,2d 3a,3b, 3c,3d 4a,4b, 4c,4d 5a,5b, 5c,5d 6a,6b, 6c,6d
3a,3b, 3c,3d
67
A B C D 1a 1b 1c 1d 2a 2b 2c 2d 3a 3b 3c 3d 4a 4b 4c 4d 5a 5b 5c 5d 6a 6b 6c 6d 1a,1b, 1c,1d
2a,2b, 2c,2d 3a,3b, 3c,3d 4a,4b, 4c,4d 5a,5b, 5c,5d 6a,6b, 6c,6d
68
A B C D 1a 1b 1c 1d 2a 2b 2c 2d 3a 3b 3c 3d 4a 4b 4c 4d 5a 5b 5c 5d 6a 6b 6c 6d
1a,2a, 3a,4a
5a,6a 1b,2b, 3b,4b 1c,2c, 3c,4c 5c,6c 1b,2b, 3b,4b 5b,6b 5b,6b
69
A B C D 1a 1b 1c 1d 2a 2b 2c 2d 3a 3b 3c 3d 4a 4b 4c 4d 5a 5b 5c 5d 6a 6b 6c 6d 1a,1b, 1c,1d
2a,2b, 2c,2d 3a,3b, 3c,3d 4a,4b, 4c,4d 5a,5b, 5c,5d 6a,6b, 6c,6d
70
A B C D 1a 1b 1c 1d 2a 2b 2c 2d 3a 3b 3c 3d 4a 4b 4c 4d 5a 5b 5c 5d 6a 6b 6c 6d
1a,2a, 3a,4a
5a,6a 1b,2b, 3b,4b 1c,2c, 3c,4c 5c,6c 1b,2b, 3b,4b 5b,6b 5b,6b
71
A B C D a1 1b 1c 4 a2 2b 2c 3 a1 3b 3c 5 a3 4b 4c 1 a2 5b 5c 10 a1 6b 6c 8
A Running Info. a1 17 a2 13 a3 1
72
A B C D a1 1b 1c 4 a2 2b 2c 3 a1 3b 3c 5 a3 4b 4c 1 a2 5b 5c 10 a1 6b 6c 8
A Running Info.
a1,1b, 1c,4
a2,2b, 2c,3 a1,3b, 3c,5 a3,4b, 4c,1 a2,5b, 5c,10 a1,6b, 6c,8
73
74
2 1 2 1 1 2 1 2 3 1 3 1 2,1, 0,0
2,1 0,0 0,1, 0,2 0,0, 1,2 3,0, 1,0 3,0, 1,0
75
2 1 2 1 1 2 1 2 3 1 3 1 2,1, 0,0
2,1 0,0 0,1, 0,2 0,0, 1,2 3,0, 1,0 3,0, 1,0
76
2 1 2 1 1 2 1 2 3 1 3 1
77
78
n
i=1
79
n
i=1
n
i=1
80
81
n
i=1
82
Y X1 X2 X3 1b 1c 1d 1 2b 2c 2d 1 3b 3c 3d 4b 4c 4d 1 5b 5c 5d 6b 6c 6d
n
i=1
0,1b, 1c,1d
1,2b, 2c,2d 1,3b, 3c,3d 0,4b, 4c,4d 1,5b, 5c,5d 0,6b, 6c,6d
83
84
85
86
87
88
Interconnect Interconnect Interconnect
89
Interconnect D1 D2 D3 D4 D5 D6 D1 D2 D3 D4 D5 D6
D1 D3 D5
90
Interconnect Interconnect D1 D2 D3 D4 D5 D6 D1 D2 D3 D4 D5 D6 D1 D2 D3 D4 D5 D6 D1 D2 D3 D4 D5 D6 D1 D3 D5 D1 D2 D3
91
92
Node 1 A B C D 1a 1b 1c 1d 2a 2b 2c 2d 3a 3b 3c 3d 4a 4b 4c 4d 5a 5b 5c 5d 6a 6b 6c 6d
Node 2 Node 3
1a,2a, 3a,4a 5a,6a 1b,2b, 3b,4b 1c,2c, 3c,4c 5c,6c 5b,6b
93
Node 1 A B C D 1a 1b 1c 1d 2a 2b 2c 2d 3a 3b 3c 3d 4a 4b 4c 4d 5a 5b 5c 5d 6a 6b 6c 6d
Node 2 Node 3
1a,1b, 2a,2b 1c,1d, 2c,2d 3a,3b, 4a,4b 5a,5b, 6a,6b 5c,5d, 6c,6d 3c,3d, 4c,4d
94
Worker 1 Master
Worker 2 Worker k
Worker 1 Worker 2 Worker k
Worker k-1
95
Worker 1 Master Worker 2 Worker k
D1 D2 D3 D4 D5 D6
96
97
98
99
100
https://static.googleusercontent.com/media/research.google.com/en//archive/gfs-sosp2003.pdf https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html
101
102
103
104
105
Master
Worker 1 Disk DRAM
1a,1b, 1c,1d 2a,2b, 2c,2d
Worker 2 Disk DRAM Worker 3 Disk DRAM
3a,3b, 3c,3d 4a,4b, 4c,4d 5a,5b, 5c,5d 6a,6b, 6c,6d
1a,1b, 1c,1d 2a,2b, 2c,2d 3a,3b, 3c,3d 4a,4b, 4c,4d 5a,5b, 5c,5d 6a,6b, 6c,6d 3a,3b, 3c,3d
106
Master
Worker 1 Disk DRAM
1a,1b, 1c,1d 2a,2b, 2c,2d
Worker 2 Disk DRAM Worker 3 Disk DRAM
3a,3b, 3c,3d 4a,4b, 4c,4d 5a,5b, 5c,5d 6a,6b, 6c,6d
1a,1b, 1c,1d 2a,2b, 2c,2d 3a,3b, 3c,3d 4a,4b, 4c,4d 5a,5b, 5c,5d 6a,6b, 6c,6d 3b,4b
1b,2b 5b,6b
107
Master
Worker 1 Disk DRAM
1a,1b, 1c,1d 2a,2b, 2c,2d
Worker 2 Disk DRAM Worker 3 Disk DRAM
3a,3b, 3c,3d 4a,4b, 4c,4d 5a,5b, 5c,5d 6a,6b, 6c,6d
1a,1b, 1c,1d 2a,2b, 2c,2d 3a,3b, 3c,3d 4a,4b, 4c,4d 5a,5b, 5c,5d 6a,6b, 6c,6d 4a 2a 6a
4a 2a 6a 6a
108
109
A B C D a1 1b 1c 4 a2 2b 2c 3 a1 3b 3c 5 a3 4b 4c 1 a2 5b 5c 10 a1 6b 6c 8
A Running Info. a1 17 a2 13 a3 1
Master Worker 1 Disk DRAM
a1,1b, 1c,4 a2,2b, 2c,3
Worker 2 Disk DRAM Worker 3 Disk DRAM
a1,3b, 3c,5 a3,4b, 4c,1 a2,5b, 5c,10 a1,6b, 6c,8 a1,1b, 1c,4 a2,2b, 2c,3 a1,3b, 3c,5 a3,4b, 4c,1 a2,5b, 5c,10 a1,6b, 6c,8 a1: 5 a3: 1 a1: 4 a2: 3 a1: 8 a2: 10 a1: 5 a3: 1 a1: 4 a2: 3 a1: 8 a2: 10 a1: 17 a2: 13 a3: 1
110
Master Worker 1 Disk DRAM
2,1, 2,1 0,0, 0,0
Worker 2 Disk DRAM Worker 3 Disk DRAM
0,1, 0,0 0,2, 1,2 3,0, 3,0 1,0, 1,0 2,1, 2,1 0,0, 0,0 0,1, 0,0 0,2, 1,2 3,0, 3,0 1,0, 1,0 10 10 20 10 10 20 40
2 1 2 1 1 2 1 2 3 1 3 1
111
2 1 2 1 1 2 1 2 3 1 3 1
112
Master Worker 1 Disk DRAM
A B
Worker 2 Disk DRAM Worker 3 Disk DRAM
C D E F … ATB …
ATB … …
113
https://sfu-db.github.io/dbsystems/Papers/systemML.pdf http://www.vldb.org/pvldb/vol9/p1425-boehm.pdf
114
i∈B
115
116
Worker 1 Worker 2 Worker n … Master
1
<latexit sha1_base64="DvIp/6uHDZOjNMUvnpHo6Owbk=">AB+3icbVDLSsNAFJ3UV62vWJdugkWom5L4QJdFNy4r2Ae0MUymk3boZBJmbsQS8ituXCji1h9x5984abPQ1gMDh3Pu5Z45fsyZAtv+Nkorq2vrG+XNytb2zu6euV/tqCiRhLZJxCPZ87GinAnaBgac9mJcehz2vUnN7nfaRSsUjcwzSmbohHgWMYNCSZ1YHIYaxH6Td7CGtw0nmOZ5Zsxv2DNYycQpSQwVanvk1GEYkCakAwrFSfceOwU2xBEY4zSqDRNEYkwke0b6mAodUuekse2Yda2VoBZHUT4A1U39vpDhUahr6ejJPqha9XPzP6ycQXLkpE3ECVJD5oSDhFkRWXoQ1ZJIS4FNMJFMZ7XIGEtMQNdV0SU4i19eJp3ThnPWuLg7rzWvizrK6BAdoTpy0CVqolvUQm1E0BN6Rq/ozciMF+Pd+JiPloxi5wD9gfH5A6OglC4=</latexit>2
<latexit sha1_base64="xRaCmja4+Alc2G0VSB2AwXG38es=">AB+3icbVDLSsNAFL3xWesr1qWbYBHqpiRV0WXRjcsK9gFtDJPpB06mYSZiVhCfsWNC0Xc+iPu/BsnbRbaemDgcM693DPHjxmVyra/jZXVtfWNzdJWeXtnd2/fPKh0ZJQITNo4YpHo+UgSRjlpK6oY6cWCoNBnpOtPbnK/+0iEpBG/V9OYuCEacRpQjJSWPLMyCJEa+0HazR7SmjrNvIZnVu26PYO1TJyCVKFAyzO/BsMIJyHhCjMkZd+xY+WmSCiKGcnKg0SGOEJGpG+phyFRLrpLHtmnWhlaAWR0I8ra6b+3khRKOU09PVknlQuern4n9dPVHDlpTHiSIczw8FCbNUZOVFWEMqCFZsqgnCguqsFh4jgbDSdZV1Cc7il5dJp1F3zuoXd+fV5nVRwmO4Bhq4MAlNOEWtAGDE/wDK/wZmTGi/FufMxHV4xi5xD+wPj8AaUklC8=</latexit>n
<latexit sha1_base64="dQYgF7AMPEB/nejv+XzaZ3+cDiQ=">AB+3icbVDLSsNAFJ3UV62vWpduBotQNyXxgS6LblxWsA9oY5lMJ+3QySTM3Igl5FfcuFDErT/izr9x0mahrQcGDufcyz1zvEhwDb9bRVWVtfWN4qbpa3tnd298n6lrcNYUdaioQhV1yOaCS5ZCzgI1o0UI4EnWMeb3GR+5EpzUN5D9OIuQEZSe5zSsBIg3KlHxAYe37SR+SGpykAyNW7bo9A14mTk6qKEdzUP7qD0MaB0wCFUTrnmNH4CZEAaeCpaV+rFlE6ISMWM9QSQKm3WSWPcXHRhliP1TmScAz9fdGQgKtp4FnJrOketHLxP+8Xgz+lZtwGcXAJ0f8mOBIcRZEXjIFaMgpoYQqrjJiumYKELB1FUyJTiLX14m7dO6c1a/uDuvNq7zOoroEB2hGnLQJWqgW9RELUTRE3pGr+jNSq0X6936mI8WrHznAP2B9fkDACOUaw=</latexit>n
i=1
i
<latexit sha1_base64="YvR+r5RoOYyp/ar/kI19j/9+94=">ACJ3icbVDLSgMxFM3UV62vqks3wSLUTZnxgW4qRTcuK9gHdNohk2ba0ExmSDJCfM3bvwVN4K6NI/MW1noa0HAodziX3Hj9mVCrb/rJyS8srq2v59cLG5tb2TnF3rymjRGDSwBGLRNtHkjDKSUNRxUg7FgSFPiMtf3Qz8VsPREga8Xs1jk3RANOA4qRMpJXvHJDpIZ+oFtpT5fVcQqr0A0EwtpJNU+hK5PQ07TqpD0O57Me9Yolu2JPAReJk5ESyFD3iq9uP8JSLjCDEnZcexYdTUSimJG0oKbSBIjPEID0jGUo5DIrp7emcIjo/RhEAnzuIJT9feERqGU49A3ycmct6biP95nUQFl1NeZwowvHsoyBhUEVwUhrsU0GwYmNDEBbU7ArxEJmWlKm2YEpw5k9eJM2TinNaOb87K9Wuszry4AcgjJwAWogVtQBw2AwSN4Bm/g3XqyXqwP63MWzVnZzD74A+v7B2K9pkU=</latexit>117
https://www.cs.cmu.edu/~muli/file/parameter_server_osdi14.pdf
118
Worker 1 Worker 2 Worker n
PS 1 PS 2 PS 2 …
n
2
1 )
<latexit sha1_base64="OcO+0LVRY+mliWjDoWz5mG5oEXI=">ACDnicbVC7SgNBFJ31GeNr1dJmMASJuz6QMugjYVFBPOAbAyzk9lkyOzsMnNXCMt+gY2/YmOhiK21nX/j5Fo4oELh3Pu5d57/FhwDY7zbS0tr6yurec28ptb2zu79t5+Q0eJoqxOIxGplk80E1yOnAQrBUrRkJfsKY/vBr7zQemNI/kHYxi1glJX/KAUwJG6tpFTxJfEOwBFz2W3mS45IUEBn6QNrP7tATlrOuWu3bBqTgT4EXizkgBzVDr2l9eL6JyCRQbRu04MnZQo4FSwLO8lmsWEDkmftQ2VJGS6k07eyXDRKD0cRMqUBDxRf0+kJNR6FPqmc3yqnvfG4n9eO4HgopNyGSfAJ0uChKBIcLjbHCPK0ZBjAwhVHFzK6YDogFk2DehODOv7xIGscV96RydntaqF7O4sihQ3SESshF56iKrlEN1RFj+gZvaI368l6sd6tj2nrkjWbOUB/YH3+AChxm4k=</latexit>119
120
121
122
123
T1
T2 T3 T4
N1: T1 T3 T4 N2: T2 N3: N4: 6 15 21 30
Mstr: T 1 T 1 T 2 T 2 T 3 T 3 T 4 T 4 W1: T1 T2 T3 T4 W2: T1 T2 T3 T4 W3: T1 T2 T3 T4 1 6 7 8 10 12 14 16 20
124
Mstr: T 1 T 1 T 4 T 4 W1: T1 T2 T4 W2: T1 T3 T4 W3: T1 T4 1 6
13 14
18
T1
T2 T3 T4