Fast, Distributed Computa2ons in the Cloud
Omid Mashayekhi Advisor: Philip Levis
April 7, 2017
Fast, Distributed Computa2ons in the Cloud Omid Mashayekhi Advisor: - - PowerPoint PPT Presentation
Fast, Distributed Computa2ons in the Cloud Omid Mashayekhi Advisor: Philip Levis April 7, 2017 2 2 Cloud Frameworks Machine Streaming Graph SQL Learning Cloud Framework ... ... ... ... Cloud frameworks abstract away the complexi2es
Omid Mashayekhi Advisor: Philip Levis
April 7, 2017
2 2
3
SQL Streaming
Machine Learning
Graph
1. Automa2c distribu2on 2. Elas2c scalability 3. Mul2tenant applica2ons 4. Load balancing 5. Fault tolerance
4
SQL
MapReduce Hadoop
5
MapReduce Hadoop
Spark Naiad
6
MapReduce Hadoop
Spark Naiad Spark 2.0 Common IL C++
7
8
10s 1s 100ms 10ms 1ms I/O-bound data analy2cs In-memory data analy2cs Op2mized data analy2cs
MapReduce Hadoop
Task Length
Spark Naiad Spark 2.0 Common IL C++
2004 2012 2016
9
178 (6x) 67 (16x) 21 (51x) 1071 C++ Java Spark RDD Spark DataFrame
10s 1s 100ms 10ms 1ms I/O-bound data analy2cs In-memory data analy2cs Op2mized data analy2cs
MapReduce Hadoop
Task Length
Spark Naiad Spark 2.0 Common IL C++
2004 2012 2016
10
11
SQL
Task are gebng orders
How about the job?
12
13
14
15
16
17
18
19
20
21
Centralized Controllers Distributed Controllers
22
Centralized Controllers Distributed Controllers Controller
Task Graph Loop Worker Worker Worker Worker
23
Centralized Controllers Distributed Controllers Controller
Task Graph Loop Worker Worker Worker Worker
24
Centralized Controllers Distributed Controllers Controller
Task Graph Loop Worker Worker Worker Worker
25
Centralized Controllers Distributed Controllers Controller
Task Graph Loop Worker Worker Worker Worker
26
Centralized Controllers Distributed Controllers Controller
Task Graph Loop Worker Worker Worker Worker
27
Centralized Controllers Distributed Controllers Controller
Task Graph Loop Worker Worker Worker Worker
28
Centralized Controllers Distributed Controllers Controller
Task Graph Loop Worker Worker Worker Worker
29
Centralized Controllers Distributed Controllers Controller
Task Graph Loop Worker Worker Worker Worker
30
Centralized Controllers Distributed Controllers Controller
Task Graph Loop Worker Worker Worker Worker
31
Centralized Controllers Distributed Controllers Controller
Task Graph Loop Worker Worker Worker Worker
32
Centralized Controllers Distributed Controllers Controller
Task Graph Loop Worker Worker Worker Worker
33
Centralized Controllers Distributed Controllers Workers fall idle
34
Centralized Controllers Distributed Controllers
Controller Worker Loop Controller Worker Loop Controller Worker Loop Controller Worker Loop Synchroniza2on
35
Centralized Controllers Distributed Controllers
Controller Worker Controller Worker Loop Controller Worker Loop Controller Worker Loop Controller Worker Loop Controller Worker Loop Controller Worker Loop Controller Worker Loop Synchroniza2on Loop
36
Centralized Controllers Distributed Controllers
Controller Worker Loop Controller Worker Loop Controller Worker Loop Controller Worker Loop Synchroniza2on
Straggling
37
Centralized Controllers Distributed Controllers
Synchroniza2on
Controller Backup Worker Loop Controller Backup Worker Loop Controller Backup Worker Loop Controller Backup Worker Loop Loop Controller Loop Controller Loop Controller Loop Controller
38
Centralized Controllers Distributed Controllers
Synchroniza2on
Controller Backup Worker Loop Controller Backup Worker Loop Controller Backup Worker Loop Controller Backup Worker Loop Loop Controller Loop Controller Loop Controller Loop Controller Straggling
39
Centralized Controllers Distributed Controllers
Synchroniza2on
Controller Backup Worker Loop Controller Backup Worker Loop Controller Backup Worker Loop Loop Controller Loop Controller Loop Controller Loop Controller Backup Worker Loop Controller
40
Control Plane Design Example Framework Task Throughput Task Scheduling Centralized MapReduce Low Dynamic Hadoop Spark Distributed Naiad High Sta2c TensorFlow
41
Control Plane Design Example Framework Task Throughput Task Scheduling Centralized MapReduce Low Dynamic Hadoop Spark Distributed Naiad High Sta2c TensorFlow
42
43
while (error > threshold_e) { while (gradient > threshold_g) { // Optimization code block gradient = Gradient(tdata, coeff, param) coeff += gradient } // Estimation code block error = Estimate(edata, coeff, param) param = update_model(param, error) }
Training Data Es,ma,on Data Parameters Error Es,ma,on Itera,ve Op,mizer Coefficients
44
45
Task id Data list
Function Parameter Task id Data list
Function Parameter Task id Data list
Function Parameter
46
Task id Data list
Function Parameter Task id Data list
Function Parameter Task id Data list
Function Parameter Load New Task ids Parameters T1 P1 T2 P2 T3 P3
47
48
Controller
Driver Program
Data Map Reduce
Data flow
Worker Worker
49
Controller
Driver Program
Data Map Reduce
Data flow
Worker Worker
Task Graph
50
Controller
Driver Program
Data Map Reduce
Data flow
Worker Worker
Data Objects Data Objects
Task Graph
51
Controller
Driver Program
Data Map Reduce
Data flow
Worker Worker
Data Objects Data Objects
C Task Graph
52
Controller
Driver Program
Data Map Reduce
Data flow
Worker Worker
Data Objects Data Objects
C
Task id Data list
Function Parameter
Task Graph
53
Controller
Driver Program
Data Map Reduce
Data flow
Worker Worker
Data Objects Data Objects
Data Exchange C
Task id Data list
Function Parameter
Task Graph
54
Controller Worker Worker
Data Objects Data Objects
Task Graph
while (error > threshold_e) { while (gradient > threshold_g) { // Optimization code block gradient = Gradient(tdata, coeff, param) coeff += gradient } // Estimation code block error = Estimate(edata, coeff, param) param = update_model(param, error) }
Driver Program
55
Controller Worker Worker
Data Objects Data Objects
C
Task id Data list
Function Parameter
Task Graph
while (error > threshold_e) { while (gradient > threshold_g) { // Optimization code block gradient = Gradient(tdata, coeff, param) coeff += gradient } // Estimation code block error = Estimate(edata, coeff, param) param = update_model(param, error) }
Driver Program
56
Controller Worker Worker
Data Objects Data Objects
Task Graph
while (error > threshold_e) { while (gradient > threshold_g) { // Optimization code block gradient = Gradient(tdata, coeff, param) coeff += gradient } // Estimation code block error = Estimate(edata, coeff, param) param = update_model(param, error) }
Driver Program
Data Exchange C
Task id Data list
Function Parameter
57
Controller Worker Worker
Data Objects Data Objects
C
Task id Data list
Function Parameter
Task Graph
while (error > threshold_e) { while (gradient > threshold_g) { // Optimization code block gradient = Gradient(tdata, coeff, param) coeff += gradient } // Estimation code block error = Estimate(edata, coeff, param) param = update_model(param, error) }
Driver Program
58
Controller Worker Worker
Data Objects Data Objects
Task Graph
while (error > threshold_e) { while (gradient > threshold_g) { // Optimization code block gradient = Gradient(tdata, coeff, param) coeff += gradient } // Estimation code block error = Estimate(edata, coeff, param) param = update_model(param, error) }
Driver Program
Data Exchange C
Task id Data list
Function Parameter
59
Controller Worker Worker
C Task Graph
Data Objects Data Objects
60
Controller Worker Worker
Task Graph
Data Objects Data Objects
C C
Template Template
61
Controller Worker Worker
Task Graph
Data Objects Data Objects
C
Template Template
62
Controller Worker Worker
Task Graph
Data Objects Data Objects
C
Template Template
Instantiate<params> Instantiate<params>
63
Controller Worker Worker
Task Graph
Data Objects Data Objects
C C
Template Template
64
Controller Worker Worker
Task Graph
Data Objects Data Objects
C
Template Template
– Templates and dynamic scheduling?
– Templates and dynamic control flow?
65
– Templates and dynamic scheduling?
– Templates and dynamic control flow?
66
– For example migra2ng tasks among workers.
67
68
Controller Worker
Task Graph
Data Objects
C
Template
Worker
Data Objects Template
Migrate
69
Controller Worker
Task Graph
Data Objects
C
Template
Worker
Data Objects Template
Edit<remove > Edit<add >
70
Controller Worker
Task Graph
Data Objects
C
Template
Worker
Data Objects Template
71
Controller Worker
Task Graph
Data Objects
C
Template
Worker
Data Objects Template
Instantiate<params> Instantiate<params>
– Templates and dynamic scheduling?
– Templates and dynamic control flow?
72
73
Training Data Es,ma,on Data Parameters Error Es,ma,on Itera,ve Op,mizer Coefficients
74
Training Data Es,ma,on Data Parameters Error Es,ma,on Itera,ve Op,mizer Coefficients
– The cost of template instan2a2on is amor2zed over greater number of tasks. – But loop unrolling only works for sta2c control flow.
75
Training Data Es,ma,on Data Parameters Error Es,ma,on Itera,ve Op,mizer Coefficients
Template
76
Training Data Es,ma,on Data Parameters Error Es,ma,on Itera,ve Op,mizer Coefficients
Template
77
Training Data Es,ma,on Data Parameters Error Es,ma,on Itera,ve Op,mizer Coefficients
Template
78
Training Data Es,ma,on Data Parameters Error Es,ma,on Itera,ve Op,mizer Coefficients
– A code block with single entry and no branches except at the end. – It is the biggest block without sacrificing dynamic control flow.
79
Training Data Es,ma,on Data Parameters Error Es,ma,on Itera,ve Op,mizer Coefficients
Template 1
80
Training Data Es,ma,on Data Parameters Error Es,ma,on Itera,ve Op,mizer Coefficients
Template 1 Instan2ate Template 1 Instan2ate Template 1 Instan2ate Template 1 Instan2ate Template 1
81
Training Data Es,ma,on Data Parameters Error Es,ma,on Itera,ve Op,mizer Coefficients
Template 2
82
Training Data Es,ma,on Data Parameters Error Es,ma,on Itera,ve Op,mizer Coefficients
Template 2 Instan2ate Template 2
83
Controller Worker Worker
Task Graph
Data Objects Data Objects
C
StartTemplate EndTemplate EndTemplate StartTemplate
84
Controller Worker Worker
Task Graph
Data Objects Data Objects
C
Template Template
85
Training Data Es,ma,on Data Parameters Error Es,ma,on Itera,ve Op,mizer Coefficients
86
Training Data Es,ma,on Data Parameters Error Es,ma,on Itera,ve Op,mizer Coefficients
Instan2ate Template 1 Instan2ate Template 1
87
Training Data Es,ma,on Data Parameters Error Es,ma,on Itera,ve Op,mizer Coefficients
Instan2ate Template 1 Instan2ate Template 1
88
Training Data Es,ma,on Data Parameters Error Es,ma,on Itera,ve Op,mizer Coefficients
Instan2ate Template 1 Instan2ate Template 1
89
Training Data Es,ma,on Data Parameters Error Es,ma,on Itera,ve Op,mizer Coefficients
Instan2ate Template 1 Instan2ate Template 1
Updated model parameters
– For example the set of data objects in memory, accessed by the tasks cached in the template.
90
91
Controller Worker Worker
Task Graph
Data Objects Data Objects
C
Template Template
92
Controller Worker Worker
Task Graph
Data Objects Data Objects
C
Template Template Precondi2ons Precondi2ons
– For example the set of data objects in memory, accessed by the tasks cached in the template.
93
94
Controller Worker Worker
Task Graph
Data Objects Data Objects
C
Template Template Precondi2ons Precondi2ons
95
Controller Worker Worker
Task Graph
Data Objects Data Objects
C
Template Template Precondi2ons Precondi2ons
Patch< load >
96
Controller Worker Worker
Task Graph
Data Objects Data Objects
C
Template Template Precondi2ons Precondi2ons
97
Controller Worker Worker
Task Graph
Data Objects Data Objects
C
Template Template
Instantiate<params> Instantiate<params>
Precondi2ons Precondi2ons
98
Controller Worker Worker
Task Graph
Data Objects Data Objects
C C
Template Template Precondi2ons Precondi2ons
99
100
101
– Implemented in C++ (the core library is ~35,000 semicolons). – Mutable data model to allow in-place opera2ons.
– The centralized controller allows dynamic scheduling and resource alloca2on. – Execu2on templates help deliver high task throughput at scale.
graphical simula2ons; for the first 2me in a cloud framework.
– Supervised/unsupervised learning algorithms, graph library. – PhysBAM library (water, smoke, etc.)
102
Controller Worker Worker Controller
Worker Worker
Controller
Worker Worker
Controller
Worker Worker
103
Controller
Worker Worker Worker Worker
Instantiate
Controller Templates
104
Controller
Worker Worker Worker Worker
Controller Templates
105
Controller
Worker Worker Worker Worker
Worker Templates
Inst. Inst. Inst. Inst.
106
Controller
Worker Worker Worker Worker
C C C
Worker Templates
107 Driver Program:
Partition prt = {2, 1, 1}; Create(velocity, prt); Op(exec: advect, data: velocity, read: core/ghost, write: ghost); ...
Launcher
Physical Data Mappings
Controller
B A C D 2 1 3 4 5 6
Logical Data Copy Tasks Translator Manager app.so
2 1 3
Translator Manager
4 5 6
app.so
PhysicalTask(advect, {1,2,3}) PhysicalTask(advect, {4,5,6}) LogicalTask(advect, {A,B,C}) LogicalTask(advect, {B,C,D})
CompuBng Nodes
GeometricTask(advect, left_reg) GeometricTask(advect, right_reg)
ApplicaBon Data
108
109
– Execu2on templates match the strong scaling performance of frameworks with distributed control plane design.
– Execu2on templates allows low cost, reac2ve scheduling and dynamic resource alloca2on similar to a centralized frameworks.
– Execu2on templates can handle applica2ons with nested loops and data dependent branches with low overhead.
110
111
performance of Naiad with a distributed control plane.
112
Migra2ng 5% of the tasks
through edits at task granularity.
113
x100 x100 x50
Valida2ng precondi2ons for template reuse Installing new templates
114
115
116
Levelset Posi2ve Par2cles Posi2ve Removed Par2cles Velocity Nega2ve Par2cles Nega2ve Removed Par2cles
117
118
$180 $30
119
task throughput picks at 460,000 tasks per second.
120
20 40 60 Time (minute) 200 400 Iteration Number Enabled Disabled
rewind from checkpoint checkpoint checkpoint
121
122
Control Plane Design Example Framework Task Throughput Task Scheduling Centralized MapReduce Low Dynamic Hadoop Spark Distributed Naiad High Sta2c TensorFlow Centralized w/ Execu2on Templates Nimbus High Dynamic
123