Execu&on Templates: Caching Control Plane Decisions for Strong - - PowerPoint PPT Presentation

execu on templates caching control plane decisions for
SMART_READER_LITE
LIVE PREVIEW

Execu&on Templates: Caching Control Plane Decisions for Strong - - PowerPoint PPT Presentation

Execu&on Templates: Caching Control Plane Decisions for Strong Scaling of Data Analy&cs Omid Mashayekhi Hang Qu Chinmayee Shah Philip Levis July 13, 2017 2 2 Cloud Frameworks Machine Graph SQL Streaming Learning Cloud


slide-1
SLIDE 1

Execu&on Templates: Caching Control Plane Decisions for Strong Scaling of Data Analy&cs

Omid Mashayekhi Hang Qu Chinmayee Shah Philip Levis

July 13, 2017

slide-2
SLIDE 2

2 2

slide-3
SLIDE 3

Cloud Frameworks

3

SQL Streaming

Machine Learning

Graph

Cloud Framework

... ... ... ...

Cloud frameworks abstract away the complexi&es of the cloud infrastructure from the applica&on developers:

1. Automa&c distribu&on 2. Elas&c scalability 3. Mul&tenant applica&ons 4. Load balancing 5. Fault tolerance

slide-4
SLIDE 4

Cloud Frameworks

4

SQL

Job Control Plane

... ... ... ...

  • Job is an instance of the applica&on running in the framework.
  • Task is the unit of computa&on for the job.
  • Control plane par&&ons job in to tasks, schedules task, and recovers from faults.

Task

slide-5
SLIDE 5

10s 1s 100ms 10ms 1ms I/O-bound data analy&cs

MapReduce Hadoop

Task Length

2004

Evolu&on of Cloud Frameworks

5

slide-6
SLIDE 6

10s 1s 100ms 10ms 1ms I/O-bound data analy&cs In-memory data analy&cs

MapReduce Hadoop

Task Length

Spark Naiad

2004 2012

Evolu&on of Cloud Frameworks

6

slide-7
SLIDE 7

10s 1s 100ms 10ms 1ms I/O-bound data analy&cs In-memory data analy&cs Op&mized data analy&cs

MapReduce Hadoop

Task Length

Spark Naiad Spark 2.0 Common IL C++

2004 2012 2016

Evolu&on of Cloud Frameworks

7

slide-8
SLIDE 8

8

Individual tasks are ge]ng faster. But does it necessarily mean that job comple&on &me is ge]ng shorter?

slide-9
SLIDE 9

9

Control Plane

The New Boaleneck

  • Logis&c regression over a data set of size 100GB.
  • Classic Spark used to be CPU-bound.
slide-10
SLIDE 10

10

Control Plane

The New Boaleneck

  • Logis&c regression over a data set of size 100GB.
  • Spark 2.0 with Scala implementa&on is already control-bound.
slide-11
SLIDE 11

11

Control Plane

The New Boaleneck

  • Logis&c regression over a data set of size 100GB.
  • Spark-opt: hypothe&cal case where Spark runs tasks as fast as C++.
slide-12
SLIDE 12

12

Control plane is the emerging boaleneck for the cloud compu&ng frameworks.

slide-13
SLIDE 13

13

Control Plane Design Scope

Control Plane Example

Task Throughput Scheduling Cost

Design Framework

(task per sec) (per task)

Centralized MapReduce ≈ 1, 000 ≈ 100µs Hadoop Spark Distributed Naiad ≈ 100, 000 ≈ 100, 000µs TensorFlow Centralized w/

  • Centralized controller adapts to scheduling changes reac&vely with a low cost,

but has limited task throughput and boalenecks at scale.

  • Distributed controller scales well, but any scheduling change requires stopping

all nodes and installing new data flow with high latency.

slide-14
SLIDE 14

14

Execu>on Templates is an abstrac&on for the control plane of cloud compu&ng frameworks, that enables orders of magnitude higher task throughput, while keeping the fine-grained, flexible scheduling with low cost.

slide-15
SLIDE 15

15

Control Plane

The New Boaleneck

  • Logis&c regression over a data set of size 100GB.
  • Nimbus with execu>on templates scales almost linearly, with low cost scheduling.
slide-16
SLIDE 16

Repe&&ve Paaerns

  • Advanced data analy&cs are itera&ve in nature.

– Machine learning, graph processing, image recogni&on, etc.

  • This results in repe&&ve paaerns in the control plane.

– Similar tasks execute with minor differences.

16

slide-17
SLIDE 17

Execu&on Model

17

Controller

Driver Program

Data Map Reduce

Data flow

Worker Worker

Data Objects Data Objects

Task Graph

slide-18
SLIDE 18

Execu&on Model

18

Controller

Driver Program

Data Map Reduce

Data flow

Worker Worker

Data Objects Data Objects

C Task Graph

slide-19
SLIDE 19

Execu&on Model

19

Controller

Driver Program

Data Map Reduce

Data flow

Worker Worker

Data Objects Data Objects

C

Task id Data list

  • Dep. list

Function Parameter

Task Graph

slide-20
SLIDE 20

Execu&on Model

20

Controller

Driver Program

Data Map Reduce

Data flow

Worker Worker

Data Objects Data Objects

Data Exchange C

Task id Data list

  • Dep. list

Function Parameter

Task Graph

slide-21
SLIDE 21

Repe&&ve Paaerns

21

Controller Worker Worker

Data Objects Data Objects

Task Graph

slide-22
SLIDE 22

Repe&&ve Paaerns

22

Controller Worker Worker

Data Objects Data Objects

C

Task id Data list

  • Dep. list

Function Parameter

Task Graph

slide-23
SLIDE 23

Repe&&ve Paaerns

23

Controller Worker Worker

Data Objects Data Objects

Task Graph Data Exchange C

Task id Data list

  • Dep. list

Function Parameter

slide-24
SLIDE 24

Repe&&ve Paaerns

24

Controller Worker Worker

Data Objects Data Objects

C

Task id Data list

  • Dep. list

Function Parameter

Task Graph

slide-25
SLIDE 25

Repe&&ve Paaerns

25

Controller Worker Worker

Data Objects Data Objects

Task Graph Data Exchange C

Task id Data list

  • Dep. list

Function Parameter

slide-26
SLIDE 26

Execu&on Templates

  • Tasks are cached as parameterizable blocks on nodes.
  • Instead of assigning the tasks from scratch, templates

are instan>ated by filling in only changing parameters.

26

Task id Data list

  • Dep. list

Function Parameter Task id Data list

  • Dep. list

Function Parameter Task id Data list

  • Dep. list

Function Parameter

slide-27
SLIDE 27

Execu&on Templates

  • Tasks are cached as parameterizable blocks on nodes.
  • Instead of assigning the tasks from scratch, templates

are instan>ated by filling in only changing parameters.

27

Task id Data list

  • Dep. list

Function Parameter Task id Data list

  • Dep. list

Function Parameter Task id Data list

  • Dep. list

Function Parameter Load New Task ids Parameters T1 P1 T2 P2 T3 P3

slide-28
SLIDE 28

28

Execu&on Templates

Mechanisms Summary

  • Instan>a>on: spawn a block of tasks without processing each task

individually from scratch. It helps increase the task throughput.

  • Edits: modifies the content of each template at the granularity of tasks. It

enables fine-grained, dynamic scheduling.

  • Patches: In case the state of the worker does not match the precondi&ons of

the template. It enables dynamic control flow.

slide-29
SLIDE 29

Execu&on Templates

Instan&a&on

29

Controller Worker Worker

C Task Graph

Data Objects Data Objects

slide-30
SLIDE 30

Execu&on Templates

Instan&a&on

30

Controller Worker Worker

Task Graph

Data Objects Data Objects

C C

Template Template

slide-31
SLIDE 31

Execu&on Templates

Instan&a&on

31

Controller Worker Worker

Task Graph

Data Objects Data Objects

C

Template Template

slide-32
SLIDE 32

Execu&on Templates

Instan&a&on

32

Controller Worker Worker

Task Graph

Data Objects Data Objects

C

Template Template

Instantiate<params> Instantiate<params>

slide-33
SLIDE 33

Execu&on Templates

Instan&a&on

33

Controller Worker Worker

Task Graph

Data Objects Data Objects

C C

Template Template

slide-34
SLIDE 34

Execu&on Templates

Caching tasks implies sta&c behavior; how could templates allow dynamic scheduling?

  • Reac&ve scheduling changes for load balancing.
  • Scheduling changes at the task granularity.

34

slide-35
SLIDE 35

Execu&on Templates

Edits

  • If scheduling changes, even slightly, the templates are obsolete.

– For example rescheduling a task from one worker to another.

  • Instead of paying the substan&al cost of installing templates for every changes,

templates allow edit, to change their structure.

  • Edits enable adding or removing tasks from the template and modifying the

template content, in-place.

  • Controller has the general view of the task graph so it can update the

dependencies properly, needed by the edits.

35

slide-36
SLIDE 36

Execu&on Templates

Edits

36

Controller Worker

Task Graph

Data Objects

C

Template

Worker

Data Objects Template

Reschedule

  • ne task
slide-37
SLIDE 37

Execu&on Templates

Edits

37

Controller Worker

Task Graph

Data Objects

C

Template

Worker

Data Objects Template

Edit<remove > Edit<add >

slide-38
SLIDE 38

Execu&on Templates

Edits

38

Controller Worker

Task Graph

Data Objects

C

Template

Worker

Data Objects Template

slide-39
SLIDE 39

Execu&on Templates

Edits

39

Controller Worker

Task Graph

Data Objects

C

Template

Worker

Data Objects Template

Instantiate<params> Instantiate<params>

slide-40
SLIDE 40

Execu&on Templates

Caching tasks implies sta&c behavior; how could templates allow dynamic control flow?

  • Need to support nested loops.
  • Need to support data dependent branches.

40

slide-41
SLIDE 41

Execu&on Templates

Patching

  • Execu&on templates operates at the granularity of basic blocks:

– A code block with single entry and no branches except at the end.

  • Each template has a set of precondi>ons that need to be sa&sfied.

– For example the set of data objects in memory, accessed by the tasks.

  • Worker state might not match the precondi&ons of the template in all

circumstances.

  • Controller patches the worker state before template instan&a&on, to sa&sfy

the precondi&ons.

41

slide-42
SLIDE 42

Execu&on Templates

Patching

42

Controller Worker Worker

Task Graph

Data Objects Data Objects

C

Template Template Precondi&ons Precondi&ons

slide-43
SLIDE 43

Execu&on Templates

Patching

43

Controller Worker Worker

Task Graph

Data Objects Data Objects

C

Template Template Precondi&ons Precondi&ons

Patch< load >

slide-44
SLIDE 44

Execu&on Templates

Patching

44

Controller Worker Worker

Task Graph

Data Objects Data Objects

C

Template Template Precondi&ons Precondi&ons

slide-45
SLIDE 45

Execu&on Templates

Patching

45

Controller Worker Worker

Task Graph

Data Objects Data Objects

C

Template Template

Instantiate<params> Instantiate<params>

Precondi&ons Precondi&ons

slide-46
SLIDE 46

Execu&on Templates

Patching

46

Controller Worker Worker

Task Graph

Data Objects Data Objects

C C

Template Template Precondi&ons Precondi&ons

slide-47
SLIDE 47

47

Execu&on Templates

Mechanisms Summary

  • Instan>a>on: spawn a block of tasks without processing each task

individually from scratch. It helps increase the task throughput.

  • Edits: modifies the content of each template at the granularity of tasks. It

enables fine-grained, dynamic scheduling.

  • Patches: In case the state of the worker does not match the precondi&ons of

the template. It enables dynamic control flow.

slide-48
SLIDE 48

48

Nimbus

  • Nimbus is designed for low latency, fast computa&ons in the cloud.
  • Nimbus embeds execu&on templates for its control plane.
  • Nimbus supports tradi&onal data analy&cs as well as Eulerian and hybrid

graphical simula&ons; for the first &me in a cloud framework.

– Supervised/unsupervised learning algorithms. – Graph processing. – Physical simula&on: water, smoke, etc. (PhysBAM library)

slide-49
SLIDE 49

nimbus.stanford.edu

49

haps://github.com/omidm/nimbus

slide-50
SLIDE 50

Evalua&on

Strong Scalability with Templates

50

  • Logis&c regression over data set of size 100GB.
  • Spark-opt and Naiad-opt, runs tasks as fast as C++ implementa&on.
  • Nimbus centralized controller with execu&on templates matches the

performance of Naiad with a distributed control plane.

slide-51
SLIDE 51

Evalua&on

Reac&ve, Fine-Grained Scheduling with Templates

51

  • Logis&c regression over data set of size 100GB, on 100 workers.
  • Naiad-opt curve is simulated (migra&ons every 5 itera&ons).
  • Execu&on templates allow low cost, reac&ve scheduling changes.
  • Single edit overhead is only 41μs (in average).

Rescheduling 5% of the tasks

slide-52
SLIDE 52

Evalua&on

High Task Throughput with Templates

52

  • Spark and Nimbus both have centralized controller.
  • Nimbus task throughput scales super linearly with more workers.
  • O(N2): more tasks and shorter tasks, simultaneously.
  • For a task graphs with single stage:
  • Instan&a&on cost is <2μs per task (500,000 tasks per second).

2 4 6 6SarN-oSW 10 20 30 40 50 60 70 80 90 100 1umber of WorNers 50 100 150 1imbus 7asN 7hroughSuW (7housands Ser second)

slide-53
SLIDE 53
  • To show the generality of execu&on templates, we considered

graphical simula&ons in Nimbus:

– Complex, and memory intensive from PhysBAM library.

– High tasks throughput requirements (400,000 tasks per second). – Nested loops and data dependent branches. – Require patching in very subtle cases. – Tradi&onally in the HPC domain.

53

Evalua&on

Graphical Simula&ons Distributed in Nimbus

slide-54
SLIDE 54

Evalua&on

Graphical Simula&ons Distributed in Nimbus

54

slide-55
SLIDE 55

Conclusion

55

Control Plane Example

Task Throughput Scheduling Cost

Design Framework

(task per sec) (per task)

Centralized MapReduce ≈ 1, 000 ≈ 100µs Hadoop Spark Distributed Naiad ≈ 100, 000 ≈ 100, 000µs TensorFlow Centralized w/ Nimbus ≈ 100, 000 ≈ 100µs Execution Templates

slide-56
SLIDE 56

56

Thank you!

nimbus.stanford.edu

haps://github.com/omidm/nimbus