F LOW P ROPHET : Generic and Accurate Traffic Prediction for - - PowerPoint PPT Presentation

f low p rophet generic and accurate traffic prediction
SMART_READER_LITE
LIVE PREVIEW

F LOW P ROPHET : Generic and Accurate Traffic Prediction for - - PowerPoint PPT Presentation

ICDCS15, Columbus, USA F LOW P ROPHET : Generic and Accurate Traffic Prediction for Data-parallel Cluster Computing Hao Wang 1,2 , Li Chen 2 , Kai Chen 2 , Ziyang Li 2,3 , Yiming Zhang 3 , Haibing Guan 1 , Zhengwei Qi 1 , Dongsheng Li 3 ,


slide-1
SLIDE 1

FLOWPROPHET: Generic and Accurate Traffic Prediction for Data-parallel Cluster Computing

Hao Wang1,2, Li Chen2, Kai Chen2, Ziyang Li2,3, Yiming Zhang3, Haibing Guan1, Zhengwei Qi1, Dongsheng Li3, Yanhui Geng4

1 Shanghai Jiao Tong University 2 Hong Kong University of Science and Technology 3 National University of Defense Technology 4 Huawei Technologies Co. Ltd.

ICDCS’15, Columbus, USA

slide-2
SLIDE 2

2

Dryad

slide-3
SLIDE 3

3

  • PDQ [Sigcomm’12], pFabric [Sigcomm’13],

PASE [Sigcomm’14], Varys [Sigcomm’14], Baraat [Sigcomm’14]

Flow-based optimization mechanisms: Architectural bandwidth provisioning:

  • c-Through [Sigcomm’10], Helios [Sigcomm’11],

Mordia [Sigcomm’13], OSA [NSDI’12]

  • Hedera [NSDI’10], MicroTE [CoNEXT’11],

D3 [Sigcomm’11]

Traffic engineering:

slide-4
SLIDE 4

4

  • PDQ [Sigcomm’12], pFabric [Sigcomm’13],

PASE [Sigcomm’14], Varys [Sigcomm’14], Baraat [Sigcomm’14]

Flow-based optimization mechanisms: Architectural bandwidth provisioning:

  • c-Through [Sigcomm’10], Helios [Sigcomm’11],

Mordia [Sigcomm’13], OSA [NSDI’12]

  • Hedera [NSDI’10], MicroTE [CoNEXT’11],

D3 [Sigcomm’11]

Traffic engineering:

Knowing the Flow Information Ahead of Time

4

?

slide-5
SLIDE 5

5

  • Generic for DCFs
  • Accurate and fined-grained
  • Ahead-of-time
  • Scalable and low-overhead

FLOWPROPHET

slide-6
SLIDE 6 6

(B,1) (D,1) (A,1) (C,1) (D,1) (B,1) (E,1) (A,1) (D,1) (E,1) (B,1) (C,1) (B,3) (C,2) (A,2) (D,3) (E,2)

Toy Example: Word Count

slide-7
SLIDE 7

7

(B,3) (C,2) (A,2) (D,3) (E,2) (B,1) (D,1) (A,1) (C,1) (D,1) (B,1) (E,1) (A,1) (C,1) (D,1) (E,1) (B,1)

A B C D E … … A … B C E … A … A D … C E … A … B D E … B…

Logical View

map() reduce()

slide-8
SLIDE 8

8

Physical View

slide-9
SLIDE 9

8

(B,1) (D,1) (A,1) (C,1) (D,1) (B,1) (D,1) (E,1) (B,1) (E,1) (A,1) (C,1)

Physical View

map()

slide-10
SLIDE 10

8

(B,1) (D,1) (A,1) (C,1) (D,1) (B,1) (D,1) (E,1) (B,1) (E,1) (A,1) (C,1)

Physical View

map()

slide-11
SLIDE 11

8

(C,1) (D,1) (B,1) (D,1) (E,1) (B,1) (E,1) (A,1) (C,1)

Physical View

map()

slide-12
SLIDE 12

8

(D,1) (E,1) (B,1) (E,1) (A,1) (C,1)

Physical View

map()

slide-13
SLIDE 13

8

(D,1) (E,1) (B,1)

Physical View

map()

slide-14
SLIDE 14

8

Physical View

map()

slide-15
SLIDE 15

8

Physical View

Shuffle map()

slide-16
SLIDE 16

8

(D,3) (E,2)

Physical View

Shuffle

(B,3) (C,2) (A,2)

map() reduce()

slide-17
SLIDE 17

9

Distributed Computing Frameworks

User

slide-18
SLIDE 18

9

Logical View Physical View

Predict flow info.

User

slide-19
SLIDE 19

Predict

Logical View Physical View

slide-20
SLIDE 20

Flow info.

Predict

Logical View

slide-21
SLIDE 21

Flow info. DAG

Predict

slide-22
SLIDE 22

Directed Acyclic Graph (DAG)

slide-23
SLIDE 23

stage #3

… …

input data stage #1 stage #2 stage #0

  • utput data

tasks

slide-24
SLIDE 24

Dryad

……………. ……………. n n

input data

  • utput files

computing vertices

……………. n

slide-25
SLIDE 25

… … …

map tasks reduce tasks input data

  • utput data
slide-26
SLIDE 26

… … supserstep(i) barrier synchronization computing nodes computing nodes

(BSP Model)

slide-27
SLIDE 27

Master

……

job#1 job#2 job#n

Application Submit

job#3

slide-28
SLIDE 28

Master

……

job#1 job#2 job#n

Application Submit

job#3

slide-29
SLIDE 29

Worker#1 Worker#2 Worker#n

Task Assignment

… …

slide-30
SLIDE 30

Master Worker#1 Worker#2 Worker#n

job#1 job#2 job#n

… …

LIFE CYCLE

slide-31
SLIDE 31

— DAG contains necessary time, data, and flow dependencies for accurate flow prediction.

OBSERVATION

slide-32
SLIDE 32

Flow Calculator Data Aggregator Spark Worker Hadoop Worker Ciel Worker DAG Builder

Write Data Tracker Fetch Master Node Local Memory Local Disk Network Interface Worker Node Spark Master Hadoop Master Ciel Master Data Status Task List Stage ID Data Status List

ARCHITECTURE

slide-33
SLIDE 33

API EXAMPLES

Event Definition Trigger Condition

newStageEvent(stageID, childStageID) a new stage is created stageStartEvent(List[task], stageID) a stage is beginning stageFinishedEvent(stageID)

a stage is finished

  • Required APIs for DCF master

Event Definition

newStageHandler(newStageEvent) ⇒ (currentStage, childStage) stageStartHandler(stageStartEvent) ⇒ Event(List[task], List[stageID]) stageFinishedHandler(stageFinishedEvent) ⇒ Event(stageID)

  • The DAG Builder event handlers
slide-34
SLIDE 34

t

3 2 1

DAG Builder Data Tracker Flow Calculator

3

192.168.1.11 blockID#1, 120MB

3

192.168.1.12 blockID#2, 200MB

3

192.168.1.13 blockID#3, 200MB

Output flow info.

2

192.168.1.21 req: blockID#2 192.168.1.22 req: blockID#3 192.168.1.23 req: blockID#1

2 2

block info.

slide-35
SLIDE 35

23

  • Generic
  • Accurate and fined-grained
  • Ahead-of-time
  • Scalable and low-overhead

FLOWPROPHET

slide-36
SLIDE 36

24

TESTBED

  • Dell PowerEdge R320 x 37
  • Intel Xeons E5-1410 2.8GHz CPU
  • 24GB 1600MHz DDR3
  • Broadcom Gigabit Ethernet NIC
  • Pronto-3295 Gigabit Ethernet Switch
slide-37
SLIDE 37

25

BENCHMARKS

  • WikiPageRank
  • SparkPageRank
  • Spark K-means
  • Hadoop TeraSort
  • π (Pi)
  • WordCount

METRICS

  • Time advance
  • Prediction

accuracy

  • Overhead
  • Scalability
  • Benefits
slide-38
SLIDE 38

26

Prediction Time (16:18:22.365) Prediction Time (16:18:29.547) ShuffleID#6 ShuffleID#7 Flow(#) in a Shuffle 1000 2000 3000 Time 16:18:25 16:18:30 16:18:35

TIME ADVANCE

  • WikipediaPageRank-13G (Spark)
slide-39
SLIDE 39

27

CDF

OF

LEAD TIME

1 2 3 4 5 0.2 0.4 0.6 0.8 1 Lead Time (s) CDF

Spark WikiPR-13G Avg: 414.1ms

1 2 3 4 5 0.2 0.4 0.6 0.8 1 Lead Time (s) CDF

Spark WikiPR-26G Avg: 478ms

10 20 30 40 50 0.2 0.4 0.6 0.8 1 Lead Time (s) CDF

Hadoop TeraSort-10G Avg: 12.3123s

10 20 30 40 0.2 0.4 0.6 0.8 1 Lead Time (s) CDF

Hadoop WordCount-20G Avg: 7.7348s

slide-40
SLIDE 40

28

PREDICTION ACCURACY

Actual Traffic Predicted Traffic Volume (MB) 200 400 600 800 1000 ShuffleID#3 ShuffleID#4 ShuffleID#5 ShuffleID#6 Actual Traffic Predicted Traffic Volume (GB) 5 10 15 Hadoop TeraSort-10G Actual Traffic Predicted Traffic Volume (MB) 100 200 300 400 Hadoop WordCount-10G

Spark WikiPR-26G

slide-41
SLIDE 41

29

Pure Spark Spark with FlowProphet Wikipedia PageRank-13G Wikipedia PageRank-26G SparkPi

  • 500M

SparkPi

  • 1000M

WordCount

  • 20G

WordCount

  • 40G

KMeans

  • 20G

Completion Time (s) 50 100 150 200

OVERHEAD

slide-42
SLIDE 42

30

Pure Hadoop Hadoop with FlowProphet Hadoop with HadoopWatch HadoopPi

  • 100M

HadoopPi

  • 500M

WordCount

  • 20G

WordCount

  • 40G

TeraSort-10G TeraSort-20G Completion Time (s) 50 100 150 200

OVERHEAD

slide-43
SLIDE 43
  • Overhead Ratio (OR) :

SCALABILITY

Pure Spark Spark with FlowProphet OR on testbed OR by projection Number of Worker Nodes Job Completion Time (s) 100 200 300 400 Overhead Ratio (%) 1 2 10 15 20 25 30 35 40 45 50 55 60 65 70 75 ... n

Spark WikiPR-26G

OR = tenabled − tdisabled tdisabled

slide-44
SLIDE 44

SCALABILITY

Pure Hadoop Hadoop with FlowProphet OR on testbed OR by projection Number of Worker Nodes Job Completion Time (s) 50 100 150 Overhead Ratio (%) 1 2 10 15 20 25 30 35 40 45 50 55 60 65 70 75 ... n

Hadoop TeraSort-10G

slide-45
SLIDE 45
  • Hadoop TeraSort-25G
  • 12.52% JCT reduction by a simple network scheduler

33

BENEFITS

Original Optimized Average coflow completion time (s) 70 75 80 85 90 95 100 105 110 115 120 Original Optimized Average job completion time (s) 70 80 90 100 110 120 130 140

slide-46
SLIDE 46
  • Analyze past statistics
  • Traffic Engineering with Estimated Traffic Matrices
  • Monitor buffers or counters in switches
  • c-Through, Hedera, Helios
  • Tracing and profiling toolkits
  • X-Trace
  • File system monitoring
  • HadoopWatch

34

RELATED WORK

slide-47
SLIDE 47
  • DCF execution pattern
  • DAG for predicting flows
  • Design and implementation
  • Evaluation on testbed

35

SUMMARY

slide-48
SLIDE 48

ICDCS’15, Columbus, USA

Thank you Q&A