Exploiting the Spatial Dimension Akshay Jajoo Rohan Gandhi Y. - - PowerPoint PPT Presentation

exploiting the spatial dimension
SMART_READER_LITE
LIVE PREVIEW

Exploiting the Spatial Dimension Akshay Jajoo Rohan Gandhi Y. - - PowerPoint PPT Presentation

Saath: Speeding up CoFlows by Exploiting the Spatial Dimension Akshay Jajoo Rohan Gandhi Y. Charlie Hu Chengkok-Koh 1 Analytics Jobs in Big Data Analytics jobs in data-centers Process huge amount of data Distributed in nature


slide-1
SLIDE 1

Saath: Speeding up CoFlows by Exploiting the Spatial Dimension

Rohan Gandhi Y. Charlie Hu Akshay Jajoo

1

Chengkok-Koh

slide-2
SLIDE 2

Analytics Jobs in Big Data

  • Analytics jobs in data-centers

– Process huge amount of data – Distributed in nature – Have multiple stages that communicate with each other

2

slide-3
SLIDE 3

Example – Map Reduce Jobs

3

Map Stage Shuffle (Communication) Reduce Stage

Two compute stages: Map and Reduce Map communicates with reduce in shuffle phase

slide-4
SLIDE 4

Impact of communication on job performance

4

Map Stage Shuffle (Communication) Reduce Stage

Facebook jobs spend 25% time in communication![1]

[1] Based on information from full facebook trace used in Aalo. Aalo slides.

slide-5
SLIDE 5

CoFlow abstraction

5

Map Stage Shuffle Reduce Stage

CoFlow

CoFlow:

Collection of all flows that share same goal

Implication:

CoFlow finishes when all its flows are over

[2] Hotnets 2012, CoFlow: a networking abstraction for cluster application, Mosharaf Choudhary, Ion Stoica

CoFlow Completion Time (CCT): Completion time of its last flow

slide-6
SLIDE 6

6

Reduce Stage Map Stage

Green Job 2 mappers and 2 reducers Orange Job 4 mappers and 2 reducers They share datacenter network Goal: minimize average CoFlow completion time (CCT) of all CoFlows

CoFlow Scheduling Problem

CoFlow 1 CoFlow 2

slide-7
SLIDE 7

CoFlow Scheduling Problem

7

  • CoFlow scheduling problem
  • Minimize average CoFlow Completion Time (CCT)
  • CoFlows have 2-dimensions
  • Time – Length of individual flows
  • Space – Many flows or ports
  • CoFlow scheduling problem is NP Hard [3]

[3] M. Chowdhury, Y. Zhong, and I. Stoica. Efficient coflow scheduling with Varys. In SIGCOMM, 2014.

slide-8
SLIDE 8

Outline

  • Background of Aalo (State-of-the-art CoFlow scheduler)
  • Limitations of Aalo
  • Design of Saath
  • Evaluation

8

slide-9
SLIDE 9

Background of Aalo (State-of-the-art CoFlow scheduler)

– Shortest job first for sequential jobs – Online approximation of SJF – Aalo: Online SJF + Spatial dimension (many distributed tasks)

9

slide-10
SLIDE 10

Scheduling 101

Shortest-Job-First (SJF): optimal in minimizing average completion time

P3 P2 P1 P1 < P2 < P3

First Last Process Scheduling

10

slide-11
SLIDE 11

Outline

  • Background of Aalo (State-of-the-art CoFlow scheduler)

– Shortest job first for sequential jobs – Online approximation of SJF – Aalo: Online SJF + Spatial dimension (many distributed tasks)

11

slide-12
SLIDE 12

Online Approximation to SJF using Priority queues

12

P3 P2 P3

Q0 Q1 Q2

High Low

Process durations - Unknown P1 P2 P3 Priority queues (Higher Priority = more CPU time)

Shorter processes finish in High priority queues

P2 P2 P1

slide-13
SLIDE 13

Outline

  • Background of Aalo (State-of-the-art CoFlow scheduler)

– Shortest job first for sequential jobs – Online approximation of SJF – Aalo: Online SJF + Spatial dimension (many distributed tasks)

13

slide-14
SLIDE 14

Datacenter Network abstraction: Non-blocking switch

  • The entire datacenter fabric is one non-

blocking switch

– Makes analysis simple – Recent works like CONGA[Sigcomm’ 14], VL2 [Sigcomm’09] make the abstraction practical

  • Only source of contention are end-hosts
  • Implication: The CoFlow scheduling

problem boils down to ordering them at sending hosts/ports

Sender Node-1 Sender Node-2

14

DC Network Receiver Node-1 Receiver Node-2

[4] Sigcomm 2014, CONGA, M. Alizadeh et.al ;[5] Sigcomm 2009, VL2, A Greenberg et.al

slide-15
SLIDE 15

C3 C2

Aalo: Online CoFlow Scheduler

Global Co-ordinator Sender Node-1 Sender Node-2

15

A CoFlow has many flows -- How to approximate SJF?

DC Network

  • 1. Replicates priority queues at

each node

  • 2. A CoFlow moved across priority

queues based on total bytes sent at all its ports

  • 3. Different ports send

independently

  • 4. Intra-queue: Use FIFO

Receiver Node-1 Receiver Node-2

[6] Sigcomm 2015, Aalo, Choudhary et.al.

C2 C1

Q0 Q1

High Low C2 C3

Q0 Q1

High Low

slide-16
SLIDE 16

Outline

  • Background of Aalo (State-of-the-art CoFlow scheduler
  • Limitations of Aalo
  • Design of Saath
  • Evaluation

16

slide-17
SLIDE 17

Aalo Drawback 1: Out-of-Sync

Global Co-ordinator Sender Node-1 Sender Node-2

17

C2 C3 C2 C1 DC Network

Q0 Q1 Q0 Q1

Ports schedule independent of each other Flows of a CoFlow may get scheduled at different times at different ports

slide-18
SLIDE 18

Aalo Drawback 2: Contention Oblivion

18

Contention of a CoFlow – Number of other CoFlows it blocks

Global Co-ordinator Sender Node-1 Sender Node-2 DC Network

Q0 Q1 Q0 Q1

Global Co-ordinator Sender Node-1 Sender Node-2 DC Network C2 C1

Q0 Q1

C2 C3

Q0 Q1

  • C1 – 1 – C2
  • C2 – 2 – C1 & C3
  • C3 – 1 – C2

C2 C3 C2 C1 Average CCT = (2+1+2)/3= 5/3 Average CCT =(1+2+1)/3 = 4/3

slide-19
SLIDE 19

Aalo is not taking arrangement of CoFlows across Space into account

19

slide-20
SLIDE 20

Outline

  • State-of-the-art CoFlow scheduler - Aalo
  • Limitations of Aalo
  • Design of Saath
  • Evaluation

20

slide-21
SLIDE 21

Saath: Speeding up CoFlows by exploiting the Spatial Dimension

21

slide-22
SLIDE 22

Saath

  • Saath is an online scheduler.
  • Takes spatial dimension into

account while scheduling CoFlows.

  • Spatial dimension: Arrangement
  • f flows of CoFlows across ports

22

slide-23
SLIDE 23

Saath Key Ideas

  • All-or-none
  • Least-Contention-First within a queue
  • Faster CoFlow-queue transition

23

slide-24
SLIDE 24

Key idea 1: All-or-none

  • Either schedule all flows of a CoFlow or schedule none.

– Not scheduling a CoFlow for which a subset of flow was being scheduled has no effect on CCT. – By freeing up some ports we potentially improve CCT for others

24

slide-25
SLIDE 25

Challenges in All-or-none

25

Time C1 C2 P1 C3 C2 P2 P3 t 2t CCT: C1 = t , C2 = 2t, C3 = C4 = t 3t C1 C3 Time C1 C2 P1 C3 C2 P2 P3 t 2t CCT: C1 = t , C2 = 2t, C3 = C4 = t C3 C1 Saath handles low port utilization by carefully designed work conservation

slide-26
SLIDE 26

Key idea 2: Least-Contention-First within a queue

  • Contention of a CoFlow – Number of other CoFlows it blocks
  • Saath sorts CoFlows in each queue in increasing order of

Contention

  • Allows more CoFlows to be scheduled in parallel.

26

slide-27
SLIDE 27

27

Key idea 2: Least-Contention-First within a queue

Global Co-ordinator Sender Node-1 Sender Node-2 C2 C3 C2 C1 DC Network

Q0 Q1 Q0 Q1

Global Co-ordinator Sender Node-1 Sender Node-2 C2 C3 C2 C1 DC Network

Q0 Q1 Q0 Q1

Average CCT = (1+2+2)/3 = 5/3 Average CCT = (1+2+1)/3 = 4/3

  • C1 – 1 – C2
  • C2 – 2 – C1 and C3
  • C3 – 1 – C2

Contention of a CoFlow – Number of other CoFlows it blocks

slide-28
SLIDE 28

Key idea 3: Faster CoFlow-queue transition

  • Both Aalo and Saath use

priority-queue structure to move CoFlows across queues

  • Aalo uses total bytes by all

flows

  • Saath uses bytes per flow
  • Saath has fast transition of

longer CoFlows to lower priority queue

28

Assume queue transition threshold for Aalo is portBandwidth × 4t P2 P3 P4 P1 C2 - Fast transition - Saath

t

P2 P3 P4

2t

P1 Aalo C2 transits Sender Ports

P1 P2 P3

C1 C2 C2 C2

P4

C1 C2 Setup C3 C4

slide-29
SLIDE 29

Recap: Saath Scheduling Ideas

  • All-or-none
  • Least-Contention-First within a queue
  • Faster CoFlow-queue transition

29

slide-30
SLIDE 30

Outline

  • State-of-the-art CoFlow scheduler - Aalo
  • Limitations of Aalo
  • Design of Saath
  • Evaluation

30

slide-31
SLIDE 31

Evaluation Methodology

31

  • 1. Large scale trace driven simulations
  • 2. Large scale testbed evaluation - 150 nodes
  • 3. Implemented Saath in 5.2 KLoC in C++
slide-32
SLIDE 32

Trace

32

  • 1. FB Trace[7]
  • 1. Collected from Facebook’s cluster.
  • 2. 526 CoFlows, 150 ports
  • 2. OSP
  • 1. Collected from Microsoft’s cluster.
  • 2. Ο(1000) CoFlows, Ο(100) ports

[7] : https://github.com/coflow/coflow-benchmark

slide-33
SLIDE 33

33

  • Saath approaches offline SEBF
  • 1.53x for FB and 1.42x for OSP median speedup as compared to Aalo

Overall CCT improvement

slide-34
SLIDE 34

CCT improvement – Design Components

34

  • Each design component has considerable contribution in CCT improvement.
slide-35
SLIDE 35

Things are In-Sync now

35

  • Most of the equal flow coflow now have very small deviation in FCTs
slide-36
SLIDE 36

Testbed

36

CCT CCT Speedup 1.88x on Average and 1.43x P50

slide-37
SLIDE 37

Scheduling Overhead

37

SAATH Aalo Average P90 Average P90 Global Coordinator CPU % 37.8 42.7 33.5 35.5 Memory(MB) 229 284 267 374 Total time (msec) 0.57 2.85 0.1 0.2 (LCoF/All-or-none) (msec) (0.02/0.24) (0.03/0.7) Local Node CPU % 5.6 5.7 5.5 5.7 Memory(MB) 1.68 1.7 1.75 1.78

slide-38
SLIDE 38

Conclusion

  • CoFlow sheduling holds promise to optimize communication in Big

Data jobs

  • Limitation of prior-art Aalo:

– Ignores spatial arrangement – Has no coordination across ports

  • Flows can be out of sync
  • CoFlow contention oblivious
  • Saath:

– Fuses spatial dimension in CoFlow scheduling – Coordination across ports – Evaluation: CCT improvement: 1.53x (P50) and 4.5x (P90) for FB trace and 1.42x (P50) and 37x (P90)

38

slide-39
SLIDE 39

Thank you!

39