Aemon: Information-agnostic Mix-flow Scheduling in Data Center - - PowerPoint PPT Presentation

aemon information agnostic mix flow scheduling in data
SMART_READER_LITE
LIVE PREVIEW

Aemon: Information-agnostic Mix-flow Scheduling in Data Center - - PowerPoint PPT Presentation

Aemon: Information-agnostic Mix-flow Scheduling in Data Center Networks Tao Wang 1 , Hong Xu 2 , Fangming Liu 1 1 Huazhong University of Science and Technology 2 NetX Lab @ City University of Hong Kong August, 2017 @ APNet, Hong Kong Why


slide-1
SLIDE 1

Aemon: Information-agnostic Mix-flow Scheduling in Data Center Networks

Tao Wang1, Hong Xu2, Fangming Liu1

1Huazhong University of Science and Technology 2NetX Lab @ City University of Hong Kong

August, 2017 @ APNet, Hong Kong

slide-2
SLIDE 2

Why information-agnostic mix-flow scheduling?

slide-3
SLIDE 3

Mix-flow in DCN

3

slide-4
SLIDE 4

Mix-flow in DCN

3

…… …

Hundreds of thousands of servers Web Services ML Analytics

……

HPC

slide-5
SLIDE 5

Mix-flow in DCN

3

…… …

Hundreds of thousands of servers Web Services ML Analytics

……

HPC

  • Non-deadline flows
  • minimize FCT
  • Deadline flows
  • minimize deadline miss ratio
slide-6
SLIDE 6

Flow size is hard to obtain

4

slide-7
SLIDE 7

Flow size is hard to obtain

  • Multi-stage job processing technique (e.g. pipelining, etc.)
  • Real-time characteristics (e.g. streaming application, etc.)

Hard to know flow sizes beforehand!

4

slide-8
SLIDE 8

Flow size is hard to obtain

  • Multi-stage job processing technique (e.g. pipelining, etc.)
  • Real-time characteristics (e.g. streaming application, etc.)

Hard to know flow sizes beforehand!

4

slide-9
SLIDE 9

Existing solutions fall short

5

slide-10
SLIDE 10

Existing solutions fall short

  • Deadline-unaware transport
  • TCP

, DCTCP , etc.

  • Fail to meet deadlines for deadline flows[1-2]

5

[1] pFabric: Minimal Near-Optimal Datacenter Transport, SIGCOMM’13 [2] Scheduling Mix-flows in Commodity Datacenters with Karuna, SIGCOMM’16

slide-11
SLIDE 11

Existing solutions fall short

  • Deadline-unaware transport
  • TCP

, DCTCP , etc.

  • Fail to meet deadlines for deadline flows[1-2]
  • Deadline-aware transport
  • D3, D2TCP

, PDQ, pFabric, Karuna, etc.

  • Either impossible to deploy in DCN (PDQ, pFabric)
  • Or assume flow size is known (D3, D2TCP

, Karuna)

5

[1] pFabric: Minimal Near-Optimal Datacenter Transport, SIGCOMM’13 [2] Scheduling Mix-flows in Commodity Datacenters with Karuna, SIGCOMM’16

slide-12
SLIDE 12

Aemon

slide-13
SLIDE 13

Aemon

slide-14
SLIDE 14

Aemon

Maester Aemon was the blind maester at Castle Black in Game of Thrones

slide-15
SLIDE 15

Aemon’s Design

7

slide-16
SLIDE 16

Aemon’s Design

  • w. deadline

w/o deadline Urgency- based Congestion Control

7

UCP

slide-17
SLIDE 17

Aemon’s Design

  • w. deadline

w/o deadline Urgency- based Congestion Control

7

UCP

End-host

slide-18
SLIDE 18

Aemon’s Design

  • w. deadline

w/o deadline Urgency- based Congestion Control

7

UCP 2LPS: Two-level PS

End-host

slide-19
SLIDE 19

Aemon’s Design

  • w. deadline

w/o deadline Urgency- based Congestion Control End-host Priority Tagging Prio 1 Prio 2 Prio 2K Prio 2K-1

Priority Scheduling

7

UCP 2LPS: Two-level PS

End-host

slide-20
SLIDE 20

Aemon’s Design

  • w. deadline

w/o deadline Urgency- based Congestion Control End-host Priority Tagging Prio 1 Prio 2 Prio 2K Prio 2K-1

Priority Scheduling

7

UCP 2LPS: Two-level PS

End-host Switch

slide-21
SLIDE 21

UCP Overview

8

slide-22
SLIDE 22

UCP Overview

  • DCTCP expression of network congestion

α ← (1 − g) · α + g · F

8

slide-23
SLIDE 23

UCP Overview

  • DCTCP expression of network congestion

α ← (1 − g) · α + g · F

8

  • Deadline flow’s urgency (non-deadline flow’s urgency is 1)
slide-24
SLIDE 24

UCP Overview

  • DCTCP expression of network congestion

α ← (1 − g) · α + g · F s = Te Td − Te

Deadline Elapsed Time

8

  • Deadline flow’s urgency (non-deadline flow’s urgency is 1)
slide-25
SLIDE 25

UCP Overview

  • DCTCP expression of network congestion

cwnd = ⇢ cwnd · (1 − αs/2), αs > 0, cwnd + 1, αs = 0. α ← (1 − g) · α + g · F s = Te Td − Te

Deadline Elapsed Time

8

  • Deadline flow’s urgency (non-deadline flow’s urgency is 1)
  • Congestion window modulation
slide-26
SLIDE 26

UCP Rationale

  • Penalize low-urgency deadline flow
  • leave more bandwidth for non-deadline flows
  • Protect high-urgency deadline flow
  • meet deadlines

9

slide-27
SLIDE 27

UCP Rationale

  • Penalize low-urgency deadline flow
  • leave more bandwidth for non-deadline flows
  • Protect high-urgency deadline flow
  • meet deadlines

Window Penalty

  • 0.5
  • 0.25

0.25 0.5 0.75 1

Urgency (i.e. s)

0.5 1 1.5 2

w/o ddl w/ ddl diff

9

slide-28
SLIDE 28

2LPS Overview

slide-29
SLIDE 29

2LPS Overview

  • Within the same type (Level-1)
  • Non-deadline flow demotes its prio as more bytes sent
  • Deadline flow promotes its prio as urgency increases
slide-30
SLIDE 30

2LPS Overview

  • Within the same type (Level-1)
  • Non-deadline flow demotes its prio as more bytes sent
  • Deadline flow promotes its prio as urgency increases
  • Within the same prio (Level-2)
  • Non-deadline flows are strictly prioritized
slide-31
SLIDE 31

2LPS Overview

  • Within the same type (Level-1)
  • Non-deadline flow demotes its prio as more bytes sent
  • Deadline flow promotes its prio as urgency increases
  • Within the same prio (Level-2)
  • Non-deadline flows are strictly prioritized

High priority Low priority Non-deadline flow Logical view

slide-32
SLIDE 32

2LPS Overview

  • Within the same type (Level-1)
  • Non-deadline flow demotes its prio as more bytes sent
  • Deadline flow promotes its prio as urgency increases
  • Within the same prio (Level-2)
  • Non-deadline flows are strictly prioritized

High priority Low priority Non-deadline flow Logical view

Prio 1 Prio 2K Deadline flow Physical view

slide-33
SLIDE 33

2LPS: Level-1 rationale

11

slide-34
SLIDE 34

2LPS: Level-1 rationale

  • Within the same type (Level-1)
  • For non-deadline flows
  • PIAS[1]-like priority demotion to approximate SJF
  • prioritize short flows
  • For deadline flows
  • Priority promotion scheme based on urgency
  • prioritize flows with deadline approaching

11

[1] Information-Agnostic Flow Scheduling for Commodity Data Centers, NSDI’15

slide-35
SLIDE 35

2LPS: Level-1 rationale

  • Within the same type (Level-1)
  • For non-deadline flows
  • PIAS[1]-like priority demotion to approximate SJF
  • prioritize short flows
  • For deadline flows
  • Priority promotion scheme based on urgency
  • prioritize flows with deadline approaching
  • Why not Earliest-Deadline-First as tagging option?
  • EDF is optimal when scheduling deadline flows
  • but over-aggressive in mix-flow context
  • and limited priority queues, etc.

11

[1] Information-Agnostic Flow Scheduling for Commodity Data Centers, NSDI’15

slide-36
SLIDE 36

2LPS: Level-2 rationale

  • Within the same prio (Level-2)
  • Protect (short) non-deadline flows from over-aggressive

(long) deadline flows Prio 1 Prio 2

High priority Low priority Deadline flow Non-deadline flow

12

slide-37
SLIDE 37

2LPS: Level-2 rationale

  • Within the same prio (Level-2)
  • Protect (short) non-deadline flows from over-aggressive

(long) deadline flows Prio 1 Prio 2

High priority Low priority Deadline flow Non-deadline flow

12

slide-38
SLIDE 38

2LPS: Level-2 rationale

  • Within the same prio (Level-2)
  • Protect (short) non-deadline flows from over-aggressive

(long) deadline flows Prio 1 Prio 2

High priority Low priority Deadline flow Non-deadline flow

12

slide-39
SLIDE 39

2LPS: Level-2 rationale

  • Within the same prio (Level-2)
  • Protect (short) non-deadline flows from over-aggressive

(long) deadline flows Prio 1 Prio 2

High priority Low priority Deadline flow Non-deadline flow Priority Promotion

12

slide-40
SLIDE 40

2LPS: Level-2 rationale

  • Within the same prio (Level-2)
  • Protect non-deadline (short) flows from over-aggressive

(long) deadline flows Prio 1 Prio 2

High priority Low priority Deadline flow Non-deadline flow

13

slide-41
SLIDE 41

2LPS: Level-2 rationale

  • Within the same prio (Level-2)
  • Protect non-deadline (short) flows from over-aggressive

(long) deadline flows Prio 1 Prio 2

High priority Low priority Deadline flow Non-deadline flow

13

slide-42
SLIDE 42

2LPS: Level-2 rationale

  • Within the same prio (Level-2)
  • Protect non-deadline (short) flows from over-aggressive

(long) deadline flows Prio 1 Prio 2

High priority Low priority Deadline flow Non-deadline flow

13

slide-43
SLIDE 43

2LPS: Level-2 rationale

  • Within the same prio (Level-2)
  • Protect non-deadline (short) flows from over-aggressive

(long) deadline flows Prio 1 Prio 2

High priority Low priority Deadline flow Non-deadline flow

(short) non-deadline flow is delayed!

13

slide-44
SLIDE 44

How does Aemon perform?

slide-45
SLIDE 45

Packet-level NS2 simulation

  • Spine-leaf Fabric with 144 hosts
  • RTT: ~85.2μs (80μs at hosts)
  • Buffer size: 360KB each port
  • ECN thresholds: 65/250 #pkts for 10/40Gbps link
  • Workloads
  • Web Search (DCTCP paper), Data Mining (VL2 paper)

…… …… …… …… ……

40Gbps Link 10Gbps Link 9 racks

15

slide-46
SLIDE 46

Overall Average FCT

Web Search workload

  • Compared with PIAS
  • Aemon reduces ~45.1%

average FCT

  • UCP lowers non-deadline

flows’ FCT

  • 2LPS also lowers non-

deadline flows’ FCT

16

slide-47
SLIDE 47

Overall Average FCT

Web Search workload

  • Compared with PIAS
  • Aemon reduces ~45.1%

average FCT

  • UCP lowers non-deadline

flows’ FCT

  • 2LPS also lowers non-

deadline flows’ FCT

Average FCT (ms) 6.5 13 19.5 26 Load 0.75 0.8 0.85 0.9

Aemon Karuna PIAS+DCTCP PIAS+UCP 2LPS+DCTCP

16

slide-48
SLIDE 48

Overall Average FCT

Web Search workload

  • Compared with PIAS
  • Aemon reduces ~45.1%

average FCT

  • UCP lowers non-deadline

flows’ FCT

  • 2LPS also lowers non-

deadline flows’ FCT

Average FCT (ms) 6.5 13 19.5 26 Load 0.75 0.8 0.85 0.9

Aemon Karuna PIAS+DCTCP PIAS+UCP 2LPS+DCTCP

16

slide-49
SLIDE 49

Overall Average FCT

Web Search workload

  • Compared with PIAS
  • Aemon reduces ~45.1%

average FCT

  • UCP lowers non-deadline

flows’ FCT

  • 2LPS also lowers non-

deadline flows’ FCT

Average FCT (ms) 6.5 13 19.5 26 Load 0.75 0.8 0.85 0.9

Aemon Karuna PIAS+DCTCP PIAS+UCP 2LPS+DCTCP

16

slide-50
SLIDE 50

Overall Average FCT

Web Search workload

  • Compared with PIAS
  • Aemon reduces ~45.1%

average FCT

  • UCP lowers non-deadline

flows’ FCT

  • 2LPS also lowers non-

deadline flows’ FCT

Average FCT (ms) 6.5 13 19.5 26 Load 0.75 0.8 0.85 0.9

Aemon Karuna PIAS+DCTCP PIAS+UCP 2LPS+DCTCP

16

slide-51
SLIDE 51

Overall Average FCT

Web Search workload

  • Compared with PIAS
  • Aemon reduces ~45.1%

average FCT

  • UCP lowers non-deadline

flows’ FCT

  • 2LPS also lowers non-

deadline flows’ FCT

Average FCT (ms) 6.5 13 19.5 26 Load 0.75 0.8 0.85 0.9

Aemon Karuna PIAS+DCTCP PIAS+UCP 2LPS+DCTCP

16

slide-52
SLIDE 52

Overall Average FCT

Web Search workload

  • Compared with PIAS
  • Aemon reduces ~45.1%

average FCT

  • UCP lowers non-deadline

flows’ FCT

  • 2LPS also lowers non-

deadline flows’ FCT

Average FCT (ms) 6.5 13 19.5 26 Load 0.75 0.8 0.85 0.9

Aemon Karuna PIAS+DCTCP PIAS+UCP 2LPS+DCTCP

16

slide-53
SLIDE 53

Overall Average FCT

Web Search workload

  • Compared with PIAS
  • Aemon reduces ~45.1%

average FCT

  • UCP lowers non-deadline

flows’ FCT

  • 2LPS also lowers non-

deadline flows’ FCT

Average FCT (ms) 6.5 13 19.5 26 Load 0.75 0.8 0.85 0.9

Aemon Karuna PIAS+DCTCP PIAS+UCP 2LPS+DCTCP

16

slide-54
SLIDE 54

Overall Average FCT

Web Search workload

  • Compared with PIAS
  • Aemon reduces ~45.1%

average FCT

  • UCP lowers non-deadline

flows’ FCT

  • 2LPS also lowers non-

deadline flows’ FCT

Average FCT (ms) 6.5 13 19.5 26 Load 0.75 0.8 0.85 0.9

Aemon Karuna PIAS+DCTCP PIAS+UCP 2LPS+DCTCP

16

slide-55
SLIDE 55

Overall Average FCT

Web Search workload

  • Compared with PIAS
  • Aemon reduces ~45.1%

average FCT

  • UCP lowers non-deadline

flows’ FCT

  • 2LPS also lowers non-

deadline flows’ FCT

Average FCT (ms) 6.5 13 19.5 26 Load 0.75 0.8 0.85 0.9

Aemon Karuna PIAS+DCTCP PIAS+UCP 2LPS+DCTCP

16

slide-56
SLIDE 56

Overall Average FCT

Web Search workload

  • Compared with PIAS
  • Aemon reduces ~45.1%

average FCT

  • UCP lowers non-deadline

flows’ FCT

  • 2LPS also lowers non-

deadline flows’ FCT

Average FCT (ms) 6.5 13 19.5 26 Load 0.75 0.8 0.85 0.9

Aemon Karuna PIAS+DCTCP PIAS+UCP 2LPS+DCTCP

16

slide-57
SLIDE 57

Overall Average FCT

Web Search workload

  • Compared with PIAS
  • Aemon reduces ~45.1%

average FCT

  • UCP lowers non-deadline

flows’ FCT

  • 2LPS also lowers non-

deadline flows’ FCT

Average FCT (ms) 6.5 13 19.5 26 Load 0.75 0.8 0.85 0.9

Aemon Karuna PIAS+DCTCP PIAS+UCP 2LPS+DCTCP

16

slide-58
SLIDE 58

Deadline Miss Ratio

Data Mining workload

  • Aemon performs best among

all the information-agnostic mechanisms

  • UCP slightly enhances the

miss ratio

  • since UCP cuts down

cwnd too aggressively

17

slide-59
SLIDE 59

Deadline Miss Ratio

Data Mining workload

  • Aemon performs best among

all the information-agnostic mechanisms

  • UCP slightly enhances the

miss ratio

  • since UCP cuts down

cwnd too aggressively

Deadline Miss Ratio (%) 4 4.5 5 5.5 6 Load 0.75 0.8 0.85 0.9

Aemon Karuna PIAS+DCTCP PIAS+UCP 2LPS+DCTCP

17

slide-60
SLIDE 60

Deadline Miss Ratio

Data Mining workload

  • Aemon performs best among

all the information-agnostic mechanisms

  • UCP slightly enhances the

miss ratio

  • since UCP cuts down

cwnd too aggressively

Deadline Miss Ratio (%) 4 4.5 5 5.5 6 Load 0.75 0.8 0.85 0.9

Aemon Karuna PIAS+DCTCP PIAS+UCP 2LPS+DCTCP

17

slide-61
SLIDE 61

Deadline Miss Ratio

Data Mining workload

  • Aemon performs best among

all the information-agnostic mechanisms

  • UCP slightly enhances the

miss ratio

  • since UCP cuts down

cwnd too aggressively

Deadline Miss Ratio (%) 4 4.5 5 5.5 6 Load 0.75 0.8 0.85 0.9

Aemon Karuna PIAS+DCTCP PIAS+UCP 2LPS+DCTCP

17

slide-62
SLIDE 62

Deadline Miss Ratio

Data Mining workload

  • Aemon performs best among

all the information-agnostic mechanisms

  • UCP slightly enhances the

miss ratio

  • since UCP cuts down

cwnd too aggressively

Deadline Miss Ratio (%) 4 4.5 5 5.5 6 Load 0.75 0.8 0.85 0.9

Aemon Karuna PIAS+DCTCP PIAS+UCP 2LPS+DCTCP

17

slide-63
SLIDE 63

Deadline Miss Ratio

Data Mining workload

  • Aemon performs best among

all the information-agnostic mechanisms

  • UCP slightly enhances the

miss ratio

  • since UCP cuts down

cwnd too aggressively

Deadline Miss Ratio (%) 4 4.5 5 5.5 6 Load 0.75 0.8 0.85 0.9

Aemon Karuna PIAS+DCTCP PIAS+UCP 2LPS+DCTCP

17

slide-64
SLIDE 64

Deadline Miss Ratio

Data Mining workload

  • Aemon performs best among

all the information-agnostic mechanisms

  • UCP slightly enhances the

miss ratio

  • since UCP cuts down

cwnd too aggressively

Deadline Miss Ratio (%) 4 4.5 5 5.5 6 Load 0.75 0.8 0.85 0.9

Aemon Karuna PIAS+DCTCP PIAS+UCP 2LPS+DCTCP

17

slide-65
SLIDE 65

Deadline Miss Ratio

Data Mining workload

  • Aemon performs best among

all the information-agnostic mechanisms

  • UCP slightly enhances the

miss ratio

  • since UCP cuts down

cwnd too aggressively

Deadline Miss Ratio (%) 4 4.5 5 5.5 6 Load 0.75 0.8 0.85 0.9

Aemon Karuna PIAS+DCTCP PIAS+UCP 2LPS+DCTCP

17

slide-66
SLIDE 66

Key Takeaways

  • Aemon: information-agnostic mix-flow scheduling
  • combine end-host rate-control (UCP) with in-network

flow-scheduling (2LPS)

  • UCP controls deadline flow’s cwnd based on urgency
  • 2LPS lowers non-deadline flow’s FCT

18

slide-67
SLIDE 67

Thanks for attention! Q & A