aemon information agnostic mix flow scheduling in data
play

Aemon: Information-agnostic Mix-flow Scheduling in Data Center - PowerPoint PPT Presentation

Aemon: Information-agnostic Mix-flow Scheduling in Data Center Networks Tao Wang 1 , Hong Xu 2 , Fangming Liu 1 1 Huazhong University of Science and Technology 2 NetX Lab @ City University of Hong Kong August, 2017 @ APNet, Hong Kong Why


  1. Aemon: Information-agnostic Mix-flow Scheduling in Data Center Networks Tao Wang 1 , Hong Xu 2 , Fangming Liu 1 1 Huazhong University of Science and Technology 2 NetX Lab @ City University of Hong Kong August, 2017 @ APNet, Hong Kong

  2. Why information-agnostic mix-flow scheduling?

  3. Mix-flow in DCN 3

  4. Mix-flow in DCN … …… Hundreds of thousands of servers …… Web Services ML Analytics HPC 3

  5. Mix-flow in DCN … …… Hundreds of thousands of servers …… Web Services ML Analytics HPC ‣ Non-deadline flows ‣ Deadline flows ‣ minimize FCT ‣ minimize deadline miss ratio 3

  6. Flow size is hard to obtain 4

  7. Flow size is hard to obtain ‣ Multi-stage job processing technique (e.g. pipelining, etc.) ‣ Real-time characteristics (e.g. streaming application, etc.) Hard to know flow sizes beforehand! 4

  8. Flow size is hard to obtain ‣ Multi-stage job processing technique (e.g. pipelining, etc.) ‣ Real-time characteristics (e.g. streaming application, etc.) Hard to know flow sizes beforehand! 4

  9. Existing solutions fall short 5

  10. Existing solutions fall short ‣ Deadline-unaware transport ‣ TCP , DCTCP , etc. ‣ Fail to meet deadlines for deadline flows [1-2] [1] pFabric: Minimal Near-Optimal Datacenter Transport, SIGCOMM’13 [2] Scheduling Mix-flows in Commodity Datacenters with Karuna, SIGCOMM’16 5

  11. Existing solutions fall short ‣ Deadline-unaware transport ‣ TCP , DCTCP , etc. ‣ Fail to meet deadlines for deadline flows [1-2] ‣ Deadline-aware transport ‣ D 3 , D 2 TCP , PDQ, pFabric, Karuna, etc. ‣ Either impossible to deploy in DCN (PDQ, pFabric) ‣ Or assume flow size is known (D 3 , D 2 TCP , Karuna) [1] pFabric: Minimal Near-Optimal Datacenter Transport, SIGCOMM’13 [2] Scheduling Mix-flows in Commodity Datacenters with Karuna, SIGCOMM’16 5

  12. Aemon

  13. Aemon

  14. Aemon Maester Aemon was the blind maester at Castle Black in Game of Thrones

  15. Aemon’s Design 7

  16. Aemon’s Design w. deadline Urgency- based Congestion Control w/o deadline UCP 7

  17. Aemon’s Design w. deadline Urgency- based Congestion Control w/o deadline End-host UCP 7

  18. Aemon’s Design w. deadline Urgency- based Congestion Control w/o deadline End-host 2LPS: Two-level PS UCP 7

  19. Aemon’s Design Priority Scheduling Prio 1 w. deadline Urgency- Prio 2 End-host based … Priority Congestion Tagging Prio 2K-1 Control w/o deadline Prio 2K End-host 2LPS: Two-level PS UCP 7

  20. Aemon’s Design Priority Scheduling Prio 1 w. deadline Urgency- Prio 2 End-host based … Priority Congestion Tagging Prio 2K-1 Control w/o deadline Prio 2K End-host Switch 2LPS: Two-level PS UCP 7

  21. UCP Overview 8

  22. UCP Overview ‣ DCTCP expression of network congestion α ← (1 − g ) · α + g · F 8

  23. UCP Overview ‣ DCTCP expression of network congestion α ← (1 − g ) · α + g · F ‣ Deadline flow’s urgency (non-deadline flow’s urgency is 1) 8

  24. UCP Overview ‣ DCTCP expression of network congestion α ← (1 − g ) · α + g · F ‣ Deadline flow’s urgency (non-deadline flow’s urgency is 1) T e s = T d − T e Deadline Elapsed Time 8

  25. UCP Overview ‣ DCTCP expression of network congestion α ← (1 − g ) · α + g · F ‣ Deadline flow’s urgency (non-deadline flow’s urgency is 1) T e s = T d − T e Deadline Elapsed Time ‣ Congestion window modulation ⇢ cwnd · (1 − α s / 2) , α s > 0 , cwnd = α s = 0 . cwnd + 1 , 8

  26. UCP Rationale ‣ Penalize low-urgency deadline flow • leave more bandwidth for non-deadline flows ‣ Protect high-urgency deadline flow • meet deadlines 9

  27. UCP Rationale ‣ Penalize low-urgency deadline flow • leave more bandwidth for non-deadline flows ‣ Protect high-urgency deadline flow • meet deadlines w/o ddl w/ ddl di ff 1 Window Penalty 0.75 0.5 0.25 0 -0.25 -0.5 0 0.5 1 1.5 2 Urgency (i.e. s) 9

  28. 2LPS Overview

  29. 2LPS Overview ‣ Within the same type (Level-1) ‣ Non-deadline flow demotes its prio as more bytes sent ‣ Deadline flow promotes its prio as urgency increases

  30. 2LPS Overview ‣ Within the same type (Level-1) ‣ Non-deadline flow demotes its prio as more bytes sent ‣ Deadline flow promotes its prio as urgency increases ‣ Within the same prio (Level-2) ‣ Non-deadline flows are strictly prioritized

  31. 2LPS Overview ‣ Within the same type (Level-1) ‣ Non-deadline flow demotes its prio as more bytes sent ‣ Deadline flow promotes its prio as urgency increases ‣ Within the same prio (Level-2) ‣ Non-deadline flows are strictly prioritized High priority Logical view … Low priority Non-deadline flow

  32. 2LPS Overview ‣ Within the same type (Level-1) ‣ Non-deadline flow demotes its prio as more bytes sent ‣ Deadline flow promotes its prio as urgency increases ‣ Within the same prio (Level-2) ‣ Non-deadline flows are strictly prioritized High priority Logical view Physical view Prio 1 … … Deadline flow Prio 2K Low priority Non-deadline flow

  33. 2LPS: Level-1 rationale 11

  34. 2LPS: Level-1 rationale ‣ Within the same type (Level-1) ‣ For non-deadline flows ‣ PIAS [1] -like priority demotion to approximate SJF • prioritize short flows ‣ For deadline flows ‣ Priority promotion scheme based on urgency • prioritize flows with deadline approaching [1] Information-Agnostic Flow Scheduling for Commodity Data Centers, NSDI’15 11

  35. 2LPS: Level-1 rationale ‣ Within the same type (Level-1) ‣ For non-deadline flows ‣ PIAS [1] -like priority demotion to approximate SJF • prioritize short flows ‣ For deadline flows ‣ Priority promotion scheme based on urgency • prioritize flows with deadline approaching ‣ Why not Earliest-Deadline-First as tagging option? ‣ EDF is optimal when scheduling deadline flows ‣ but over-aggressive in mix-flow context ‣ and limited priority queues, etc. [1] Information-Agnostic Flow Scheduling for Commodity Data Centers, NSDI’15 11

  36. 2LPS: Level-2 rationale ‣ Within the same prio (Level-2) • Protect (short) non-deadline flows from over-aggressive (long) deadline flows Deadline flow High priority Prio 1 Non-deadline flow Prio 2 Low priority 12

  37. 2LPS: Level-2 rationale ‣ Within the same prio (Level-2) • Protect (short) non-deadline flows from over-aggressive (long) deadline flows Deadline flow High priority Prio 1 Non-deadline flow Prio 2 Low priority 12

  38. 2LPS: Level-2 rationale ‣ Within the same prio (Level-2) • Protect (short) non-deadline flows from over-aggressive (long) deadline flows Deadline flow High priority Prio 1 Non-deadline flow Prio 2 Low priority 12

  39. 2LPS: Level-2 rationale ‣ Within the same prio (Level-2) • Protect (short) non-deadline flows from over-aggressive (long) deadline flows Deadline flow High priority Prio 1 Non-deadline flow Prio 2 Priority Promotion Low priority 12

  40. 2LPS: Level-2 rationale ‣ Within the same prio (Level-2) • Protect non-deadline (short) flows from over-aggressive (long) deadline flows Deadline flow High priority Prio 1 Non-deadline flow Prio 2 Low priority 13

  41. 2LPS: Level-2 rationale ‣ Within the same prio (Level-2) • Protect non-deadline (short) flows from over-aggressive (long) deadline flows Deadline flow High priority Prio 1 Non-deadline flow Prio 2 Low priority 13

  42. 2LPS: Level-2 rationale ‣ Within the same prio (Level-2) • Protect non-deadline (short) flows from over-aggressive (long) deadline flows Deadline flow High priority Prio 1 Non-deadline flow Prio 2 Low priority 13

  43. 2LPS: Level-2 rationale ‣ Within the same prio (Level-2) • Protect non-deadline (short) flows from over-aggressive (long) deadline flows (short) non-deadline flow is delayed! Deadline flow High priority Prio 1 Non-deadline flow Prio 2 Low priority 13

  44. How does Aemon perform?

  45. Packet-level NS2 simulation 40Gbps Link 9 racks 10Gbps …… Link …… …… …… …… ‣ Spine-leaf Fabric with 144 hosts ‣ RTT: ~85.2 μ s (80 μ s at hosts) ‣ Buffer size: 360KB each port ‣ ECN thresholds: 65/250 #pkts for 10/40Gbps link ‣ Workloads ‣ Web Search (DCTCP paper), Data Mining (VL2 paper) 15

  46. Overall Average FCT Web Search workload ‣ Compared with PIAS ‣ Aemon reduces ~45.1% average FCT ‣ UCP lowers non-deadline flows’ FCT ‣ 2LPS also lowers non- deadline flows’ FCT 16

  47. Overall Average FCT Web Search workload Aemon Karuna PIAS+DCTCP PIAS+UCP 2LPS+DCTCP ‣ Compared with PIAS 26 ‣ Aemon reduces ~45.1% 19.5 Average FCT (ms) average FCT ‣ UCP lowers non-deadline 13 flows’ FCT ‣ 2LPS also lowers non- 6.5 deadline flows’ FCT 0 0.75 0.8 0.85 0.9 Load 16

  48. Overall Average FCT Web Search workload Aemon Karuna PIAS+DCTCP PIAS+UCP 2LPS+DCTCP ‣ Compared with PIAS 26 ‣ Aemon reduces ~45.1% 19.5 Average FCT (ms) average FCT ‣ UCP lowers non-deadline 13 flows’ FCT ‣ 2LPS also lowers non- 6.5 deadline flows’ FCT 0 0.75 0.8 0.85 0.9 Load 16

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend