Con Congesti tion on Ma Management t fo for Ethernet-based Lo - - PowerPoint PPT Presentation

con congesti tion on ma management t fo for ethernet
SMART_READER_LITE
LIVE PREVIEW

Con Congesti tion on Ma Management t fo for Ethernet-based Lo - - PowerPoint PPT Presentation

Con Congesti tion on Ma Management t fo for Ethernet-based Lo Lossless Da DataCe Center Ne Networks Pedro Javier Garcia 1 , Jesus Escudero-Sahuquillo 1 , Francisco J. Quiles 1 and Jose Duato 2 1: University of Castilla-La Mancha (UCLM)


slide-1
SLIDE 1

Con Congesti tion

  • n Ma

Management t fo for Ethernet-based Lo Lossless Da DataCe Center Ne Networks

Pedro Javier Garcia1, Jesus Escudero-Sahuquillo1, Francisco J. Quiles1 and Jose Duato2

1: University of Castilla-La Mancha (UCLM) 2: Technical University València (UPV)

DCN: 1-19-0012-00-ICne NENDICA

slide-2
SLIDE 2

Ab Abstract

This paper describes congestion phenomena in lossless data center networks and its nega- tive

  • consequences. It explores proposed solutions,

analyzing their pros and cons to determine which are suited to the requirements of modern data centers. Conclusions identify important issues that should be addressed in the future.

slide-3
SLIDE 3

Ag Agenda

Introduction Congestion Dynamics in DCNs Reducing In-Network and Incast Congestion Combining Congestion Management Mechanisms Conclusions

slide-4
SLIDE 4

Ag Agenda

Introduction Congestion Dynamics in DCNs Reducing In-Network and Incast Congestion Combining Congestion Management Mechanisms Conclusions

slide-5
SLIDE 5

In Intr troduc ductio tion

On-Line Data Intensive (OLDI) Services [Congdon18]

  • Require immediate answers to requests that are coming in

at a high rate.

  • End-user experience is highly dependent upon the system

responsiveness.

  • The network becomes a significant component of overall DC

latency when congestion occurs in the network.

Worker Worker ... Worker

Aggregator Aggregator ...

Worker Worker ... Worker

Aggregator

Aggregator Deadline = 10 ms Deadline = 50 ms Deadline = 250 ms

Request

slide-6
SLIDE 6
  • Todays DCNs require a flexible fabric for carrying in

a convergent way traffic from different types of applications, storage of control.

  • Latency is a concern: Fabric design for DCNs must

minimize or eliminate packet loss, provide high throughput and maintain low latency.

  • These goals are crucial for applications of OLDI,

Deep Learning, NVMe over Fabrics and the Cloudified Central Offices.

  • However, congestion threatens these applications.

In Intr troduc ductio tion

Data-Center Networks (DCNs)

slide-7
SLIDE 7
  • HoL-blocking dramatically

degrades the network performance (e.g. PFC has not enough granularity and there is no congested flow identification) [Garcia05].

  • Classical e2e congestion

control for lossless networks is difficult to tune, reacts slowly, and may introduce

  • scillations and instability

[Escudero11].

HS starts HS ends

HS = traffic injected to Hot Spot destination

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 1e+06 2e+06 3e+06 4e+06 5e+06 Network Throughput (normalized) Time (nanoseconds) 1Q ITh VOQnet

64-node CLOS network, 4 hot-spots

In Intr troduc ductio tion

Why congestion isolation is needed?

slide-8
SLIDE 8

33% 33% 66% 33% 33% 33% 33% 66% 100%

  • Sw. 1
  • Sw. 2
  • Sw. 3
  • Sw. 4
  • Sw. 7
  • Sw. 6
  • Sw. 5
  • Sw. 8
  • Src. A
  • Sw. 9

33%

  • Src. B
  • Src. C
  • Src. D
  • Src. E
  • Dst. X
  • Dst. Y
  • Dst. Z

33% 33% 33% 33% 33% 33 % Sending 33 % Stopped 33 % Sending

Low-Order HoL-blocking

33% 33 % Sending 33 % Stopped 33 % Sending

High-Order HoL-blocking

Congested flows (Dst. X) Non-congested flows (Dst. Z) Non-congested flows (Dst. Y)

In Intr troduc ductio tion

Why congestion isolation is needed?

slide-9
SLIDE 9
  • We need a congestion isolation (CI) mechanism

that reacts quickly when transient congestion situations appear, preventing network performance degradation caused by the HoL blocking.

  • We want a CI mechanism that complements other

technologies available in the DCNs, so that CI improves their performance, while the others reduce the CI complexity.

In Intr troduc ductio tion

Why congestion isolation is needed?

slide-10
SLIDE 10

Ag Agenda

Introduction Congestion Dynamics in DCNs Reducing In-Network and Incast Congestion Combining Congestion Management Mechanisms Conclusions

slide-11
SLIDE 11

Congestion

Injection rate at 100% of the link bandwidth (full rate)

Congestion

Injection rate at 100% of the link bandwidth (full rate)

Congestion (t0+T)

Injection rate at 100% of the link bandwidth (full rate)

Congestion (t0) Congestion (t0)

Injection rate at 100% of the link bandwidth (full rate)

Congestion (t0+T)

Con Congestion

  • n Dynami

mics in DC DCNs Ns

Appearance of Congestion

Speedup = 1 Speedup = 2 Speedup = 2 Speedup = 1.5

slide-12
SLIDE 12

Con Congestion

  • n Dynami

mics in DC DCNs Ns

Growth of Congestion Trees (from root to leaves)

Switch 1 Switch 3 Switch 5 Switch 2 Switch 4

Switch speedup = 1.5 Packet flows Congestion point

slide-13
SLIDE 13

Switch 4

Switch speedup = 1.5 Packet flows Congestion point

Switch 3 Switch 2 Switch 1 Switch 5 Switch 6 Switch 7

Con Congestion

  • n Dynami

mics in DC DCNs Ns

Growth of Congestion Trees (from leaves to root)

slide-14
SLIDE 14

Con Congestion

  • n Dynami

mics in DC DCNs Ns

Growth of Congestion Trees (Roots movement)

Switch 2 Switch 1 Switch 3 Switch 2 Switch 1 Switch 3

Switch speedup = 1.5 Packet flows (start) Packet flows (after) Congestion point

slide-15
SLIDE 15

Switch 4 Switch 3 Switch 2 Switch 1 Switch 5 Switch 6 Switch 7 Switch 8

Y X

Switch speedup = 1.5 Packet flows addressed to X Packet flows addressed to Y Congestion point

Con Congestion

  • n Dynami

mics in DC DCNs Ns

Growth of Congestion Trees (in-network roots)

slide-16
SLIDE 16

Switch speedup = 1.5 Packet flows addressed to X Packet flows addressed to Y Congestion point Switch 1 Switch 2 Switch 3 Switch 4 Switch 5 Switch 6 Switch 7 Switch 8 Switch 9

X Y

Con Congestion

  • n Dynami

mics in DCN CNs

Growth of Congestion Trees (Overlapping)

slide-17
SLIDE 17

Switch 2 Switch 1 Switch 3

Switch speedup = 1.5 Permanent packet flows Packet flows disappearing first Congestion point first appeared in the switch

Switch 2 Switch 1 Switch 3

Con Congestion

  • n Dynami

mics in DCN CNs

Growth of Congestion Trees (Vanishing)

slide-18
SLIDE 18

Ag Agenda

Introduction Congestion Dynamics in DCNs Reducing In-Network and Incast Congestion Combining Congestion Management Mechanisms Conclusions

slide-19
SLIDE 19

Re Reducing Congestion

Incast congestion reduction - ECMP

slide-20
SLIDE 20

Switch 1 Switch 2 Switch 3 Switch 4 Switch 5 Switch 6 Switch 7 Switch 8 Switch 9

X Y Switch speedup = 1.5 Packet flows addressed to X Packet flows addressed to Y Victim flow Congestion point

Re Reducing Congestion

In-network congestion reduction - ECN

slide-21
SLIDE 21
  • These technologies may work together to eliminate loss

in the cloud data center network.

  • Load-balancing and destination scheduling are end-to-

end solutions incurring in the RTT delays when congestion appear.

  • However, there is no time for loss in the network due to

congestion and congestion trees grow very quickly.

  • Transient congestion may still produce HoL blocking

that leads to increase latency, lower throughput and buffers overflow, significantly degrading performance.

  • Even using these mechanisms, we still need something

to deal with HOL Blocking locally and fast.

Re Reducing Congestion

Limitations of current technologies [Escudero19]

slide-22
SLIDE 22

Ag Agenda

Introduction Congestion Dynamics in DCNs Reducing In-Network and Incast Congestion Combining Congestion Management Mechanisms Conclusions

slide-23
SLIDE 23

Comb Combining Con Congestion

  • n

Ma Manageme ment Me Mechanisms ms

  • CI is needed to react locally and very fast to

immediately eliminate HoL blocking.

  • Previous technologies reduce the use of PFC and

ECN, but their closed- and open-loop approach cause delays still happening.

  • Congestion trees appear suddenly, are difficult to

predict (even worse when load balancing is applied) and grow quickly.

  • New techniques can be applied in combination to

the previous technologies, improving their behavior.

slide-24
SLIDE 24

Switch A

P1 P2 P3 P3

Switch B

P2 P1 P4 CFQ nCFQ Congestion Root CIP CFQ

Legend

Output port requested by the packet on top. Congestion root. Congestion Isolation Packets (CIP). Packets from congested flows. Packets from non-congested flows.

CFQ nCFQ CFQ nCFQ nCFQ CFQ nCFQ CFQ nCFQ P4

Comb Combining Con Congestion

  • n

Ma Manageme ment Me Mechanisms ms

Dynamic Virtual Lanes (DVL)

slide-25
SLIDE 25

Ag Agenda

Introduction Congestion Dynamics in DCNs Reducing In-Network and Incast Congestion Combining Congestion Management Mechanisms Conclusions

slide-26
SLIDE 26

Re References

[Duato03] J. Duato, S. Yalamanchili, and L. M. Ni, Interconnection Networks: An Engineering Approach. San Francisco, CA, USA: Morgan Kaufmann Publishers, 2003. [Garcia05] P. J. Garcia, J. Flich, J. Duato, I. Johnson, F. J. Quiles, and F. Naven, “Dynamic Evolution of Congestion Trees: Analysis and Impact on Switch Architecture,” in High Performance Embedded Architectures and Compilers, ser. Lecture Notes in Computer

  • Science. Springer, Berlin, Heidelberg, Nov. 2005, pp. 266–285.

[Congdon18] Paul Congdon, “IEEE 802 Nendica Report: The Lossless Network for Data Centers”, IEEE-SA Industry Connections White Paper, August 2018. [Leiserson85] C. E. Leiserson, “Fat-trees: Universal networks for hardware-efficient supercomputing,” IEEE Transactions on Computers, vol. C-34, pp. 892– 901, Oct 1985. [Escudero11] Jesús Escudero-Sahuquillo, Ernst Gunnar Gran, Pedro Javier García, Jose Flich, Tor Skeie, Olav Lysne, Francisco J. Quiles, José Duato: Combining Congested-Flow Isolation and Injection Throttling in HPC Interconnection Networks. ICPP 2011: 662-672 [Escudero19] Jesús Escudero-Sahuquillo, Pedro Javier García, Francisco J. Quiles, José Duato: P802.1Qcz interworking with other data center technologies. IEEE 802.1 Plenary Meeting, San Diego, CA, USA July 8, 2018 (cz-escudero-sahuquillo-ci-internetworking-0718-v1.pdf)