Backpressure Flow Control Prateesh Goyal, Preey Shah, Naveen Sharma, - - PowerPoint PPT Presentation

backpressure flow control
SMART_READER_LITE
LIVE PREVIEW

Backpressure Flow Control Prateesh Goyal, Preey Shah, Naveen Sharma, - - PowerPoint PPT Presentation

Backpressure Flow Control Prateesh Goyal, Preey Shah, Naveen Sharma, Kevin Zhao, Mohammad Alizadeh, Tom Anderson Two Types of Congestion Control End to End: action delayed by at least one RT Sources send initial window Adjust rate


slide-1
SLIDE 1

Backpressure Flow Control

Prateesh Goyal, Preey Shah, Naveen Sharma, Kevin Zhao, Mohammad Alizadeh, Tom Anderson

slide-2
SLIDE 2

Two Types of Congestion Control

  • End to End: action delayed by at least one RT

– Sources send initial window – Adjust rate based on feedback – Complex control loop: topology and signalling

  • Hop by Hop: short control loops

– Sources send at line rate, pushback at switch – Per-flow state and head-of-line blocking – Widely used at server and rack level

slide-3
SLIDE 3

Why Now?

  • Data center bandwidth increasing rapidly

– Soon, most traffic will fit in a single round trip

  • Latency and tail latency a dominant concern

– Increasing percentage of RDMA

  • Traffic patterns are highly bursty

– Hard to control what isn’t stable

  • Network operational costs important

– E2E: lower utilization for same tail latency

slide-4
SLIDE 4

Switch Capacity Increasing

slide-5
SLIDE 5

Buffering Matters to Tail Latency

DCQCN, 99% tail latency, Google workload, 75% util+incast

slide-6
SLIDE 6

Elephants Are Mice

100 Gb Weighted by flow size 1 Tb

slide-7
SLIDE 7

Backpressure Flow Control

  • Assumptions (Tofino like)

– Limited number of egress queues (e.g., 32) – Queues can be paused/unpaused – Deficit RR among unpaused queues

  • Dynamic assignment of flows to queues
  • Per-hop pause frames, bloom filter for flows

– Aggressive: push queueing upstream unless needed to keep egress busy

  • Switch state ⍺ number of queued flows
slide-8
SLIDE 8

Buffer Occupancy

BFC buffers ⍺ number of queued flows

slide-9
SLIDE 9

Tail Latency

99%, Google workload, 60% util + incast

slide-10
SLIDE 10

Cross-Data Center Traffic

99% tail latency, intra-DC traffic in presence of cross-DC traffic

slide-11
SLIDE 11

Network Cut Theorem

For today’s bursty traffic patterns and flow sizes, e2e cc cannot provide all of (choose at most 2):

  • C: High link capacity
  • U: Efficient link utilization
  • T: Low tail latency

Hop by hop flow control can provide all three

slide-12
SLIDE 12

Backup

slide-13
SLIDE 13

Faster Links Harder to Control

DCQCN, Google workload, 75% util+incast