RESTRUCTURING ENDPOINT CONGESTION CONTROL
Akshay Narayan, Frank Cangialosi, Deep5 Raghavan, Prateesh Goyal, Srinivas Narayana, Radhika Mi>al, Mohammad Alizadeh, Hari Balakrishnan
RESTRUCTURING ENDPOINT CONGESTION CONTROL Akshay Narayan, Frank - - PowerPoint PPT Presentation
RESTRUCTURING ENDPOINT CONGESTION CONTROL Akshay Narayan, Frank Cangialosi, Deep5 Raghavan, Prateesh Goyal, Srinivas Narayana, Radhika Mi>al, Mohammad Alizadeh, Hari Balakrishnan CONGESTION CONTROL APPLICATION Data OS TCP Data NETWORK
Akshay Narayan, Frank Cangialosi, Deep5 Raghavan, Prateesh Goyal, Srinivas Narayana, Radhika Mi>al, Mohammad Alizadeh, Hari Balakrishnan
CONGESTION CONTROL
APPLICATION OS TCP
Data Data
NETWORK
DATAPATHS
MTCP DPDK QUIC DCCP UDP SMARTNIC FPGA RDMA APPLICATION
Data Data
NETWORK OS TCP
NEW ALGORITHMS
1987 2018
Vegas Westwood Compound PRR BIC H-TCP Cubic FAST EBCC Binomial LEDBAT Veno BBR PCC Remy Sprout DCTCP Illinois NV Hybla TIMELY ABC NewReno Copa RC3 XCP RCP DCQCN
1998 2001
Tahoe Reno
2010
Vivace Indigo Nimbus
ALGORITHM COMPLEXITY
Sprout (NSDI 2013): Bayesian forecas5ng Remy (SIGCOMM 2013): Offline learning PCC / PCC Vivace (NSDI 2015 / NSDI 2018): Online learning Indigo (Usenix ATC 2018): Reinforcement learning
CROSS PRODUCT OF SADNESS
Vegas Westwood Compound PRR BIC H-TCP Cubic FAST EBCC Binomial LEDBAT Veno BBR PCC Remy Sprout DCTCP Illinois NV Hybla TIMELY ABC NewReno Copa RC3 XCP RCP DCQCN Tahoe Reno Vivace Indigo Nimbus
MTCP DPDK QUIC DCCP USERSPACE SMARTNIC FPGA RDMA OS TCP
CROSS PRODUCT OF SADNESS
NEW CAPABILITIES
APPLICATION CC CC CC CC CC INDEPENDENT CC APPLICATION CC AGGREGATE CC
CURRENT DESIGN
NIC ApplicaEon TX RX Datapath Datapath State CongesEon Control
CONGESTION CONTROL PLANE
NIC ApplicaEon TX RX Datapath Datapath State CCP Agent CCP Datapath
CONGESTION CONTROL PLANE
Write-once, run-anywhere Sophis5cated algorithms New capabili5es
NIC ApplicaEon TX RX Datapath Datapath State CCP Agent CCP Datapath
LATENCY TRADEOFF
Latency ( < 30 µs)
NIC ApplicaEon TX RX Datapath Datapath State CCP Agent CCP Datapath
Write-once, run-anywhere Sophis5cated algorithms New capabili5es
SPLIT IMPLEMENTATION
NIC ApplicaEon TX RX Datapath
CWND RATE StaEsEcs
Datapath State CCP Agent CCP Datapath
Asynchronous CC Logic Synchronous Measurement Gathering
Split CC performs similarly to datapath-na5ve
SPLIT IMPLEMENTATION
NIC ApplicaEon TX RX Datapath
CWND RATE StaEsEcs
Datapath State CCP Agent CCP Datapath
Standardized Datapath Interface
MEASUREMENT PRIMITIVES
Measurement timestamp In-order acked bytes Out-of-order acked bytes ECN-marked bytes Lost bytes Timeout occurred RTT sample Bytes in flight Outgoing rate Incoming rate
Linux Kernel BBR
CCP BBR
CUBIC WINDOW DYNAMICS
200 400 600 20 40 60
Time (s) Congestion Window (Pkts)
CCP Kernel
96 Mbit/s, 20ms link RTT
WRITE-ONCE RUN-ANYWHERE
Link: 24 Mbit/s, 20ms RTT
Kernel QUIC mTCP Copa Cubic
Link: 12 Mbit/s, 20ms RTT
STRESS TEST
cubic reno 1 2 4 8 16 32 64 1 2 4 8 16 32 64 0.0 2.5 5.0 7.5
Flows CPU Utilization (%)
ccp kernel
Link: 10Gbit/s, 100µs RTT
5%
0.00 0.25 0.50 0.75 1.00 1.25 20us 2RTT 50us 5RTT 100us 10RTT 200us 20RTT 300us 30RTT 400us 40RTT 500us 50RTT
Reporting Interval RCT/Baseline
10Pkts 100Pkts 1000Pkts
LOW-RTT SCENARIOS
Link: 10Gbit/s, 10 µs
1.15
DESIGN: FAST AND SLOW PATH
NIC ApplicaEon TX RX Datapath
CWND RATE StaEsEcs
Datapath State CCP Agent CCP Datapath Shim
Asynchronous API
SPLIT IMPLEMENTATION
NIC ApplicaEon TX RX Datapath Datapath State CCP Agent CCP Datapath Shim libccp
Datapath shim Expose datapath variables
SPLIT IMPLEMENTATION
libccp Per Packet Opera5ons Shared across datapaths
NIC ApplicaEon TX RX Datapath Datapath State CCP Agent CCP Datapath Shim libccp
Datapath shim Expose datapath variables
Datapath Program:
▸ Per ACK measurements ▸ Pulse:
Rate = 1.25 x bo>le rate
▸ Aeer 1 RTT:
Rate = 0.75 x bo>le rate
▸ Aeer 2 RTT:
Rate = bo>le rate
▸ Aeer 8 RTT: repeat ▸ Every report ▸ Calculate new rate based
▸ Handle switching between
modes
Asynchronous: BBR SPLIT IMPLEMENTATION
Asynchronous Datapath Program
SLOW START
200 400 600 0.0 0.2 0.4 0.6
Time (s) Congestion Window (Pkts)
CCP , 100ms Report CCP , In−Fold CCP , Rate−Based In−Datapath
48Mbit/s, 100ms link RTT
NEW CAPABILITIES
Sophis5cated algorithms Rapid prototyping CC for flow aggregates Applica5on-integrated CC Dynamic, path-specific CC
NEXT STEPS
▸ More algorithms! ▸ Hardware datapaths ▸ Impact of new API on
conges5on control algorithms
▸ New capabili5es using CCP
plajorm
▸ Datapaths (libccp): ▸ Linux TCP ▸ QUIC ▸ mTCP/DPDK ▸ CCP Agent (portus)
CURRENT STATUS
Reproduce our results and build your own conges5on control at
EBPF
▸ Event-driven seman5cs ▸ Explicit repor5ng model
Front-End (Language) Back-End (Datapath)
▸ Conges5on control
enforcement
▸ Direct access to socket
state
(def (Report (acked 0))) (when true (:= Report.acked (+ Report.acked Ack.bytes_acked)) (:= Cwnd (+ Cwnd Report.acked)) (fallthrough)) (when (> Flow.lost_pkts_sample 0) (report))