Balancing on the edge
Transport affinity without network state
João Taveira Araújo, Lorenzo Saino, Raul Landa and Lennert Buytenhek NSDI 2018 |
Balancing on the edge Transport affinity without network state Joo - - PowerPoint PPT Presentation
Balancing on the edge Transport affinity without network state Joo Taveira Arajo, Lorenzo Saino, Raul Landa and Lennert Buytenhek NSDI 2018 | this is the last slide (sort of) Faild decomposes load balancing as a division of labour
Balancing on the edge
Transport affinity without network state
João Taveira Araújo, Lorenzo Saino, Raul Landa and Lennert Buytenhek NSDI 2018 |
this is the last slide (sort of)
Faild decomposes load balancing as a division of labour
Read paper if you have an interest in transport protocols, Internet architecture
the problem
technology
cost of entry
market
start over
Requirements
Requirements
Requirements
Problem statement
Given fixed physical footprint, how do you design a load balancing architecture which is efficient, resilient and graceful?
the topology
Guidelines
maximize number of hosts need switches for connecting to upstreams avoid dedicated network hardware:
Guidelines
maximize number of hosts need switches for connecting to upstreams avoid dedicated network hardware:
Guidelines
maximize number of hosts need switches for connecting to upstreams avoid dedicated network hardware:
POP topology
POP topology
POP topology
Hosts Racks Bandwidth* (Gbps) RPS (millions) Storage (TB)
8 0.5 200 0.32 768 16 1 1200 0.64 1536 32 2 2400 1.28 3072 64 4 4800 2.56 6144
* notional host bandwidth on fabric accounting for loss of one switch
Hosts Racks Bandwidth* (Gbps) RPS (millions) Storage (TB)
8 0.5 200 0.32 768 16 1 1200 0.64 1536 32 2 2400 1.28 3072 64 4 4800 2.56 6144
POP topology
* notional host bandwidth on fabric accounting for loss of one switch
POP topology POP topology
Hosts Racks Bandwidth* (Gbps) RPS (millions) Storage (TB)
8 0.5 200 0.32 768 16 1 1200 0.64 1536 32 2 2400 1.28 3072 64 4 4800 2.56 6144
* notional host bandwidth on fabric accounting for loss of one switch
load balancing
anything that maintains state is easy to DDOS
software-only load balancers can’t make use of full bisection bandwidth
stateless solutions are not graceful
faild
Faild
A C B
FIB
Destination prefix 10.0.1.A 192.168.0.0/24 192.168.0.0/24 10.0.2.A 10.0.1.B 192.168.0.0/24 192.168.0.0/24 10.0.2.B 10.0.1.C 192.168.0.0/24 192.168.0.0/24 10.0.2.C Next hop IP
Faild
A C B
FIB
Destination prefix 10.0.1.A 192.168.0.0/24 192.168.0.0/24 10.0.2.A 10.0.1.B 192.168.0.0/24 192.168.0.0/24 10.0.2.B 10.0.1.C 192.168.0.0/24 192.168.0.0/24 10.0.2.C Next hop IP
Faild
A C B
made up IPs
Controller
ARP table
IP address xx:xx:xx:xx:xx:a 10.0.1.A 10.0.2.A xx:xx:xx:xx:xx:a MAC address xx:xx:xx:xx:xx:b 10.0.1.B 10.0.2.B xx:xx:xx:xx:xx:b 10.0.2.C xx:xx:xx:xx:xx:c xx:xx:xx:xx:xx:c 10.0.1.C
FIB
Destination prefix 10.0.1.A 192.168.0.0/24 192.168.0.0/24 10.0.2.A 10.0.1.B 192.168.0.0/24 192.168.0.0/24 10.0.2.B 10.0.1.C 192.168.0.0/24 192.168.0.0/24 10.0.2.C Next hop IP
Faild
A C B
Controller
ARP table
IP address xx:xx:xx:xx:xx:a 10.0.1.A 10.0.2.A xx:xx:xx:xx:xx:a MAC address xx:xx:xx:xx:xx:b 10.0.1.B 10.0.2.B xx:xx:xx:xx:xx:b 10.0.2.C xx:xx:xx:xx:xx:c xx:xx:xx:xx:xx:c 10.0.1.C
FIB
Destination prefix 10.0.1.A 192.168.0.0/24 192.168.0.0/24 10.0.2.A 10.0.1.B 192.168.0.0/24 192.168.0.0/24 10.0.2.B 10.0.1.C 192.168.0.0/24 192.168.0.0/24 10.0.2.C Next hop IP
Faild
A C B
target host
Controller
ARP table
IP address xx:xx:xx:xx:xx:a 10.0.1.A 10.0.2.A xx:xx:xx:xx:xx:a MAC address xx:xx:xx:xx:xx:b 10.0.1.B 10.0.2.B xx:xx:xx:xx:xx:b 10.0.2.C xx:xx:xx:xx:xx:c xx:xx:xx:xx:xx:c 10.0.1.C
FIB
Destination prefix 10.0.1.A 192.168.0.0/24 192.168.0.0/24 10.0.2.A 10.0.1.B 192.168.0.0/24 192.168.0.0/24 10.0.2.B 10.0.1.C 192.168.0.0/24 192.168.0.0/24 10.0.2.C Next hop IP
Faild
A C B
Faild
A C B
hosts send health status to controller
Controller
ARP table
IP address xx:xx:xx:xx:xx:a 10.0.1.A 10.0.2.A xx:xx:xx:xx:xx:a MAC address xx:xx:xx:xx:xx:c 10.0.1.B 10.0.2.B xx:xx:xx:xx:xx:a 10.0.2.C xx:xx:xx:xx:xx:c xx:xx:xx:xx:xx:c 10.0.1.C
FIB
Destination prefix 10.0.1.A 192.168.0.0/24 192.168.0.0/24 10.0.1.B 10.0.1.B 192.168.0.0/24 192.168.0.0/24 10.0.2.B 10.0.1.C 192.168.0.0/24 192.168.0.0/24 10.0.2.C Next hop IP
Controller
ARP table
IP address xx:xx:xx:xx:xx:a 10.0.1.A 10.0.2.A xx:xx:xx:xx:xx:a MAC address xx:xx:xx:xx:xx:b 10.0.1.B 10.0.2.B xx:xx:xx:xx:xx:b 10.0.2.C xx:xx:xx:xx:xx:c xx:xx:xx:xx:xx:c 10.0.1.C
FIB
Destination prefix 10.0.1.A 192.168.0.0/24 192.168.0.0/24 10.0.1.B 10.0.1.B 192.168.0.0/24 192.168.0.0/24 10.0.2.B 10.0.1.C 192.168.0.0/24 192.168.0.0/24 10.0.2.C Next hop IP
Faild
A C B
hosts send health status to controller
Controller
ARP table
IP address xx:xx:xx:xx:xx:a 10.0.1.A 10.0.2.A xx:xx:xx:xx:xx:a MAC address xx:xx:xx:xx:xx:c 10.0.1.B 10.0.2.B xx:xx:xx:xx:xx:a 10.0.2.C xx:xx:xx:xx:xx:c xx:xx:xx:xx:xx:c 10.0.1.C
FIB
Destination prefix 10.0.1.A 192.168.0.0/24 192.168.0.0/24 10.0.1.B 10.0.1.B 192.168.0.0/24 192.168.0.0/24 10.0.2.B 10.0.1.C 192.168.0.0/24 192.168.0.0/24 10.0.2.C Next hop IP
Controller
ARP table
IP address xx:xx:xx:xx:xx:a 10.0.1.A 10.0.2.A xx:xx:xx:xx:xx:a MAC address xx:xx:xx:xx:xx:b 10.0.1.B 10.0.2.B xx:xx:xx:xx:xx:b 10.0.2.C xx:xx:xx:xx:xx:c xx:xx:xx:xx:xx:c 10.0.1.C
FIB
Destination prefix 10.0.1.A 192.168.0.0/24 192.168.0.0/24 10.0.1.B 10.0.1.B 192.168.0.0/24 192.168.0.0/24 10.0.2.B 10.0.1.C 192.168.0.0/24 192.168.0.0/24 10.0.2.C Next hop IP
Faild
A C B
hosts send health status to controller
Controller
ARP table
IP address xx:xx:xx:xx:xx:a 10.0.1.A 10.0.2.A xx:xx:xx:xx:xx:a MAC address xx:xx:xx:xx:xx:c 10.0.1.B 10.0.2.B xx:xx:xx:xx:xx:a 10.0.2.C xx:xx:xx:xx:xx:c xx:xx:xx:xx:xx:c 10.0.1.C
FIB
Destination prefix 10.0.1.A 192.168.0.0/24 192.168.0.0/24 10.0.1.B 10.0.1.B 192.168.0.0/24 192.168.0.0/24 10.0.2.B 10.0.1.C 192.168.0.0/24 192.168.0.0/24 10.0.2.C Next hop IP
Controller
ARP table
IP address xx:xx:xx:xx:xx:a 10.0.1.A 10.0.2.A xx:xx:xx:xx:xx:a MAC address xx:xx:xx:xx:xx:c 10.0.1.B 10.0.2.B xx:xx:xx:xx:xx:a 10.0.2.C xx:xx:xx:xx:xx:c xx:xx:xx:xx:xx:c 10.0.1.C
FIB
Destination prefix 10.0.1.A 192.168.0.0/24 192.168.0.0/24 10.0.2.A 10.0.1.B 192.168.0.0/24 192.168.0.0/24 10.0.2.B 10.0.1.C 192.168.0.0/24 192.168.0.0/24 10.0.2.C Next hop IP
remap entries
Faild
A C B
hosts send health status to controller
Controller
ARP table
IP address xx:xx:xx:xx:xx:a 10.0.1.A 10.0.2.A xx:xx:xx:xx:xx:a MAC address xx:xx:xx:xx:xx:c 10.0.1.B 10.0.2.B xx:xx:xx:xx:xx:a 10.0.2.C xx:xx:xx:xx:xx:c xx:xx:xx:xx:xx:c 10.0.1.C
FIB
Destination prefix 10.0.1.A 192.168.0.0/24 192.168.0.0/24 10.0.1.B 10.0.1.B 192.168.0.0/24 192.168.0.0/24 10.0.2.B 10.0.1.C 192.168.0.0/24 192.168.0.0/24 10.0.2.C Next hop IP
Controller
ARP table
IP address xx:xx:xx:xx:xx:a 10.0.1.A 10.0.2.A xx:xx:xx:xx:xx:a MAC address xx:xx:xx:xx:xx:c 10.0.1.B 10.0.2.B xx:xx:xx:xx:xx:a 10.0.2.C xx:xx:xx:xx:xx:c xx:xx:xx:xx:xx:c 10.0.1.C
FIB
Destination prefix 10.0.1.A 192.168.0.0/24 192.168.0.0/24 10.0.2.A 10.0.1.B 192.168.0.0/24 192.168.0.0/24 10.0.2.B 10.0.1.C 192.168.0.0/24 192.168.0.0/24 10.0.2.C Next hop IP
isn’t this just consistent hashing?
isn’t this just consistent hashing?
yes, but we can extend mechanism and avoid resets entirely
embed mapping history in MAC address
Faild
A B
Controller
ARP table
IP address xx:xx:xx:xx:xx:a 10.0.1.A 10.0.2.A xx:xx:xx:xx:xx:a MAC address xx:xx:xx:xx:xx:c 10.0.1.B 10.0.2.B xx:xx:xx:xx:xx:a 10.0.2.C xx:xx:xx:xx:xx:c xx:xx:xx:xx:xx:c 10.0.1.C
FIB
Destination prefix 10.0.1.A 192.168.0.0/24 192.168.0.0/24 10.0.1.B 10.0.1.B 192.168.0.0/24 192.168.0.0/24 10.0.2.B 10.0.1.C 192.168.0.0/24 192.168.0.0/24 10.0.2.C Next hop IP
Controller
ARP table
IP address xx:xx:xx:xx:a:a 10.0.1.A 10.0.2.A xx:xx:xx:xx:a:a MAC address xx:xx:xx:xx:b:c 10.0.1.B 10.0.2.B xx:xx:xx:xx:b:a 10.0.2.C xx:xx:xx:xx:c:c xx:xx:xx:xx:c:c 10.0.1.C
FIB
Destination prefix 10.0.1.A 192.168.0.0/24 192.168.0.0/24 10.0.2.A 10.0.1.B 192.168.0.0/24 192.168.0.0/24 10.0.2.B 10.0.1.C 192.168.0.0/24 192.168.0.0/24 10.0.2.C Next hop IP
C
a a c c b b
embed mapping history in MAC address
Faild
A B
Controller
ARP table
IP address xx:xx:xx:xx:xx:a 10.0.1.A 10.0.2.A xx:xx:xx:xx:xx:a MAC address xx:xx:xx:xx:xx:c 10.0.1.B 10.0.2.B xx:xx:xx:xx:xx:a 10.0.2.C xx:xx:xx:xx:xx:c xx:xx:xx:xx:xx:c 10.0.1.C
FIB
Destination prefix 10.0.1.A 192.168.0.0/24 192.168.0.0/24 10.0.1.B 10.0.1.B 192.168.0.0/24 192.168.0.0/24 10.0.2.B 10.0.1.C 192.168.0.0/24 192.168.0.0/24 10.0.2.C Next hop IP
Controller
ARP table
IP address xx:xx:xx:xx:a:a 10.0.1.A 10.0.2.A xx:xx:xx:xx:a:a MAC address xx:xx:xx:xx:b:c 10.0.1.B 10.0.2.B xx:xx:xx:xx:b:a 10.0.2.C xx:xx:xx:xx:c:c xx:xx:xx:xx:c:c 10.0.1.C
FIB
Destination prefix 10.0.1.A 192.168.0.0/24 192.168.0.0/24 10.0.2.A 10.0.1.B 192.168.0.0/24 192.168.0.0/24 10.0.2.B 10.0.1.C 192.168.0.0/24 192.168.0.0/24 10.0.2.C Next hop IP
C
a a c c b b
current host
embed mapping history in MAC address
Faild
A B
Controller
ARP table
IP address xx:xx:xx:xx:xx:a 10.0.1.A 10.0.2.A xx:xx:xx:xx:xx:a MAC address xx:xx:xx:xx:xx:c 10.0.1.B 10.0.2.B xx:xx:xx:xx:xx:a 10.0.2.C xx:xx:xx:xx:xx:c xx:xx:xx:xx:xx:c 10.0.1.C
FIB
Destination prefix 10.0.1.A 192.168.0.0/24 192.168.0.0/24 10.0.1.B 10.0.1.B 192.168.0.0/24 192.168.0.0/24 10.0.2.B 10.0.1.C 192.168.0.0/24 192.168.0.0/24 10.0.2.C Next hop IP
Controller
ARP table
IP address xx:xx:xx:xx:a:a 10.0.1.A 10.0.2.A xx:xx:xx:xx:a:a MAC address xx:xx:xx:xx:b:c 10.0.1.B 10.0.2.B xx:xx:xx:xx:b:a 10.0.2.C xx:xx:xx:xx:c:c xx:xx:xx:xx:c:c 10.0.1.C
FIB
Destination prefix 10.0.1.A 192.168.0.0/24 192.168.0.0/24 10.0.2.A 10.0.1.B 192.168.0.0/24 192.168.0.0/24 10.0.2.B 10.0.1.C 192.168.0.0/24 192.168.0.0/24 10.0.2.C Next hop IP
C
a a c c c a
current host
a a c c b b
embed mapping history in MAC address
Faild
A B
Controller
ARP table
IP address xx:xx:xx:xx:xx:a 10.0.1.A 10.0.2.A xx:xx:xx:xx:xx:a MAC address xx:xx:xx:xx:xx:c 10.0.1.B 10.0.2.B xx:xx:xx:xx:xx:a 10.0.2.C xx:xx:xx:xx:xx:c xx:xx:xx:xx:xx:c 10.0.1.C
FIB
Destination prefix 10.0.1.A 192.168.0.0/24 192.168.0.0/24 10.0.1.B 10.0.1.B 192.168.0.0/24 192.168.0.0/24 10.0.2.B 10.0.1.C 192.168.0.0/24 192.168.0.0/24 10.0.2.C Next hop IP
Controller
ARP table
IP address xx:xx:xx:xx:a:a 10.0.1.A 10.0.2.A xx:xx:xx:xx:a:a MAC address xx:xx:xx:xx:b:c 10.0.1.B 10.0.2.B xx:xx:xx:xx:b:a 10.0.2.C xx:xx:xx:xx:c:c xx:xx:xx:xx:c:c 10.0.1.C
FIB
Destination prefix 10.0.1.A 192.168.0.0/24 192.168.0.0/24 10.0.2.A 10.0.1.B 192.168.0.0/24 192.168.0.0/24 10.0.2.B 10.0.1.C 192.168.0.0/24 192.168.0.0/24 10.0.2.C Next hop IP
C
a a c c c a
current host
embed mapping history in MAC address
Faild
A B
Controller
ARP table
IP address xx:xx:xx:xx:xx:a 10.0.1.A 10.0.2.A xx:xx:xx:xx:xx:a MAC address xx:xx:xx:xx:xx:c 10.0.1.B 10.0.2.B xx:xx:xx:xx:xx:a 10.0.2.C xx:xx:xx:xx:xx:c xx:xx:xx:xx:xx:c 10.0.1.C
FIB
Destination prefix 10.0.1.A 192.168.0.0/24 192.168.0.0/24 10.0.1.B 10.0.1.B 192.168.0.0/24 192.168.0.0/24 10.0.2.B 10.0.1.C 192.168.0.0/24 192.168.0.0/24 10.0.2.C Next hop IP
Controller
ARP table
IP address xx:xx:xx:xx:a:a 10.0.1.A 10.0.2.A xx:xx:xx:xx:a:a MAC address xx:xx:xx:xx:b:c 10.0.1.B 10.0.2.B xx:xx:xx:xx:b:a 10.0.2.C xx:xx:xx:xx:c:c xx:xx:xx:xx:c:c 10.0.1.C
FIB
Destination prefix 10.0.1.A 192.168.0.0/24 192.168.0.0/24 10.0.2.A 10.0.1.B 192.168.0.0/24 192.168.0.0/24 10.0.2.B 10.0.1.C 192.168.0.0/24 192.168.0.0/24 10.0.2.C Next hop IP
C
a a c c c a
embed mapping history in MAC address
Faild
A B
Controller
ARP table
IP address xx:xx:xx:xx:xx:a 10.0.1.A 10.0.2.A xx:xx:xx:xx:xx:a MAC address xx:xx:xx:xx:xx:c 10.0.1.B 10.0.2.B xx:xx:xx:xx:xx:a 10.0.2.C xx:xx:xx:xx:xx:c xx:xx:xx:xx:xx:c 10.0.1.C
FIB
Destination prefix 10.0.1.A 192.168.0.0/24 192.168.0.0/24 10.0.1.B 10.0.1.B 192.168.0.0/24 192.168.0.0/24 10.0.2.B 10.0.1.C 192.168.0.0/24 192.168.0.0/24 10.0.2.C Next hop IP
Controller
ARP table
IP address xx:xx:xx:xx:a:a 10.0.1.A 10.0.2.A xx:xx:xx:xx:a:a MAC address xx:xx:xx:xx:b:c 10.0.1.B 10.0.2.B xx:xx:xx:xx:b:a 10.0.2.C xx:xx:xx:xx:c:c xx:xx:xx:xx:c:c 10.0.1.C
FIB
Destination prefix 10.0.1.A 192.168.0.0/24 192.168.0.0/24 10.0.2.A 10.0.1.B 192.168.0.0/24 192.168.0.0/24 10.0.2.B 10.0.1.C 192.168.0.0/24 192.168.0.0/24 10.0.2.C Next hop IP
C
previous host
a a c c c a a a c c b b
embed mapping history in MAC address
Faild
A B
Controller
ARP table
IP address xx:xx:xx:xx:xx:a 10.0.1.A 10.0.2.A xx:xx:xx:xx:xx:a MAC address xx:xx:xx:xx:xx:c 10.0.1.B 10.0.2.B xx:xx:xx:xx:xx:a 10.0.2.C xx:xx:xx:xx:xx:c xx:xx:xx:xx:xx:c 10.0.1.C
FIB
Destination prefix 10.0.1.A 192.168.0.0/24 192.168.0.0/24 10.0.1.B 10.0.1.B 192.168.0.0/24 192.168.0.0/24 10.0.2.B 10.0.1.C 192.168.0.0/24 192.168.0.0/24 10.0.2.C Next hop IP
Controller
ARP table
IP address xx:xx:xx:xx:a:a 10.0.1.A 10.0.2.A xx:xx:xx:xx:a:a MAC address xx:xx:xx:xx:b:c 10.0.1.B 10.0.2.B xx:xx:xx:xx:b:a 10.0.2.C xx:xx:xx:xx:c:c xx:xx:xx:xx:c:c 10.0.1.C
FIB
Destination prefix 10.0.1.A 192.168.0.0/24 192.168.0.0/24 10.0.2.A 10.0.1.B 192.168.0.0/24 192.168.0.0/24 10.0.2.B 10.0.1.C 192.168.0.0/24 192.168.0.0/24 10.0.2.C Next hop IP
C
a a c c c a a a c c b b
Host processing
Current target
xx:xx:xx:xx:c:b
Match previous? SYN packet? Local socket?
Redirect
xx:xx:xx:xx:
Process Destination MAC address
Previous target
C
Destina
A C B
Host processing
Current target
xx:xx:xx:xx:c:b
Match previous? SYN packet? Local socket?
Redirect
xx:xx:xx:xx:
Process Destination MAC address
Previous target
C
Destina
c != b
A C B
Host processing
Current target
xx:xx:xx:xx:c:b
Match previous? SYN packet? Local socket?
Redirect
xx:xx:xx:xx:
Process Destination MAC address
Previous target
C
Destina
A C B
Host processing
Current target
xx:xx:xx:xx:c:b
Match previous? SYN packet? Local socket?
Redirect
xx:xx:xx:xx:
Process Destination MAC address
Previous target
C
Destina
A C B
Current target
xx:xx:xx:xx:c:b
Match previous? SYN packet? Local socket?
Redirect
xx:xx:xx:xx:
Process Destination MAC address
Previous target
C
Destina
Host processing
A C B
Current target
xx:xx:xx:xx:c:b
Match previous? SYN packet? Local socket?
Redirect
xx:xx:xx:xx:
Process Destination MAC address
Previous target
C
Destina
Host processing
A C B
Current target
xx:xx:xx:xx:c:b
Match previous? SYN packet? Local socket?
Redirect
xx:xx:xx:xx:
Process Destination MAC address
Previous target
C
Destina
Host processing
A C B
Match previous? SYN packet? Local socket?
Redirect
xx:xx:xx:xx:b:b
Process ess
Match previous? SYN packet? Local socket?
Redirect Process
C B
Destination MAC address
Host processing
A C B
Match previous? SYN packet? Local socket?
Redirect
xx:xx:xx:xx:b:b
Process ess
Match previous? SYN packet? Local socket?
Redirect Process
C B
Destination MAC address
Host processing
C
b == b
A C B
Match previous? SYN packet? Local socket?
Redirect
xx:xx:xx:xx:b:b
Process ess
Match previous? SYN packet? Local socket?
Redirect Process
C B
Destination MAC address
Host processing
b == b
A C B
40 60 80 100 120 140 160 180 Round Trip Time [µs] 0.0 0.2 0.4 0.6 0.8 1.0 Cumulative probability Steady state Draining
Host processing
Low latency
Negligible impact on CPU utilization
median difference: 14µs
Host processing
Low latency
Negligible impact on CPU utilization
Steady state Drain
0.0 0.1 0.2 0.3 0.4 0.5
CPU utilization [%]
Refill
Estimated PDF
Timeline
2012 2014 2016 2018
Timeline
2012 2014 2016 2018
deployed globally
Timeline
2012 2014 2016 2018
deployed globally 3x 1014
requests per day
we suspect it works
Assumption #1 hash buckets are equally loaded
Hashing
5 10 15 20 25 30 Time [min] 2k 3k 4k Requests per second
Implications for capacity planning
Hashing
5 10 15 20 25 30 Time [min] 2k 3k 4k Requests per second
Implications for capacity planning
50 100 150 200 250 Rank of nexthop 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 Normalized bucket load
Uneven hashing
Inject synthetic, equally distributed traffic
50 100 150 200 250 Rank of nexthop 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 Normalized bucket load 50 100 150 200 250 Rank of nexthop 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 Normalized bucket load
Uneven hashing
Inject synthetic, equally distributed traffic
50 100 150 200 250 Rank of nexthop 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 Normalized bucket load 50 100 150 200 250 Rank of nexthop 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 Normalized bucket load
Uneven hashing
Significant skew
than the least loaded
Inject synthetic, equally distributed traffic
50 100 150 200 250 Rank of nexthop 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 Normalized bucket load 50 100 150 200 250 Rank of nexthop 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 Normalized bucket load
Uneven hashing
Significant skew
than the least loaded
Behaviour can depend on number of nexthops
number of configured nexthops
Inject synthetic, equally distributed traffic
Assumption #2 switches hash identically
Hash polarization
Hash polarization
Hash polarization
Hash polarization
Hash polarization
Vendors were told hash polarization was bad
vendor additionally uses boot order of linecards to add entropy
Assumption #3 packets in a flow use same network path
Nope, things break
Fragmentation
ECN
SYN proxies
paper has lots more stuff
the value is not in the implementation NSDI
the value is in the design NSDI
the value is in the design NSDI
Faild decomposes load balancing as a division of labour
…the design is now part of the architecture
NSDI
Five years of dealing with the consequences of changing a fraction of the Internet:
Faild part of shift in economics of edge delivery, has since percolated through industry If you propose protocol changes, please take this paper into account
Additional materials
2017 2016 2015