An experimental study of the learnability of congestion control
Anirudh Sivaraman, Keith Winstein, Pratiksha Thaker, Hari Balakrishnan
MIT CSAIL http://web.mit.edu/remy/learnability
August 31, 2014
1 / 17
An experimental study of the learnability of congestion control - - PowerPoint PPT Presentation
An experimental study of the learnability of congestion control Anirudh Sivaraman, Keith Winstein, Pratiksha Thaker, Hari Balakrishnan MIT CSAIL http://web.mit.edu/remy/learnability August 31, 2014 1 / 17 This talk How easy is it to
An experimental study of the learnability of congestion control
Anirudh Sivaraman, Keith Winstein, Pratiksha Thaker, Hari Balakrishnan
MIT CSAIL http://web.mit.edu/remy/learnability
August 31, 2014
1 / 17
This talk
◮ How easy is it to learn a network protocol to
achieve a desired goal, despite a mismatched set
2 / 17
This talk
◮ How easy is it to learn a network protocol to
achieve a desired goal, despite a mismatched set
◮ cf. Learning: “Knowledge acquisition without
explicit programming” (Valiant 1984)
2 / 17
Preview of key results
3 / 17
Preview of key results
◮ Can tolerate mismatched link-rate assumptions
3 / 17
Preview of key results
◮ Can tolerate mismatched link-rate assumptions ◮ Need precision about the number of senders
3 / 17
Preview of key results
◮ Can tolerate mismatched link-rate assumptions ◮ Need precision about the number of senders ◮ TCP compatibility is a double-edged sword
3 / 17
Preview of key results
◮ Can tolerate mismatched link-rate assumptions ◮ Need precision about the number of senders ◮ TCP compatibility is a double-edged sword ◮ Can tolerate mismatch in the # of bottlenecks
3 / 17
Experimental method
4 / 17
Experimental method
4 / 17
Experimental method
< Mbps, ms>
4 / 17
Experimental method
< Mbps, ms>
4 / 17
Experimental method
< Mbps, ms>
4 / 17
Experimental method
< Mbps, ms>
4 / 17
Experimental method
< Mbps, ms>
4 / 17
Experimental method
< Mbps, ms>
Training Networks
5 / 17
Experimental method
< Mbps, ms>
Training Networks Objective Function:
Learner
5 / 17
Experimental method
< Mbps, ms>
Training Networks Objective Function:
Learner Congestion Control Algorithm
5 / 17
Experimental method
< Mbps, ms>
Training Networks Objective Function:
Remy (SIGCOMM 13) RemyCC
5 / 17
Experimental method
< Mbps, ms>
Training Networks
< Mbps, ms>
T est within ns-2 T esting Networks Objective Function:
Remy (SIGCOMM 13) RemyCC
5 / 17
Remy compared with an ideal protocol
0.5 1 2 4 8 16 32 100 200 300 400 500 Throughput (Mbps) Queueing delay (ms)
6 / 17
Remy compared with an ideal protocol
0.5 1 2 4 8 16 32 100 200 300 400 500 Throughput (Mbps) Queueing delay (ms)
Ideal
6 / 17
Remy compared with an ideal protocol
0.5 1 2 4 8 16 32 100 200 300 400 500 Throughput (Mbps) Queueing delay (ms)
Ideal RemyCC
6 / 17
Remy compared with an ideal protocol
0.5 1 2 4 8 16 32 100 200 300 400 500 Throughput (Mbps) Queueing delay (ms)
Ideal RemyCC Cubic Cubic/sfqCoDel
6 / 17
Learning network protocols despite mismatched assumptions
7 / 17
Learning network protocols despite mismatched assumptions
◮ Is there a tradeoff between operating range and
generality in link rates?
7 / 17
Learning network protocols despite mismatched assumptions
◮ Is there a tradeoff between operating range and
generality in link rates?
◮ Is there a tradeoff between performance and
7 / 17
Performance and link-rate operating range
1 10 100 1000 Link rate (Mbps) Objective Function (Normalized)
8 / 17
Performance and link-rate operating range
1 10 100 1000 Link rate (Mbps) Objective Function (Normalized) Ideal
8 / 17
Performance and link-rate operating range
1 10 100 1000 Link rate (Mbps) Objective Function (Normalized) Ideal 2x range
8 / 17
Performance and link-rate operating range
1 10 100 1000 Link rate (Mbps) Objective Function (Normalized) Ideal 2x range 10x range
8 / 17
Performance and link-rate operating range
1 10 100 1000 Link rate (Mbps) Objective Function (Normalized) Ideal 2x range 10x range 100x range
8 / 17
Performance and link-rate operating range
1 10 100 1000 Link rate (Mbps) Objective Function (Normalized) Ideal 2x range 10x range 100x range 1000x range
8 / 17
Performance and link-rate operating range
1 10 100 1000 Link rate (Mbps) Objective Function (Normalized) Ideal C u b i c Cubic-over-sfqCoDel 2x range 10x range 100x range 1000x range
8 / 17
Performance and link-rate operating range
9 / 17
Performance and link-rate operating range
◮ Very clear generality vs. operating range tradeoff
9 / 17
Performance and link-rate operating range
◮ Very clear generality vs. operating range tradeoff ◮ Only weak evidence of a performance vs.
9 / 17
Performance and link-rate operating range
◮ Very clear generality vs. operating range tradeoff ◮ Only weak evidence of a performance vs.
◮ Possible to design a forwards-comptabible
protocol handling a wide range in link rates
9 / 17
Learning network protocols despite mismatched assumptions Can we learn a protocol that performs well both when there are few senders and when there are many senders?
10 / 17
Imperfections in the number of senders
20 40 60 80 100
Number of senders
−1.4 −1.2 −1.0 −0.8 −0.6 −0.4 −0.2 0.0
Normalized objective function
11 / 17
Imperfections in the number of senders
20 40 60 80 100
Number of senders
−1.4 −1.2 −1.0 −0.8 −0.6 −0.4 −0.2 0.0
Normalized objective function
Ideal
11 / 17
Imperfections in the number of senders
20 40 60 80 100
Number of senders
−1.4 −1.2 −1.0 −0.8 −0.6 −0.4 −0.2 0.0
Normalized objective function
Ideal
11 / 17
Imperfections in the number of senders
20 40 60 80 100
Number of senders
−1.4 −1.2 −1.0 −0.8 −0.6 −0.4 −0.2 0.0
Normalized objective function
Ideal 1 - 2
11 / 17
Imperfections in the number of senders
20 40 60 80 100
Number of senders
−1.4 −1.2 −1.0 −0.8 −0.6 −0.4 −0.2 0.0
Normalized objective function
Ideal 1 - 2
11 / 17
Imperfections in the number of senders
20 40 60 80 100
Number of senders
−1.4 −1.2 −1.0 −0.8 −0.6 −0.4 −0.2 0.0
Normalized objective function
Ideal 1 - 2 1 - 10
11 / 17
Imperfections in the number of senders
20 40 60 80 100
Number of senders
−1.4 −1.2 −1.0 −0.8 −0.6 −0.4 −0.2 0.0
Normalized objective function
Ideal 1 - 2 1 - 10
11 / 17
Imperfections in the number of senders
20 40 60 80 100
Number of senders
−1.4 −1.2 −1.0 −0.8 −0.6 −0.4 −0.2 0.0
Normalized objective function
Ideal 1 - 2 1 - 10 1
11 / 17
Imperfections in the number of senders
20 40 60 80 100
Number of senders
−1.4 −1.2 −1.0 −0.8 −0.6 −0.4 −0.2 0.0
Normalized objective function
Ideal 1 - 2 1 - 10 1
11 / 17
Imperfections in the number of senders
20 40 60 80 100
Number of senders
−1.4 −1.2 −1.0 −0.8 −0.6 −0.4 −0.2 0.0
Normalized objective function
Ideal 1 - 2 1 - 10 1
1 - 100
11 / 17
Imperfections in the number of senders
20 40 60 80 100
Number of senders
−1.4 −1.2 −1.0 −0.8 −0.6 −0.4 −0.2 0.0
Normalized objective function
Ideal 1 - 2 1 - 10 1
1 - 100
11 / 17
Imperfections in the number of senders
20 40 60 80 100
Number of senders
−1.4 −1.2 −1.0 −0.8 −0.6 −0.4 −0.2 0.0
Normalized objective function
Ideal 1 - 2 1 - 10 1
1 - 100 Cubic
11 / 17
Imperfections in the number of senders
20 40 60 80 100
Number of senders
−1.4 −1.2 −1.0 −0.8 −0.6 −0.4 −0.2 0.0
Normalized objective function
Ideal 1 - 2 1 - 10 1
1 - 100 Cubic Cubic-over-sfqCoDel
11 / 17
Imperfections in the number of senders Tradeoff between performance with few senders and performance with many senders
11 / 17
Learning network protocols despite mismatched assumptions What are the costs and benefits of learning a new protocol that shares fairly with a legacy sender?
12 / 17
Imperfect assumptions about the nature of other senders
◮ TCP-Aware RemyCC: Contends with:
◮ TCP-Aware RemyCC half the time ◮ TCP NewReno half the time.
13 / 17
Imperfect assumptions about the nature of other senders
◮ TCP-Aware RemyCC: Contends with:
◮ TCP-Aware RemyCC half the time ◮ TCP NewReno half the time.
◮ TCP-Naive RemyCC: Contends with:
◮ TCP-Naive RemyCC all the time
13 / 17
RemyCC competing against itself
3 4 5 6 7 16 32 64 128 Throughput (Mbps) Queueing delay (ms)
Better
NewReno RemyCC [TCP-naive]
14 / 17
RemyCC competing against itself
3 4 5 6 7 16 32 64 128 Throughput (Mbps) Queueing delay (ms)
Better
NewReno RemyCC [TCP-naive]
Cost of TCP-awareness
14 / 17
RemyCC competing against itself
3 4 5 6 7 16 32 64 128 Throughput (Mbps) Queueing delay (ms)
Better
NewReno RemyCC [TCP-naive]
Cost of TCP-awareness
RemyCC [TCP-aware]
14 / 17
RemyCC competing against TCP NewReno
4 5 6 7 64 96 128 Queueing delay (ms)
Better
3 Throughput (Mbps) NewReno RemyCC [TCP-naive]
15 / 17
RemyCC competing against TCP NewReno
4 5 6 7 64 96 128 Queueing delay (ms)
Better
3 Throughput (Mbps) NewReno RemyCC [TCP-naive]
Benefit of TCP-awareness E f f e c t
T C P
w a r e a d v e r s a r y
15 / 17
RemyCC competing against TCP NewReno
4 5 6 7 64 96 128 Queueing delay (ms)
Better
3 Throughput (Mbps) NewReno RemyCC [TCP-aware] NewReno RemyCC [TCP-naive]
Benefit of TCP-awareness E f f e c t
T C P
w a r e a d v e r s a r y
15 / 17
RemyCC competing against TCP NewReno TCP awareness benefits you when needed, costs if you don’t
15 / 17
Caveats
16 / 17
Caveats
◮ Remy as a proxy for an optimal learner
16 / 17
Caveats
◮ Remy as a proxy for an optimal learner ◮ Results may change with better learners
16 / 17
Caveats
◮ Remy as a proxy for an optimal learner ◮ Results may change with better learners ◮ Negative results may no longer hold
16 / 17
The learnability of congestion control
17 / 17
The learnability of congestion control
◮ Can tolerate mismatched link-rate assumptions
17 / 17
The learnability of congestion control
◮ Can tolerate mismatched link-rate assumptions ◮ Need precision about the number of senders
17 / 17
The learnability of congestion control
◮ Can tolerate mismatched link-rate assumptions ◮ Need precision about the number of senders ◮ TCP compatibility is a double-edged sword
17 / 17
The learnability of congestion control
◮ Can tolerate mismatched link-rate assumptions ◮ Need precision about the number of senders ◮ TCP compatibility is a double-edged sword ◮ Can tolerate mismatch in the # of bottlenecks
17 / 17
The learnability of congestion control
◮ Can tolerate mismatched link-rate assumptions ◮ Need precision about the number of senders ◮ TCP compatibility is a double-edged sword ◮ Can tolerate mismatch in the # of bottlenecks ◮ Ongoing work in using findings:
17 / 17
The learnability of congestion control
◮ Can tolerate mismatched link-rate assumptions ◮ Need precision about the number of senders ◮ TCP compatibility is a double-edged sword ◮ Can tolerate mismatch in the # of bottlenecks ◮ Ongoing work in using findings:
◮ improve Google’s datacenter transport
17 / 17
The learnability of congestion control
◮ Can tolerate mismatched link-rate assumptions ◮ Need precision about the number of senders ◮ TCP compatibility is a double-edged sword ◮ Can tolerate mismatch in the # of bottlenecks ◮ Ongoing work in using findings:
◮ improve Google’s datacenter transport ◮ user-space implementation of RemyCC
17 / 17
The learnability of congestion control
◮ Can tolerate mismatched link-rate assumptions ◮ Need precision about the number of senders ◮ TCP compatibility is a double-edged sword ◮ Can tolerate mismatch in the # of bottlenecks ◮ Ongoing work in using findings:
◮ improve Google’s datacenter transport ◮ user-space implementation of RemyCC
◮ http://web.mit.edu/remy/learnability
17 / 17
Backup slides
17 / 17
The Remy protocol synthesis procedure
◮ Protocol: range-based rule table from state to action
17 / 17
The Remy protocol synthesis procedure
◮ Protocol: range-based rule table from state to action ◮ State: Congestion signals tracked by the sender
◮ s ewma : EWMA over packet inter-transmit times ◮ r ewma : EWMA over ACK inter-arrival times ◮ rtt ratio: Ratio of RTT to minimum RTT ◮ slow r ewma: Slower version of s ewma
17 / 17
The Remy protocol synthesis procedure
◮ Protocol: range-based rule table from state to action ◮ State: Congestion signals tracked by the sender
◮ s ewma : EWMA over packet inter-transmit times ◮ r ewma : EWMA over ACK inter-arrival times ◮ rtt ratio: Ratio of RTT to minimum RTT ◮ slow r ewma: Slower version of s ewma
◮ Action: modify window, transmission rate
◮ Multiplier m to current window ◮ Increment c to current window ◮ Minimum inter-transmit time.
17 / 17
The Remy protocol synthesis procedure
17 / 17
One action for all states. Find the best value.
s_ewma r_ewma
17 / 17
The best (single) action. Now split it on median.
s_ewma r_ewma
17 / 17
Simulate
s_ewma r_ewma
<0.90,4,3.3> <0.90,4,3.3>
17 / 17
Optimize each of the new actions
s_ewma r_ewma
<0.90,4,3.3> <0.90,4,3.3>
17 / 17
Now split the most-used rule
s_ewma r_ewma
<0.90,5,2.8> <0.60,19,76.2>
17 / 17
Simulate
s_ewma r_ewma
<0.90,5,2.8> <0.60,19,76.2>
<0.80,5,4.1> <0.80,5,4.1>
<0.80,5,4.1> <0.80,5,4.1> 17 / 17
Optimize
s_ewma r_ewma
<0.90,5,2.8> <0.60,19,76.2>
<0.80,5,4.1> <0.80,5,4.1>
<0.80,5,4.1> <0.80,5,4.1> 17 / 17
Split
s_ewma r_ewma
<0.90,5,2.8> <0.30,29,49.7>
<0.80,8,3.3> <0.80,8,62.7>
<0.80,17,4.6> <0.80,7,16.9> 17 / 17
Simulate
s_ewma r_ewma
<0.30,29,49.7>
<0.80,8,3.3> <0.80,8,62.7>
<0.80,17,4.6> <0.80,7,16.9>
<0.90,5,2.8> <0.90,5,2.8>
<0.90,5,2.8> <0.90,5,2.8>17 / 17
Can applications with different objectives coexist?
◮ Tpt. Sender: A throughput-intensive sender
log(throughput) − 0.1 ∗ log(delay) (1)
◮ Lat. Sender: A latency-sensitive sender
log(throughput) − 10.0 ∗ log(delay) (2)
◮ Running over a FIFO queue
17 / 17
Training for diversity has a cost ...
1 2 5 11 16 1 2 4 8 16 32 64 128 256 512 1024 2048 Throughput (Mbps) Queueing delay (ms)
17 / 17
Training for diversity has a cost ...
1 2 5 11 16 1 2 4 8 16 32 64 128 256 512 1024 2048 Throughput (Mbps) Queueing delay (ms)
[naive]
[naive]
17 / 17
Training for diversity has a cost ...
1 2 5 11 16 1 2 4 8 16 32 64 128 256 512 1024 2048 Throughput (Mbps) Queueing delay (ms)
[naive]
[coevolved]
[naive]
[coevolved]
Cost of Coexistence
17 / 17
but, benefits the docile sender
1 2 5 11 16 1 2 4 8 16 32 64 128 256 512 1024 2048 Throughput (Mbps) Queueing delay (ms)
17 / 17
but, benefits the docile sender
1 2 5 11 16 1 2 4 8 16 32 64 128 256 512 1024 2048 Throughput (Mbps) Queueing delay (ms)
[naive]
[naive]
17 / 17
but, benefits the docile sender
1 2 5 11 16 1 2 4 8 16 32 64 128 256 512 1024 2048 Throughput (Mbps) Queueing delay (ms)
[naive]
[coevolved]
[naive]
[coevolved]
Benefit of coevolution E f f e c t
p l a y i n g n i c e
17 / 17