Responding to Spurious Timeouts in TCP Andrei Gurtov University of - - PowerPoint PPT Presentation
Responding to Spurious Timeouts in TCP Andrei Gurtov University of - - PowerPoint PPT Presentation
Responding to Spurious Timeouts in TCP Andrei Gurtov University of Helsinki Reiner Ludwig Ericsson Research Outline Motivation Spurious Timeouts in TCP Robustness to Packet Losses Undoing Congestion Control Adapting
2/15
Outline
- Motivation
- Spurious Timeouts in TCP
- Robustness to Packet Losses
- Undoing Congestion Control
- Adapting the Retransmit Timer
- Performance Evaluation
- Conclusions
3/15
Motivation
- Delay variation in wireless
networks
– Cell reselections in GPRS last 3-15 sec – Bandwidth oscillation in CDMA2000 – Link-layer persistent error recovery
- Deployed aggressive
retransmission timers
– 10 ms granularity and 200 ms minimum in Linux TCP – Solaris
5000 10000 15000 20000 25000 30000 35000 40000 45000 50000 500 510 520 530 540 550 560 570 580 Tim e of D ay (s) Sequence Number (B) Sn d_D ata Sn d_Ack
A TCP trace shows spurious retransmissions caused by two cell reselections in a live GPRS network.
4/15
Spurious Timeouts in TCP
- Spurious timeouts hurt
TCP performance
– Unnecessary retransmissions during go- back-N – Disrupted congestion control
- In a short run, slow start
causes congestion
- In a long run,
underutilization due to reduced ssthresh
5/15
Spurious Timeouts in TCP
- Detecting spurious
timeouts
– Eifel, F-RTO, etc…
- Response after
detecting a spurious timeout
– Our focus
6/15
Main Issues w.r.t. Response
- 1. Robustness to packet losses
– Danger of genuine timeouts
- 2. How to restore the
congestion control state
– Does a full restore of cwnd and ssthresh cause a burst? – Do partial restore options perform well?
- 3. How to adapt the
retransmit timer
– To avoid clogging the network with unnecessary retransmissions in the future
7/15
- 1. Robustness to Packet
Losses
- State-of-the-art TCP is often
sufficient
– Fast Retransmit+ Sack+Limited Transmit
- Heavy losses do trigger genuine
timeouts
– TCP gets low throughput – Cannot adapt RTO to a more conservative level
- Solutions
– FACK works well but not with reordering – NewReno+Sack works almost as well and appears safe for the Internet
8/15
- 2. Undoing Congestion
Control
- Full undo – too aggressive?
– Appropriate, no bursts
- bserved
- Partial undo sets the sender idle
for a while
– The flight size is higher than the reduced cwnd – The ACK clock is can be lost
- A new proposal: use the ACK
clock but in congestion avoidance
– Ssthresh=cwnd_old, cwnd=ssthresh
9/15
TCP RTO
- TCP does not take samples
from delayed segments (Karn algorithm)
- TCP with timestamps can do
that
– RTO is more conservative but decays quite fast
- TCP with Eifel uses timestamps
– Already more conservative than the standard TCP – Also, maintains a larger window that results into higher RTT and higher RTO
10/15
- 3. Adapting RTO
- Upon a spurious timeout
– Reseed: initialize SRTT and RTTVAR with new sample (history discard) – Back-off: keep the exponential back-off count – Min++: increase the minimum RTO (by 1 sec)
- Reset to the standard timer upon a geniune
timeout
11/15
Performance Evaluation
- ns2, dumbbell with TCP and CBR sources
– 3G or satellite link: 2 Mbps, 400 ms RTT – Periodic delay spikes
- 250 % gain in throughput when delay spikes occur on
uncongested path (without CBR)
- TCP fairness does not suffer because response to packet
losses is unchanged
12/15
Robustness to Packet Losses: TCPs with CBR
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 std eifel std eifel std eifel Reno-SACK NewReno-SACK FACK N
- rmalized
Download Time Segments Sent Spurious RTOs Genuine RTOs
13/15
Undo of Congestion Control: TCP FACK with CBR
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Download Time Segments Sent Spurious RTOs Genuine RTOs Normalized Full Partial None New partial
14/15
Adopting RTO: TCP FACK with CBR
0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 Download Time Segments Sent Spurious RTOs Genuine RTOs Normalized Std Reseed Back-off Min++
15/15
Summary
- An update of TCP sender improving performance over
paths with variable delays
- Up to 250% throughput gain on links with a high
bandwidth-delay product
- Adequate performance on congested paths
– NewReno-SACK is robust to packet losses
- Full restore of cong. control after a spurious timeout is ok
- Using back-offs or increasing the min RTO can reduce the
number of spurious RTOs by 40% with only slightly lower throughput
- We have some real measurements for 2.5G links