Responding to Spurious Timeouts in TCP Andrei Gurtov University of - - PowerPoint PPT Presentation

responding to spurious timeouts in tcp
SMART_READER_LITE
LIVE PREVIEW

Responding to Spurious Timeouts in TCP Andrei Gurtov University of - - PowerPoint PPT Presentation

Responding to Spurious Timeouts in TCP Andrei Gurtov University of Helsinki Reiner Ludwig Ericsson Research Outline Motivation Spurious Timeouts in TCP Robustness to Packet Losses Undoing Congestion Control Adapting


slide-1
SLIDE 1

Responding to Spurious Timeouts in TCP

Andrei Gurtov

University of Helsinki

Reiner Ludwig

Ericsson Research

slide-2
SLIDE 2

2/15

Outline

  • Motivation
  • Spurious Timeouts in TCP
  • Robustness to Packet Losses
  • Undoing Congestion Control
  • Adapting the Retransmit Timer
  • Performance Evaluation
  • Conclusions
slide-3
SLIDE 3

3/15

Motivation

  • Delay variation in wireless

networks

– Cell reselections in GPRS last 3-15 sec – Bandwidth oscillation in CDMA2000 – Link-layer persistent error recovery

  • Deployed aggressive

retransmission timers

– 10 ms granularity and 200 ms minimum in Linux TCP – Solaris

5000 10000 15000 20000 25000 30000 35000 40000 45000 50000 500 510 520 530 540 550 560 570 580 Tim e of D ay (s) Sequence Number (B) Sn d_D ata Sn d_Ack

A TCP trace shows spurious retransmissions caused by two cell reselections in a live GPRS network.

slide-4
SLIDE 4

4/15

Spurious Timeouts in TCP

  • Spurious timeouts hurt

TCP performance

– Unnecessary retransmissions during go- back-N – Disrupted congestion control

  • In a short run, slow start

causes congestion

  • In a long run,

underutilization due to reduced ssthresh

slide-5
SLIDE 5

5/15

Spurious Timeouts in TCP

  • Detecting spurious

timeouts

– Eifel, F-RTO, etc…

  • Response after

detecting a spurious timeout

– Our focus

slide-6
SLIDE 6

6/15

Main Issues w.r.t. Response

  • 1. Robustness to packet losses

– Danger of genuine timeouts

  • 2. How to restore the

congestion control state

– Does a full restore of cwnd and ssthresh cause a burst? – Do partial restore options perform well?

  • 3. How to adapt the

retransmit timer

– To avoid clogging the network with unnecessary retransmissions in the future

slide-7
SLIDE 7

7/15

  • 1. Robustness to Packet

Losses

  • State-of-the-art TCP is often

sufficient

– Fast Retransmit+ Sack+Limited Transmit

  • Heavy losses do trigger genuine

timeouts

– TCP gets low throughput – Cannot adapt RTO to a more conservative level

  • Solutions

– FACK works well but not with reordering – NewReno+Sack works almost as well and appears safe for the Internet

slide-8
SLIDE 8

8/15

  • 2. Undoing Congestion

Control

  • Full undo – too aggressive?

– Appropriate, no bursts

  • bserved
  • Partial undo sets the sender idle

for a while

– The flight size is higher than the reduced cwnd – The ACK clock is can be lost

  • A new proposal: use the ACK

clock but in congestion avoidance

– Ssthresh=cwnd_old, cwnd=ssthresh

slide-9
SLIDE 9

9/15

TCP RTO

  • TCP does not take samples

from delayed segments (Karn algorithm)

  • TCP with timestamps can do

that

– RTO is more conservative but decays quite fast

  • TCP with Eifel uses timestamps

– Already more conservative than the standard TCP – Also, maintains a larger window that results into higher RTT and higher RTO

slide-10
SLIDE 10

10/15

  • 3. Adapting RTO
  • Upon a spurious timeout

– Reseed: initialize SRTT and RTTVAR with new sample (history discard) – Back-off: keep the exponential back-off count – Min++: increase the minimum RTO (by 1 sec)

  • Reset to the standard timer upon a geniune

timeout

slide-11
SLIDE 11

11/15

Performance Evaluation

  • ns2, dumbbell with TCP and CBR sources

– 3G or satellite link: 2 Mbps, 400 ms RTT – Periodic delay spikes

  • 250 % gain in throughput when delay spikes occur on

uncongested path (without CBR)

  • TCP fairness does not suffer because response to packet

losses is unchanged

slide-12
SLIDE 12

12/15

Robustness to Packet Losses: TCPs with CBR

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 std eifel std eifel std eifel Reno-SACK NewReno-SACK FACK N

  • rmalized

Download Time Segments Sent Spurious RTOs Genuine RTOs

slide-13
SLIDE 13

13/15

Undo of Congestion Control: TCP FACK with CBR

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Download Time Segments Sent Spurious RTOs Genuine RTOs Normalized Full Partial None New partial

slide-14
SLIDE 14

14/15

Adopting RTO: TCP FACK with CBR

0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 Download Time Segments Sent Spurious RTOs Genuine RTOs Normalized Std Reseed Back-off Min++

slide-15
SLIDE 15

15/15

Summary

  • An update of TCP sender improving performance over

paths with variable delays

  • Up to 250% throughput gain on links with a high

bandwidth-delay product

  • Adequate performance on congested paths

– NewReno-SACK is robust to packet losses

  • Full restore of cong. control after a spurious timeout is ok
  • Using back-offs or increasing the min RTO can reduce the

number of spurious RTOs by 40% with only slightly lower throughput

  • We have some real measurements for 2.5G links