1
Sliding Window Protocol and TCP Congestion Control
Simon S. Lam Department of Computer Science Th U i it f T t A ti The University of Texas at Austin
1 TCP Congestion Control (Simon S. Lam)
Sliding Window Protocol and TCP Congestion Control Simon S. Lam - - PowerPoint PPT Presentation
Sliding Window Protocol and TCP Congestion Control Simon S. Lam Department of Computer Science Th The University of Texas at Austin U i it f T t A ti TCP Congestion Control (Simon S. Lam) 1 1 Reliable data transfer f important in
1
1 TCP Congestion Control (Simon S. Lam)
2
important in app., transport, link layers characteristics of unreliable channel will determine
complexity of reliable data transfer protocol (rdt)
TCP Congestion Control (Simon S. Lam)
2
p y p ( )
3
delivers a subsequence in FIFO order example: delivery service provided by a
x mpl : d liv
example: delivery service provided by IP or by
TCP Congestion Control (Simon S. Lam)
3
4
send window
Source: 1 2 a–1 a s–1 s
send window
acknowledged unacknowledged
Source:
P1 Sender g g received r + RW – 1
Sink:
next expected P2 Receiver
1 2
r
delivered receive window d d ( )
TCP Congestion Control (Simon S. Lam)
4
SW send window size (s - a ≤ SW) RW receive window size
5
Receive window slides forward Receive window slides forward
1 2 a–1 a s–1 s
send window
Source: P1 Sender
unacknowledged acknowledged
Sender r+3 1 2
r
r + RW – 1
Sink: P2 Receiver
next expected
TCP Congestion Control (Simon S. Lam)
5
delivered receive window
Receiver
6
Send window slides forward
1 2 a–1 a s–1 s
send window
Source: P1 Sender
acknowledged r + RW – 1 next expected
1 2
r
delivered receive window r RW 1
Sink: P2 Receiver
next expected
TCP Congestion Control (Simon S. Lam)
6
delivered receive window
7
error control (reliable delivery) in-order delivery flow and congestion control (by varying send
selective nack selective ack (TCP SACK) selective ack (TCP SACK) bit-vector representing entire state of receive
TCP Congestion Control (Simon S. Lam)
7
8
RW receive window size SW send window size
P1 Sender 1 2 a–1 a
send window
Source:
SW send window size
acknowledged unacknowledged
Sink:
next expected
P2 Receiver 1 2
delivered
Sink:
next expected receive window
TCP Congestion Control (Simon S. Lam)
8
9
SW = RW = 1
SW = 7, RW = 1
SW = RW
TCP Congestion Control (Simon S. Lam)
9
10
32-bit sequence numbers counts bytes assumes that datagrams will be discarded by IP
TCP Congestion Control (Simon S. Lam)
10
11
time Source 1 2 W 1 2 W data ACKs time Destination 1 2 W 1 2 W
11 TCP Congestion Control (Simon S. Lam)
12
(assuming no loss, MSS denotes maximum segment size)
(assuming no loss, MSS denotes maximum segment size)
12
TCP Congestion Control (Simon S. Lam)
13
Average number in the send buffer is less than W unless
g m ff packet arrival rate to send buffer is infinite
If a packet is lost in the network with probability p, then
the average time in send buffer is (1
O
g ff Since TO > RTT, actual throughput is smaller. Th
O
13 TCP Congestion Control (Simon S. Lam)
14
long delay before
resending lost packet Detect lost segments
Sender often sends
many segments back-to- back
If segment is lost,
h ll l k l
fast retransmit:
there will likely be many duplicate ACKs.
TCP Congestion Control (Simon S. Lam)
14
15
time R di t ft t i l d li t ACK
TCP Congestion Control (Simon S. Lam)
15
Resending a segment after triple duplicate ACK without waiting for timeout
16
receiver: explicitly informs sender of (dynamically sender won’t overrun
y y changing) amount of free buffer space
RcvWindow field in
receiver’s buffers by transmitting too much, too fast TCP segment header sender: keeps amount of sender keeps amount of transmitted, unACKed data less than most recently received y RcvWindow value
buffer at receive side of a TCP connection
TCP Congestion Control (Simon S. Lam)
16
17
Host A λin : original data
positive feedback instability
finite shared output li k b ff
in
g λ'in : original data plus retransmitted data
link buffers
Host B
λout
TCP Congestion Control (Simon S. Lam)
17
18
goodput
18
load (aggregate send rate)
TCP Congestion Control (Simon S. Lam)
19
Avoid overloading receiver rcvwindow: receiver’s advertised window (also rwnd) Receiver sends rcvwindow to sender Receiver sends rcvwindow to sender
Sender tries to avoid overloading network It infers network congestion from “loss indications” congwin: congestion window (also cwnd)
19 TCP Congestion Control (Simon S. Lam)
20
l
CongWin /
slow start reduce to 1 segment
after timeout event AIMD ( ddi i
i
RTT bytes/sec
AIMD (additive increase
multiplicative decrease)
Note: For now consider RcvWindow to be very large such that the send window size is l t C Wi
TCP Congestion Control (Simon S. Lam)
20
equal to CongWin.
21
Example: MSS = 500 bytes & RTT = 200 msec initial rate = 2500 bytes/sec = 20 kbps
desirable to quickly ramp up to a higher rate
TCP Congestion Control (Simon S. Lam)
21
22
d bl i
RTT
double CongWin every
RTT
done by incrementing
CongWin by 1 MSS for CongWin by 1 MSS for every ACK received Summary: initial rate
time
TCP Congestion Control (Simon S. Lam)
22
23
Q: If no loss, when should the exponential increase switch to linear?
12 14
w size
TCP Reno
3 dup ACKs switch to linear? A: When CongWin gets to current value of threshold
6 8 10
n window
egments)
Reno
For initial slow start,
h h ld l
2 4
ngestion
(se
threshold TCP Tahoe
threshold is set to a large value (e.g., 64 Kbytes)
Subsequently, threshold is
bl
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
co variable
At a loss event, threshold is
set to 1/2 of CongWin just b f l
Note: For simplicity, CongWin is in number
TCP Congestion Control (Simon S. Lam)
23
before loss event p y g
24
After 3 dup ACKs:
CongWin is cut in half
window then grows linearly
(additive increase) 3 dup ACKs indicate
(add t ve ncrease)
But after timeout event:
CongWin is set to 1 MSS
instead; timeout occurring
instead;
window then grows
exponentially to threshold, then grows linearly
then grows linearly Additive Increase Multiplicative Decrease (AIMD)
TCP Congestion Control (Simon S. Lam)
24
25
CongWin Timeout 3 dupACKs
th sh ld h d
Initial slow start
threshold reached during slow start
In this example, 3 dupACKs during slow start before reaching initial threshold
24
TCP Congestion Control (Simon S. Lam)
f g
26
9
S 1 2 3 4 5 6 8 7 1 10 11 9 time S time R 1 2 3 4 5 6 8 7 1
Exit FR/FR 1 1 1 1 1 1 1
10 11
loss
9
time R
Retransmit packet 1 upon 3 dupACKs
26
Inflate cwnd with #dupACKs such that new packets 9,
10, and 11 can be sent while repairing loss
TCP Congestion Control (Simon S. Lam)
27
Set ssthresh ← max(flightsize/2, 2) Retransmit lost packet Set cwnd ← ssthresh + #dupACKs (window inflation) Set cwnd ← ssthresh + #dupACKs (window inflation) Wait till W=min(rwnd, cwnd) is large enough; transmit
new packet(s) O d ACK (1 RTT l t ) s t d ssth sh
On non-dup ACK (1 RTT later), set cwnd ← ssthresh
(window deflation) Enter Congestion Avoidance
27 TCP Congestion Control (Simon S. Lam)
28
loss event or exceeding threshold). When CongWin is above Threshold, sender is in When CongWin is above Threshold, sender is in
TCP Congestion Control (Simon S. Lam)
28
29
Exponential backoff up to
max timeout value equal max timeout value equal to 64 times initial timeout value (There are other variations.)
29
TCP Congestion Control (Simon S. Lam)
30
24 Kbytes congestion window
16 Kbytes 8 Kbytes
TCP Congestion Control (Simon S. Lam)
30
time
Long-lived TCP connection
31
Avoidance Algorithm,”ACM Computer Communicatons Review, 27(3), 1997.
Ave.
31 TCP Congestion Control (Simon S. Lam)
# of RTTs
32
1
∞ ∞
1 1 1 1
i i i i i
− = = ∞ −
1
i i i
= ∞ ∞
1 2
i i
= =
32
TCP Congestion Control (Simon S. Lam)
33
2
2 2 2
33
TCP Congestion Control (Simon S. Lam)
34
Arrival of in-order segment with expected seq #. All data up to expected seq # already ACKed Delayed ACK. Wait up to 500ms for next segment. If no next segment, send ACK expected seq # already ACKed Arrival of in-order segment with expected seq #. One other send ACK Immediately send single cumulative ACK, ACKing both in-order segments segment has ACK pending Arrival of out-of-order segment higher-than-expect seq # Immediately send duplicate ACK, indicating seq. # of next expected byte higher than expect seq. # . Gap detected Arrival of segment that ti ll l t l fill indicating seq. # of next expected byte Immediate send ACK, provided that t t t t l d f
34
partially or completely fills gap segment starts at lower end of gap
TCP Congestion Control (Simon S. Lam)
35
2
2
35
TCP Congestion Control (Simon S. Lam)
36
E g
data center networks (local global)
36 TCP Congestion Control (Simon S. Lam)
E.g., data center networks (local, global)
37
Reference:
Th h t A Si l M d l d it E i i l V lid ti ” Throughput: A Simple Model and its Empirical Validation,” Proceedings ACM SIGCOMM, 1998.
TCP Congestion Control (Simon S. Lam) 37
38
38 TCP Congestion Control (Simon S. Lam)
39
i l d k No slow start triple dup acks b = 1 (no delayed ack)
39 TCP Congestion Control (Simon S. Lam)
40
no triple duplicate Acks packet loss (timeout) with probability p
packet loss (timeout) with probability p timeout interval fixed at T0 after each loss First success in next cycle
TCP Congestion Control (Simon S. Lam) 40
41
41 TCP Congestion Control (Simon S. Lam)