Reducing Web Latency: The Virtue of Gentle Aggression Tobias Flach , - PowerPoint PPT Presentation

Reducing Web Latency: The Virtue of Gentle Aggression Tobias Flach , Nandita Dukkipati, Andreas Terzis, Barath Raghavan, Yuchung Cheng, Neal Cardwell, Ankur Jain, Shuai Hao, Ethan Katz-Bassett, and Ramesh Govindan USC & Google August 14, 2013

We can improve Google’s response time by 23% Across billions of client requests, we improved the mean response time by 23%. We achieved this by ONLY speeding up 6% of the transfers, all of them experienced packet loss. Improvement is in the tail: We halved latency in the 99th percentile. For latency-sensitive services faster transfers mean a better user experience. 2

Ways to Reduce Latency: The State of the Art High loss, high delay 3

Ways to Reduce Latency: The State of the Art High loss, Low loss, shorter delay multiplexed Improve the proximity of services to the user Leverage multi-stage connections 4

Evaluating TCP Performance High loss, Low loss, shorter delay multiplexed Analyzed billions of flows carrying Web traffic between Google and clients 5

Transfers With Loss Are Too Slow Loss makes Web latency 5 times slower Delays caused by TCP loss [Delay graph] detection and recovery 6% of transfers between Google and clients are lossy 6

Retransmission Timeouts Are Expensive 77% of losses are recovered by retransmission timeouts Retransmission timeouts can be 200 times larger than the RTT Caused by high RTT variance, or lack of samples 7

Tail Drops Are Expensive (Single) tail packet drop is very common Tail packets are twice as likely to be dropped compared to packets early in a burst 35% of lossy bursts observe only one packet loss 8

Our Motivation and Goal Loss significantly slows down transfers. Due to frequent recovery via slow RTOs. Caused by tail loss. Our Goal: Approaching the ideal of loss detection and recovery without delay. Without making the protocol too aggressive. 9

Design Space Level of Aggression Decreasing Increased Increased Phase slightly slightly greatly Startup / IW 10 Short flows Steady TCP Vegas CUBIC state Relentless / Decongestion Loss Moderation Recovery Loss DDoS Defense Timeout by Offense 10

Setting Backend Frontend Server Server Public Network Private Network Controlling server only Controlling client and server Preference for solutions Latency-sensitive traffic is a without client changes and small portion of traffic mix middlebox compatibility

Setting Backend Frontend Server Server Public Network Private Network Trigger fast retransmit Avoid retransmissions Reactive Proactive by retransmitting the through packet tail packet early duplication Add redundancy to enable recovery without retransmission, Corrective or trigger fast retransmit

Reactive Receiver does not know about the loss 1 - 3 and therefore cannot send signals back Wait time until RTO 1 14

Reactive Retransmit new packet or previous (tail) packet after 1 - 3 two RTTs Can trigger selective Wait for acknowledgement two RTTs indicating loss 3 1 2 Fast Speeds up loss retransmit detection 15

Reactive: Detecting Masked Losses 1 - 3 Cannot ignore the case where a packet loss is recovered by the Reactive probe Wait for two RTTs 3 Count ACKs and reduce congestion window if only one ACK for tail packet received 16

Reactive: Detecting Masked Losses 1 - 3 1 - 3 2 2 K K C C A A Wait for two RTTs 3 3 3 K C A 3 K C 3 A K C A One ACK only: Loss → Reduce Two ACKs: No loss congestion window 17

Setting Backend Frontend Server Server Public Network Private Network Trigger fast retransmit Avoid retransmissions Reactive Proactive by retransmitting the through packet tail packet early duplication Add redundancy to enable recovery without retransmission, Corrective or trigger fast retransmit

Proactive 1 - 3 Avoid almost all retransmissions through packet duplication Wait time until RTO 3 20

Proactive 1 Avoid almost all retransmissions 1 (DUP) through packet duplication 2 2 (DUP) 3 Duplicates are used if original 3 (DUP) transmission was lost Avoids loss detection and recovery 21

A/B Experiment Setup Frontend Backend Server Server Default Default Reactive Proactive Experimented in production environment serving billions of queries (millions of queries are sampled) 22

Impact of Reactive and Proactive 15-day experiment, 2.6 million queries sampled: mean response time reduced by 23% 99th percentile response time reduced by 47% Impact of Proactive: Retransmission rates on the backend connection dropped from 0.99% to 0.09% Impact of Reactive: Almost 50% of retransmission timeouts on the frontend connection are converted to fast retransmits 23

Corrective: The Middle Way Proactive avoids Reactive speeds up loss detection and loss detection, but still recovery, but has requires recovery 100% overhead Corrective 24

Corrective: Forward Error Correction in TCP 1 - 3 Add redundancy to enable recovery without retransmission Wait time until RTO 1 25

Corrective: Forward Error Correction in TCP Encodes previously transmitted 1 - 3 segments in few coded segments ENCODED XOR coding can recover single packet loss at the receiver Signaling of recovery status to the sender to enforce congestion No loss control or fast retransmit detection required Speeds up loss detection and recovery 26

Evaluation: Corrective Synthetic workloads (fixed-size single queries) Network emulator Web page downloads (complex multi-resource queries) 27

Loading nytimes.com with Corrective Tail latency reduced by more than 20% But: performance slightly worse on loss-free connections 28

Dealing with Middleboxes Protocol changes need to account for middlebox interference We designed our modules for middlebox compatibility or graceful fallback to standard TCP 29

Dealing with Middleboxes Unknown option in data Require option in all packet is stripped packets Resend lost segment to ACK number is rewritten update middlebox state for unseen sequences Modified retransmission Detect tampering through payload is rejected checksum 30

Conclusion In a measurement study analyzing billions of flows in a Google’ s production environment we found that Analysis of loss patterns motivated three designs to improve latency: Reactive, Proactive, and Corrective Reactive and Proactive improved Google’s mean response time by 23% Reactive and Corrective are IETF Internet Drafts. Reactive is implemented and enabled by default in Linux 3.10 31

Reducing Web Latency: The Virtue of Gentle Aggression Tobias Flach , Nandita Dukkipati, Andreas Terzis, Barath Raghavan, Yuchung Cheng, Neal Cardwell, Ankur Jain, Shuai Hao, Ethan Katz-Bassett, and Ramesh Govindan USC & Google August 14, 2013

Additional Slides

Why aren’t you just using a more aggressive RTO value? Delays can be the result of delayed ACKs Increases the risk of spurious retransmissions Severely impacts TCP performance due to potentially larger number of unnecessary retransmissions and reduction of the congestion window 34

Why are you doing Corrective on the Transport Layer? Application Layer Transport Layer Applications can selectively Has necessary data to protect important data parts configure and tune Corrective (e.g. packets with higher loss probability, congestion Reliable transport protocol window size, loss rate, RTT) would recover redundant data Does not know which packets Additional protocol are prone to loss complexity

Design Space Level of Aggression Decreasing Increased Increased Phase slightly slightly greatly Startup / IW Proactive 10 Short flows Steady TCP Vegas CUBIC state Relentless / Decongestion Moderation Corrective Recovery DDoS Defense Reactive Timeout by Offense

Reducing Web Latency: The Virtue of Gentle Aggression Tobias Flach , - PowerPoint PPT Presentation

Reducing Web Latency: The Virtue of Gentle Aggression Tobias Flach , Nandita Dukkipati, Andreas Terzis, Barath Raghavan, Yuchung Cheng, Neal Cardwell, Ankur Jain, Shuai Hao, Ethan Katz-Bassett, and Ramesh Govindan USC & Google August 14,

Dealing with Aggression and Best Mixing Practices Dr Jennifer Brown Research Scientist- Ethology

Reducing Latency for Linux Transport Per Hurtig Karlstad University Andreas Petlund Simula

Reducing input latency on the web bit.ly/reduce-input-latency W3C Games Workshop - June 2019

Asynchronous I/O Stack: A Low-latency Kernel I/O Stack for Ultra-Low Latency SSDs Jinkyu Jeong

Neurobiology and Treatment of Aggression A Translational Approach Zoran M Pavlovic MD Medical

Resident to Resident Aggression in Residential Care in B.C. www.seniorsadvocatebc.ca

Modelisation and simulation of sulphur dioxide aggression to calcium carbonate stones D.

Social Status and Aggression in Road Traffic. An Analysis of Horn-Honking Responses Ben Jann

Gentle birth in New Zealand Fourth International "Gentle Childbirth" Midwifery

A Gentle Introduction A Gentle Introduction to Bilateral Filtering to Bilateral Filtering and

Web Services Web Services Towards Web Services Towards Web Services Towards Web Services A

Case 2: Reducing Cardiovascular Risk Type 2 Diabetes Management Case 1: Reducing Hypoglycemic

Ethics as a Quality Attribute Michael Keeling IBM @michaelkeeling Virtue Ethics Consider the

The Virtue of Vicious Circles Baltasar Trancn y Widemann TU Ilmenau 2016-10-25 Trancn y

Lets talk locks! @kavya719 kavya locks. locks are slow locks are slow latency

FAILURE AT NETFLIX VELOCITY Cannot Connect to the Netflix Service 0 0 Ms % IMPACT LATENCY

Approximate Computing on Unreliable Silicon Georgios Karakonstantis 2 Jeremy Constantin, Andreas

Recovery Techniques for Streaming Audio zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA A

Computing and Communications 1. Introduction Ying Cui Department of Electronic Engineering

Reliable Communication for Datacenters Mahesh Balakrishnan Cornell University Mahesh

Error-Bounded Correction of Noisy Labels Songzhu Zheng , Pengxiang Wu, Aman Goswami, Mayank

Shannon's Theory of Communication An operational introduction 5 September 2014, Introduction to

Network Security: Network Review and Firewalls Henning Schulzrinne Columbia University, New York

Energy Efficient Channel Coding Leonardo Fagundes Luz Serrano Energy Efficient Channel Coding

Reducing Web Latency: The Virtue of Gentle Aggression Tobias Flach , - PowerPoint PPT Presentation

Reducing Web Latency: The Virtue of Gentle Aggression Tobias Flach , Nandita Dukkipati, Andreas Terzis, Barath Raghavan, Yuchung Cheng, Neal Cardwell, Ankur Jain, Shuai Hao, Ethan Katz-Bassett, and Ramesh Govindan USC & Google August 14,

Dealing with Aggression and Best Mixing Practices Dr Jennifer Brown Research Scientist- Ethology

Reducing Latency for Linux Transport Per Hurtig Karlstad University Andreas Petlund Simula

Reducing input latency on the web bit.ly/reduce-input-latency W3C Games Workshop - June 2019

Asynchronous I/O Stack: A Low-latency Kernel I/O Stack for Ultra-Low Latency SSDs Jinkyu Jeong

Neurobiology and Treatment of Aggression A Translational Approach Zoran M Pavlovic MD Medical

Resident to Resident Aggression in Residential Care in B.C. www.seniorsadvocatebc.ca

Modelisation and simulation of sulphur dioxide aggression to calcium carbonate stones D.

Social Status and Aggression in Road Traffic. An Analysis of Horn-Honking Responses Ben Jann

Gentle birth in New Zealand Fourth International &quot;Gentle Childbirth&quot; Midwifery

A Gentle Introduction A Gentle Introduction to Bilateral Filtering to Bilateral Filtering and

Web Services Web Services Towards Web Services Towards Web Services Towards Web Services A

Case 2: Reducing Cardiovascular Risk Type 2 Diabetes Management Case 1: Reducing Hypoglycemic

Ethics as a Quality Attribute Michael Keeling IBM @michaelkeeling Virtue Ethics Consider the

The Virtue of Vicious Circles Baltasar Trancn y Widemann TU Ilmenau 2016-10-25 Trancn y

Lets talk locks! @kavya719 kavya locks. locks are slow locks are slow latency

FAILURE AT NETFLIX VELOCITY Cannot Connect to the Netflix Service 0 0 Ms % IMPACT LATENCY

Approximate Computing on Unreliable Silicon Georgios Karakonstantis 2 Jeremy Constantin, Andreas

Recovery Techniques for Streaming Audio zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA A

Computing and Communications 1. Introduction Ying Cui Department of Electronic Engineering

Reliable Communication for Datacenters Mahesh Balakrishnan Cornell University Mahesh

Error-Bounded Correction of Noisy Labels Songzhu Zheng , Pengxiang Wu, Aman Goswami, Mayank

Shannon's Theory of Communication An operational introduction 5 September 2014, Introduction to

Network Security: Network Review and Firewalls Henning Schulzrinne Columbia University, New York

Energy Efficient Channel Coding Leonardo Fagundes Luz Serrano Energy Efficient Channel Coding

Gentle birth in New Zealand Fourth International "Gentle Childbirth" Midwifery