Tales of the Tail Hardware, OS, and Application-level Sources of - PowerPoint PPT Presentation

Tales of the Tail Hardware, OS, and Application-level Sources of Tail Latency Jialin Li, Naveen Kr. Sharma , Dan R. K. Ports and Steven D. Gribble February 2, 2015 1

Introduction What is Tail Latency? What is Tail Latency? 2

Introduction What is Tail Latency? What is Tail Latency? Fraction of Requests Request Processing Time 2

Introduction What is Tail Latency? What is Tail Latency? Fraction of Requests Request Processing Time In Facebook’s Memcached deployment, Median latency is 100 µ s , but 95 th percentile latency ≥ 1 ms . 2

Introduction What is Tail Latency? What is Tail Latency? Fraction of Requests Request Processing Time In Facebook’s Memcached deployment, Median latency is 100 µ s , but 95 th percentile latency ≥ 1 ms . In this talk, we will explore Why some requests take longer than expected? What causes them to get delayed? 2

Introduction What is Tail Latency? Why is the Tail important? Low latency is crucial for interactive services. 500ms delay can cause 20% drop in user traffic. [Google Study] Latency is directly tied to traffic, hence revenue. 3

Introduction What is Tail Latency? Why is the Tail important? Low latency is crucial for interactive services. 500ms delay can cause 20% drop in user traffic. [Google Study] Latency is directly tied to traffic, hence revenue. What makes it challenging is today’s datacenter workloads. Interactive services are highly parallel. Single client request spawns thousands of sub-tasks. Overall latency depends on slowest sub-task latency. Bad Tail ⇒ Probability of any one sub-task getting delayed is high. 3

Introduction What is Tail Latency? A real-life example Nishtala et. al. Scaling memcache at Facebook, NSDI 2013. 4

Introduction What is Tail Latency? A real-life example All requests have to finish within the SLA latency. Nishtala et. al. Scaling memcache at Facebook, NSDI 2013. 4

Introduction What is Tail Latency? What can we do? People in industry have worked hard on solutions. Hedged Requests [Jeff Dean et. al.] Effective sometimes, but adds application specific complexity. Intelligently avoid slow machines Keep track of server status; route requests around slow nodes. 5

Introduction What is Tail Latency? What can we do? People in industry have worked hard on solutions. Hedged Requests [Jeff Dean et. al.] Effective sometimes, but adds application specific complexity. Intelligently avoid slow machines Keep track of server status; route requests around slow nodes. Attempts to build predictable response out of less predictable parts. We still don’t know what is causing requests to get delayed. 5

Introduction What is Tail Latency? Our Approach 1 Pick some real life applications: RPC Server, Memcached, Nginx . 2 Generate the ideal latency distribution. 3 Measure the actual distribution on a standard Linux server. 4 Identify a factor causing deviation from ideal distribution. 5 Explain and mitigate it. 6 Iterate over this till we reach the ideal distribution. 6

Introduction What is Tail Latency? Rest of the Talk Introduction 1 Predicted Latency from Queuing Models 2 Measurements: Sources of Tail Latencies 3 Summary 4 7

Predicted Latency from Queuing Models Ideal latency distribution What is the ideal latency for a network server? 8

Predicted Latency from Queuing Models Ideal latency distribution What is the ideal latency for a network server? Ideal baseline for comparing measured performance. 8

Predicted Latency from Queuing Models Ideal latency distribution What is the ideal latency for a network server? Ideal baseline for comparing measured performance. Assume a simple model, and apply queuing theory. 8

Predicted Latency from Queuing Models Ideal latency distribution What is the ideal latency for a network server? Ideal baseline for comparing measured performance. Assume a simple model, and apply queuing theory. Server 8

Predicted Latency from Queuing Models Ideal latency distribution What is the ideal latency for a network server? Ideal baseline for comparing measured performance. Assume a simple model, and apply queuing theory. Clients Server 8

Predicted Latency from Queuing Models Ideal latency distribution What is the ideal latency for a network server? Ideal baseline for comparing measured performance. Assume a simple model, and apply queuing theory. Clients Server Given the arrival distribution and request processing time, We can predict the time spent by a request in the server. 8

Predicted Latency from Queuing Models Tail latency characteristics 10 0 10 -1 CCDF P[X >= x] 10 -2 10 -3 10 -4 10 1 10 2 10 3 10 4 Latency in micro-seconds Dummy 9

Predicted Latency from Queuing Models Tail latency characteristics 10 0 Distribution 1 10 -1 CCDF P[X >= x] 10 -2 10 -3 10 -4 10 1 10 2 10 3 10 4 Latency in micro-seconds Dummy 9

Predicted Latency from Queuing Models Tail latency characteristics 10 0 Distribution 1 10 -1 CCDF P[X >= x] 10 -2 10 -3 10 -4 10 1 10 2 10 3 10 4 Latency in micro-seconds 99th percentile ⇒ 60 µ s 9

Predicted Latency from Queuing Models Tail latency characteristics 10 0 Distribution 1 10 -1 CCDF P[X >= x] 10 -2 10 -3 10 -4 10 1 10 2 10 3 10 4 Latency in micro-seconds 99.9th percentile ⇒ 200 µ s 9

Predicted Latency from Queuing Models Tail latency characteristics 10 0 Distribution 1 Distribution 2 10 -1 CCDF P[X >= x] 10 -2 10 -3 10 -4 10 1 10 2 10 3 10 4 Latency in micro-seconds Dummy 9

Predicted Latency from Queuing Models Tail latency characteristics What is the ideal latency distribution? Assume a server with single worker with 50 µ s fixed processing time. 10 0 10 -1 CCDF P[X >= x] 10 -2 10 -3 10 -4 10 1 10 2 10 3 10 4 Latency in micro-seconds Dummy 9

Predicted Latency from Queuing Models Tail latency characteristics What is the ideal latency distribution? Assume a server with single worker with 50 µ s fixed processing time. 10 0 Uniform Request Arrival 10 -1 CCDF P[X >= x] 10 -2 10 -3 10 -4 10 1 10 2 10 3 10 4 Latency in micro-seconds Dummy 9

Predicted Latency from Queuing Models Tail latency characteristics What is the ideal latency distribution? Assume a server with single worker with 50 µ s fixed processing time. 10 0 Uniform Request Arrival Poisson at 70% Utilization 10 -1 CCDF P[X >= x] 10 -2 10 -3 10 -4 10 1 10 2 10 3 10 4 Latency in micro-seconds Inherent tail latency due to request burstiness. 9

Predicted Latency from Queuing Models Tail latency characteristics What is the ideal latency distribution? Assume a server with single worker with 50 µ s fixed processing time. 10 0 Uniform Request Arrival Poisson at 70% Utilization 10 -1 CCDF P[X >= x] Poisson at 90% Utilization 10 -2 10 -3 10 -4 10 1 10 2 10 3 10 4 Latency in micro-seconds Tail latency depends on the average server utilization. 9

Predicted Latency from Queuing Models Tail latency characteristics What is the ideal latency distribution? Assume a server with single worker with 50 µ s fixed processing time. 10 0 Uniform Request Arrival Poisson at 70% Utilization 10 -1 CCDF P[X >= x] Poisson at 90% Utilization Poisson at 70% - 4 workers 10 -2 10 -3 10 -4 10 1 10 2 10 3 10 4 Latency in micro-seconds Additional workers can reduce tail latency, even at constant utilization. 9

Measurements: Sources of Tail Latencies Introduction 1 Predicted Latency from Queuing Models 2 Measurements: Sources of Tail Latencies 3 Summary 4 10

Measurements: Sources of Tail Latencies Testbed Cluster of standard datacenter machines. 2 x Intel L5640 6 core CPU 24 GB of DRAM Mellanox 10Gbps NIC Ubuntu 12.04, Linux Kernel 3.2.0 All servers connected to a single 10 Gbps ToR switch. One server runs Memcached, others run workload generating clients. Other application results are in the paper. 11

Measurements: Sources of Tail Latencies Timestamping Methodology Append a blank buffer ≈ 32 bytes to each request. Overwrite buffer with timestamps as it goes through the server. Incoming After TCP/UDP Memcached thread Server NIC processing scheduled on CPU Outgoing Memcached Memcached Server NIC write() read() return Very low overhead and no server side logging. 12

Measurements: Sources of Tail Latencies How far are we from the ideal? 13

Measurements: Sources of Tail Latencies How far are we from the ideal? 10 0 Ideal Model 10 -1 CCDF P[X >= x] 10 -2 10 -3 10 -4 10 1 10 2 10 3 10 4 Latency in micro-seconds Single CPU, single core, Memcached running at 80% utilization. 14

Tales of the Tail Hardware, OS, and Application-level Sources of - PowerPoint PPT Presentation

Tales of the Tail Hardware, OS, and Application-level Sources of Tail Latency Jialin Li, Naveen Kr. Sharma , Dan R. K. Ports and Steven D. Gribble February 2, 2015 1 Introduction What is Tail Latency? What is Tail Latency? 2 Introduction

CS161 Recursion Continued Tail recursion n Tail recursion is a recursive call that occurs as

Race Condition Shared Data: 4 5 6 1 8 5 6 20 9 ? Synchronization and Deadlocks tail

Race Condition Shared Data: 5 6 4 1 8 5 6 20 9 ? InterProcess Communication tail A[]

The Obstacle is the Way: The Timeless Art of Turning Trials into Triumph: A Book Tales

Hardware Observability Framework Hardware Observability Framework Hardware Observability

tail bounds tail bounds For a random variable X, the tails of X are the parts of the PMF/density

Probe or Wait : Handling tail losses using Multipath TCP Kiran Yedugundla, Per Hurtig, Anna

English Traditional Tales: Hansel and Gretel English I Year 3 I Literacy I Traditional Tales

EXCITING ADVENTURES WITH ANIMALS AROUND THE WORLD- 6 TANTALIZING TALES 1 6 TANTALIZING TALES

Tuesday 16th June Lesson 1 1. What are tales? 2. Which tales do you know or have you heard of?

Sec Secure ure Hardware Hardware and Hardware and Hardware- En Enabled abled Security

VC. VC. Hardware Startup The Hardware Revolu/on The Hardware Revolution Removing Barriers to

Synthesis and characterization of new tail-to-tail dimers of bile acids with different spacers

Tail call elimination Tail calls and their elimination Michel Schinz Loops in functional

Tail call elimination Michel Schinz Tail calls and their elimination Loops in functional

NATIONAL ELECTRICITY MARKET: NATIONAL ELECTRICITY MARKET: TAIL BETWEEN ITS NEG? TAIL BETWEEN ITS

OS Structures Tevfik Ko ! ar University at Buffalo September 1 st , 2011 1 Roadmap OS

Tock Operating System Safety without Processes for Embedded Systems 1 Stanford University 2

Architecture-Neutral Operating System Components Daniel Lohmann Department of Computer Science

Module 22: The Linux System History Design Principles Kernel Modules Process

Time Frequency Analysis Overview Introduction and Motivation Introduction and motivation r x (

MECIR Toby Lasserson April 2016 Trusted evidence. Informed decisions. Better health. Session

A simple vision of spin torques in domain walls Michel Viret, Antoine Vanhaverbeke CEA Saclay

Machine Learning: Study of algorithms that improve their performance P at some task T