tales of the tail hardware os and application level
play

Tales of the Tail Hardware, OS, and Application-level Sources of - PowerPoint PPT Presentation

Tales of the Tail Hardware, OS, and Application-level Sources of Tail Latency Jialin Li, Naveen Kr. Sharma , Dan R. K. Ports and Steven D. Gribble February 2, 2015 1 Introduction What is Tail Latency? What is Tail Latency? 2 Introduction


  1. Tales of the Tail Hardware, OS, and Application-level Sources of Tail Latency Jialin Li, Naveen Kr. Sharma , Dan R. K. Ports and Steven D. Gribble February 2, 2015 1

  2. Introduction What is Tail Latency? What is Tail Latency? 2

  3. Introduction What is Tail Latency? What is Tail Latency? Fraction of Requests Request Processing Time 2

  4. Introduction What is Tail Latency? What is Tail Latency? Fraction of Requests Request Processing Time 2

  5. Introduction What is Tail Latency? What is Tail Latency? Fraction of Requests Request Processing Time In Facebook’s Memcached deployment, Median latency is 100 µ s , but 95 th percentile latency ≥ 1 ms . 2

  6. Introduction What is Tail Latency? What is Tail Latency? Fraction of Requests Request Processing Time In Facebook’s Memcached deployment, Median latency is 100 µ s , but 95 th percentile latency ≥ 1 ms . In this talk, we will explore Why some requests take longer than expected? What causes them to get delayed? 2

  7. Introduction What is Tail Latency? Why is the Tail important? Low latency is crucial for interactive services. 500ms delay can cause 20% drop in user traffic. [Google Study] Latency is directly tied to traffic, hence revenue. 3

  8. Introduction What is Tail Latency? Why is the Tail important? Low latency is crucial for interactive services. 500ms delay can cause 20% drop in user traffic. [Google Study] Latency is directly tied to traffic, hence revenue. What makes it challenging is today’s datacenter workloads. Interactive services are highly parallel. Single client request spawns thousands of sub-tasks. Overall latency depends on slowest sub-task latency. Bad Tail ⇒ Probability of any one sub-task getting delayed is high. 3

  9. Introduction What is Tail Latency? A real-life example Nishtala et. al. Scaling memcache at Facebook, NSDI 2013. 4

  10. Introduction What is Tail Latency? A real-life example All requests have to finish within the SLA latency. Nishtala et. al. Scaling memcache at Facebook, NSDI 2013. 4

  11. Introduction What is Tail Latency? What can we do? People in industry have worked hard on solutions. Hedged Requests [Jeff Dean et. al.] Effective sometimes, but adds application specific complexity. Intelligently avoid slow machines Keep track of server status; route requests around slow nodes. 5

  12. Introduction What is Tail Latency? What can we do? People in industry have worked hard on solutions. Hedged Requests [Jeff Dean et. al.] Effective sometimes, but adds application specific complexity. Intelligently avoid slow machines Keep track of server status; route requests around slow nodes. Attempts to build predictable response out of less predictable parts. We still don’t know what is causing requests to get delayed. 5

  13. Introduction What is Tail Latency? Our Approach 1 Pick some real life applications: RPC Server, Memcached, Nginx . 2 Generate the ideal latency distribution. 3 Measure the actual distribution on a standard Linux server. 4 Identify a factor causing deviation from ideal distribution. 5 Explain and mitigate it. 6 Iterate over this till we reach the ideal distribution. 6

  14. Introduction What is Tail Latency? Rest of the Talk Introduction 1 Predicted Latency from Queuing Models 2 Measurements: Sources of Tail Latencies 3 Summary 4 7

  15. Predicted Latency from Queuing Models Ideal latency distribution What is the ideal latency for a network server? 8

  16. Predicted Latency from Queuing Models Ideal latency distribution What is the ideal latency for a network server? Ideal baseline for comparing measured performance. 8

  17. Predicted Latency from Queuing Models Ideal latency distribution What is the ideal latency for a network server? Ideal baseline for comparing measured performance. Assume a simple model, and apply queuing theory. 8

  18. Predicted Latency from Queuing Models Ideal latency distribution What is the ideal latency for a network server? Ideal baseline for comparing measured performance. Assume a simple model, and apply queuing theory. Server 8

  19. Predicted Latency from Queuing Models Ideal latency distribution What is the ideal latency for a network server? Ideal baseline for comparing measured performance. Assume a simple model, and apply queuing theory. Clients Server 8

  20. Predicted Latency from Queuing Models Ideal latency distribution What is the ideal latency for a network server? Ideal baseline for comparing measured performance. Assume a simple model, and apply queuing theory. Clients Server 8

  21. Predicted Latency from Queuing Models Ideal latency distribution What is the ideal latency for a network server? Ideal baseline for comparing measured performance. Assume a simple model, and apply queuing theory. Clients Server 8

  22. Predicted Latency from Queuing Models Ideal latency distribution What is the ideal latency for a network server? Ideal baseline for comparing measured performance. Assume a simple model, and apply queuing theory. Clients Server Given the arrival distribution and request processing time, We can predict the time spent by a request in the server. 8

  23. Predicted Latency from Queuing Models Tail latency characteristics 10 0 10 -1 CCDF P[X >= x] 10 -2 10 -3 10 -4 10 1 10 2 10 3 10 4 Latency in micro-seconds Dummy 9

  24. Predicted Latency from Queuing Models Tail latency characteristics 10 0 Distribution 1 10 -1 CCDF P[X >= x] 10 -2 10 -3 10 -4 10 1 10 2 10 3 10 4 Latency in micro-seconds Dummy 9

  25. Predicted Latency from Queuing Models Tail latency characteristics 10 0 Distribution 1 10 -1 CCDF P[X >= x] 10 -2 10 -3 10 -4 10 1 10 2 10 3 10 4 Latency in micro-seconds 99th percentile ⇒ 60 µ s 9

  26. Predicted Latency from Queuing Models Tail latency characteristics 10 0 Distribution 1 10 -1 CCDF P[X >= x] 10 -2 10 -3 10 -4 10 1 10 2 10 3 10 4 Latency in micro-seconds 99.9th percentile ⇒ 200 µ s 9

  27. Predicted Latency from Queuing Models Tail latency characteristics 10 0 Distribution 1 Distribution 2 10 -1 CCDF P[X >= x] 10 -2 10 -3 10 -4 10 1 10 2 10 3 10 4 Latency in micro-seconds Dummy 9

  28. Predicted Latency from Queuing Models Tail latency characteristics What is the ideal latency distribution? Assume a server with single worker with 50 µ s fixed processing time. 10 0 10 -1 CCDF P[X >= x] 10 -2 10 -3 10 -4 10 1 10 2 10 3 10 4 Latency in micro-seconds Dummy 9

  29. Predicted Latency from Queuing Models Tail latency characteristics What is the ideal latency distribution? Assume a server with single worker with 50 µ s fixed processing time. 10 0 Uniform Request Arrival 10 -1 CCDF P[X >= x] 10 -2 10 -3 10 -4 10 1 10 2 10 3 10 4 Latency in micro-seconds Dummy 9

  30. Predicted Latency from Queuing Models Tail latency characteristics What is the ideal latency distribution? Assume a server with single worker with 50 µ s fixed processing time. 10 0 Uniform Request Arrival Poisson at 70% Utilization 10 -1 CCDF P[X >= x] 10 -2 10 -3 10 -4 10 1 10 2 10 3 10 4 Latency in micro-seconds Inherent tail latency due to request burstiness. 9

  31. Predicted Latency from Queuing Models Tail latency characteristics What is the ideal latency distribution? Assume a server with single worker with 50 µ s fixed processing time. 10 0 Uniform Request Arrival Poisson at 70% Utilization 10 -1 CCDF P[X >= x] Poisson at 90% Utilization 10 -2 10 -3 10 -4 10 1 10 2 10 3 10 4 Latency in micro-seconds Tail latency depends on the average server utilization. 9

  32. Predicted Latency from Queuing Models Tail latency characteristics What is the ideal latency distribution? Assume a server with single worker with 50 µ s fixed processing time. 10 0 Uniform Request Arrival Poisson at 70% Utilization 10 -1 CCDF P[X >= x] Poisson at 90% Utilization Poisson at 70% - 4 workers 10 -2 10 -3 10 -4 10 1 10 2 10 3 10 4 Latency in micro-seconds Additional workers can reduce tail latency, even at constant utilization. 9

  33. Measurements: Sources of Tail Latencies Introduction 1 Predicted Latency from Queuing Models 2 Measurements: Sources of Tail Latencies 3 Summary 4 10

  34. Measurements: Sources of Tail Latencies Testbed Cluster of standard datacenter machines. 2 x Intel L5640 6 core CPU 24 GB of DRAM Mellanox 10Gbps NIC Ubuntu 12.04, Linux Kernel 3.2.0 All servers connected to a single 10 Gbps ToR switch. One server runs Memcached, others run workload generating clients. Other application results are in the paper. 11

  35. Measurements: Sources of Tail Latencies Timestamping Methodology Append a blank buffer ≈ 32 bytes to each request. Overwrite buffer with timestamps as it goes through the server. Incoming After TCP/UDP Memcached thread Server NIC processing scheduled on CPU Outgoing Memcached Memcached Server NIC write() read() return Very low overhead and no server side logging. 12

  36. Measurements: Sources of Tail Latencies How far are we from the ideal? 13

  37. Measurements: Sources of Tail Latencies How far are we from the ideal? 10 0 Ideal Model 10 -1 CCDF P[X >= x] 10 -2 10 -3 10 -4 10 1 10 2 10 3 10 4 Latency in micro-seconds Single CPU, single core, Memcached running at 80% utilization. 14

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend