measuring and optimizing tail latency
play

Measuring and Optimizing Tail Latency Kathryn S McKinley, Google - PowerPoint PPT Presentation

Measuring and Optimizing Tail Latency Kathryn S McKinley, Google CRA-W Undergraduate Town Hall April 5 th , 2018 Speaker & Moderator The image part with relationship ID rId3 was not found in the file. Lori Pollock Kathryn S McKinley Dr.


  1. Measuring and Optimizing Tail Latency Kathryn S McKinley, Google CRA-W Undergraduate Town Hall April 5 th , 2018

  2. Speaker & Moderator The image part with relationship ID rId3 was not found in the file. Lori Pollock Kathryn S McKinley Dr. Lori Pollock is a Professor in Computer and Dr. Kathryn S. McKinley is a Senior Research Information Sciences at University of Scientist at Google and previously was a Delaware. Her current research focuses on Researcher at Microsoft and an Endowed program analysis for building better software Professorship at The University of Texas at Austin. maintenance tools, software testing, energy- Her research spans programming languages, efficient software and computer science compilers, runtime systems, architecture, education. Dr. Pollock is an ACM Distinguished performance, and energy. She and her Scientist and was awarded the University of collaborators have produced several widely used Delaware’s Excellence in Teaching Award and tools: the DaCapo Java Benchmarks (30,000+ the E.A. Trabant Award for Women’s Equity. downloads), the TRIPS Compiler, Hoard memory manager, MMTk memory management toolkit, and the Immix garbage collector. She served as program chair for ASPLOS, PACT, PLDI, ISMM, and CGO. She is currently a CRA and CRA-W Board member. Dr. McKinley was honored to testify to the House Science Committee (Feb. 14, 2013). She is an IEEE and ACM Fellow. She has graduated 22 PhD students.

  3. Measuring and Optimizing Tail Latency Kathryn S McKinley, Google Xi Yang, Stephen M Blackburn, Md Haque, Sameh Elnikety, Yuxiong He, Ricardo Bianchini

  4. Tail Latency Matters TOP PRIORITY 400 millisecond delay decreased Two second slowdown reduced searches/user by 0.59%. [Jack Brutlag, Google] revenue/user by 4.3%. [Eric Schurman, Bing] 4

  5. 5 Photo: Google/Connie Zhou

  6. Datacenter economics quick facts* ~ $500,000 Cost of small datacenter ~3,000,000 US datacenters in 2016 ~ $1.5 trillion US Capital investment to date ~ $3,000,000,000 KW dollars / year ~ $30,000,000 Savings from 1% less work Lots more by not building a datacenter *Shehabi et al., United States Data Center Energy 6 Usage Report, Lawrence Berkeley, 2016.

  7. TOP PRIORITY Tail Latency Efficiency 8 8

  8. BOTH ?! Tail Latency Efficiency 9 9

  9. Server architecture client aggregator workers 10

  10. Characteristics of interactive services 100 5 LC 4.5 Percentage of requests 80 4 Bursty, diurnal 3.5 60 3 CDF changes slowly 2.5 Slowest server dictates tail 40 2 1.5 Orders of magnitude diff 20 1 average & tail - 99th %tile 0.5 0 0 0 2 4 6 8 1 0 0 0 0 0 0 Latency (ms) 11

  11. What is in the tail? 100 5 4.5 Percentage of requests 80 4 3.5 60 3 2.5 ? 40 2 1.5 20 1 0.5 0 0 0 2 4 6 8 1 0 0 0 0 0 0 Latency (ms) 12

  12. Cycle-level on-line profiling tool [ISCA’15 (Top Picks HM), ATC’16] Insight Hardware & software generate signals without instrumentation 4 HT1 IPC counters tags HT1 0 4 Core IPC performance ✓ ✓ 4 0 counters HT2 SHIM IPC HT2 0 memory ✓ ✓ HT1 IPC = Core IPC – HT2 SHIM IPC locations 13

  13. What is in the tail? 100 5 4.5 Percentage of requests 80 4 3.5 60 3 2.5 ? 40 2 1.5 20 1 0.5 0 0 0 2 4 6 8 1 0 0 0 0 0 0 Latency (ms) 14

  14. The Tail Longest 200 requests } noise Network & other Network imperfections Network and networking queueing time Idle OS imperfections Idle time } 120 CPU work Long requests not noise CPU time Queuing at worker Overload 100 Dispatch queueing time latency (ms) latency 80 60 40 20 0 0 50 100 150 200 15 Top 200 requests

  15. Optimizing the tail Diagnosing the tail with continuous profiling No Noise ise systems are not perfect too much load is bad, but so is over Queuing Queuing provisioning Wo Work many requests are long In Insights Use the CDF off line Long requests reveal themselves, treat them specially 16

  16. Insight Long requests reveal themselves Regardless of the cause 17

  17. Noise Replicate & reissue The Tail at Scale, Dean & Barroso, CACM’13 All requests? 100 5 4.5 5% reissued Percentage of requests 80 4 CFD for cost & potential 3.5 10 % reissued 60 3 2.5 Fixed issue time 40 2 1.5 noise 20 1 0.5 0 0 0 2 4 6 8 1 0 0 0 0 0 0 Latency (ms) 18 18

  18. Probabilistic reissue Optimal Reissue Policies for Reducing Tail Latencies, Kaler, He, & Elnickety , SPAA’17 Adding randomness to 100 5 reissue makes one earlier 4.5 5% reissued Percentage of requests reissue time d (vs n) optimal 80 4 3.5 60 3 Probability is proportional to 1-3% reissue w/ prob. p 2.5 reissue budget & noise in tail 40 2 1.5 noise 20 1 0.5 0 0 0 2 4 6 8 1 0 0 0 0 0 0 Latency (ms) 19 19

  19. Single R Probabilistic reissue Optimal Reissue Policies for Reducing Tail Latencies, Kaler, He, & Elnickety , SPAA’17 20

  20. Work Speed up the tail efficiently Judicious parallelism 100 5 [ASPLOS’15] 4.5 Percentage of requests DVFS faster on the tail 80 4 [DISC’14, MICRO’17] 3.5 Asymmetric multicore 60 3 2.5 [DISC’14, MICRO’17] 40 2 1.5 work 20 1 0.5 0 0 0 2 4 6 8 1 0 0 0 0 0 0 Latency (ms) 21

  21. Work Parallelism Parallelism historically for throughput Idea Parallelism for tail latency 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

  22. Queuing theory Optimizing average latency maximizes throughput But not the tail! Shortening the tail reduces queuing latency 23

  23. Parallelism Parallelism historically for throughput Idea Parallelism for tail latency 0 0 0 0 Insight Long requests reveal themselves 0 0 0 0 0 0 0 0 Approach Incrementally add parallelism to long requests – the tail – based on request progress & load 24

  24. Few to Many Sequential xed: add thread every d ms Fi Fixe 4 way Dynamic: u : use l load Tail latency ms Fixed interval 20 ms 1500 Fixed interval 100 ms Fixed interval 500 ms 1200 0 0 0 0 900 0 0 0 0 short delay good at best at 0 0 0 0 low load all loads 600 long delay good at 300 high load 30 32 34 36 38 40 42 44 46 48 Lucene RPS

  25. Evaluation 2x8 64 bit 2.3 GHz Xeon, 64 GB Dynamic parallelism 1500 Sequential Few to Many Tail latency ms 1200 21% fewer servers 900 600 or reduce tail by 28% 300 30 32 34 36 38 40 42 44 46 48 Requests per Second

  26. Work speed up the tail efficiently Judicious parallelism 100 5 [ASPLOS’15] ✔ 4.5 Percentage of requests 80 4 3.5 60 3 2.5 40 2 1.5 work 20 1 0.5 0 0 0 2 4 6 8 1 0 0 0 0 0 0 Latency (ms) 27

  27. BOTH ! Tail Latency Efficiency 28 28

  28. Efficiency at scale for interactive workloads Diagnosing the tail with continuous profiling No Noise ise replication, systems are not perfect replication + judicious choice Queuing Queuing Wo Work judicious use of resources on long requests Request latency CDF is a powerful tool Tail efficiency ≠ average or throughput Hardware heterogeneity Questions? 29

  29. Professional and Research Relationships

  30. Your Academic Village • Peer students • Students senior & junior to you • Teaching assistants • PhD students • Faculty

  31. My Professional Village • Researchers in all career stages – Undergrads, PhD students, post docs – Faculty, industrial researchers, staff, administrators • Industrial village – Software engineers in all career stages – Managers, directors, admins, – in/out my management chain

  32. Faculty Mentors Don Johnson Ken Kennedy Dave Stemple My Professor PhD Advisor Dept. Chair

  33. Building a Village

  34. Networking is…. Building and sustaining professional relationships Participating in an academic / research community • Finding people you like and you learn from, and building a • relationship

  35. Networking is not …. Using people • A substitute for quality work •

  36. But I am Horrible at Small Talk • You have CS in common • Networking is not genetic • It is a research skill – Practice – Meet people – Learn – Go places – Volunteer! – Sustain your relationships

  37. With whom do you network? People you like • People senior to you, who can show you the way • People at different career stages, so you can anticipate • Your peers •

  38. Peer Mentors Mary Hall Doug Burger Margaret Martonosi

  39. Your Village Will • Write letters for grad school, jobs, etc. • Help you solve problems • Point you in good directions • Encourage you • Choose you for important roles • You will do the same or more for them • Make your life and work more fun and meaningful

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend