curb tail latency
play

CURB TAIL LATENCY WITH PELIKAN ABOUT ME 6 years at Twitter, on - PowerPoint PPT Presentation

IN-MEMORY CACHING: CURB TAIL LATENCY WITH PELIKAN ABOUT ME 6 years at Twitter, on cache maintainer of Twemcache (OSS), Twitters Redis fork operations of thousands of machines hundreds of (internal) customers Now working on


  1. IN-MEMORY CACHING: CURB TAIL LATENCY WITH PELIKAN

  2. ABOUT ME • 6 years at Twitter, on cache • maintainer of Twemcache (OSS), Twitter’s Redis fork • operations of thousands of machines • hundreds of (internal) customers • Now working on Pelikan, a next-gen cache framework to replace the above @twitter • Twitter: @thinkingfish

  3. THE PROBLEM: CACHE PERFORMANCE

  4. CACHE RULES SERVICE EVERYTHING AROUND ME CACHE DB

  5. 😤 CACHE RUINS SERVICE EVERYTHING AROUND 😤 ME SENSITIVE! CACHE DB

  6. GOOD CACHE PERFORMANCE = PREDICTABLE LATENCY

  7. GOOD CACHE PERFORMANCE = PREDICTABLE TAIL LATENCY

  8. KING OF PERFORMANCE “MILLIONS OF QPS PER MACHINE” “SUB-MILLISECOND LATENCIES” “NEAR LINE-RATE THROUGHPUT” …

  9. GHOSTS OF PERFORMANCE “ USUALLY PRETTY FAST” “HICCUPS EVERY ONCE IN A WHILE ” “TIMEOUT SPIKES AT THE TOP OF THE HOUR ” “SLOW ONLY WHEN MEMORY IS LOW” …

  10. I SPENT FIRST 3 MONTHS AT TWITTER LEARNING CACHE BASICS… …AND THE NEXT 5 YEARS CHASING GHOSTS

  11. CONTAIN GHOSTS = MINIMIZE INDETERMINISTIC BEHAVIOR

  12. HOW? IDENTIFY AVOID MITIGATE

  13. A PRIMER: CACHING IN DATACENTER

  14. CONTEXT • geographically centralized • highly homogeneous network • reliable, predictable infrastructure • long-lived connections • high data rate • simple data/operations

  15. CACHE IN PRODUCTION MAINLY: REQUEST → RESPONSE INITIALLY: CONNECT ALSO (BECAUSE WE ARE ADULTS): STATS, LOGGING, HEALTH CHECK…

  16. CACHE: BIRD’S VIEW protocol data storage event-driven server OS HOST network infrastructure

  17. HOW DID WE UNCOVER THE UNCERTAINTIES ?

  18. “ BANDWIDTH UTILIZATION WENT WAY UP, BUT REQUEST RATE WAY DOWN. ”

  19. SYSCALLS

  20. CONNECTING IS SYSCALL-HEAVY read 4+ syscalls accept config register event

  21. REQUEST IS SYSCALL-LIGHT read IO post- event (read) read 3 syscalls* parse process compose write IO post- event (write) write *: event loop returns multiple read events at once, I/O syscalls can be further amortized by batching/pipelining

  22. TWEMCACHE IS MOSTLY SYSCALLS • 1-2 µs overhead per call • dominate CPU time in simple cache • What if we have 100k conns / sec? source

  23. culprit: CONNECTION STORM

  24. “ …TWEMCACHE RANDOM HICCUPS, ALWAYS AT THE TOP OF THE HOUR. ”

  25. cache t worker ⏱ l o g g i n g DISK O / I cron job “ x”

  26. culprit: BLOCKING I/O

  27. “ WE ARE SEEING SEVERAL “BLIPS” AFTER EACH CACHE REBOOT… ”

  28. LOCKING FACTS • ~25ns per operation • more expensive on NUMA • much more costly when contended source

  29. A TIMELINE MEMCACHE RESTART … lock! EVERYTHING IS FINE REQUESTS SUDDENLY GET SLOW/TIMED-OUT lock! CONNECTION STORM CLIENTS TOPPLE SLOWLY RECOVER (REPEAT A FEW TIMES) … STABILIZE

  30. culprit: LOCKING

  31. “ HOSTS WITH LONG RUNNING CACHE TRIGGERS OOM WHEN LOAD SPIKE. ”

  32. “ REDIS INSTANCES WERE KILLED BY SCHEDULER. ”

  33. culprit: MEMORY

  34. SUMMARY CONNECTION STORM BLOCKING I/O LOCKING MEMORY

  35. HOW TO MITIGATE?

  36. DATA PLANE, CONTROL PLANE

  37. HIDE EXPENSIVE OPS PUT OPERATIONS OF DIFFERENT NATURE / PURPOSE ON SEPARATE THREADS

  38. SLOW: CONTROL PLANE LISTENING (ADMIN CONNECTIONS) STATS AGGREGATION STATS EXPORTING LOG DUMP

  39. FAST: DATA PLANE / REQUEST read IO post- event (read) read t worker : parse process compose write IO post- event (write) write

  40. FAST: DATA PLANE / CONNECT t server read accept config dispatch : event t worker read register : event

  41. LATENCY-ORIENTED THREADING t worker REQUESTS new logging, connection stats update t server t admin CONNECTS OTHER logging, stats update

  42. WHAT TO AVOID?

  43. LOCKING

  44. WHAT WE KNOW • inter-thread communication in cache t worker • stats new logging, • logging connection stats update • connection hand-off t server t admin • locking propagates blocking/delay logging, between threads stats update

  45. LOCKLESS OPERATIONS MAKE STATS UPDATE LOCKLESS w/ atomic instructions

  46. LOCKLESS OPERATIONS MAKE LOGGING WAITLESS RING/CYCLIC BUFFER writer reader read write position position

  47. LOCKLESS OPERATIONS MAKE CONNECTION HAND-OFF LOCKLESS … … RING ARRAY writer reader read write position position

  48. MEMORY

  49. WHAT WE KNOW • alloc-free cause fragmentation • internal vs external fragmentation • OOM/swapping is deadly • memory alloc/copy relatively expensive source

  50. PREDICTABLE FOOTPRINT AVOID EXTERNAL FRAGMENTATION CAP ALL MEMORY RESOURCES

  51. PREDICTABLE RUNTIME REUSE BUFFER PREALLOCATE

  52. IMPLEMENTATION PELIKAN CACHE

  53. WHAT IS PELIKAN CACHE? process • (Datacenter-) Caching framework server cache data model parse/compose/trace orchestration • A summary of Twitter’s cache ops data store request response • Perf goal: deterministically fast streams events • Clean, modular design poo ling • Open-source channels buffers timer alarm common core pelikan.io waitless logging lockless metrics composed config threading

  54. PERFORMANCE DESIGN DECISIONS A COMPARISON latency-oriented Memory/ Memory/ Memory/ locking threading fragmentation buffer caching pre-allocation, cap partial internal partial partial yes Memcached no->partial external no partial no->yes Redis yes internal yes yes no Pelikan

  55. TO BE FAIR… MEMCACHED REDIS • multiple worker threads • rich set of data structures • binary protocol + SASL • master-slave replication • redis-cluster • modules • tools

  56. THE BEST CACHE IS… ALWAYS FAST

  57. QUESTIONS?

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend