A MAZON S3 Simple storage service Launched: March 14, 2006 Simple - PowerPoint PPT Presentation

A MAZON S3: A RCHITECTING FOR R ESILIENCY IN THE F ACE OF M ASSIVE L OAD Jason McHugh

S ETTING THE S TAGE • Architecting for Resiliency in the Face of Massive Load – Resiliency ‐ > High availability – Massive load 1. Many requests 2. Suddenly and with little or no warning 3. Request patterns differ from the norm

S ETTING THE S TAGE ~7000ms 151ms 293ms 17:19:10.100 17:19:03.122 Time 3,001 requests Zero requests Within a minute For Object request rate 1,097 requests “Foo” reached 30,000 rps where it stayed for roughly an hour. 34,944 requests June 17 th 2010

A VAILABILITY IS CRITICAL • Customers – Don’t care if you are a victim of your own success – Expect proper architecture • The more successful you are – The harder this problem becomes – The more important properly handling becomes • Features – Availability – Durability – Scalability – Performance

K EY T AKEAWAYS • This is a hard problem • Many techniques exist • A successful service has to solve this problem

O UTLINE • Amazon Simple Storage Service (S3) • Presenting the problem • Three techniques – Incorporating caching at scale – Adaptive consistency to handle flash crowds – Service protection • Conclusion

A MAZON S3 • Simple storage service • Launched: March 14, 2006 • Simple key/value storage system • Core tenets: simple, durable, secure, available • Financial guarantee of availability – Amazon S3 has to be above 99.9% available • Eventually consistent

P RESENTING THE P ROBLEM • None of this is unique to S3 • Super simple architecture • Natural evolution to handle scale • The core problem in all distributed systems

A S IMPLE A RCHITECTURE Load Balancing WS 1 WS 2 WS 3 Data Store

A S IMPLE A RCHITECTURE Load Balancing WS 4 WS 1 WS 2 WS 3 WS 5 Data Store Data Store

A S IMPLE A RCHITECTURE Load Balancing WS 4 WS 1 WS 2 WS 3 WS 5 WS 4 WS 4 WS 4 WS 4 WS 4 Data Store Data Store Data Store Data Store

C ORE P ROBLEMS • Weaknesses with simple architecture – Not cost effective – Correlation in customer requests to machine resources creates hotspots – A single machine hotspot can take down the entire service • Even when a request need not use that machine!

I LLUSTRATING THE C ORE P ROBLEMS … Load Balancer WS 4 WS 1 WS 2 WS 3 WS 5 WS 4 WS 4 WS 4 WS 4 WS 4 Data Store Data Store Data Store Data Store

M ASSIVE L OAD • Massive load characteristics – Large, unexpected, request pattern differs • Capacity planning is a different problem • Massive load manifests itself as hotspots • Can’t you avoid hotspots with the right design?

H OTSPOT M ANAGEMENT ‐ F ALLACIES • Fallacy: When a fleet is stateless then you don’t have to worry – Consider webservers and load balancers Hardware Load 40 Gbps 40 Gbps HW LB 1 HW LB 2 Balancer WS 1 WS 2 WS 3 WS 4 WS 1 WS 2 WS 3 WS 3

H OTSPOT M ANAGEMENT ‐ F ALLACIES • Fallacy: You only have to worry about the customer objects which grow the fastest – S3 object growth is the fastest – S3 buckets grow slowly – But bucket information is accessed for all requests – Buckets become hotspots • Don’t conflate orders of growth with hotspots

H OTSPOT M ANAGEMENT ‐ F ALLACIES • Fallacy: Hash distribution of resources solves all hotspot problems – Great job of distributing even the most granular unit accessed by the system – Problem is the most granular unit can become popular

S IMPLIFIED S3 A RCHITECTURE Get “/foo” Webserver Byte Stream Storage

S IMPLIFIED S3 A RCHITECTURE Network Boundary … Webserver 1 Webserver 2 Webserver 3 Webserver 4 Webserver W Storage 1 Storage 2 Storage 3 … … Storage S Key A, J, R, … Key B, K, S, … Key C, L, T, …

Resiliency Techniques • Caching at Scale • Adaptive Consistency • Service Protection

R ESILIENCY T ECHNIQUE – C ACHING AT S CALE • Architecture on prior slide creates hotspots • Introduce a cache to avoid hitting the storage nodes – Requests can be handled higher up in the stack – Serviced out of memory • Cache increases availability – Negative impact on consistency – Standard CAP stuff

R ESILIENCY T ECHNIQUE – C ACHING AT S CALE • Caching is all about the cache hit rate • At scale a cache must contend with: – Working set size and the long tail – Cache invalidation techniques – Memory overhead per cache entity – Management overhead per cache entity

R ESILIENCY T ECHNIQUE – C ACHING AT S CALE • Naïve techniques won’t work • Caching via distributed hash tables – Primary advantages: distribution of requests to cache nodes can use different dimensions of incoming request to route

R ESILIENCY T ECHNIQUE – C ACHING AT S CALE Network Boundary … Webserver 1 Webserver 2 Webserver 3 Webserver 4 Webserver N Cache 1 Cache 2 Cache C … Key A, C, … Key B, K, … Key T, … Storage 1 Storage 2 Storage 3 … Storage S Key B, K, S, … Key C, L, T, … Key A, J, R, …

R ESILIENCY T ECHNIQUE – C ACHING AT S CALE • Mitigate the impact on consistency • Cache Spoilers – Ruins cached value on a node – Caused by • Fleet membership inconsistencies • Network unreachability • Inability to communicate with proper machine due to transient machine failures

C ACHE S POILER IN A CTION Put k,v2 Network Boundary Get K Get k Put k,v2 Webserver 1 Webserver 2 Get k Put k,v2 Cache 1 Cache 2 <k,v> <k,v2> <k,v> Storage 1 Get k Put k,v2 <k,v> <k,v2>

C ACHE S POILER S OLUTIONS • Segment keys into sets of keys – Cache individual keys – Requests are for individual keys – Invalidation unit is for a set

C ACHE S POILER S OLUTIONS • Identifying spoiler agents – Capture the last writer to a set – it will be the owner – Create generations to capture last writer – New owner removes any prior generation for a set • Periodically – Each cache node learns about all generations that are valid

C ACHE S POILER IN A CTION Put k1, v2 Network Boundary Get K Put k1,v2 Webserver 1 Webserver 2 Put k1,v2 <k1,v, g1> Cache 1 Cache 2 <k1,v2, g2> Valid Generations: g1 Valid Generations: Valid Generations: g2 Put k1,v2 – from Cache2 Storage 1 Set 1: { k1, k2, k3, … } Set 1: { k1, k2, k3, … }, Set 1: { k1, k2, k3, … }, Owner Cache2, Generation g2 Owner Cache1, Generation g1

C ACHE S POILER S OLUTIONS • Validity – All cache entities have a generation associated with them – All cache nodes have a set of valid generations – Lookup for K in the cache will fail when generation associated with K is not in valid set

Resiliency Technique ‐ Adaptive Consistency • Flash Crowds – Surge in a request for a very small set of resources – Worst case scenario is for a single entity within your system – These are valid use cases

F LASH C ROWDS IN A CTION Network Boundary Get K … Webserver 1 Webserver 2 Webserver 3 Webserver 4 Webserver N … Cache 1 Cache 2 Cache C 30,000 rps … Storage 1 Storage 2 Storage 3 Storage S

R ESILIENCY T ECHNIQUE ‐ A DAPTIVE C ONSISTENCY • Trade off consistency to maintain availability • Cache at the Webserver layer • If done incorrectly can result in a see ‐ saw effect • Back channel communications to caching fleet – Knows about shielding being done – Knows “effective” request rate – Can incorporate information to know whether or not it would be overloaded if shielding weren’t done

R ESILIENCY T ECHNIQUE ‐ A DAPTIVE C ONSISTENCY Network Boundary Get k Get k Webserver 1 Webserver 2 Webserver 3 Webserver 4 Webserver N … <k, v> <k, v> <k, v> <k, v> <k, v> Get K Get K Get K Shielded: 2 Shielded: 72 Result: <k, v> Get k Shielded: 85 Result: <k, v> Overload: true Result: <k, v> Result: <k, v> Overload: false ShieldGoodness: 100 Overload: true Overload: true ShieldGoodness: 100 ShieldGoodness: 100 Heavy Hitters: Heavy Hitters: Heavy Hitters: Heavy Hitters: Heavy Hitters: Cache 2 k, 1000 k, 0 k, 2 k, 72 k, 157

R ESILIENCY T ECHNIQUE – S ERVICE P ROTECTION • When possible do something smart to absorb and handle incoming requests • As a last resort every single service must protect itself from an overwhelming load from an upstream service • Goal is to shed load – Early – Fairly

L OAD S HEDDING • Two standard techniques – Strict resource allocation – Adaptive

L OAD S HEDDING – R ESOURCE A LLOCATION • Hand out resource credits • Ensure credits never exceed capacity of the service • Replace credits over time • Number of credits for client can grow or shrink over time

L OAD S HEDDING – R ESOURCE A LLOCATION • Positives – Ensures that all work done by a machine is useful work – Tight guarantees on response time • Negatives – Tight coupling between client and server – Work for all APIs must be comparable – Capacity of server must be a fixed limit and computed ahead of time • Independent of execution order of APIs • Specific costs of APIs • Must be constantly changed

A MAZON S3 Simple storage service Launched: March 14, 2006 Simple - PowerPoint PPT Presentation

A MAZON S3: A RCHITECTING FOR R ESILIENCY IN THE F ACE OF M ASSIVE L OAD Jason McHugh S ETTING THE S TAGE Architecting for Resiliency in the Face of Massive Load Resiliency > High availability Massive load 1. Many requests 2.

C LIMATE CONNECTIONS TO MARINE ECOSYSTEMS - FROM THE A MAZON TO T HE P OLES Dr. Patricia Yager

C AN YOUR S ERVICE S URVIVE ? C AN YOUR S ERVICE S URVIVE ? C AN YOUR S ERVICE S URVIVE ?

Load Shedding in Network Monitoring Applications . Barlet-Ros 1 G. Iannaccone 2 J. Sanjus-Cuxart

Instruction encoding The ISA defines The format of an instruction (syntax) The

CS 294-73 Software Engineering for Scientific Computing Lecture 9: Performance on

CS 6958 LECTURE 11 CACHES February 12, 2014 Fancy Machines baetis.cs.utah.edu

Vulnerability Analysis Of Optimal Power Flow Problem Under Cyber-Physical Security Attacks

Overload Control for Scaling WeChat Microservices WeChat The new way to connect Chat Moments

Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2012/13 Data Stream

Confusion in the land of the serverless Sam Newman Building Microservices DESIGNING FINE -

Scripts for Sensor Network Seminar Data Management Section Lectured by George Kollios,

Real-Time Databases Meghan Russ Miriam Speert Pete Dempsey Sedat Behar Yevgeny Ioffe Zachi

Ope rating State s November 16 th 2018 1. Follow up on Autonomous Islands Age nda 2. Brief

The Eight Requirements of Real- Time Stream Processing: STREAM vs Storm Presentation by: Alex

Staying FIT with Aurora/Borealis Wednesday, 01 October 2008 Overview Introduction to Stream

Power Grid Impacts Resulting From Unintentional Demand Response J EFF D AGLE , PE Chief

CS533 Concepts of Operating Systems Class 2 Thread vs Event-Based Programming Questions Why

8F: Compact Data Structures for SDNs Muthukrishnan (Rutgers) and Rexford (Princeton)

Transient Stability and Phasor Measurement Unit (Synchrophasors) : Making

Presentation of the ATENA view and testbed ATENA H2020 WORKSHOP A new cybersecurity for

Transformation Design and Operation Working Group Meeting 21 Scheduling and dispatch part 3

TLA+ Quinceaera FLoC TLA+ Workshop David Langworthy Microsoft Engineer 2003: WS-Transaction

A sustainability-based approach to resource allocation in the Smart Grid Siddharth Suryanarayanan

Challenges in Data Stream Processing Corso di Sistemi e Architetture per Big Data A.A. 2019/2020