Optimized for change: Architecture @ Etsy Kellan Elliott-McCrea - - PowerPoint PPT Presentation

optimized for change architecture etsy
SMART_READER_LITE
LIVE PREVIEW

Optimized for change: Architecture @ Etsy Kellan Elliott-McCrea - - PowerPoint PPT Presentation

Optimized for change: Architecture @ Etsy Kellan Elliott-McCrea @kellan CTO, Etsy Monday, June 18, 12 Monday, June 18, 12 Launched June 18, 2005 875,000 active sellers 33.5MM items for sale $65.9MM in sales, in May 1.4B page views, in May


slide-1
SLIDE 1

Optimized for change: Architecture @ Etsy

Kellan Elliott-McCrea @kellan CTO, Etsy

Monday, June 18, 12
slide-2
SLIDE 2 Monday, June 18, 12
slide-3
SLIDE 3

Launched June 18, 2005 875,000 active sellers 33.5MM items for sale $65.9MM in sales, in May 1.4B page views, in May 102 engineers 32 releases, last Friday

Monday, June 18, 12
slide-4
SLIDE 4

LAMP

any questions?

8BitLit, http://www.etsy.com/listing/90066890/ Monday, June 18, 12
slide-5
SLIDE 5

Why?

Monday, June 18, 12
slide-6
SLIDE 6

3 inevitabilities we design for:

  • 1. Things break, unexpectedly
  • 2. What we're building changes
  • 3. We don't get to start over
Monday, June 18, 12
slide-7
SLIDE 7

2 years of change.

Monday, June 18, 12
slide-8
SLIDE 8

* Don't bet against the future. * Our customers are humans. * Simplicity always wins, in the end. * Favor global vs local optimization. * Ambiguity kills momentum. * Make failure cheap. * Technical debt is an inevitable by-product

  • f shipping code.

* Optimize for change.

Architectural Principles

Monday, June 18, 12
slide-9
SLIDE 9 Ckrickett, http://www.etsy.com/listing/90611466

Cleverness

Monday, June 18, 12
slide-10
SLIDE 10 Ckrickett, http://www.etsy.com/listing/90611466

Complex systems and change

  • 1. Distributed systems are inherently complex.
  • 2. The outcome of change in complex systems is hard to

predict.

  • 3. The outcome of small, frequent, measurable changes

are easier to predict, easier to recover from, and promote learning.

Monday, June 18, 12
slide-11
SLIDE 11 Ckrickett, http://www.etsy.com/listing/90611466

Continuous deployment, Metrics Driven Development, Blameless Post-Mortems

Monday, June 18, 12
slide-12
SLIDE 12 Ckrickett, http://www.etsy.com/listing/90611466

Continuous deployment: Small, frequent changes to production

Monday, June 18, 12
slide-13
SLIDE 13

Continuous Deployment:

No branching.

“All existing revision control systems were built by people who build installed software”

  • Paul Hammond,

Always Ship Trunk, Velocity 2010

Thursday, March 17, 2011 Monday, June 18, 12
slide-14
SLIDE 14

if ($cfg[‘awesome_new_search’]) { # new hotness $rsp = do_solr(); } else { # boring old stuff $rsp = do_grep(); }

Continuous Deployment:

feature flags

Monday, June 18, 12
slide-15
SLIDE 15

Continuous Deployment:

Ramp - ups

(on top of feature flags)

  • 1. Launch to staff only
  • 2. Launch to 1% of all users
  • 3. Launch to members of a beta group
Monday, June 18, 12
slide-16
SLIDE 16

Continuous Deployment:

any engineer can launch a feature to

1% of users

Monday, June 18, 12
slide-17
SLIDE 17

Continuous Deployment:

~200 experiments live right now

Monday, June 18, 12
slide-18
SLIDE 18

Metrics driven development:

introspection isn’t

  • ptional.

measure everything, log everything

Monday, June 18, 12
slide-19
SLIDE 19

Metrics driven development:

Metrics happen when you make it easy. And visible.

Monday, June 18, 12
slide-20
SLIDE 20

Metrics driven development:

holtWintersConfidence(Upper|Lower) Teach computer to read graphs

Monday, June 18, 12
slide-21
SLIDE 21

Metrics driven development:

More info: http://www.slideshare.net/ mikebrittain/metricsdriven-engineering

Monday, June 18, 12
slide-22
SLIDE 22

Optimize for MTTR, not MTBF

Monday, June 18, 12
slide-23
SLIDE 23

How?

Monday, June 18, 12
slide-24
SLIDE 24

Etsy

Monday, June 18, 12
slide-25
SLIDE 25

Etsy

EMR/S3 PCI BCP, Cold

Monday, June 18, 12
slide-26
SLIDE 26

inbound request

etsy.com/ api.etsy.com /atlas etsystatic.com/ photos bcn.etsy.com CDNs - diversified at the DNS level Internet providers - diversified at borders

Etsy

network appliances

AWS

analytics imstor apache php application MySQL search memcache async http StatsD sqlite gearman logs server/OS hardware Squid apache php imstor NFS apache logs logrotate HDFS analytics EMR JRuby/ Cascading S3 PHP MySQL S3 search Thrift Jetty Solr slaves datasets Solr master HBase sharded MySQL MySQL dbindex dbshards dbaux dbdata mail out SMTP X-Yarnblaster etc

PCI

via jsonp, no privileged access

Monday, June 18, 12
slide-27
SLIDE 27

CDNs: Put a slider on it Just works via weighted DNS

Monday, June 18, 12
slide-28
SLIDE 28

Apache

* Well known * PHP is native * apache_note * fast start time * cheap in place replacement * .htaccess * Challenge: memory usage

Monday, June 18, 12
slide-29
SLIDE 29

Apache: apache_note

apache_note('etsy_uaid', $id); A d d i t i v e ! i n s a n e l y u s e f u l ! i n t r

  • s

p e c t i

  • n

t h r

  • u

g h t h e l i f e c y c l e

Monday, June 18, 12
slide-30
SLIDE 30

LogFormat "%{X-Forwarded-For}i % {True-Client-IP}i %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User- Agent}i\" % {etsy_shop_id}n % {etsy_uaid}n %V % {etsy_ab_selections}n % {etsy_request_uuid}n % {etsy_api_consumer_key}n % {etsy_api_method_name}n % {php_memory_usage_bytes}n % {php_time_microsec}n %D" combined

Apache: log format

Monday, June 18, 12
slide-31
SLIDE 31

Etsy: the App

* 487,000 lines of PHP * 214,000 lines of Javascript * Monolithic codebase * 3 front ends, Etsy.com, API, Atlas

Monday, June 18, 12
slide-32
SLIDE 32

Etsy: the App

* routing handled by Apache * scripts fronting OO PHP5 * PHP, fast by default * opcode caching * Challenge: liveliness when calling services

Monday, June 18, 12
slide-33
SLIDE 33

Etsy: coding patterns

* light weight, home rolled “framework” * ORM handles DAO across backends * config and feature flags systems used everywhere * small slow moving datasets stored as PHP arrays * A/B tests * Smarty * StatsD * Concurrency * memcache

Monday, June 18, 12
slide-34
SLIDE 34

Etsy: A/B tests

* beaconed * inserted into logs via apache_note * conditionalized on feature flags * nightly reports on conversion, bounce rate, etc * nightly reports on page speed, memory usage, etc

Monday, June 18, 12
slide-35
SLIDE 35

Etsy: Smarty

* pre-compiled * pre-compiled per language

Monday, June 18, 12
slide-36
SLIDE 36

Etsy: StatsD StatsD::increment("logins.success"); StatsD::timing("gearman.time", $msec);

* 340,000 application metrics

Monday, June 18, 12
slide-37
SLIDE 37

Etsy: Concurrency

* no native concurrency in PHP * asynchronous HTTP calls * Gearman

Monday, June 18, 12
slide-38
SLIDE 38

Etsy: Async HTTP calls

* curl_multi_exec * non-blocking, per request time outs * used for optional aspects of a page * curl against http://localhost to avoid network overhead

Monday, June 18, 12
slide-39
SLIDE 39

Etsy: Gearman

* language agnostic job server * don’t use an MQ when you want a job server * 150 job types * persistent jobs flushed to MySQL, read from memory * non-persistent jobs just stored in memory * NP queue is wicked fast.

Monday, June 18, 12
slide-40
SLIDE 40

Etsy: Gearman

* scaling CPU of cron jobs * denormalizing data * pushing to 3rd party services

Monday, June 18, 12
slide-41
SLIDE 41

Etsy: Challenges

* Apache memory usage * liveliness talking to services, no concurrency, blocking by default

Monday, June 18, 12
slide-42
SLIDE 42

Etsy: graph of distributed failure

Monday, June 18, 12
slide-43
SLIDE 43

Etsy: Challenges

* Apache memory usage * liveliness talking to services: no concurrency, blocking by default

Enforce liveliness with a judicious application of force

Monday, June 18, 12
slide-44
SLIDE 44

Etsy: judicious application of force

list($v, $res, $shar) = @fopen(‘/proc/self/statm', 'r'); $mine = $res-$shar; if ($mine > $cfg[‘sizelimit’]) { $pid = getmypid(); @exec("kill -USR1 $pid"); }

Monday, June 18, 12
slide-45
SLIDE 45

Etsy: judicious application of force

Bowhunter * Find long running PHP processes * Try to avoid those mid-post

  • pen(APACHE, "/usr/bin/curl -s http://localhost/server-

status|") || die "$!";

Monday, June 18, 12
slide-46
SLIDE 46

Etsy: judicious application of force

Query_killer * Same idea, long running queries * MySQL “SHOW PROCESSLIST();”

Monday, June 18, 12
slide-47
SLIDE 47

Memcache

* Caching, obviously * Cache invalidation is hard * Write buffering * multi_get * rate limits

Monday, June 18, 12
slide-48
SLIDE 48

Memcache

* atomic INCR is awesome * slice your time windows to reduce risk of cache eviction * we’ve been unlucky, lots of segfaults :( * multi_get slows down the more boxes in the pool

Monday, June 18, 12
slide-49
SLIDE 49

MySQL: By the numbers * 25K+queries/sec avg * 3TB InnoDB buffer pool * 15TB + data stored * 50 servers * 99.99% queries under 1ms

Monday, June 18, 12
slide-50
SLIDE 50

MySQL: a NotMuchSQL server * no joins * no foreign keys * no transactions or locks * no sub-selects * store data like you want to read it. * also: no auto_increment

Monday, June 18, 12
slide-51
SLIDE 51

MySQL: a NotMuchSQL server “Normalization is for sissie.”

  • Cal Henderson, Flickr
Monday, June 18, 12
slide-52
SLIDE 52

MySQL: scale horizontally * objects shared by key * lookups maintained in dbindex (MySQL is a FAST key-value store) * avoid key hashing, range partitions, and partitioning functions

more: http://www.slideshare.net/jgoulah/the-etsy-shard-architecture-starts-with-s-and-ends-with-hard

Monday, June 18, 12
slide-53
SLIDE 53

MySQL: Master-Master * objects hashed to a side, avoid split brain * allows in place schema upgrades without slave promotion * simplified capacity planning

more: http://codeascraft.etsy.com/2012/04/20/two-sides-for-salvation/

Monday, June 18, 12
slide-54
SLIDE 54

web0038 : [Mon Jun 18 09:58:38 2012] [error] [client 10.101.1.12] [C6kds9y1MVptEDMoOe5KCYha9VWl] [error] [ORM_LONG_QUERY] [/var/etsy/ current/phplib/EtsyORM/Query/RawSql.php:752] [15877310] Query exceeded 10 seconds: long_query_time=83.0927 long_query_string='/* [etsy_shard_005_A] [/ remove_favorite_listing.php] */ DELETE FROM `users_favoritelistings` WHERE `user_id` = ? AND `listing_id` = ?' long_query_trace='#10 __construct() /EtsyModel/ UserFavoriteListingMirror.php:310 #4 delete() /EtsyModel/UserFavoriteListing.php:39 #3 delete() /EtsyModel/User.php:1840 #2 unfavoriteListing() /Controller/ Favorites.php:344 #1 removeFavoriteListingRecord() /Controller/Favorites.php:94 #0 performRemoveFavoriteListing() /var/etsy/current/htdocs/remove_favorite_listing.php: 9', referer: http://www.etsy.com/people/kellanem/favorites?page=5

MySQL: Introspection SQL Comments are awesome!

Monday, June 18, 12
slide-55
SLIDE 55

MySQL: Deletes are expensive * update objects to state=‘deleted’ * use partitions * truncatenator - on ext3, hard link file, move, delete slowly.

Monday, June 18, 12
slide-56
SLIDE 56

Anatomy of a feature: Shop Stats

  • Monday, June 18, 12
slide-57
SLIDE 57

Anatomy of a feature: Shop Stats

  • “Never get into a land war in Asia, and never

build an analytics tool on top of MySQL.

Monday, June 18, 12
slide-58
SLIDE 58

Anatomy of a feature: Shop Stats * buffer writes in Memcache using predictable keys * flush to MySQL tables periodically via cron * bake old data into all possible date ranges, and archived to S3 * truncate tables

Monday, June 18, 12
slide-59
SLIDE 59 Monday, June 18, 12
slide-60
SLIDE 60

bcn.etsy.com: beaconed event stream * Server-side and javascript event stream * At least one per page view * Apache serving static assets * Aggregated on HDFS via logrotate * Archived on S3 * Analyzed via JRuby/Cascading on Hadoop * Doesn’t use: Flume, Scribe, etc

Monday, June 18, 12
slide-61
SLIDE 61

bcn.etsy.com: beaconed event stream

{"event_guid":"c2ffb51808b.6d2be52959ef{".user_id": 8528531,"php_event_name":"s2","php_unique_id":"4fdf1cb5d5c078.37523961","php_event_dat e":"18\/Jun\/2012:08:19:01","locale_currency_code":"USD","pref_language":"en- US","region":"US","detected_region":"US","accept-languages":"en- US,en","isMobileDevice":"0","isMobileSupported":"0","isTabletSupported":"0","isTouch":"0","isEt syApp":"0","listing_ids":[60274277,101504389,98682771,88585080],"cids": [14103953,14239293,14247717,14209614],"query":"blue","keywords": ["blue","blue","blue","blue"],"position":1,"replay_number":1,"s2_cached": 1,"php_ab_test_names":"orm_record_instance_caching;mobile_detector.all_blackberry;multila ng_shops_listings.view;ga_replacement_cookie;disable_search_autosuggest;admin_toolbar;tra nslations.live_translations;ab_analytics_test;search_type_experiment;search_ads.max_replays_ less;search_diversity_experiment;search_cached_listing_cards;placefinder.cache_memcached_ migration;search_stream_a;search_all_items_ignores_supplies;search_default_type;search.two _cluster_deploy;search_parameter_sample;thrift_category2_transform;search.similar_listing_b rowse_page;orm_replicant_safe_find_many;bottom_first;foreign_language_carousel;search.rel ated_searches_all_items;weddings.srp_promos;search_log_page_position;newrelic;clientlog;go

  • gle_analytics_async;personalized_endpoint;search_no_dropdown;community_nav_popout;se

curity_settings;search_changes_tooltip;inline_listing_hearts;framelogger;log_normal;analytics_ second_beacon;analytics_second_beacon_privileged;analytics_second_beacon_mobile","php_a b_var_names":"1;1;1;1;control;1;0;A;ponycorn_v3;1;threshold_off;1;1;1;0;all_sans_supplies; 0;1;1;1;1;0;top;0;0;1;0;1;0;1;1;1;0;1;1;1;0;1;0;1","php_ab_selector_names":"

Monday, June 18, 12
slide-62
SLIDE 62

Search Master Search Slave01 Search Slave02 Search SlaveNN BitTorrent to distribute indexes Web01 Web02 WebNN 100% of all indexes

  • n each slave

Thrift, with server affinity to improve cache hit ratio, just returns ids databases and memcache hydrate IDs via multi-get, ignore a few failures denormalized listing store, transition from MySQL to Hbase, not user facing pull via cron, push via gearman incremental index, every 7 minutes, avoid even numbered cron times

Search

Monday, June 18, 12
slide-63
SLIDE 63

Search

* Solr trunk * Custom ranking via crunched datasets * BitSet fields for personalized search * Scaling the JVM * 32% of visits, 40% of sales * Also powers categories, unshardable queries * Next time, just use HTTP * Up next: custom codecs * Avoiding sharding

Monday, June 18, 12
slide-64
SLIDE 64

Search

* JVM slow start * Search deployinator does rolling restart * HotSpot and GC causes unpredictable throughput * Overfetch - ask multiple servers, go with 1st response * Index size is important. Don’t store too much.

Monday, June 18, 12
slide-65
SLIDE 65

Photos

* 400 million photos * Uploaded locally, then streamed to S3 * GraphicsMagick FTW * Working set is tiny, served

  • ut of Squid

* 2% read failure rate during full S3 outage. * 0% write failure rate during full S3 outage.

JonathanOtis, http://www.etsy.com/listing/96361102/ Monday, June 18, 12
slide-66
SLIDE 66

Technology no longer part of the stack

* Python Twisted * PostgreSQL and stored procedures * Scala and MongoDB * Clojure and Tokyo Tyrant * Rails * ActiveMQ * RabbitMQ * a "Routes" framework * building RPMs * Lighttpd

Monday, June 18, 12
slide-67
SLIDE 67

Take aways

  • 1. A few simple, boring, well known

components

  • 2. Extensive instrumentation
  • 3. Rapid iteration and feedback loops
  • 4. Human centric
  • 5. A few tweaks on the classics for scale
  • 6. Technology supports business goals
Monday, June 18, 12
slide-68
SLIDE 68

Questions?

More info: http://codeascraft.etsy.com http://slideshare.net/etsy http://github.com/etsy http://www.etsy.com/jobs kellan@etsy.com

Monday, June 18, 12