 
              timelines at scale @ra ffi qcon sf 2012
Pull Push Targeted twitter.com User / Site Streams home_timeline API Mobile Push (SMS, etc.) Queried Search API Track / Follow Streams
the challenge ⇢ > 150M world wide active users ⇢ > 300K QPS for timelines ⇢ naïve timeline “materialization” can be slow
Write API Ingester Fanout Batch Compute Timeline Cache Push Compute Search Cache HTTP Push Redis Redis Redis Hadoop Redis Earlybird Redis Mobile Push Timeline Blender Service
Write API Social Ingester Fanout Graph Service Batch Compute Timeline Cache Push Compute Search Cache HTTP Push Redis Redis Redis Hadoop Redis Earlybird Redis Mobile Push Timeline Blender Service
Write API Social Ingester Fanout Graph Service insert Batch Compute Timeline Cache Push Compute Search Cache HTTP Push Redis Redis Redis Hadoop Redis Earlybird Redis ⇢ keyed o ff Mobile Push “recipient” Timeline Blender ⇢ pipelined 4k Service “destinations” at a time ⇢ replicated
Write API Ingester Fanout using redis Batch Compute Timeline Cache Push Compute Search Cache Tweet ID User ID Bits HTTP Push Redis Redis Redis Hadoop Redis 8 bytes 8 bytes 4 bytes Earlybird Redis ⇢ native list Mobile Push structure Timeline Blender Service
Write API Ingester Fanout using redis Batch Compute Timeline Cache Push Compute Search Cache Tweet ID User ID Bits HTTP Push Redis Redis Redis Tweet ID User ID Bits Hadoop Tweet ID Redis Earlybird Redis ⇢ native list Mobile Tweet ID User ID Bits Push Tweet ID User ID Bits structure Tweet ID User ID Bits Tweet ID Timeline Blender ⇢ RPUSHX to Tweet ID User ID Bits Service Tweet ID User ID Bits only add to Tweet ID User ID Bits Tweet ID cached Tweet ID User ID Bits Tweet ID User ID Bits timelines
Write API Ingester Fanout Batch Compute Timeline Cache Push Compute Search Cache HTTP Push Redis Redis Redis Hadoop Redis Earlybird Redis Mobile Push Timeline Blender Service
Write API Fanout Timeline Cache Redis Redis Redis Timeline Gizmoduck TweetyPie Service
Pull Push Targeted twitter.com User / Site Streams home_timeline API Mobile Push (SMS, etc.) Queried Search API Track / Follow Streams
Write API Ingester Fanout Batch Compute Timeline Cache Push Compute Search Cache HTTP Push Redis Redis Redis Hadoop Redis Earlybird Redis Mobile Push Timeline Blender Service
Write API Ingester Fanout blender Batch Compute Timeline Cache Push Compute Search Index HTTP Push Redis Redis Earlybird Hadoop Redis Earlybird Redis ⇢ queries one Mobile Push replica of all indexes Timeline Blender Service ⇢ merges & ranks results
Write API Ingester Fanout Batch Compute Timeline Cache Push Compute Search Index HTTP Push Redis Redis Earlybird Hadoop Redis Earlybird Redis Mobile Push Timeline Blender Service
⇢ O(n) write ⇢ O(1) read Cache Cache API API Redis Redis Write Read Redis Redis API API Redis Redis Earlybird Earlybird Write Read Earlybird Earlybird API API Earlybird Earlybird ⇢ O(1) write ⇢ O(n) read
the challenge (part #2) ⇢ fanout can be really slow! ⇢ ...especially for high follower counts
@ladygaga 31 million followers @katyperry 28 million followers @justinbieber 28 million followers @barackobama 23 million followers @ra ffi 0.019 million followers
there are over 400 million tweets a day
4600 tweets a twee ≈ a second 0.2 ms
Write API Ingester Fanout Timeline Cache Search Index Redis Redis Earlybird Redis Earlybird Redis search index fanout index ⇢ [‘hello’,‘world’] ⇢ [@danadanger, ...]
User Intent Query Expansion “Hello, world” “Hello” AND “world” @ra ffi ’s home timeline home_timeline:ra ffi
User Intent Query Expansion “Hello, world” “Hello” AND “world” user_timeline:nelson @ra ffi ’s home timeline OR user_timeline:danadanger
User Intent Query Expansion “Hello, world” “Hello” AND “world” @ra ffi ’s home timeline home_timeline:ra ffi
User Intent Query Expansion “Hello, world” “Hello” AND “world” home_timeline:ra ffi @ra ffi ’s home timeline OR user_timeline:taylorswift13
Write API Ingester Fanout Batch Compute Timeline Cache Push Compute Search Index HTTP Push Redis Redis Earlybird Hadoop Redis Earlybird Redis Mobile Push Timeline Blender Service
Synchronous Path Write API Ingester Fanout Asynchronous Path Batch Compute Timeline Cache Push Compute Search Index HTTP Push Redis Redis Earlybird Hadoop Redis Earlybird Redis Mobile Push Timeline Blender Query Path Service
Synchronous Path Write API Ingester Fanout Asynchronous Path Batch Compute Timeline Cache Push Compute Search Index HTTP Push Redis Redis Earlybird Hadoop Redis Earlybird Redis Mobile Push Timeline Blender Query Path Service
Synchronous Path Write API Ingester Fanout Asynchronous Path Batch Compute Timeline Cache Push Compute Search Index HTTP Push Redis Redis Earlybird Hadoop Redis Earlybird Redis Mobile Push Timeline Blender Query Path Service
timeline query statistics ⇢ >150m active users worldwide ⇢ >300k qps poll-based timelines @ 1ms p50 / 4ms p99 ⇢ >30k qps search-based timelines
tweet input ⇢ ~400m tweets per day ⇢ ~5K/sec daily average ⇢ ~7K/sec daily peak ⇢ >12K/sec during large events
timeline delivery statistics ⇢ 30b deliveries / day (~21m / min) ⇢ 3.5 seconds @ p50 to deliver to 1m ⇢ ~300k deliveries / sec
thanks!
Recommend
More recommend