SLIDE 1
Building Real-Time Visualizations at Scale Mike Barry @msb5014 - - PowerPoint PPT Presentation
Building Real-Time Visualizations at Scale Mike Barry @msb5014 - - PowerPoint PPT Presentation
Building Real-Time Visualizations at Scale Mike Barry @msb5014 Kevin Robinson @krob Hello! Hello! analytics.twitter.com Hello! Answers Building Real-time Visualizations Real-time Actionable User-focused Analytics at Twitter
SLIDE 2
SLIDE 3
Hello!
SLIDE 4
analytics.twitter.com
SLIDE 5
Hello!
SLIDE 6
Answers
SLIDE 7
Building Real-time Visualizations
Real-time Actionable User-focused
SLIDE 8
Analytics at Twitter
Architecture Higher-level abstractions Human flexibility
SLIDE 9
HDFS Batch Job Historic Dataset streaming events Realtime Job Realtime Dataset Query Mediator queue
Typical Analytics Pipeline
SLIDE 10
Do more work on write so that reads are fast
SLIDE 11
How many impressions from X to Y?
SLIDE 12
How many impressions from X to Y?
from X to Y minute
SLIDE 13
How many impressions from X to Y?
from X minute hour day to Y
SLIDE 14
Abstractions
Scalding Storm Summingbird Tsar Heron
SLIDE 15
Scaling up
SLIDE 16
Communicate Fearlessly to Build Trust
SLIDE 17
Human Flexibility
Globally-available data + Flexible individuals + Hack weeks = innovation
SLIDE 18
Analytics at Twitter
Architecture Higher-level abstractions Human flexibility
SLIDE 19
SLIDE 20
SLIDE 21
SLIDE 22
SLIDE 23
Initial assumptions Shortest path to usefulness Real users and data change everything
SLIDE 24
Initial assumptions Shortest path to usefulness Real users and data change everything
SLIDE 25
Existing data sources? nope Predictable usage or distributions? nope hmm...
No data!
SLIDE 26
SLIDE 27
Assumptions about data
SLIDE 28
Assumptions about data
SLIDE 29
How can we make them explicit?
SLIDE 30
How can we make them explicit?
SLIDE 31
Excel prototype
SLIDE 32
Excel prototype
SLIDE 33
Excel prototype
SLIDE 34
Initial assumptions about data Shortest path to usefulness Real users and data change everything
SLIDE 35
Initial assumptions about data Shortest path to usefulness Real users and data change everything
SLIDE 36
Let’s build!
SLIDE 37
Let’s build!
SLIDE 38
Let’s build!
SLIDE 39
Real-time computation
SLIDE 40
Real-time computation
SLIDE 41
Let’s build: Prototype feature
SLIDE 42
Prototype feature
SLIDE 43
Prototype feature
SLIDE 44
Prototype feature
SLIDE 45
Prototype feature
SLIDE 46
Prototype feature
SLIDE 47
Production feature
SLIDE 48
More fault-tolerance
SLIDE 49
Local Cascading jobs Subsets or samples of real data In-memory tests
More fault-tolerance
SLIDE 50
Local Cascading jobs Subsets or samples of real data In-memory tests More data only a command line away
More fault-tolerance
SLIDE 51
Ready for real users!
SLIDE 52
Initial assumptions about data Shortest path to usefulness Real users and data change everything
SLIDE 53
Initial assumptions about data Shortest path to usefulness Real users and data change everything
SLIDE 54
High-touch feedback
SLIDE 55
SLIDE 56
Exploring the data
SLIDE 57
Real-time prototyping
SLIDE 58
Real-time prototyping
kafka logs gif
SLIDE 59
Real-time prototyping
SLIDE 60
Real-time prototyping
SLIDE 61
Real-time prototyping
sublime samples
SLIDE 62
Real-time prototyping
sublime samples
SLIDE 63
Real-time prototyping
sublime samples
SLIDE 64
Real-time prototyping
sublime samples
SLIDE 65
Real-time prototyping
sublime samples
SLIDE 66
What’s the TL;DR?
SLIDE 67
Answers Events
SLIDE 68
Opinionated
SLIDE 69
Opinionated
SLIDE 70
TL;DR
SLIDE 71
TL;DR
SLIDE 72
TL;DR
SLIDE 73
SLIDE 74
Initial assumptions about data Shortest path to usefulness Real users and data change everything
SLIDE 75
Lambda architecture Opening everything enables re-use Higher-level abstractions Full-stack iteration
Conclusion
SLIDE 76