Scaling Automated Database Monitoring at Uber
… with M3 and Prometheus Richard Artoul
Scaling Automated Database Monitoring at Uber with M3 and - - PowerPoint PPT Presentation
Scaling Automated Database Monitoring at Uber with M3 and Prometheus Richard Artoul Agenda 01 Automated database monitoring 02 Why scaling automated monitoring is hard 03 M3 architecture and why it scales so well 04 How you can use M3
… with M3 and Prometheus Richard Artoul
Agenda
Uber’s “Architecture”
2015 2019
4000+ micr crose services ices - Most of which directly or indirectly depend on storage 22+ storage technologie ies s - Ranging from C* to MySQL 1000’s of dedicated servers running ing database ses s - Monitoring all of these is hard
Monitoring Databases
Application Hardware Technology Application
Hardware Level Metrics
Technology Level Metrics
E E
Application Level Metrics
○ All queries against a given database ○ All queries issue by a specific service ○ A specific query ■ SELECT * FROM TABLE WHERE USER_ID = ?
Monitoring Applications at Scale
Microservices w/ dedicated storage
Instances per service
Instances per DB cluster
Queries per service
Metrics per query
480 million unique time series
100+ million dollars!
Writes per second/s (post replication)
Datapoints emitted pre- aggregation
Unique Metric IDs
Datapoints read per second
Workload
A Brief History of M3
○ No replication, operations were ‘cumbersome’
○ Solved operational issues ○ 16x YoY Growth ○ Expensive (> 1500 Cassandra Hosts) ○ Compactions => R.F=2
M3DB
An open source distributed time series database
retention
series data for real-time workloads
High-Level Architecture
Like a Log Structured Merge Tree (LSM) Except a typical LSM has levelled or size based compaction M3DB has time window compaction
Topology and Consistency
○ No gossip ○ Replicated with zone/rack aware layout and configurable replication factor
○ Configurable consistency level for both read and write
Cost Savings and Performance
○ ~1.4PB for Cassandra at R.F=2 ○ ~200TB for M3DB at R.F=3
○ Hundreds of thousands of writes/s on commodity hardware
Centralized Elasticsearch Index
Elasticsearch Index - Write Path
m3agg E.S Indexer Redis Cache “Don’t write cache”
Influx of new metrics:
Deployment
M3DB
Elasticsearch Index - Read Path
Query E.S M3DB 1 2
Elasticsearch Index - Read Path
Query E.S Redis Query Cache Need high T.T.L to prevent overwhelming E.S … but high T.T.L means long delay for new time series to become queryable
Elasticsearch Index - Read Path
Query Redis Query Cache E.S Short E.S Long Redis Query Cache Short TTL Merge on read Long TTL
Elasticsearch Index - Final Breakdown
cache, and breaks everything if you deploy it too quickly
cluster indexes
documents from the long term index
M3 Inverted Index
M3 Inverted Index
○ service = “foo” AND ○ endpoint = “bar” AND ○ client_version regexp matches “3.*”
service=”foo” endpoint=”bar” client_version=”3.*” AND AND client_version=”3.1” client_version=”3.2” OR OR
AND → Intersection OR → Union
M3 Inverted Index - F.S.Ts
efficient and compressed structure for fast regexp.
unpack another data structure that contains the set of metrics IDs associated with a particular label value (postings list)
Encoded Relationships
are --> 4 ate --> 2 see --> 3 Compressed + fast regexp!
M3 Inverted Index - Postings List
IDs (integers) that match, this is called a postings list.
intersection (AND - across terms) and union (OR - within a term).
12P.M -> 2P.M Index Block service=”foo” endpoint=”bar” client_version=”3.*” INTERSECT
Intersect → AND Union → OR
INTERSECT client_version=”3.1” client_version=”3.2” Union Union
Primary data structure for the postings list in M3DB is the roaring bitmap
M3 Inverted Index - File Structure
────────────────────────────── Time ────────────────────────────────────────▶ ┌──────────────────────────────────────┐ │/var/lib/m3db/data/namespace-a/shard-0│ └───────────────────┬───────────────┬──┴────────────┬───────────────┬───────────────┐ │Fileset File │Fileset File │Fileset File │Fileset File │ │Block │Block │Block │Block │ └───────────────┴───────────────┴───────────────┴───────────────┘ ┌──────────────────────────────────────┐ │/var/lib/m3db/index/namespace-a │ └───────────────────┬──────────────────┴────────────┬───────────────────────────────┐ │Index Fileset File │Index Fileset File │ │Block │Block │ └───────────────────────────────┴───────────────────────────────┘
M3 Summary
M3 and Prometheus
Prometheus
My App Grafana Alerting M3 Coordinator
M3DB M3DB M3DB
Directly query M3 using coordinator for single Grafana datasource
Roadmap
○ Efficient compression of events in the form of Protobuf messages
We’re Hiring!
○ Reach out to me at rartoul@uber.com