Building highly reliable data pipelines @ Datadog
Quentin FRANCOIS
Team Lead, Data Engineering
1
DataEng Barcelona ‘18
Building highly reliable data pipelines @ Datadog Quentin FRANCOIS - - PowerPoint PPT Presentation
Building highly reliable data pipelines @ Datadog Quentin FRANCOIS Team Lead, Data Engineering DataEng Barcelona 18 1 2 3 4 5 Building highly reliable data pipelines @ Datadog Quentin FRANCOIS Team Lead, Data Engineering DataEng
Team Lead, Data Engineering
1
DataEng Barcelona ‘18
2
3
4
5
6
Team Lead, Data Engineering DataEng Barcelona ‘18
Source: E.J. McClusky & S. Mitra (2004). "Fault Tolerance" in Computer Science Handbook 2ed. ed. A.B. Tucker. CRC Press.
7
8
9
10
metric system.load.1 timestamp 1526382440 value 0.92 tags host:i-xyz,env:dev,...
11
12
13
14
High resolution data Low resolution data 1pt /min 1pt /hour 1pt /day 1pt /sec AWS S3
Rollups pipeline
15
CLUSTERS DATA WORKERS USERS Luigi Spark Datadog monitoring S3 EMR Web Scheduler CLI
16
EMR EMR EMR
CLUSTERS DATA WORKERS USERS Luigi Spark Datadog monitoring S3 EMR Web Scheduler CLI
17
EMR EMR EMR
18
19
20
21
22
23
24
25
26
9
27
Pipeline A Pipeline B Time (hours)
Pipeline A Pipeline B Time (hours) 9 7 10 16 Job failure
28
29
30
Input data Output data
Aggregated time series data (custom file format) Raw time series data
31
Input data Output data
Raw time series data Aggregated time series data (custom file format)
1 2
32
Output data
Checkpoint data
Input data
Aggregated time series data (Parquet format) Raw time series data Aggregated time series data (custom file format)
1 2
1 2
33
Output data
Checkpoint data
Aggregated time series data (Parquet format) Raw time series data Aggregated time series data (custom file format)
1 2
A B C D
1 2
34
A A
B B C C D D
Aggregated time series data (Parquet format) Raw time series data Aggregated time series data (custom file format)
1 2
1 2
35
36
37
#rollups #anomaly
38
39
40
41
42
More details: datadoghq.com/blog/monitoring-spark/
43
44
45
46
47
1 point/sec data lag 1 point/hour data lag
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
A A B B C C D D
63
64
65
66
67
68
69
70