Creating A Scalable Monitoring System That Everyone Will - - PowerPoint PPT Presentation

creating a scalable monitoring system that everyone will
SMART_READER_LITE
LIVE PREVIEW

Creating A Scalable Monitoring System That Everyone Will - - PowerPoint PPT Presentation

1 Creating A Scalable Monitoring System That Everyone Will @ThePracticalDev | @molly_struve | dev.to/molly_struve @molly_struve 2 Overhauling the Monitoring Mistakes The Payoff System @molly_struve 3 Overhauling the Monitoring


slide-1
SLIDE 1

@molly_struve

1

Creating A Scalable Monitoring System That Everyone Will ❤

@ThePracticalDev | @molly_struve | dev.to/molly_struve

slide-2
SLIDE 2

@molly_struve

Monitoring Mistakes

Overhauling the

System The Payoff

2
slide-3
SLIDE 3

@molly_struve

Monitoring Mistakes

3

Overhauling the

System The Payoff

slide-4
SLIDE 4

@molly_struve

4
slide-5
SLIDE 5

@molly_struve

Monitoring Mistakes

5

Overhauling the

System The Payoff

slide-6
SLIDE 6

@molly_struve

Monitoring Mistakes

Overhauling the

System

6

Overhauling the

System The Payoff

slide-7
SLIDE 7

@molly_struve

Monitoring Mistakes

Overhauling the

System The Payoff

7
slide-8
SLIDE 8

@molly_struve

Monitoring Mistakes

Overhauling the

System The Payoff

8
slide-9
SLIDE 9

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

9
slide-10
SLIDE 10

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

10
slide-11
SLIDE 11

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

11
slide-12
SLIDE 12

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

12
slide-13
SLIDE 13

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

13
slide-14
SLIDE 14

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

14
slide-15
SLIDE 15

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

15
slide-16
SLIDE 16

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

16
slide-17
SLIDE 17

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

17
slide-18
SLIDE 18

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

18
slide-19
SLIDE 19

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

19
slide-20
SLIDE 20

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

20
slide-21
SLIDE 21

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

21
slide-22
SLIDE 22

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

22
slide-23
SLIDE 23

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

23
slide-24
SLIDE 24

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

24
slide-25
SLIDE 25

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

25
slide-26
SLIDE 26

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

Incredibly Inconsistent

26
slide-27
SLIDE 27

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

Inconsistent Alerts

Required no action Reported data

27
slide-28
SLIDE 28

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

Inconsistent Alerts

Required no action Reported data Immediate action required

28
slide-29
SLIDE 29

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

Manual Monitoring

29
slide-30
SLIDE 30

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

Manual Monitoring

30
slide-31
SLIDE 31

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

31
slide-32
SLIDE 32

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

32
slide-33
SLIDE 33

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

33
slide-34
SLIDE 34

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

34
slide-35
SLIDE 35

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

35
slide-36
SLIDE 36

@molly_struve

make on-call devs miserable

Monitoring Mistakes Overhauling the System The Payoff

36
slide-37
SLIDE 37

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

37
slide-38
SLIDE 38

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

38
slide-39
SLIDE 39

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

39
slide-40
SLIDE 40

@molly_struve

Coverage doesn’t matter if you have no idea what is going on!

Monitoring Mistakes Overhauling the System The Payoff

40
slide-41
SLIDE 41

@molly_struve

Monitoring Mistakes

Overhauling the

System The Payoff

41
slide-42
SLIDE 42

@molly_struve

Monitoring Mistakes

Overhauling the

System The Payoff

42
slide-43
SLIDE 43

@molly_struve

Consolidate Monitoring To a Single Place

Monitoring Must Haves

1 2 3 4 Monitoring Mistakes Overhauling the System The Payoff

43

5

slide-44
SLIDE 44

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

    

44
slide-45
SLIDE 45

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

45
slide-46
SLIDE 46

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

46
slide-47
SLIDE 47

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

47
slide-48
SLIDE 48

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

48
slide-49
SLIDE 49

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

49
slide-50
SLIDE 50

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

50
slide-51
SLIDE 51

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

51
slide-52
SLIDE 52

@molly_struve

Make ALL Alerts Actionable Consolidate Monitoring To a Single Place

Monitoring Must Haves

1 2 3 4 Monitoring Mistakes Overhauling the System The Payoff

52

5

slide-53
SLIDE 53

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

53

Action

slide-54
SLIDE 54

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

54

Action

slide-55
SLIDE 55

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

Action Required No action Needed

55
slide-56
SLIDE 56

@molly_struve

#ops_alerts

Monitoring Mistakes Overhauling the System The Payoff

#dev_alerts

Action Required No action Needed

56
slide-57
SLIDE 57

@molly_struve

#ops_alerts

Monitoring Mistakes Overhauling the System The Payoff

#ops_reporting #dev_alerts

Action Required No action Needed

#dev_reporting

57
slide-58
SLIDE 58

@molly_struve

Make Sure Alerts Are Mutable Make ALL Alerts Actionable Consolidate Monitoring To a Single Place

Monitoring Must Haves

1 2 3 4 Monitoring Mistakes Overhauling the System The Payoff

58

5

slide-59
SLIDE 59

@molly_struve

Make Sure Alerts Are Mutable Make ALL Alerts Actionable Consolidate Monitoring To a Single Place

Monitoring Must Haves

1 2 3 4 Monitoring Mistakes Overhauling the System The Payoff

59

5

slide-60
SLIDE 60

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

30 60 90

minutes

60
slide-61
SLIDE 61

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

61
slide-62
SLIDE 62

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

62

Miss new alerts

slide-63
SLIDE 63

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

63
slide-64
SLIDE 64

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

64
slide-65
SLIDE 65

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

65
slide-66
SLIDE 66

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

66
slide-67
SLIDE 67

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

67
slide-68
SLIDE 68

@molly_struve

Track Alert History Make Sure Alerts Are Mutable Make ALL Alerts Actionable Consolidate Monitoring To a Single Place

Monitoring Must Haves

1 2 3 4 Monitoring Mistakes Overhauling the System The Payoff

68

5

slide-69
SLIDE 69

@molly_struve

Tracking Alert History

Monitoring Mistakes Overhauling the System The Payoff

69
slide-70
SLIDE 70

@molly_struve

Tracking Alert History

Monitoring Mistakes Overhauling the System The Payoff

70
slide-71
SLIDE 71

@molly_struve

Tracking Alert History

Monitoring Mistakes Overhauling the System The Payoff

71
slide-72
SLIDE 72

@molly_struve

Track Alert History Make Sure Alerts Are Mutable Make ALL Alerts Actionable Consolidate Monitoring To a Single Place

Monitoring Must Haves

1 2 3 4 Monitoring Mistakes Overhauling the System The Payoff

72

5

Remove ALL Manual Monitoring

slide-73
SLIDE 73

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

73
slide-74
SLIDE 74

@molly_struve

Manual Monitoring 🙆 Scale

Monitoring Mistakes Overhauling the System The Payoff

74
slide-75
SLIDE 75

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

75
slide-76
SLIDE 76

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

76

Automatic Alerts

slide-77
SLIDE 77

@molly_struve

Track Alert History Make Sure Alerts Are Mutable Make ALL Alerts Actionable Consolidate Monitoring To a Single Place

Monitoring Must Haves

1 2 3 4 Monitoring Mistakes Overhauling the System The Payoff

77

5

Remove ALL Manual Monitoring

slide-78
SLIDE 78

@molly_struve

Monitoring Mistakes

Overhauling the

System The Payoff

78
slide-79
SLIDE 79

@molly_struve

Monitoring Mistakes

Overhauling the

System The Payoff

79
slide-80
SLIDE 80

@molly_struve

On boarding is a breeze

Monitoring Mistakes Overhauling the System The Payoff

80
slide-81
SLIDE 81

@molly_struve

3 On-boarding steps:

1 2 3 Monitoring Mistakes Overhauling the System The Payoff

81
slide-82
SLIDE 82

@molly_struve

Show them the monitoring setup

1 2 3 Monitoring Mistakes Overhauling the System The Payoff

3 On-boarding steps:

82
slide-83
SLIDE 83

@molly_struve

Show them the monitoring setup

1 2 3 Monitoring Mistakes Overhauling the System The Payoff

If an alert goes off you have to address it

3 On-boarding steps:

83
slide-84
SLIDE 84

@molly_struve

Show them the monitoring setup

1 2 3 Monitoring Mistakes Overhauling the System The Payoff

How to mute a triggered alert

3 On-boarding steps:

84

If an alert goes off you have to address it

slide-85
SLIDE 85

@molly_struve

On boarding is a breeze

Monitoring Mistakes Overhauling the System The Payoff

85
slide-86
SLIDE 86

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

Happier on-call developers

86
slide-87
SLIDE 87

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

All alerts must be actionable

87
slide-88
SLIDE 88

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

88
slide-89
SLIDE 89

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

No more noise

89
slide-90
SLIDE 90

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

90
slide-91
SLIDE 91

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

Alerts must be mutable

91
slide-92
SLIDE 92

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

Alerts must be mutable

92
slide-93
SLIDE 93

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

Diagnosing alerts is faster and easier

93
slide-94
SLIDE 94

@molly_struve

Tracking Alert History

94

Monitoring Mistakes Overhauling the System The Payoff

slide-95
SLIDE 95

@molly_struve

95

Monitoring Mistakes Overhauling the System The Payoff

High Priority Job Queue Backed Up

slide-96
SLIDE 96

@molly_struve

96

Monitoring Mistakes Overhauling the System The Payoff

Today?

High Priority Job Queue Backed Up

slide-97
SLIDE 97

@molly_struve

97

Monitoring Mistakes Overhauling the System The Payoff

Today?

Going on longer?

High Priority Job Queue Backed Up

slide-98
SLIDE 98

@molly_struve

98

Monitoring Mistakes Overhauling the System The Payoff

Today?

Going on longer?

High Priority Job Queue Backed Up

slide-99
SLIDE 99

@molly_struve

99

Monitoring Mistakes Overhauling the System The Payoff

High Priority Job Queue Backed Up

Today?

Going on longer?

Alert History

slide-100
SLIDE 100

@molly_struve

100

Monitoring Mistakes Overhauling the System The Payoff

slide-101
SLIDE 101

@molly_struve

Developers began helping to improve our monitoring system

Monitoring Mistakes Overhauling the System The Payoff

101
slide-102
SLIDE 102

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

102
slide-103
SLIDE 103

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

All alerts must be actionable

103
slide-104
SLIDE 104

@molly_struve

30-40 Alerts

Monitoring Mistakes Overhauling the System The Payoff

104
slide-105
SLIDE 105

@molly_struve

>90 Alerts

Monitoring Mistakes Overhauling the System The Payoff

105
slide-106
SLIDE 106

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

106

Basic Alerts

slide-107
SLIDE 107

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

107

More Granular Alerts Basic Alerts

slide-108
SLIDE 108

@molly_struve

Monitoring Mistakes Overhauling the System The Payoff

108

vs

High MySQL load High MySQL load

  • n db 1 for client 5
slide-109
SLIDE 109

@molly_struve

Developers ❤ monitoring system

Monitoring Mistakes Overhauling the System The Payoff

109
slide-110
SLIDE 110

@molly_struve

Monitoring Must Have Benefits

1 2 3 Monitoring Mistakes Overhauling the System The Payoff

110

4

slide-111
SLIDE 111

@molly_struve

On boarding is a breeze.

1 2 3 Monitoring Mistakes Overhauling the System The Payoff

111

4

Monitoring Must Have Benefits

slide-112
SLIDE 112

@molly_struve

On-call developers are a lot happier

1 2 3 Monitoring Mistakes Overhauling the System The Payoff

On boarding is a breeze.

112

4

Monitoring Must Have Benefits

slide-113
SLIDE 113

@molly_struve

On-call developers are a lot happier

1 2 3 Monitoring Mistakes Overhauling the System The Payoff

On boarding is a breeze.

113

4

Diagnosing alerts is faster and easier

Monitoring Must Have Benefits

slide-114
SLIDE 114

@molly_struve

Developers helping to improve your monitoring systems

1 2 3 Monitoring Mistakes Overhauling the System The Payoff

On boarding is a breeze.

114

On-call developers are a lot happier

4

Diagnosing alerts is faster and easier

Monitoring Must Have Benefits

slide-115
SLIDE 115

@molly_struve

Track Alert History Make Sure Alerts Are Mutable Make ALL Alerts Actionable Consolidate Monitoring To a Single Place

Monitoring Must Haves

1 2 3 4

115

5

Remove ALL Manual Monitoring

@ThePracticalDev | @molly_struve | dev.to/molly_struve

slide-116
SLIDE 116

@molly_struve

❤❤❤

116

@ThePracticalDev | @molly_struve | dev.to/molly_struve

slide-117
SLIDE 117

@molly_struve

Questions?

117

@ThePracticalDev | @molly_struve | dev.to/molly_struve

slide-118
SLIDE 118

Rate today ’s session

Session page on conference website

O’Reilly Events App