Hawkular Metrics Metric Storage & Alerting Stefan Negrea About - PowerPoint PPT Presentation

Hawkular Metrics Metric Storage & Alerting Stefan Negrea

About Me Co-Creator of Hawkular Metrics 2

Hawkular Metrics Hawkular Demo & Alerting Introduction to Hawkular Metrics 3

Pre-History ● 2006 JBoss Operations Network 1.0 ● 2008 Project RHQ ○ JBoss Operations Network 2.0 ○ Metrics stored in Postgres 4

Pre-History 5

Pre-History ● 2012 - 2013 RHQ Storage Nodes ○ Cassandra based ○ Store metrics ● 2014 RHQ Metrics 6

= + Hawkular It’s a hawk with a monocular. Hawks are known to have a very sharp vision and very good hunters, they can catch preys anticipating their movements at a very fast speed. The goal is to be able to monitor and catch anomalies in fast pace environments. All* projects are Apache License 2.0 7

History ● 2014 Hawkular organization formed ● 2014 Hawkular Alerting started ● 02/2015 RHQ Metrics joins Hawkular org ● 12/2015 Hawkular Metrics integrated in OpenShift Origin v3 ● 10/2016 Hawkular Metrics includes Hawkular Alerting 8

Hawkular Metrics Hawkular Metrics is a storage engine for metric data metric data = a measurement taken at a specific time storage engine = store metrics efficiently for their useful lifetime 9

Supported Metrics Memory usage ● Gauge (metric1, 4.5, 1493301898245) ○ number (metric1, 5.6, 1493301898246) (metric1, 1.2, 1493301898247) ○ varies (not monotonic) ○ rate of change Number of visitors ● Counter (metric2, 4, 1493301898248) ○ integer (metric2, 5, 1493301898249) (metric2, 9, 1493301898250) ○ monotonic (increasing or decreasing (metric2, 0, 1493301898251) ○ rate of change ○ support for reset 10

Supported Metrics Server status ● Availability (metric3, UP, 1493301898253) ○ Availability of a resource (metric3, DOWN, 1493301898254) (metric3, UP, 1493301898255) ○ up, down, or unknown ○ can compute interesting stats based on values ● String Value of configuration key ‘k’ (metric4, “k=v”, 1493301898256) ○ just that (metric4, “k=t”, 1493301898257) ○ possible uses: logs, events, config (metric4, “k=1”, 1493301898258) (metric4, “k=4”, 1493301898259) 11

Cassandra - Storage Management & Support ● Highly available, fault tolerant ● No specialized node roles ● Minimal configuration Performance & Scalability ● Optimized for writes ● Data compression ● Indexing 13

Cassandra - Storage ● CQL based ● Partitioning & indexing of data based on usage ● Use built-in compression & TTL ● Use the Datastax driver fully async ● Support for latest C* 3.0.x release ● Keep updating to latest stable ● Use multiple tables for indexing 14

App Layer ● REST API with JSON ● JAX-RS 2.0 (async spec) ● Fully async = JAX-RS 2.0 async + RX Java + async C* driver ● Stateless** server (Metrics, mostly) ● Minimal clustering via Infinispan ● Schema Management ● Easy to use ○ packaged distribution with WildFly ○ download and run, only JDK required 15

Performance - Sample C* - 4 CPU, 4GB Hawkular - 4 CPU, 4GB message sizes: 10 datapoints: 2592 req/sec => 25920 datapoints/sec 100 datapoints: 365 req/sec => 36500 datapoints/sec 5000 datapoints: 7.6 req/sec => 38000 datapoints/sec C*, 8 CPU, 8GB Hawkular, 8 CPU, 4GB message sizes: 10 datapoints: 4655 req/sec => 46550 datapoints/sec 100 datapoints: 604 req/sec => 60400 datapoints/sec 5000 datapoints: 15 req/sec => 75000 datapoints/sec 16

Features ● Multi-tenant ○ tenant id required on each request (HAWKULAR-TENANT header) ○ no way to get data from multiple tenants at once ● Can insert data without pre-creating metrics ● Data is compressed using Gorilla compression ○ 2 hour time window ○ further reduces disk footprint ○ LZ4 enabled in Cassandra ○ Load testing: ■ 5000 data points/sec for 5 days = 26GB ■ 83M data points ~ 1GB of disk space 17

Features ● Bulk insertion endpoint for metrics and data ● Tagging support for metrics and single data points ○ key, value; multi-tag support ■ tag1 = d ○ metrics queryable via TQL (tag query language) ■ AND, OR, NOT ■ grouping ■ wildcard matching ■ a1 = 'd' OR ( a1 != 'ab' AND c1 ) 18

Features - Simple REST API ● Endpoint for each metric type ○ /gauges, /availability, /counters, /strings ○ Each metric type has almost identical endpoints ● Raw data - /gauges/raw ● Raw data for single metric - /strings/{metric_id}/raw ● Query time aggregation ○ multiple metrics - /availability/stats ○ single metric - /counter/{metric_id}/stats ● Bulk operations - /metrics 19 ** String metrics do not have stats (yet?)

Features - Aggregation & Rate ● Query Time Aggregation ○ Combine multiple metrics and get statistical data ○ Gauge and counter: average, median, percentile, sum ○ Availability: ratios for uptime and downtime, downtime duration ○ Time Slicing: first group data, then compute stats ○ Single or multiple metrics ● Rate ○ available for gauges and counters ○ rate of change of the values for the timespan ○ ex: how fast is the number of total requests increasing 20

Metrics + Alerting ● Natural fit: collect data and then alert on anomalies ● Two ways to alert on metric data ○ Dedicated API for setting up alerts, incoming data is filtered and processed by the alerting engine ○ Metrics Alerter that queries single or multiple metrics, no need to predefine alerts triggers ahead of time. 21

Alerting Features ● Single and group Triggers ● Template triggers ● Complex conditions ● Dampening ● Auto-resolve/auto-disable triggers ● Pluggable notifiers 22

Roadmap - 2017 ● Automatic & persisted aggregation ● Management capabilities for the Cassandra cluster ● Query language ● Performance improvements ○ already have a good baseline, but can do better ○ read/write 24

Demo ● Install ccm ○ https://github.com/pcmanus/ccm ● Start a single node C* cluster ○ ccm create -v 3.0.12 -n 1 -s hawkular ● Download, extract and start Hawkular Metrics ○ https://origin-repository.jboss.org/nexus/content/groups/public/org/ha wkular/metrics/hawkular-metrics-wildfly-standalone/0.26.1.Final/ ○ bin/standalone -b 0.0.0.0 ● Download, extract and start Grafana ● Download, install, and configure the Hawkular plugin for Hawkular ○ https://grafana.com/plugins/hawkular-datasource/installation ○ https://github.com/hawkular/hawkular-grafana-datasource i. pick a tenant id of your choice 26

Demo ● Install the Hawkular Metrics python client via pip ○ pip install hawkular-client ● Install psutil to collect CPU stats ○ pip install psutil ● Create an custom agent (using python client) ○ make sure you use the same tenant id configured with Grafana ○ pre-create and tag a metric for each CPU ○ collect CPU usage every 10 seconds ○ send the data to Hawkular Metrics 27

Demo #! /usr/bin/env python3 import psutil, time from hawkular.metrics import HawkularMetricsClient, MetricType client = HawkularMetricsClient(tenant_id='test') cpu_percent = psutil.cpu_percent(interval = 1, percpu = True) for index, cpu in enumerate(cpu_percent) : client.create_metric_definition(MetricType.Gauge, 'cpu%s' % index, cpu = 'cpu%s' % index) while True : cpu_percent = psutil.cpu_percent(interval = 1, percpu = True) for index, cpu in enumerate(cpu_percent) : client.push(MetricType.Gauge, 'cpu%s'% index, float(cpu)) time.sleep(10) 28

Resources ● Web - http://www.hawkular.org/ ● Github - https://github.com/hawkular ● Metrics Documentation - http://www.hawkular.org/tags/metrics.html ● Alerting Documentation - http://www.hawkular.org/tags/alerts.html ● Twitter - https://twitter.com/hawkular_org 29

Thank you! hawkular.org #hawkular (on freenode) snegrea@redhat.com

Hawkular Metrics Metric Storage & Alerting Stefan Negrea About - PowerPoint PPT Presentation

Hawkular Metrics Metric Storage & Alerting Stefan Negrea About Me Co-Creator of Hawkular Metrics 2 Hawkular Metrics Hawkular Demo & Alerting Introduction to Hawkular Metrics 3 Pre-History 2006 JBoss Operations Network 1.0

What we learned from Community Metrics Agenda Why are metrics used? How metrics are used

Performance Metrics for Graph Mining Tasks 1 Outline Introduction to Performance Metrics

AGENCY OPERATIONS METRICS The Metrics of Me The Metrics of Me x 159 13,006 5 days old books

Proposal Metrics Dashboard What Gets Measured Gets Done Topics Why Keep Metrics? What

Software Metrics Alex Boughton Executive Summary What are software metrics? Why are

Astheno-Khler and strong KT General results metrics Bismut connection Definition of strong KT

NDCs and metrics Andrei Marcu , Director, ERCST 1 NDCs and metrics Main issues: - Which metrics

Metrics are Pivotal A NATIONAL FARM TO INSTITUTION METRICS COLLABORATIVE WEBINAR Local

Metrics and Estimation Rahul Premraj + Andreas Zeller 1 Metrics Quantitative measures that

Software Metrics And I gnominy Software Metrics And I gnominy Software Metrics And I gnominy

Software Metrics Chapter 4 1 SW Metrics SW process and product metrics are quantitative

Software Metrics Overview SE 350 Software Process & Product Quality Lecture Objectives

Opsim and MAF Metrics Lynne Jones Opsim and MAF metrics Call for white papers on LSST survey

CSA Z260 Pipeline Safety Metrics CSA Z260 - Pipeline Safety Metrics Provide a suite of

Open Metrics Journal Metrics Perspective from an Open Access Publisher Martin Fenner Technical

Distance Metrics Mark Voorhies 4/5/2018 Mark Voorhies Distance Metrics List tricks Adding

Cosmic Rays Energy Spectrum from PeV to EeV energies measured by the TALE Detector Tareq

Cue combinations, Bayesian models Thurs. March 1, 2018 1 Visual Cues: image properties that

Deep-Learning: general principles + Convolutional Neural Networks Pr. Fabien MOUTARDE Center

robots navigation LUKAS HFLIGER SUPERVISED BY MARIAN GEORGE 2 LUKAS HFLIGER 3 4 LUKAS

Using Geometry to Detect Grasp Poses in 3D Point Clouds ten Pas, Platt Northeastern University

Depth Perception in Grasshopper -Shashank Chepurwar -Ritvik Srivastava Grasshopper -Agile

Stereo CSE 576 Ali Farhadi Several slides from

Squeezing down the computing Edit Master text styles Second level requirements of deep neural

Hawkular Metrics Metric Storage & Alerting Stefan Negrea About - PowerPoint PPT Presentation

Hawkular Metrics Metric Storage & Alerting Stefan Negrea About Me Co-Creator of Hawkular Metrics 2 Hawkular Metrics Hawkular Demo & Alerting Introduction to Hawkular Metrics 3 Pre-History 2006 JBoss Operations Network 1.0

What we learned from Community Metrics Agenda Why are metrics used? How metrics are used

Performance Metrics for Graph Mining Tasks 1 Outline Introduction to Performance Metrics

AGENCY OPERATIONS METRICS The Metrics of Me The Metrics of Me x 159 13,006 5 days old books

Proposal Metrics Dashboard What Gets Measured Gets Done Topics Why Keep Metrics? What

Software Metrics Alex Boughton Executive Summary What are software metrics? Why are

Astheno-Khler and strong KT General results metrics Bismut connection Definition of strong KT

NDCs and metrics Andrei Marcu , Director, ERCST 1 NDCs and metrics Main issues: - Which metrics

Metrics are Pivotal A NATIONAL FARM TO INSTITUTION METRICS COLLABORATIVE WEBINAR Local

Metrics and Estimation Rahul Premraj + Andreas Zeller 1 Metrics Quantitative measures that

Software Metrics And I gnominy Software Metrics And I gnominy Software Metrics And I gnominy

Software Metrics Chapter 4 1 SW Metrics SW process and product metrics are quantitative

Software Metrics Overview SE 350 Software Process &amp; Product Quality Lecture Objectives

Opsim and MAF Metrics Lynne Jones Opsim and MAF metrics Call for white papers on LSST survey

CSA Z260 Pipeline Safety Metrics CSA Z260 - Pipeline Safety Metrics Provide a suite of

Open Metrics Journal Metrics Perspective from an Open Access Publisher Martin Fenner Technical

Distance Metrics Mark Voorhies 4/5/2018 Mark Voorhies Distance Metrics List tricks Adding

Cosmic Rays Energy Spectrum from PeV to EeV energies measured by the TALE Detector Tareq

Cue combinations, Bayesian models Thurs. March 1, 2018 1 Visual Cues: image properties that

Deep-Learning: general principles + Convolutional Neural Networks Pr. Fabien MOUTARDE Center

robots navigation LUKAS HFLIGER SUPERVISED BY MARIAN GEORGE 2 LUKAS HFLIGER 3 4 LUKAS

Using Geometry to Detect Grasp Poses in 3D Point Clouds ten Pas, Platt Northeastern University

Depth Perception in Grasshopper -Shashank Chepurwar -Ritvik Srivastava Grasshopper -Agile

Stereo CSE 576 Ali Farhadi Several slides from

Squeezing down the computing Edit Master text styles Second level requirements of deep neural

Software Metrics Overview SE 350 Software Process & Product Quality Lecture Objectives