Monitoring Kubernetes with OMD Labs Edition and Prometheus - - PowerPoint PPT Presentation

monitoring kubernetes with omd labs edition and prometheus
SMART_READER_LITE
LIVE PREVIEW

Monitoring Kubernetes with OMD Labs Edition and Prometheus - - PowerPoint PPT Presentation

Monitoring Kubernetes with OMD Labs Edition and Prometheus Michael Kraus - FOSDEM 2017 About me Doing monitoring for 12 years, mainly with plain old Nagios, open-source only. About me Michael Kraus Senior Monitoring Consultant @ ConSol.


slide-1
SLIDE 1

Monitoring Kubernetes with OMD Labs Edition and Prometheus

Michael Kraus - FOSDEM 2017

slide-2
SLIDE 2

About me

slide-3
SLIDE 3

About me

Michael Kraus Doing monitoring for 12 years, mainly with plain old Nagios,

  • pen-source only.

Senior Monitoring Consultant @ ConSol.

slide-4
SLIDE 4

Background

slide-5
SLIDE 5

Why

Kubernetes in a classical enterprise Implementation of Kubernetes PoC at $customer: We have …

  • already running some

monitoring instances there.

  • but no idea about

monitoring Kubernetes.

slide-6
SLIDE 6

With

Enter Prometheus Natural choice for kubernetes monitoring:

  • Integrated service

discovery

  • Labels are retained

between Kubernetes and Prometheus

slide-7
SLIDE 7

How

Where to start There are excellent tutorials and blog posts available as a starting point, for example by

  • coreos.com/blog/

( Fabian Reinartz )

  • robustperception.io/blog/

( Brian Brazil )

  • … many examples on

GitHub

slide-8
SLIDE 8

Implementation

slide-9
SLIDE 9

Implementation

Prometheus kubernetes_sd

  • kubernetes_sd_configs
  • role: endpoints
  • kubernetes_sd_configs
  • role: node
  • kubernetes_sd_configs
  • role: pod

prometheus-kubernetes.yml from prometheus/examples.

slide-10
SLIDE 10
slide-11
SLIDE 11

Implementation

Prometheus kubernetes_sd Metrics:

  • apiserver_*
  • container_cpu_*
  • container_fs_*
  • deployment_*
  • etcd_*
  • kubelet_*
  • ...
slide-12
SLIDE 12

Implementation

node_exporter Prometheus exporter for hardware and OS metrics exposed by the kernel.

  • DaemonSet
  • prometheus.io/scrape:

'true'

slide-13
SLIDE 13
slide-14
SLIDE 14

Implementation

node_exporter Metrics:

  • node_cpu
  • node_disk_*
  • node_filesystem_*
  • node_netstat_*
  • node_vmstat_*
  • ...
slide-15
SLIDE 15

Implementation

kube-state-metrics “... focused … on the health of the various objects inside, such as deployments, nodes and pods.”

  • prometheus.io/scrape:

'true'

slide-16
SLIDE 16
slide-17
SLIDE 17

Implementation

kube-state-metrics Metrics:

  • kube_deployment_*
  • kube_node_*
  • kube_pod_*
  • kube_resource_quota
  • ...
slide-18
SLIDE 18

Implementation

Demo environment Based on minikube:

github.com/ kubernetes/minikube

Sample config:

github.com/ m-kraus/kubernetes-monitoring

slide-19
SLIDE 19

Demo

slide-20
SLIDE 20

Implementation

What else? What we also need:

  • persistent storage
  • Alertmanager
  • Grafana
  • Pushgateway
  • ...
slide-21
SLIDE 21
slide-22
SLIDE 22

But we have that already

slide-23
SLIDE 23

Classical monitoring

OMD Labs Edition Monitoring in one package.

  • completely open-source
  • based on Nagios / Icinga
  • bundles “best practices” of

many years of experience

  • no root required

"Musterlösung" at $customer for monitoring projects:

slide-24
SLIDE 24

Classical monitoring

OMD Labs Edition

Nagios Icinga1 Icinga2 Shinken

Naemon Thruk Mod-Gearman LMD NagVis

PNP4Nagios

Apache MySQL InfluxDB Nagflux

Prometheus

Dokuwiki

Grafana

FreeTDS JMX4Perl check_webinject check_logfiles

Jolokia

check_mysql_health coshsh check_mssql_health rrdcache check_nsc_web check_curly check_nwc_health check_multi check_oracle_health Ansible

slide-25
SLIDE 25

Classical monitoring

OMD sites and commads

  • md create <MYSITE>
  • md cp <PROD> <STAGE>
  • md update <STAGE>
  • md version
slide-26
SLIDE 26
  • md create <MYSITE>
  • md cp <PROD> <STAGE>
slide-27
SLIDE 27

Classical monitoring

OMD Labs Edition https://labs.consol.de/omd/

slide-28
SLIDE 28

Implementation

Connecting OMD Why not scrape Kubernetes directly from OMD:

  • hard to access pods inside

Kubernetes

  • hard to access API from
  • utside Kubernetes
  • API secured via TLS and

token only (easily) available from a serviceaccount

slide-29
SLIDE 29

Implementation

Connecting OMD Getting the metrics from Kubernetes to OMD:

  • federation
  • job_name: 'kube_federation'

metrics_path: '/federate' honor_labels: true params: 'match[]':

  • '{job=~"^kubernetes.+"}'
slide-30
SLIDE 30

OMD

slide-31
SLIDE 31

Demo

slide-32
SLIDE 32

Issues

slide-33
SLIDE 33

Issues

Federation “... Not quite the purpose of federation.” Brian Brazil

www.robustperception.io/ federation-what-is-it-good-for/

  • Let’s try it anyway ...
slide-34
SLIDE 34

Issues

Securing "Accessing metrics without authentication is ok for a PoC, but not allowed in production..." internal audit

  • How to secure (federated)

Prometheus?

slide-35
SLIDE 35

Issues

Integration "Should Nagios, Alertmanager

  • r both notify?"

“Do we need to define our checks and alerts both, in Nagios and Prometheus?”

  • How to route alerts
  • How to ease or centralize

configuration

slide-36
SLIDE 36

Issues

Long-term storage “How can we store (some) of

  • ur graphs for a longer period
  • f time?”
  • InfluxDB ?
slide-37
SLIDE 37

Issues

Coverage “Our kubernetes cluster died. We had no monitoring until it up again...”

  • perations team
  • external monitoring of

crucial components ○ machine health ○ important services ○ important API queries

slide-38
SLIDE 38

Thanks for watching