Percona Live, 2018-11-06
Monitoring Kubernetes with Prometheus
Henri Dubois-Ferriere @henridf
Monitoring Kubernetes with Prometheus Henri Dubois-Ferriere - - PowerPoint PPT Presentation
Monitoring Kubernetes with Prometheus Henri Dubois-Ferriere @henridf Percona Live, 2018-11-06 Hello. Henri Dubois-Ferriere Technical Director, Sysdig Doing observability for many many years, from network to web apps via many startups.
Percona Live, 2018-11-06
Henri Dubois-Ferriere @henridf
Doing “observability” for many many years, from network to web apps via many startups. PhD in CS from EPFL Repatriate from San Francisco to Switzerland
https://commons.wikimedia.org/wiki/File:Kubernetes.png
https://prometheus.io/assets/architecture.png
Query: http_requests_total{code=”200”, method=”get”}
Selector (aka filter) Metric name
Query:
http_requests_total{code=”200”, method=”get”}
Response: http_requests_total{code="200", method=”get”, route="/api/users"} 1528706829.115 1741 http_requests_total{code="200", method=”get”, route="/api/objects"} 1528706829.115 1920
Label/value pairs (aka dimensions)
Query:
http_requests_total{code=”200”, method=”get”}
Response: http_requests_total{code="200", method=”get”, route="/api/users"} 1528706829.115 1741 http_requests_total{code="200", method=”get”, route="/api/objects"} 1528706829.115 1920
Timestamp value
Percent of total cluster memory used:
sum(container_memory_rss) / sum(machine_memory_bytes)
Memory used by kubernetes namespace:
sum(container_memory_rss) by (namespace)
Top 5 pods by network I/O:
topk(5, sum by (pod_name) (rate(container_network_transmit_bytes_total[5m])))
$ kubectl get deploy my-app -o yaml apiVersion: extensions/v1beta1 kind: Deployment metadata: name: my-app ... spec: replicas: 4 ... status: replicas: 4 ...
$ kubectl get deploy my-app -o yaml apiVersion: extensions/v1beta1 kind: Deployment metadata: name: my-app ... spec: replicas: 4 ... status: replicas: 4 ...
kube_deployment_spec_replicas{deployment="my-app", ...} Metrics created by kube-state-metrics With label set from this deployment kube_deployment_status_replicas{deployment="my-app", ...}
Deployments with issues
kube_deployment_spec_replicas != kube_deployment_status_replicas_available
Top 10 longest-running pods (“reverse uptime”)
topk(10, sort_desc(time() - kube_pod_created))
Deployment mode How many Metrics about node_exporter daemonset 1 per node node resources cAdvisor inside kubelet 1 per node container resources kube-state-metrics deployment singleton k8s object state etcd, Api Server, controller manager, ... core service singleton or HA group Itself
AlertingSpec, ...
more familiar with challenges of direct route
Henri Dubois-Ferriere @henridf
https://prometheus.io/docs/prometheus/latest/configuration/configuration/#kubernetes_sd_config
https://blog.freshtracks.io/a-deep-dive-into-kubernetes-metrics-66936addedae