Pushing Prometheus until it breaks. The bumpy road to a fully - - PowerPoint PPT Presentation

pushing prometheus until it breaks
SMART_READER_LITE
LIVE PREVIEW

Pushing Prometheus until it breaks. The bumpy road to a fully - - PowerPoint PPT Presentation

Pushing Prometheus until it breaks. The bumpy road to a fully automated benchmarking. Krasi Georgiev, Harsh Agarwal @krazygeorgiev @thesipian Prometheus Krasi Georgiev no problem if you pronounce it crazy Climbing and Carrots


slide-1
SLIDE 1

Prometheus

Pushing Prometheus until it breaks.

The bumpy road to a fully automated benchmarking. Krasi Georgiev, Harsh Agarwal

@krazygeorgiev @thesipian

slide-2
SLIDE 2

Prometheus

Climbing and Carrots

no problem if you pronounce it “crazy”

Krasi Georgiev

github.com/krasi-georgiev

slide-3
SLIDE 3

Prometheus

Climbing and Carrots

no problem if you pronounce it “crazy” Prometheus maintainer from March

Krasi Georgiev

github.com/krasi-georgiev

slide-4
SLIDE 4

Prometheus

Climbing and Carrots

no problem if you pronounce it “crazy” Prometheus maintainer from March Part of the Prometheus team.

Krasi Georgiev

github.com/krasi-georgiev

slide-5
SLIDE 5

Prometheus

  • Openshift
  • Kiali
  • Operations monitoring

Prometheus in Red Hat

slide-6
SLIDE 6

Prometheus

  • Undergraduate student from
  • Google Summer of Code 2018 intern for Prometheus

○ Mentors ■ Krasi Georgiev ■ Goutham Veeramachaneni

Harsh Agarwal

@thesipian github.com/sipian

slide-7
SLIDE 7

Prometheus

Is prometheus ready for a new release?

slide-8
SLIDE 8

Prometheus

I day of a Prometheus maintainer.

slide-9
SLIDE 9

Prometheus

Wake up and get ready for work.

slide-10
SLIDE 10

Prometheus

User:

My Prometheus is broken please help!

Me:

Hey man I am not a psychic , give some details.

User:

100 lines config scraping 3500 targets.

Me:

  • oooo sh*t…. replicate this locally…........?

Sounds Familiar?

slide-11
SLIDE 11

Prometheus

Why benchmarking? Why not just unit and e2e tests?

  • Memory leaks appear at high load and happen over a long period.
  • Compaction can be tested only with long running tests.
slide-12
SLIDE 12

Prometheus

  • Build by the k8s testing team and runs on k8s

○ each job is a via pod deployments

  • Integrates nicely with github

○ triggers via github comments

  • Written in golang
  • Easy to extend via plugins
  • Used by k8s, openshift, istio, jetstack

Extended discussions: https://goo.gl/CuKsMB

Why Prow CI?

slide-13
SLIDE 13

Prometheus

  • Drone CI

No comment triggers (WIP)

I bit hacky to make it run jobs on k8s(WIP)

  • Packet - Cloud on Bare Metal

kudos to their amazing team for helping.

decided to use GKE to avoid the k8s bootstrapping

  • terraform , kubectl

○ Prefer to troubleshoot few lines of golang code ○ Life is too short to learn new tools :) ○ The prombench tool is easy to use and no dependencies

Didn’t make the finals.

slide-14
SLIDE 14

Prometheus

Recorded Demo?

https://github.com/prometheus/prombench

slide-15
SLIDE 15

Prometheus

https://github.com/prometheus/prombench

Life of a Bench test

slide-16
SLIDE 16

Prometheus

https://github.com/prometheus/prombench

Life of a Bench test

slide-17
SLIDE 17

Prometheus

https://github.com/prometheus/prombench

Life of a Bench test

slide-18
SLIDE 18

Prometheus

https://github.com/prometheus/prombench

Life of a Bench test

slide-19
SLIDE 19

Prometheus

https://github.com/prometheus/prombench

Life of a Bench test

slide-20
SLIDE 20

Prometheus

https://github.com/prometheus/prombench

Life of a Bench test

slide-21
SLIDE 21

Prometheus

https://github.com/prometheus/prombench

Life of a Bench test

slide-22
SLIDE 22

Prometheus

https://github.com/prometheus/prombench

Life of a Bench test

slide-23
SLIDE 23

Prometheus

https://github.com/prometheus/prombench

Life of a Bench test

slide-24
SLIDE 24

Prometheus

https://github.com/prometheus/prombench

Life of a Bench test

slide-25
SLIDE 25

Prometheus

https://github.com/prometheus/prombench

Life of a Bench test

slide-26
SLIDE 26

Prometheus

https://github.com/prometheus/prombench

Life of a Bench test

slide-27
SLIDE 27

Prometheus

Results example PR vs Master

slide-28
SLIDE 28

Prometheus

  • Document how to run manually without prow.
  • Catch Races with the -- race detector.
  • Add Service Discovery tests.
  • Automated Scalability tests.

○ post results with every release

  • New cli commands

○ `promtool debug` - gather metrics/profile data ○ `tsdb scan` - scan and repair/delete corrupted blocks(WIP).

This is only the beginning!

slide-29
SLIDE 29

Prometheus

Positively surprised of how much support we received from everyone.

  • Packet , Google , RedHat(for letting me work on this).
  • Goutham, Frederic , Max for providing feedback along the way
  • Julius - for putting us in touch with the CNCF for the GKE credit.
  • Chris from the CNCF for setting up the google account.
  • Fabian & Max for their first Prombench version.

○ Many pieces were all ready to use.

  • The kubernetes-sig-testing team for introducing us to prow.

○ https://github.com/cjwagner ○ https://github.com/krzyzacy the Big thanks

slide-30
SLIDE 30

Prometheus

Conclusion!

If you don’t break = you don’t refactor and optimize.

slide-31
SLIDE 31

Prometheus

https://github.com/kiali/kiali

Istio Prometheus

Jaeger

Open source - Mesh observability

slide-32
SLIDE 32

Prometheus

Slides: https://goo.gl/ky5ZZX Code: https://github.com/prometheus/prombench Proposal: https://goo.gl/CuKsMB

If you DON’T have any questions please ask them now.

Questions for extreme sports are welcome :)