Querying Prometheus with Flux (#fluxlang) Paul Dix @pauldix - - PowerPoint PPT Presentation

querying prometheus with flux fluxlang
SMART_READER_LITE
LIVE PREVIEW

Querying Prometheus with Flux (#fluxlang) Paul Dix @pauldix - - PowerPoint PPT Presentation

Querying Prometheus with Flux (#fluxlang) Paul Dix @pauldix paul@influxdata.com Data-scripting language Functional MIT Licensed Language & Runtime/Engine Prometheus users: so what? High availability? Sharded Data?


slide-1
SLIDE 1

Querying Prometheus with Flux (#fluxlang)

Paul Dix @pauldix paul@influxdata.com

slide-2
SLIDE 2
slide-3
SLIDE 3
  • Data-scripting language
  • Functional
  • MIT Licensed
  • Language & Runtime/Engine
slide-4
SLIDE 4

Prometheus users: so what?

slide-5
SLIDE 5

High availability?

slide-6
SLIDE 6

Sharded Data?

slide-7
SLIDE 7

Federation?

slide-8
SLIDE 8
slide-9
SLIDE 9
slide-10
SLIDE 10
slide-11
SLIDE 11
slide-12
SLIDE 12

subqueries

slide-13
SLIDE 13
slide-14
SLIDE 14

subqueries recording rules

slide-15
SLIDE 15

Ad hoc exporation

slide-16
SLIDE 16
slide-17
SLIDE 17

Focus is Strength

slide-18
SLIDE 18

Saying No is an Asset

slide-19
SLIDE 19
slide-20
SLIDE 20

Liberate the silo!

slide-21
SLIDE 21
slide-22
SLIDE 22

Language Elements

slide-23
SLIDE 23

// get all data from the telegraf db from(db:"telegraf") // filter that by the last hour |> range(start:-1h) // filter further by series with a specific measurement and field |> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system")

slide-24
SLIDE 24

// get all data from the telegraf db from(db:"telegraf") // filter that by the last hour |> range(start:-1h) // filter further by series with a specific measurement and field |> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system")

Comments

slide-25
SLIDE 25

// get all data from the telegraf db from(db:"telegraf") // filter that by the last hour |> range(start:-1h) // filter further by series with a specific measurement and field |> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system")

Functions

slide-26
SLIDE 26

// get all data from the telegraf db from(db:"telegraf") // filter that by the last hour |> range(start:-1h) // filter further by series with a specific measurement and field |> filter(fn: r => r._measurement == "cpu" and r._field == "usage_system")

Pipe forward operator

slide-27
SLIDE 27

// get all data from the telegraf db from(db:"telegraf") // filter that by the last hour |> range(start:-1h) // filter further by series with a specific measurement and field |> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system")

Named Arguments

slide-28
SLIDE 28

// get all data from the telegraf db from(db:"telegraf") // filter that by the last hour |> range(start:-1h) // filter further by series with a specific measurement and field |> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system")

String Literal

slide-29
SLIDE 29

// get all data from the telegraf db from(db:"telegraf") // filter that by the last hour |> range(start:-1h) // filter further by series with a specific measurement and field |> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system")

Duration Literal (relative time)

slide-30
SLIDE 30

// get all data from the telegraf db from(db:"telegraf") // filter that by the last hour |> range(start:”2018-08-09T14:00:00Z“) // filter further by series with a specific measurement and field |> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system")

Time Literal

slide-31
SLIDE 31

// get all data from the telegraf db from(db:"telegraf") // filter that by the last hour |> range(start:-1h) // filter further by series with a specific measurement and field |> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system")

Anonymous Function

slide-32
SLIDE 32

Operators

+ == != ( )

  • < !~ [ ]

* > =~ { } / <= = , : % >= <- . |>

slide-33
SLIDE 33

Types

  • int
  • uint
  • float64
  • string
  • duration
  • time
  • regex
  • array
  • object
  • function
  • namespace
  • table
  • table stream
slide-34
SLIDE 34

Ways to run Flux - (interpreter, fluxd api server, InfluxDB 1.7 & 2.0)

slide-35
SLIDE 35

Flux builder in Chronograf

slide-36
SLIDE 36

Flux builder in Grafana

slide-37
SLIDE 37

Flux is about:

slide-38
SLIDE 38

Time series in Prometheus

slide-39
SLIDE 39
slide-40
SLIDE 40
slide-41
SLIDE 41

// get data from Prometheus on http://localhost:9090 fromProm(query:`node_cpu_seconds_total{cpu=“0”,mode=“idle”}`) // filter that by the last minute |> range(start:-1m)

slide-42
SLIDE 42
slide-43
SLIDE 43
slide-44
SLIDE 44
slide-45
SLIDE 45
slide-46
SLIDE 46
slide-47
SLIDE 47
slide-48
SLIDE 48
slide-49
SLIDE 49

Multiple time series in Prometheus

slide-50
SLIDE 50

fromProm(query: `node_cpu_seconds_total{cpu=“0”,mode=~”idle|user”}`) |> range(start:-1m) |> keep(columns: [“name”, “cpu”, “host”, “mode”, “_value”, “_time”])

slide-51
SLIDE 51
slide-52
SLIDE 52
slide-53
SLIDE 53
slide-54
SLIDE 54
slide-55
SLIDE 55
slide-56
SLIDE 56

Tables are the base unit

slide-57
SLIDE 57

Not tied to a specific data model/schema

slide-58
SLIDE 58

Filter function

slide-59
SLIDE 59

fromProm() |> range(start:-1m) |> filter(fn: (r) => r.__name__ == “node_cpu_seconds_total” and r.mode == “idle” and r.cpu == “0”) |> keep(columns: [“name”, “cpu”, “host”, “mode”, “_value”, “_time”])

slide-60
SLIDE 60
slide-61
SLIDE 61
slide-62
SLIDE 62
slide-63
SLIDE 63
slide-64
SLIDE 64

fromProm() |> range(start:-1m) |> filter(fn: (r) => r.__name__ == “node_cpu_seconds_total” and r.mode in [“idle”, “user”] and r.cpu == “0”) |> keep(columns: [“name”, “cpu”, “host”, “mode”, “_value”, “_time”])

slide-65
SLIDE 65
slide-66
SLIDE 66
slide-67
SLIDE 67
slide-68
SLIDE 68

Aggregate functions

slide-69
SLIDE 69

fromProm() |> range(start:-30s) |> filter(fn: (r) => r.__name__ == “node_cpu_seconds_total” and r.mode == “idle” and r.cpu =~ /0|1/) |> count() |> keep(columns: [“name”, “cpu”, “host”, “mode”, “_value”, “_time”])

slide-70
SLIDE 70
slide-71
SLIDE 71
slide-72
SLIDE 72
slide-73
SLIDE 73
slide-74
SLIDE 74
slide-75
SLIDE 75
slide-76
SLIDE 76

_start and _stop are about windows of data

slide-77
SLIDE 77

fromProm(query: `node_cpu_seconds_total{cpu=“0”,mode=“idle”}` |> range(start: -1m)

slide-78
SLIDE 78
slide-79
SLIDE 79

fromProm(query: `node_cpu_seconds_total{cpu=“0”,mode=“idle”}` |> range(start: -1m) |> window(every: 20s)

slide-80
SLIDE 80
slide-81
SLIDE 81

fromProm(query: `node_cpu_seconds_total{cpu=“0”,mode=“idle”}` |> range(start: -1m) |> window(every: 20s)j |> min()

slide-82
SLIDE 82
slide-83
SLIDE 83

fromProm(query: `node_cpu_seconds_total{cpu=“0”,mode=“idle”}` |> range(start: -1m) |> window(every: 20s)j |> min() |> window(every:inf)

slide-84
SLIDE 84
slide-85
SLIDE 85

Window converts N tables to M tables based on time boundaries

slide-86
SLIDE 86

Group converts N tables to M tables based on values

slide-87
SLIDE 87

fromProm(query: `node_cpu_seconds_total{cpu=~“0|1”,mode=“idle”}`) |> range(start: -1m)

slide-88
SLIDE 88
slide-89
SLIDE 89

fromProm(query: `node_cpu_seconds_total{cpu=~“0|1”,mode=“idle”}`) |> range(start: -1m) |> group(columns: [“__name__”, “mode”])

slide-90
SLIDE 90
slide-91
SLIDE 91
slide-92
SLIDE 92
slide-93
SLIDE 93

Nested range vectors

fromProm(host:”http://localhost:9090") |> filter(fn: (r) => r.__name__ == "node_disk_written_bytes_total") |> range(start:-1h) // transform into non-negative derivative values |> derivative() // break those out into tables for each 10 minute block of time |> window(every:10m) // get the max rate of change in each 10 minute window |> max() // and put everything back into a single table |> window(every:inf) // and now let’s convert to KB |> map(fn: (r) => r._value / 1024.0)

slide-94
SLIDE 94

Multiple Servers

dc1 = fromProm(host:”http://prom.dc1.local:9090") |> filter(fn: (r) => r.__name__ == “node_network_receive_bytes_total”) |> range(start:-1h) |> insertGroupKey(key: “dc”, value: “1”) dc2 = fromProm(host:”http://prom.dc2.local:9090") |> filter(fn: (r) => r.__name__ == “node_network_receive_bytes_total”) |> range(start:-1h) |> insertGroupKey(key: “dc”, value: “2”) dc1 |> union(streams: [dc2]) |> limit(n: 2) |> derivative() |> group(columns: [“dc”]) |> sum()

slide-95
SLIDE 95

Work with data from many sources

  • from() // influx
  • fromProm()
  • fromMySQL()
  • fromCSV()
  • fromS3()
slide-96
SLIDE 96

Defining Functions

fromProm(query: `node_cpu_seconds_total{cpu=“0”,mode=“idle”}` |> range(start: -1m) |> window(every: 20s)j |> min() |> window(every:inf)

slide-97
SLIDE 97

Defining Functions

windowAgg = (every, fn, <-stream) => { return stream |> window(every: every) |> fn() |> window(every:inf) } fromProm(query: `node_cpu_seconds_total{cpu=“0”,mode=“idle”}` |> range(start: -1m) |> windowAgg(every:20s, fn: min)

slide-98
SLIDE 98

Packages & Namespaces

package “flux-helpers” windowAgg = (every, fn, <-stream) => { return stream |> window(every: every) |> fn() |> window(every:inf) } // in a new script import helpers “github.com/pauldix/flux-helpers" fromProm(query: `node_cpu_seconds_total{cpu=“0”,mode=“idle”}` |> range(start: -1m) |> helpers.windowAgg(every:20s, fn: min)

slide-99
SLIDE 99

Project Status

  • Everything in this talk is prototype (as of 2018-08-09)
  • Proposed Final Language Spec
  • Release flux, fluxd, InfluxDB 1.7, InfluxDB 2.0 alpha
  • Iterate with community to finalize spec
  • Optimizations!
  • https://github.com/influxdata/flux
slide-100
SLIDE 100

Future work

slide-101
SLIDE 101

More complex Flux compilations to PromQL?

slide-102
SLIDE 102

PromQL parser for Flux engine?

slide-103
SLIDE 103

Add Flux into Prometheus?

slide-104
SLIDE 104

Arrow API for Prometheus

slide-105
SLIDE 105

Apache Arrow

slide-106
SLIDE 106

Stream from Prometheus

slide-107
SLIDE 107

Pushdown matcher and range

slide-108
SLIDE 108

Later pushdown more?

slide-109
SLIDE 109

Standardized Remote Read API?

slide-110
SLIDE 110

Arrow is becoming the lingua franca in data science and big data

slide-111
SLIDE 111

fromProm(query: `{__name__=~/node_.*/}`) |> range(start:-1h) |> toCSV(file: “node-data.csv”) |> toFeather(file: “node-data.feather”)

slide-112
SLIDE 112

Much more work to be done…

slide-113
SLIDE 113

Prometheus + Flux = Possibilities

slide-114
SLIDE 114

Thank you

Paul Dix @pauldix paul@influxdata.com