Effidient!Ti喘imeiieries T
with PostgreSQL
iTeveiiimps Tont!
steve@smpsn.net
FOSDEM PGDay 2018
Effidient!Tiimeiieries T with PostgreSQL iTeveiiimps Tont! FOSDEM - - PowerPoint PPT Presentation
Effidient!Tiimeiieries T with PostgreSQL iTeveiiimps Tont! FOSDEM PGDay 2018 steve@smpsn.net Overview Overview Background Overview Background Complexity Overview Background Complexity Time Series Overview Schema Indexing
Effidient!Ti喘imeiieries T
with PostgreSQL
iTeveiiimps Tont!
steve@smpsn.net
FOSDEM PGDay 2018
Overview
Overview
Background
Overview
Background Complexity
Overview
Background Complexity Time Series
Overview
Background Schema Indexing Normalisation Summarisation Partitioning Complexity Time Series
Bafidkgrount!d
Developers T,iDevelopers T,iDevelopers T
Expent!s Tivei喘oys T
Mont!iTorint!g
Vis TibiliTyiint!ToiTheioperaTiont!iofihardwareiant!dis TofTware e.g. web site, database, cluster, disk drive
Mont!iTorint!g
Vis TibiliTyiint!ToiTheioperaTiont!iofihardwareiant!dis TofTware e.g. web site, database, cluster, disk drive
! Look ! ! Graphs !
Mont!iTorint!gi–i喘imeiieries T
ComplexiTy
“Mifidros Tervifides TiHell”
PostgreSQL Redis Frontend Machine Learning Bit Login / IAM Backend API Cache Configuration Store Elasticsearch Cassandra Search API Service Discovery (consul, etcd) Container Orchestration Engine Docker Swarm / Kubernetes MySQLMoreiCowbelli-iOpent!iTafidk
Heat Horizon Nova Keystone Conductor Scheduler Compute Glance Cinder KVM MySQL RabbitMQ ... ... ... ... ... ... ... ... CephMoreiCowbelli-iOpent!iTafidk
Heat Horizon Nova Keystone Conductor Scheduler Compute Glance Cinder KVM MySQL RabbitMQ ... ... ... ... ... ... ... ... Cephi i
Mont!iTorint!g
Software Network Storage Servers MeTrifids T Logs Ti i
MiddlewareMont!iTorint!g
Software Network Storage Log API Kafka Logstash Elastic InfluxDB Metric API Alerting Grafana Kibana MySQL SQLite Servers MeTrifids T Logs T Zookeeper Stormi i
Refidap
PostgreSQL Redis Frontend Machine Learning Bit Login / IAM Backend API Cache Configuration Store Elasticsearch Cassandra Search API Service Discovery (consul, etcd) Container Orchestration Engine - Docker Swarm / Kubernetes MySQLi i
Refidap
PostgreSQL Redis Frontend Machine Learning Bit Login / IAM Backend API Cache Configuration Store Elasticsearch Cassandra Search API Service Discovery (consul, etcd) Container Orchestration Engine - Docker Swarm / Kubernetes MySQL Heat Horizon Nova Keystone Conductor Scheduler Compute Glance Cinder KVM MySQL RabbitMQ ... ... ... ... ... ... ... ... Cephi i
Refidap
PostgreSQL Redis Frontend Machine Learning Bit Login / IAM Backend API Cache Configuration Store Elasticsearch Cassandra Search API Service Discovery (consul, etcd) Container Orchestration Engine - Docker Swarm / Kubernetes MySQL Heat Horizon Nova Keystone Conductor Scheduler Compute Glance Cinder KVM MySQL RabbitMQ ... ... ... ... ... ... ... ... Ceph Middleware Log API Kafka Logstash Elastic InfluxDB Metric API Alerting Grafana Kibana MySQL SQLite Zookeeper Stormi i
MiddlewareMont!iTorint!g
Software Network Storage Log API Kafka Logstash Elastic InfluxDB Metric API Alerting Grafana Kibana MySQL SQLite Servers MeTrifids T Logs T Zookeeper Stormi i
MiddlewareMont!iTorint!g
Software Network Storage Log API Kafka Logstash Elastic InfluxDB Metric API Alerting Grafana Kibana MySQL SQLite Servers MeTrifids T Logs T Zookeeper Stormi i
Mont!iTorint!g
Commendable “right tool for the job” attitude, but…
ATileas TT,ifidouldis TomeiofiTheipers Tis TTent!fideibeiunt!ifedd Fewerifailureimodes T Feweribafidkupis TTraTegies T Feweriredunt!dant!fidy/replifidaTiont!iproTofidols T Ont!eis TeTiofifidont!s Tis TTent!TidaTais Temant!Tifids T Re-us Teiexis TTint!gioperaTiont!aliknt!owledge
i i
Ont!eis Timpleiques TTiont!…
Is PostreSQL a Time Series Database?
喘imeiieries T
喘imeiieries TiDaTabas Tes Ti–i喘heiChoifide!
ank (Cassandra)
喘imeiieries TiDaTabas Tes Ti–i喘heiChoifide!
ank [Raintank]
喘imeiieries TiDaTabas Tes T
喘imeiieries TiDaTabas Tes T
2000
喘imeiieries TiDaTabas Tes T
2000 2010
喘imeiieries TiDaTabas Tes T
2000 2010 2013 - 2017
喘imeiieries Ti–iPeriodifid time value
10:01 15% 10:02 100% 10:03 86%
CollefidTor10:04 60% 10:05 0%
喘imeiieries Ti–iPeriodifid
CollefidTortime value
10:01 15%
MeTrifid MeTaname dimensions
cpu
{ host:prod1 }10:01 1% cpu
{ host:prod2 }10:01 24° temp
{ sensor:rack } CollefidTor10:02 100% cpu
{ host:prod1 }10:02 87% cpu
{ host:prod2 }10:02 26° temp
{ sensor:rack }喘imeiieries Ti–iiporadifid
CollefidTortime value
10:07 13
MeTrifid MeTaname dimensions meta
log
{ host:prod1 } Event!T MeTa { msg:… }11:39 1 log
{ host:prod2 } { msg:… }11:50 2 alarm
{ host:prod1 } { reason:… } Event!Ts T14:02 1 alarm
{ host:prod1 } { reason:… }"metric": { "timestamp": 1232141412, "value": 42, "name": "cpu.percent", "dimensions": { "hostname": "dev-01" }, "value_meta": { … } } 喘imeiieries Ti–iInt!ges TTiDaTa
ags”)
喘imeiieries Ti–iQueries T
10:01 10:02 10:03 20 21 22 23 24 25 26 27 28 29 30 { sensor:rack } temp °Ctime value
10:01 15%
name dimensions
cpu
{ host:prod1 }10:01 1% cpu
{ host:prod2 }10:01 24° temp
{ sensor:rack }10:02 100% cpu
{ host:prod1 }10:02 87% cpu
{ host:prod2 }10:02 26° temp
{ sensor:rack }10:03 100% cpu
{ host:prod1 }10:03 50% cpu
{ host:prod2 }10:03 27° temp
{ sensor:rack }喘imeiieries Ti–iQueries T
10:01 10:02 10:03 10 20 30 40 50 60 70 80 90 100 { host:prod1 } { host:prod2 } cpu %time value
10:01 15%
name dimensions
cpu
{ host:prod1 }10:01 1% cpu
{ host:prod2 }10:01 24° temp
{ sensor:rack }10:02 100% cpu
{ host:prod1 }10:02 87% cpu
{ host:prod2 }10:02 26° temp
{ sensor:rack }10:03 100% cpu
{ host:prod1 }10:03 50% cpu
{ host:prod2 }10:03 27° temp
{ sensor:rack }喘imeiieries Ti–iQueries T
10:01 10:02 10:03 10 20 30 40 50 60 70 80 90 100 { host:prod1 } { host:prod2 } cpu %time value
10:01 15%name dimensions
cpu { host:prod1 } 10:01 1% cpu { host:prod2 } 10:01 24° temp { sensor:rack } 10:02 100% cpu { host:prod1 } 10:02 87% cpu { host:prod2 } 10:02 26° temp { sensor:rack } 10:03 100% cpu { host:prod1 } 10:03 50% cpu { host:prod2 } 10:03 27° temp { sensor:rack } 10:04 10:04 10:04 10:05 10:05 10:05喘imeiieries Ti–iifidalint!g time value name dimensions
10:01 15% cpu
{host:prod1}10:01 1% cpu
{host:prod2}10:01 24° temp
{sensor:rack}10:02 100% cpu
{host:prod1}10:02 87% cpu
{host:prod2}10:02 26° temp
{sensor:rack}10:03 100% cpu
{host:prod1}10:03 50% cpu
{host:prod2}10:03 27° temp
{sensor:rack}RelaTiont!aliModel
CREATE TABLE measurements ( timestamp TIMESTAMPTZ, value FLOAT8, name VARCHAR, dimensions JSONB, value_meta JSON ); RelaTiont!aliModeli–iDent!ormalis Tediifidhema
ypical for sensor use cases
– Strict accuracy: use NUMERICCREATE TABLE measurements ( timestamp TIMESTAMPTZ, value FLOAT8, name VARCHAR, dimensions JSONB, value_meta JSON ); RelaTiont!aliModeli–iDent!ormalis Tediifidhema
SELECT DISTINCT name, dimensions FROM measurements WHERE
name = 'cpu.percent' dimensions @> '{"host": "dev-01"}'::JSONBRelaTiont!aliModeli–iieries TiLis TTint!giQuery
RelaTiont!aliModeli–iiint!gleiieries TiQuery
ake average of all data points in that interval
RelaTiont!aliModeli–iPerformant!fideiAnt!alys Tis T
Query Duration (seconds) Data Volume (M/rows)
喘argeT:i<100ms T
(QueryiDuraTiont!)
Query Duration (seconds) Time Range (seconds)
RelaTiont!aliModeli–iiint!gleiieries TiQueryi(vs Ti喘imeiRant!ge)
1000 2000 3000 4000 5000 6000 7000 8000 9000 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 3M Rows 2M Rows 1M Rows Query Time Range (seconds) Query Duration (seconds)RelaTiont!aliModeli–iiint!gleiieries TiQueryi(vs TiDaTaiVolume)
1 2 3 4 5 6 7 8 9 10 0.00 0.20 0.40 0.60 0.80 1.00 1.20 1.40 Data Volume (M-rows) Query Duration (seconds)RelaTiont!aliModeli-iAnt!alys Tis T
✔ QueryiTimeifxedi regardles Ts TiofiTimei rant!ge ✔ Ont!iTargeTifor <i~1Mirows T ✗ QueryiTimeis Tfidales Ti lint!earlyiwiThidaTai volume ✗ Everyiqueryireads Ti everyirow ✗ Full table scan
Int!dexint!g
Int!dexint!g
B喘REE,iHAiH,iBRIN,iGIN,iGIi喘
Index Table
1 6 3 2
Int!dexint!gi–iB喘REE
3 three 2 two 4 four 6 six 1
5 five 8 eight 7 seven
4 5 7 8
Index Table
1 6 3 2
Int!dexint!gi–iB喘REE
3 three 2 two 4 four 6 six 1
5 five 8 eight 7 seven
4 5 7 8
=7
Index Table
1 6 3 2
Int!dexint!gi–iB喘REE
3 three 2 two 4 four 6 six 1
5 five 8 eight 7 seven
4 5 7 8
>=6 <=8
Int!dexint!gi–iiint!gleiieries TiQuery
int!dexifidreaTiont!
Int!dexint!gi-i喘imes TTamp CREATE INDEX ON measurements USING BTREE (timestamp);
Int!dexint!gi–iiint!gleiieries TiQueryi(vs TiDaTaiVolume)
1 2 3 4 5 6 7 8 9 10 0.00 0.20 0.40 0.60 0.80 1.00 1.20 1.40 No Index With Index Data Volume (millions/rows) Query Duration (seconds)Int!dexint!gi–iiint!gleiieries TiQueryi(vs TiDaTaiVolume)
1 2 3 4 5 6 7 8 9 10 0.000 0.005 0.010 0.015 0.020 0.025 9000 8000 7000 Data Volume (millions/rows) Query Duration (seconds)Int!dexint!gi–iiint!gleiieries TiQueryi(vs Ti喘imeiRant!ge,i10MiRows T)
1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 1 Metric 10 Metrics Query Time Range (seconds) Query Duration (seconds)Int!dexint!gi–iiint!gleiieries TiQueryi(vs Ti喘imeiRant!ge,i10MiRows T)
1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 0.00 0.05 0.10 0.15 0.20 0.25 1 Metric 10 Metrics 100 Metrics Query Time Range (seconds) Query Duration (seconds)Int!dexint!gi-iAnt!alys Tis T
✔ DaTaiVolume ✔ T
✔ 喘imeiRant!ge ✔ T
✔ QueryiTimeis TTableias Ti DaTaiVolumeiint!fidreas Tes T
✗ 喘imeiRant!ge ✗ Over 4000s (100 Metrics) ✗ Nowiapparent!Tiqueryi duraTiont!iint!fidreas Tes Tias Ti 喘imeiRant!geigrows T ✗ Int!fidreas Tint!gint!umberiofi meTrifids Tidras TTifidallyi afefidTs TiqueryiduraTiont! ✗ Data for each uninteresting series must be fltered out
Int!dexint!gi–iiint!gleiieries TiQuery
Int!dexint!gi–iAddiTiont!al CREATE INDEX ON measurements USING BTREE (name); CREATE INDEX ON measurements USING GIN (dimensions);
meas Turement!Ts TiTable
Int!dexint!gi–iieries TiQueryi(vs Ti喘imeiRant!ge,i10MiRows T,i100iMeTrifids T)
1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 0.00 0.05 0.10 0.15 0.20 0.25 Time & Metric Time Index Query Time Range (seconds) Query Duration (seconds)Int!dexint!g
✗ Ohidear ✗ 喘ooimufidhiint!dexint!gifidant!ibeiharmful
Normalis TaTiont!
CREATE TABLE measurements ( timestamp TIMESTAMPTZ, value FLOAT8, name VARCHAR, dimensions JSONB, value_meta JSON ); Normalis TaTiont! CREATE TABLE values ( timestamp TIMESTAMPTZ, value FLOAT8, metric_id INT, value_meta JSON ); CREATE TABLE metrics ( id SERIAL, name VARCHAR, dimensions JSONB, UNIQUE (name, dimensions) );
Normalis TaTiont!
metric are only stored once
– Eliminates repeated bulky datain measurements table
integers to allot id values
– UNIQUE constraint is usefulduring normalisation
CREATE TABLE values ( timestamp TIMESTAMPTZ, value FLOAT8, metric_id INT, value_meta JSON ); CREATE TABLE metrics ( id SERIAL, name VARCHAR, dimensions JSONB, UNIQUE (name, dimensions) );
Normalis TaTiont!i–iView CREATE VIEW measurements AS SELECT timestamp, value, name, dimensions, value_meta FROM values INNER JOIN metrics ON (metric_id = id);
the same way as tables
produces contents of view
queries Tias Tibefore
Normalis TaTiont!i–iViewiInt!s TerT
views TibyidefaulT
Toiperformiont!iINiER喘
allofidaTeimeTrifid_id
Trant!s Tparent!Tiforius Ter
Normalis TaTiont!i–iMeTrifidiLookup
ake name/dimensions
– Returns metric_idCREATE INDEX ON values USING BTREE (timestamp); CREATE INDEX ON values USING BTREE (metric_id); Normalis TaTiont!i-iInt!dexint!g
metrics during JOIN
– Serves similar purpose toexisting metric indexing
Normalis TaTiont!i–iieries TiQueryi(vs Ti喘ime,i10MiRows T,i100iMeTrifids T)
1000 2000 3000 4000 5000 6000 7000 8000 9000 0.00 0.05 0.10 0.15 0.20 0.25 Normalised Denormalised (Time Index) Denormalised (Extra Index) Query Time Range (seconds) Query Duration (seconds)Normalis TaTiont!
✔ Normalis TaTiont!i elimint!aTedioverheadiofi addiTiont!alimeTrifidi int!dexint!g ✗ 喘heimeTrifidiint!dexint!gi s TTillidoes Tnt!’Tihaveiai pos TiTiveiefefidT
time value metric
10:01 . 1 10:01 . 2 10:02 . 1 10:02 . 2 10:03 . 1 10:03 . 2 10:04 . 1 10:04 . 2 A B C D E F G H
Normalis TaTiont!i–iBiTmapiInt!dexiifidant!
time index metric index C E F B D F H D D F
2 :02 :03
time value metric
10:01 . 1 10:01 . 2 10:02 . 1 10:02 . 2 10:03 . 1 10:03 . 2 10:04 . 1 10:04 . 2 A B C D E F G H
Normalis TaTiont!i–iMulTi-Columnt!iInt!dexint!g
time metric index D F
2 :02 :03
CREATE INDEX ON values USING BTREE (timestamp); CREATE INDEX ON values USING BTREE (metric_id); Normalis TaTiont!i–iMulTi-Columnt!iInt!dexint!g CREATE INDEX ON values USING BTREE (timestamp, metric_id); CREATE INDEX ON values USING BTREE (metric_id, timestamp);
Normalis TaTiont!i–iieries TiQueryi(vs TiRant!ge,i10MiRows T,i100iMeTrifids T)
1000 2000 3000 4000 5000 6000 7000 8000 9000 0.00 0.05 0.10 0.15 0.20 0.25 Normalised (Single Index) Normalised Denormalised (Time Index) Denormalised (Extra Index) Query Time Range (seconds) Query Duration (seconds)Normalis TaTiont!
Normalis TaTiont!i–iieries TiQueryi(vs TiVolume,i10iMeTrifids T)
10 20 30 40 50 60 70 80 90 100 0.02 0.04 0.06 0.08 0.1 0.12 10000 20000 30000 Data Volume (M-rows) Query Duration (seconds)Normalis TaTiont!i–iieries TiQueryi(vs TiRant!ge,i100MiRows T)
10000 20000 30000 40000 50000 60000 70000 80000 90000 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 1 Metric 10 Metrics Query Time Range (seconds) Query Duration (seconds)Normalis TaTiont!i–iieries TiQueryi(vs TiRant!ge,i100MiRows T)
10000 20000 30000 40000 50000 60000 70000 80000 90000 0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1 Metric 10 Metrics 100 Metrics Query Time Range (seconds) Query Duration (seconds)Normalis TaTiont!i–iieries TiQueryi(vs TiRant!ge,i100MiRows T)i(+Cont!fg)
10000 20000 30000 40000 50000 60000 70000 80000 90000 0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1 Metric 10 Metrics 100 Metrics Query Time Range (seconds) Query Duration (seconds)Normalis TaTiont!i-iAnt!alys Tis T
✔ DaTaiVolume ✔ T
✔ 喘imeiRant!ge ✔ T
✗ 喘imeiRant!ge ✗ Over 30,000s (100 Metrics) ✗ Over 90,000s ✗ NeediaibeTTeris TTraTegyi foris Tervifidint!gilargeri Timeirant!ges T
iummaris Tint!g
iummaris Tint!gi-iProblem
values_2 iummaris Tint!gi-iExample time sum metric
10:00 10 1 10:00 2 2 10:02 5 1 10:02 4 2
values time value metric
10:00 10 1 10:00 2 2 10:01 20 1 10:01 6 2 10:02 5 1 10:02 4 2 10:03 15 1 10:03 1 2 30 8 20 5 30 8 20 5
✔ iummaryiTableijus TTiai frafidTiont!iofiTheis Tize
CREATE TABLE values_10 ( timestamp TIMESTAMPTZ, metric_id INT, sum FLOAT8, count FLOAT8, min FLOAT8, max FLOAT8, UNIQUE (metric_id, timestamp) ); iummaris Tint!gi
CREATE VIEW summary_10 AS SELECT * FROM values_10 INNER JOIN metrics ON (metric_id = id); iummaris Tint!g
CREATE FUNCTION summarise_10 () RETURNS TRIGGER LANGUAGE plpgsql AS $_$ BEGIN : END; $_$; CREATE TRIGGER summarise_10_t AFTER INSERT ON values FOR EACH ROW EXECUTE PROCEDURE summarise_10 ();
iummaris Tint!gi–i喘riggeriDefnt!iTiont!
iummaris Tint!gi–i喘riggeriAfidTiont!
NEW is inserted data
10 seconds
EXCLUDED is current row
–Combine new value with existing aggregate value
iummaris Tint!gi–iiint!gleiieries TiQuery
nt!oTirawimeas Turement!Ts T
parTialiaggregaTiont!s T
–MIN: MIN(min)
–MAX: MAX(max)
–SUM: SUM(sum)
–COUNT: SUM(count)
–AVG: SUM(sum)/SUM(count)
iummaris Tint!gi–iieries TiQueryi(vs TiRant!ge,i100MiRows T)
10000 20000 30000 40000 50000 60000 70000 80000 90000 0.00 0.02 0.04 0.06 0.08 0.10 0.12 1 Metric 10 Metrics 100 Metrics Query Time Range (seconds) Query Duration (seconds)iummaris Tint!gi–iieries TiQueryi(vs TiRant!ge,i100MiRows T)
10000 20000 30000 40000 50000 60000 70000 80000 90000 0.00 0.02 0.04 0.06 0.08 0.10 0.12 1 Metric 10 Metrics 100 Metrics 1 Metric 10 Metrics 100 Metrics Query Time Range (seconds) Query Duration (seconds)iummaris Tint!g
iummaris Tint!gi–iieries TiQueryi(vs TiVolume;i100M-1BN)
100 200 300 400 500 600 700 800 900 1000 0.000 0.020 0.040 0.060 0.080 0.100 0.120 100000 200000 300000 Data Volume (M-rows) Query Duration (seconds)iummaris Tint!gi–iieries TiQueryi(vs TiRant!ge,i1BNiRows T)
100000 200000 300000 400000 500000 600000 700000 800000 900000 0.02 0.04 0.06 0.08 0.1 0.12 Query Time Range (seconds) Query Duration (seconds)✔ DaTaiVolume ✔ T
✔ 喘imeiRant!ge ✔ T
✔ 喘ois TfidaleifurTherdi喘ryi100:1is Tummary
iummaris Tint!g
Clos Tint!giNoTes T
WhaTiNexTd
Is TiITiWorThiITd
i i 103
喘hant!ks T s TTeve@s Tmps Tnt!.nt!eT