Databases Have Forgotten About Single Node Performance, A - - PowerPoint PPT Presentation

databases have forgotten about single node performance a
SMART_READER_LITE
LIVE PREVIEW

Databases Have Forgotten About Single Node Performance, A - - PowerPoint PPT Presentation

Databases Have Forgotten About Single Node Performance, A Wrongheaded Trade Off Matvey Arye, PhD Core Database Engineer mat@timescale.com github.com/timescale How computing used to look How computing looks today What happened?


slide-1
SLIDE 1

Databases Have Forgotten
 About Single Node Performance, A Wrongheaded Trade Off

Matvey Arye, PhD

Core Database Engineer

mat@timescale.com · github.com/timescale

slide-2
SLIDE 2
slide-3
SLIDE 3

How computing used to look

slide-4
SLIDE 4

How computing looks today

slide-5
SLIDE 5

@timescale

What happened?

slide-6
SLIDE 6

Databases changed from 


scaling up to scaling out

slide-7
SLIDE 7

We entered the MapReduce Era

slide-8
SLIDE 8

But… MapReduce specializes in long jobs that touch lots of data

  • Grep: 150 Seconds
  • Sort: 890 Seconds
  • Average job completion time of

Google production workloads: 634 Seconds

(Source: Google MapReduce Paper)

slide-9
SLIDE 9

Why the high-latency?

slide-10
SLIDE 10

Network Latencies

  • Connection setup/teardown
  • TCP ramp-up
  • Network transfer
slide-11
SLIDE 11

Consensus is Expensive

  • Dreaded two-phase commit
  • Snapshot Isolation even harder
  • Co-odinators often bottlenecks
slide-12
SLIDE 12

The “Straggler” Problem

slide-13
SLIDE 13

The high-latency of distributed databases is nothing new

Core s Twitter_rv uk_2007_5 Spark 128 1784s 8000s+ Giraph 128 200s 8000s+ GraphLab 128 242s 714s Graphx 128 251s 800s Laptop 1 153s 417s Laptop* 1 15s 30s Graph Connected Component Analysis

Source: Scalability! But at what COST? - McSherry et al.

Core s Twitter_r v uk_2007_5 Spark 128 857s 1759s Giraph 128 596s 1235s GraphLab 128 249s 833s Graphx 128 419s 462s Laptop 1 300s 651s Page Rank (20 iterations)

slide-14
SLIDE 14

But that was 10 years ago

The world has changed

160 GB Spinning Rust 2 Core CPU 4GB RAM 60 TB SSDs 64 Core CPU TBs of RAM Edge computing GPUs, TPUs Then Now

slide-15
SLIDE 15

Financial & Marketing Industrial Machines Datacenter & DevOps

And data needs have become real-time.

Web / Mobile Events

Transportation & Logistics

slide-16
SLIDE 16

@timescale

slide-17
SLIDE 17

Shorter latency requirements mean 
 we need to focus again on scaling up

slide-18
SLIDE 18

So how do we improve single-node performance?


(Lessons from TimescaleDB)

slide-19
SLIDE 19

Postgres 9.6.2 on Azure standard DS4 v2 (8 cores), SSD (premium LRS storage) Each row has 12 columns (1 timestamp, indexed 1 host ID, 10 metrics)

slide-20
SLIDE 20

Insight #1 Partition, even on a single machine.

slide-21
SLIDE 21

Older

slide-22
SLIDE 22

Older

slide-23
SLIDE 23

Older

slide-24
SLIDE 24

Older

slide-25
SLIDE 25

Partitioning on a single node.

Chunk (sub-table)

Hypertable

Space Time
 (older)

slide-26
SLIDE 26

Chunks are “right-sized”

Recent (hot) chunks fit in memory

slide-27
SLIDE 27

144K

METRICS / S

TimescaleDB vs. PostgreSQL

(single-row inserts)

14.4K

INSERTS / S

TimescaleDB 0.5, Postgres 9.6.2 on Azure standard DS4 v2 (8 cores), SSD (LRS storage) Each row has 12 columns (1 timestamp, indexed 1 host ID, 10 metrics)

slide-28
SLIDE 28

>20x

TimescaleDB vs. PostgreSQL

(batch inserts)

TimescaleDB 0.5, Postgres 9.6.2 on Azure standard DS4 v2 (8 cores), SSD (LRS storage) Each row has 12 columns (1 timestamp, indexed 1 host ID, 10 metrics)

1.11M

METRICS / S

slide-29
SLIDE 29

Other benefits to partitioning

slide-30
SLIDE 30

Avoid querying chunks via constraint exclusion

SELECT time, device_id, temp FROM data
 WHERE time > ‘2017-08-22 18:18:00+00’

slide-31
SLIDE 31

Efficient retention policies

Drop chunks, don’t delete rows ⇒ avoids vacuuming

slide-32
SLIDE 32

Insight #2 Partition, even on a single machine
 across many disks

slide-33
SLIDE 33

Single node: Scaling up via adding disks

  • Faster inserts
  • Parallelized queries

How Benefit

Chunks spread across many disks (elastically!) either RAIDed or via distinct tablespaces

slide-34
SLIDE 34

Elasticity on a single node

slide-35
SLIDE 35

Insight #3 Query Optimizations

slide-36
SLIDE 36

Avoid querying chunks via constraint exclusion (more)

SELECT time, device_id, temp FROM data
 WHERE time > now() - interval ’24 hours’

Won’t exclude chunks in plain PostgreSQL

slide-37
SLIDE 37

Example Query Optimization

CREATE INDEX ON readings(time); SELECT date_trunc(‘minute’, time) as bucket, avg(cpu) FROM readings GROUP BY bucket ORDER BY bucket DESC LIMIT 10;

Will this use the index?

slide-38
SLIDE 38

Example Query Optimization

Timescale understands time

CREATE INDEX ON readings(time); SELECT date_trunc(‘minute’, time) as bucket, avg(cpu) FROM readings GROUP BY bucket ORDER BY bucket DESC LIMIT 10;

slide-39
SLIDE 39

How to find bottlenecks?

4/20/2018

  • ut.full.10concurrency.cache.withwhere.svg

file:///Users/arye/Downloads/out.full.10concurrency.cache.withwhere.svg 1/1

Flame Graph

Function: postgres`standard_planner (110 samples, 90.91%) postgres`0x10f97ed8e li.. postgres`PostmasterMain pos.. postgr.. postgres`.. po.. postgres`PortalRun post.. postgres`build_simple_rel libsystem_kernel.d.. postgres`RelationGetNumberOfBlock.. postgres`add_base_rels_to_query postgres`add_base_rels_to_query po.. postgres`mdnblocks po.. post.. postgres`FileSeek postgres`mdnblocks postgres`op. postgres`grouping_planner postgres`_mdnblocks postgres`standard_ProcessUtility pos.. timescaledb­0.10.0­dev.so`timescaledb_ddl_command_start postgres`BackendStartup postgres`FillPortalStore postgres`get_relat.. postgres`set_base_rel_sizes postgr.. postgr.. postgres`FileAccess po.. postgres`0x10 postgres`pr. timescaledb­0.10.0­dev.so`prev_ProcessUtility po.. postgr.. postgres`estimate_rel_size postgres`_mdnblocks postgr.. postgr.. postgres`exec_simple_query pos.. po.. po.. libsys.. po.. postgres`make_one_rel po.. postgr.. postgres`query_planner pos.. postgres`FileAccess libsystem_kerne.. postgres`.. postgres`PostgresMain li.. po.. postgres`RelationGetNumbe.. postgres`pr. pos.. postgres`pr. postgres`LruInsert postgres`FileSeek po.. postgres`relation_excluded_by_constr.. postgres`ExplainQuery pos.. post.. postgres`pg_plan_query postgres`smgrnblocks postgres`build_simple_rel postgres`smgrnblocks pos.. postgr.. postgres`subquery_planner postgres`PortalRunUtility postgres`planner post.. postgres`ServerLoop postgres`LruInsert lib.. postgres`standard_planner postgres`ProcessUtility timescaledb­0.10.0­dev.so`timescaledb_planner postgres`pr. libdyld.dylib`start postgr.. postgres`set_append_rel_size postgres`0x10f89e519 postgr.. postgres`set_rel_size postgres`get_relation_info postgr.. postgres`ExplainOneQuery po.. po..

slide-40
SLIDE 40

Insight #4 HA != Horizontally-distributed

slide-41
SLIDE 41

Single Node + Replicas = High-availability

slide-42
SLIDE 42

Replication much cheaper than horizontal- distribution

  • Avoids consensus penalty
  • Can more easily be asynchronous
  • Still provides HA
slide-43
SLIDE 43

Results

slide-44
SLIDE 44

Results vs Cassandra (5 nodes): Query

Type Scanned (h) Devices metrics Timescale query time (ms) Cassandra Multiplier groupby-hour 24 all 1 27866.50 9.66 groupby-hour 24 all 5 35827.60 37.52 groupby-hour 24 all 10 44912.30 45.05 high-cpu 24 all 1 49,807 32.95 cpu-max 12 8 10 126.90 11.84 cpu-max 12 1 10 20.54 37.41 high-cpu 24 1 1 56 18.65 groupby-minute 12 1 1 33.80 0.91 groupby-minute 1 8 1 23.50 1.97 groupby-minute 12 1 5 27.60 10.25 groupby-minute 1 8 5 20.40 17.90 lastpoint all all 10 266.10 2285.30 groupby-orderby-limit all all 1 75.20 3181.62

Higher multiplier indicates worse Cassandra performance.

slide-45
SLIDE 45

Insert metrics/s Cassandra

150K

Timescale 745K

TimescaleDB vs. Cassandra
 (5 nodes)

slide-46
SLIDE 46

Results vs Mongo: Query

Type Scanned (h) Devices metrics Timescale query time (ms) Mongo Multiplier groupby-hour 24 all 1 29,968 7.63 groupby-hour 24 all 5 39,157 5.51 groupby-hour 24 all 10 49,058 5.50 high-cpu 24 all 1 51,323 5.24 cpu-max 12 8 10 260 1.14 cpu-max 12 1 10 28 1.32 high-cpu 24 1 1 95 1.17 groupby-minute 12 1 1 62 0.77 groupby-minute 1 8 1 28 1.95 groupby-minute 12 1 5 69 0.94 groupby-minute 1 8 5 26 2.17 lastpoint all all 10 453 101.72 groupby-orderby-limit all all 1 149 1667.25

Higher multiplier indicates worse Mango performance.

slide-47
SLIDE 47

Insert metrics/s Mongo

807K

Timescale 994K

TimescaleDB

  • vs. Mongo
slide-48
SLIDE 48

Lessons for DB designers

  • Concentrate on scale-up
  • Consider (insights!):
  • Partitioning even on single-node
  • Even across disks
  • Performance analysis
  • High-level: Query optimization
  • Low-level: Profiling
  • High-availability is possible on single-node
slide-49
SLIDE 49

Lessons for DB users

  • Absolute performance as important as scaling numbers.
  • Don’t go horizontally distributed unless you have to.
  • HA not same as horizontal scalability.
  • Replication cheaper than distribution.
  • SQL and ACID are extremely useful.
slide-50
SLIDE 50

Open Source (Apache 2.0)

  • github.com/timescale/timescaledb

Join the Community

  • slack.timescale.com
slide-51
SLIDE 51

Timescale is hiring!

  • Core Database Engineers
  • R&D Engineers
  • Solutions Engineers
  • Evangelists
  • Customer Success

careers.timescale.com