The Future of Postgres Sharding This presentaon will cover the - - PowerPoint PPT Presentation

the future of postgres sharding
SMART_READER_LITE
LIVE PREVIEW

The Future of Postgres Sharding This presentaon will cover the - - PowerPoint PPT Presentation

, The Future of Postgres Sharding This presentaon will cover the advantages of sharding and future Postgres sharding implementaon requirements. Creave Commons Aribuon License hp://momjian.us/presentaons Last updated:


slide-1
SLIDE 1

,

The Future of Postgres Sharding

This presentaon will cover the advantages of sharding and future Postgres sharding implementaon requirements.

Creave Commons Aribuon License hp://momjian.us/presentaons

Last updated: August, 2019

B M, A K 2019 October 17

B M, A K The Future of Postgres Sharding 1 / 39

slide-2
SLIDE 2

, Outline

  • 1. Scaling
  • 2. Vercal scaling opons
  • 3. Non-sharding horizontal scaling
  • 4. Exisng sharding opons
  • 5. Built-in sharding accomplishments
  • 6. Future sharding requirements

B M, A K The Future of Postgres Sharding 2 / 39

slide-3
SLIDE 3

, 1. Scaling

Database scaling is the ability to increase database throughput by ulizing addional resources such as I/O, memory, , or addional computers. However, the high concurrency and write requirements of database servers make scaling a challenge. Somemes scaling is only possible with mulple sessions, while other opons require data model adjustments or server configuraon changes. Postgres Scaling Opportunies

http://momjian.us/main/presentations/overview.html#scaling

B M, A K The Future of Postgres Sharding 3 / 39

slide-4
SLIDE 4

, 2. Vercal Scaling

Vercal scaling can improve performance on a single server by:

▶ Increasing I/O with

▶ faster storage ▶ tablespaces on storage devices ▶ striping ( 0) across storage devices ▶ Moving to separate storage

▶ Adding memory to reduce read I/O requirements ▶ Adding more and faster s

B M, A K The Future of Postgres Sharding 4 / 39

slide-5
SLIDE 5

, 3. Non-Sharding Horizontal Scaling

Non-sharding horizontal scaling opons include:

▶ Read scaling using Pgpool and streaming replicaon ▶ C/memory scaling with asynchronous mul-master

The enre data set is stored on each server.

B M, A K The Future of Postgres Sharding 5 / 39

slide-6
SLIDE 6

, Pgpool II With Streaming Replicaon

SELECT INSERT, UPDATE, DELETE to master host Slave Slave Master replication replication to any host

pgpool

streaming

  • Streaming replicaon avoids the

problem of non-determinisc queries producing different results

  • n different hosts.

B M, A K The Future of Postgres Sharding 6 / 39

slide-7
SLIDE 7

, Why Use Sharding?

▶ Only sharding can reduce I/O, by spling data across servers ▶ Sharding benefits are only possible with a shardable workload ▶ The shard key should be one that evenly spreads the data ▶ Changing the sharding layout can cause downme ▶ Addional hosts reduce reliability; addional standby servers might

be required

B M, A K The Future of Postgres Sharding 7 / 39

slide-8
SLIDE 8

, Typical Sharding Criteria

▶ List ▶ Range ▶ Hash

B M, A K The Future of Postgres Sharding 8 / 39

slide-9
SLIDE 9

, 4. Exisng Sharding Soluons

▶ Applicaon-based sharding ▶ PL/Proxy ▶ Postgres-XC/XL ▶ Citus ▶ Hadoop

The data set is sharded (striped) across servers.

B M, A K The Future of Postgres Sharding 9 / 39

slide-10
SLIDE 10

,

  • 5. Built-in Sharding Accomplishments:

Sharding Using Foreign Data Wrappers ()

  • SQL Queries

SQL Queries PG FDW

Foreign Server Foreign Server Foreign Server

https://wiki.postgresql.org/wiki/Built-in_Sharding

B M, A K The Future of Postgres Sharding 10 / 39

slide-11
SLIDE 11

, F Sort/Join/Aggregate Pushdown

SQL Queries PG FDW

Foreign Server Foreign Server Foreign Server

joins (9.6) sorts (9.6) aggregates (11)

B M, A K The Future of Postgres Sharding 11 / 39

slide-12
SLIDE 12

,

Advantages of F Sort/Join/Aggregate Pushdown

▶ Sort pushdown reduces and memory overhead on the

coordinator

▶ Join pushdown reduces coordinator join overhead, and reduces the

number of rows transferred

▶ Aggregate pushdown causes summarized values to be passed back

from the shards

▶ W clause restricons are also pushed down

B M, A K The Future of Postgres Sharding 12 / 39

slide-13
SLIDE 13

, Aggregate Pushdown in Postgres 11

SQL Queries PG FDW

Foreign Server Foreign Server Foreign Server

Aggregates, e.g., SUM()

Unfortunately, aggregates are currently evaluated one paron at a me, i.e., serially.

B M, A K The Future of Postgres Sharding 13 / 39

slide-14
SLIDE 14

, F D Pushdown in Postgres 9.6 & 11

SQL Queries PG FDW

Foreign Server Foreign Server Foreign Server

INSERT, UPDATE DELETE, COPY

B M, A K The Future of Postgres Sharding 14 / 39

slide-15
SLIDE 15

,

  • 6. Future Sharding Requirements: Parallel Shard

Access

  • SQL Queries

PG FDW

Foreign Server Foreign Server Foreign Server

Shard Parallel Access

Parallel shard access is waing for an executor rewrite, which is necessary for improvements.

B M, A K The Future of Postgres Sharding 15 / 39

slide-16
SLIDE 16

, Advantages of Parallel Shard Access

▶ Can use libpq’s asynchronous to issue mulple pending queries ▶ Ideal for queries that must run on every shard, e.g.,

▶ restricons on stac tables ▶ queries with no sharded-key reference ▶ queries with mulple shared-key references

▶ Parallel aggregaon across shards

B M, A K The Future of Postgres Sharding 16 / 39

slide-17
SLIDE 17

, Joins With Replicated Tables

SQL Queries SQL Queries with joins to static data PG FDW and static data restrictions

Foreign S. Foreign S. Foreign S. repl. repl. repl. B M, A K The Future of Postgres Sharding 17 / 39

slide-18
SLIDE 18

, Implemenng Joins With Replicated Tables

Joins with replicated tables allow join pushdown where the query restricon is on the replicated (lookup) table and not on the sharded

  • column. Tables can be replicated to shards using logical replicaon. The
  • pmizer must be able to adjust join pushdown based on which tables are

replicated on the shards.

B M, A K The Future of Postgres Sharding 18 / 39

slide-19
SLIDE 19

, shardman

https://github.com/postgrespro/shardman

▶ Automaon of sharding using paroning + FDW. ▶ Redundancy and automac failover using streaming replicaon. ▶ Distributed transacons using 2PC. ▶ Distributed visibility. ▶ Distributed query planning/execuon.

B M, A K The Future of Postgres Sharding 19 / 39

slide-20
SLIDE 20

, Redundancy in shardman

Streaming replicaon.

Server 1 Instance 1 Instance 4 replica Instance 2 replica Server 2 Instance 3 replica Instance 2 Instance 1 replica Server 4 Instance 3 replica Instance 1 replica Instance 4 Server 3 Instance 3 Instance 4 replica Instance 2 replica

B M, A K The Future of Postgres Sharding 20 / 39

slide-21
SLIDE 21

, Move to logical replicaon

▶ Logical pg_rewind ▶ Parallel decoding & parallel apply ▶ Logical decoding of 2PC ▶ Online streaming of large transacons ▶ High availability (Ra?) ▶ DDL support

B M, A K The Future of Postgres Sharding 21 / 39

slide-22
SLIDE 22

, Distributed visibility in shardman

▶ Based on Clock-SI scheme 1. ▶ Needs PostgreSQL core patching, ideally needs CSN ▶ Good scalability: only nodes involved in transacon

are involved in snapshot management.

▶ Local transacons runs locally. ▶ No dedicated service is required. ▶ Short lock for some of readers during distributed

CSN coordinaon.

1Clock-SI: Snapshot isolaon for paroned data stores using loosely synchronized

clocks

B M, A K The Future of Postgres Sharding 22 / 39

slide-23
SLIDE 23

, Visibility in work

Instance 1 Instance 2 Instance 3 BEGIN BEGIN Some work PREPARE PREPARE PRECOMMIT PRECOMMIT COMMIT COMMIT

▶ PRECOMMIT stage

marks transacons in-doubt and calculates distributed CSN.

▶ It blocks readers ONLY

IF: they access the data modified by currently in-doubt transacon and acquired snapshot aer it enters in-doubt stage.

B M, A K The Future of Postgres Sharding 23 / 39

slide-24
SLIDE 24

,

Distributed visibility in PostgreSQL Core: wrong way

▶ C API (set of hooks), which allows to override

low-level visibility-related funcons.

▶ pg_tsdtm extension, which implements CSN and

Clock-SI on top of that.

B M, A K The Future of Postgres Sharding 24 / 39

slide-25
SLIDE 25

,

Distributed visibility in PostgreSQL Core: right way

▶ CSN snapshots to PostgreSQL Core. ▶ C API for management of transacon CSN. ▶ Proper Clock-SI implementaon (as extension or in

Core?)

B M, A K The Future of Postgres Sharding 25 / 39

slide-26
SLIDE 26

,

How does shardman plan/execute distributed (OLAP) queries?

B M, A K The Future of Postgres Sharding 26 / 39

slide-27
SLIDE 27

,

Distributed planning step 1: local plan

Scan A1 Append Scan B1 Append Scan A2 Scan B2 Join B M, A K The Future of Postgres Sharding 27 / 39

slide-28
SLIDE 28

,

Distributed planning step 2: add distributed nodes

Scan A1 Append Scan B1 Append Scan A2 Scan B2 Join Shuffle Shuffle Gather Distributed execution B M, A K The Future of Postgres Sharding 28 / 39

slide-29
SLIDE 29

,

Distributed planning step 3: spread plans across the nodes

Scan A1 Append Scan B1 Append Scan A2 Scan B2 Join Shuffle Shuffle Gather Node 1 Scan A1 Append Scan B1 Append Scan A2 Scan B2 Join Shuffle Shuffle Gather Node 2 Distributed execution B M, A K The Future of Postgres Sharding 29 / 39

slide-30
SLIDE 30

, Stages of distributed query execuon

  • 1. Prepare distributed query plan at coordinator node
  • 2. Portable serializaon of the plan, collect list of

foreign servers

  • 3. At the begin of query execuon, pass the plan to

each foreign server by FDW connecon

  • 4. Localize the plan - walk across scan nodes, remove

unneeded scan nodes

  • 5. Execute the plan

▶ Steps 1-3 for coordinator node ▶ Steps 3-4 for every involved node

B M, A K The Future of Postgres Sharding 30 / 39

slide-31
SLIDE 31

,

How distributed planning/execuon integrates to PostgreSQL?

▶ Planner hooks: set_rel_pathlist_hook,

set_join_pathlist_hook,

▶ Custom node: ExchangePlanNode, ▶ Portable plan serializaon/deserializaon. 1

B M, A K The Future of Postgres Sharding 31 / 39

slide-32
SLIDE 32

, set_rel_pathlist_hook

Scan A1 Append Gather Scan A2 Scan A1 Append Foreign Scan A2 Basic path Shardman path

B M, A K The Future of Postgres Sharding 32 / 39

slide-33
SLIDE 33

, set_join_pathlist_hook

Scan A1 Append Gather Scan A2 Shuffle Broadcast Scan B1 Append Gather Scan B2 Shuffle Broadcast HashJoin HashJoin NestedLoopJoin

B M, A K The Future of Postgres Sharding 33 / 39

slide-34
SLIDE 34

, ExchangePlanNode

▶ Compute desnaon instance for each incoming tuple ▶ Transfer the tuple to the corresponding EXCHANGE node at the

instance

▶ If desnaon is itself ? transfer the tuple up by the plan tree ▶ Any distributed plan has EXCHANGE node in gather mode at

the top of the plan: collect all results at the coordinator node. Modes:

▶ Shuffle ? transfer tuple corresponding to distribuon funcon ▶ Gather ? gather all tuples at one node ▶ Broadcast ? transfer each tuple to each node (itself too)

B M, A K The Future of Postgres Sharding 34 / 39

slide-35
SLIDE 35

, Portable plan serializaon/deserializaon

▶ Patch nodeToString(), stringToNode() code. ▶ Serializaon replaces OIDs with object names. ▶ Deserializaon replaces object names back to OIDs. ▶ pg_exec_plan(plan text) deserializes, localizes and

launches execuon of the plan.

B M, A K The Future of Postgres Sharding 35 / 39

slide-36
SLIDE 36

, PostgreSQL core modificaons

▶ Patch nodeToString(), stringToNode() code. ▶ Change paroning code in the planner: paroning

  • f joinrel can be changing according to path (May be

we transfer paroning-related fields from RelOptInfo to Path structure?)

B M, A K The Future of Postgres Sharding 36 / 39

slide-37
SLIDE 37

, Distributed planning/execuon status

▶ WIP ▶ Need to patch PostgreSQL core. ▶ HashJoin, NestedLoopJoin and HashAgg are

implemented, MergeJoin and GroupAgg are in TODO list.

▶ Observed up to 5-mes improvement in comparison

with FDW on 4-nodes cluster (async execuon!).

▶ https://github.com/postgrespro/shardman ?

go try it.

B M, A K The Future of Postgres Sharding 37 / 39

slide-38
SLIDE 38

, Conclusion

Following features to push into PostgreSQL Core:

▶ CSN ▶ Distributed-visibility C API (CSN-based) ▶ Portable serializaon/deserializaon for

nodeToString(), stringToNode()

▶ Planner improvements for paroning ▶ Logical replicaon improvements (a lot of them)

B M, A K The Future of Postgres Sharding 38 / 39

slide-39
SLIDE 39

, Conclusion

hp://momjian.us/presentaons

hps://www.flickr.com/photos/anotherpintplease/ B M, A K The Future of Postgres Sharding 39 / 39