PostgresOpen 12 September 2019 1 Speaker Julien Riou DBA since - - PowerPoint PPT Presentation

postgresopen 12 september 2019
SMART_READER_LITE
LIVE PREVIEW

PostgresOpen 12 September 2019 1 Speaker Julien Riou DBA since - - PowerPoint PPT Presentation

https://knowyourmeme.com/memes/all-the-things PostgresOpen 12 September 2019 1 Speaker Julien Riou DBA since 2012 Tech lead in the databases team at OVH since 2015 pgterminate @ github Speaker PostgresOpen 12 September


slide-1
SLIDE 1

1

PostgresOpen – 12 September 2019

https://knowyourmeme.com/memes/all-the-things

slide-2
SLIDE 2

2

Speaker

  • Julien Riou
  • DBA since 2012
  • Tech lead in the databases team at OVH since 2015
  • pgterminate @ github

Speaker PostgresOpen – 12 September 2019

slide-3
SLIDE 3

3

Overview

  • Definitions
  • Context
  • Updates
  • Upgrades
  • Conclusion
  • What’s next?

Overview PostgresOpen – 12 September 2019

slide-4
SLIDE 4

4

Definitions

PostgresOpen – 12 September 2019

slide-5
SLIDE 5

5

Versioning policy

  • Starting from version 10

11.4

Major version Minor version

Definitions PostgresOpen – 12 September 2019

slide-6
SLIDE 6

6

Versioning policy

  • Before version 10

9.6.14

Major version Minor version

Definitions PostgresOpen – 12 September 2019

slide-7
SLIDE 7

7

Versioning policy

  • Major versions

– Released about once a year – Includes new features – Supported for 5 years

Definitions PostgresOpen – 12 September 2019

  • Minor versions

– Released at least every 3 months – Includes bug and security fixes – Critical fixes are released as soon as possible

slide-8
SLIDE 8

8

Definitions

Update

Installing a newer minor version of PostgreSQL

Definitions PostgresOpen – 12 September 2019

”Minor upgrade” accepted too

slide-9
SLIDE 9

9

Definitions

Upgrade

Installing a newer major version of PostgreSQL

Definitions PostgresOpen – 12 September 2019

”Major upgrade” accepted too

slide-10
SLIDE 10

10

Context

PostgresOpen – 12 September 2019

slide-11
SLIDE 11

11

Context PostgresOpen – 12 September 2019

slide-12
SLIDE 12

12

Products

Cloud

Baremetal VPS Public cloud Private cloud Storage

Web hosting

Domain names Website hosting E-mail solutions SSL / CDN Office & Microsoft solutions

Platform

Kubernetes Logs & Metrics Data Platforms Databases Big data AI & Machine Learning

Telecom

Internet offers Telephony SMS / Fax VDI OverTheBox Context PostgresOpen – 12 September 2019

slide-13
SLIDE 13

13

Perimeter

Internal databases

60 3000 700 400

Clusters Applications Users Databases

Context PostgresOpen – 12 September 2019

slide-14
SLIDE 14

14

Cluster example

  • MySQL
  • PostgreSQL

PostgresOpen – 12 September 2019 Context

slide-15
SLIDE 15

15

Updates

PostgresOpen – 12 September 2019

slide-16
SLIDE 16

16

Recommendations

  • “We always recommend that all users run the latest available minor release for whatever

major version is in use.”

  • “For minor releases, the community considers not upgrading to be riskier than upgrading.”

https://www.postgresql.org/support/versioning/

Updates

PostgresOpen – 12 September 2019

slide-17
SLIDE 17

17

Updates

PostgresOpen – 12 September 2019

https://knowyourmeme.com/memes/all-the-things

slide-18
SLIDE 18

18

Method

1. Stop the service 2. Install new binaries 3. Start the service

Updates

PostgresOpen – 12 September 2019

slide-19
SLIDE 19

19

Attention points

  • Always read the changelog
  • Downtime

– Can be minimized by using pgbouncer and PAUSE/RESUME commands

  • Write intensive clusters

– Run CHECKPOINT before stopping the service – Prepare for a switchover for extreme case

  • Patroni

– Put the cluster on maintenance mode to avoid failovers

Updates

PostgresOpen – 12 September 2019

slide-20
SLIDE 20

20

Initial state

Updates

PostgresOpen – 12 September 2019

slide-21
SLIDE 21

21

Clustershell

  • “Event-driven open source Python library, designed to run local or distant commands in

parallel on server farms or on large Linux clusters” http://cea-hpc.github.io/clustershell/

  • Binaries

– clush – nodeset

  • Python API

Updates

PostgresOpen – 12 September 2019

slide-22
SLIDE 22

22

Clustershell

  • nodeset

Updates

PostgresOpen – 12 September 2019

$ nodeset -ll @all node[1-6] @cluster1 node[1-3] @cluster2 node[4-6] @node node[1-2,4-5] @backup node[3,6]

slide-23
SLIDE 23

23

Clustershell

  • clush

Updates

PostgresOpen – 12 September 2019

$ clush -bw @all $ clush -bw @cluster1\&@backup $ clush -bw @cluster1,@cluster2

slide-24
SLIDE 24

24

Clustershell

  • clush

Updates

PostgresOpen – 12 September 2019

clush> apt-get update Clush> apt-get upgrade

slide-25
SLIDE 25

25

Clustershell

  • Backups first

Updates

PostgresOpen – 12 September 2019

$ clush -bw @backup

slide-26
SLIDE 26

26

Clustershell

  • Backups first

Updates

PostgresOpen – 12 September 2019

$ clush -bw @backup

slide-27
SLIDE 27

27

Clustershell

  • Backups first

Updates

PostgresOpen – 12 September 2019

$ clush -bw @backup

slide-28
SLIDE 28

28

Clustershell

  • Then nodes one node at a time (fanout)

Updates

PostgresOpen – 12 September 2019

$ clush -f 1 -bw @node

slide-29
SLIDE 29

29

Clustershell

  • Then nodes one node at a time (fanout)

Updates

PostgresOpen – 12 September 2019

$ clush -f 1 -bw @node

slide-30
SLIDE 30

30

Clustershell

  • Then nodes one node at a time (fanout)

Updates

PostgresOpen – 12 September 2019

$ clush -f 1 -bw @node

slide-31
SLIDE 31

31

Clustershell

  • Then nodes one node at a time (fanout)

Updates

PostgresOpen – 12 September 2019

$ clush -f 1 -bw @node

slide-32
SLIDE 32

32

Clustershell

  • Then nodes one node at a time (fanout)

Updates

PostgresOpen – 12 September 2019

$ clush -f 1 -bw @node

slide-33
SLIDE 33

33

Clustershell

  • Then nodes one node at a time (fanout)

Updates

PostgresOpen – 12 September 2019

$ clush -f 1 -bw @node

slide-34
SLIDE 34

34

Clustershell

  • Then nodes one node at a time (fanout)

Updates

PostgresOpen – 12 September 2019

$ clush -f 1 -bw @node

slide-35
SLIDE 35

35

Clustershell

  • Then nodes one node at a time (fanout)

Updates

PostgresOpen – 12 September 2019

$ clush -f 1 -bw @node

slide-36
SLIDE 36

36

Clustershell

  • Then nodes one node at a time (fanout)

Updates

PostgresOpen – 12 September 2019

$ clush -f 1 -bw @node

slide-37
SLIDE 37

37

Final state

Updates

PostgresOpen – 12 September 2019

slide-38
SLIDE 38

38

Limitations

  • clush is great for one-shot human simple operations
  • Requires development investment to implement complex automation
  • At our scale, we use our own automation system

– Mostly open: PostgreSQL, Flask, Ansible, Celery, … – And some internal systems

Updates

PostgresOpen – 12 September 2019

slide-39
SLIDE 39

39

Upgrades

PostgresOpen – 12 September 2019

slide-40
SLIDE 40

40

Why?

  • Support

– Limited to 5 years

  • Better performance

– Parallelism – Optimizations

Upgrades

PostgresOpen – 12 September 2019

  • New features

– Materialized views – JSON – Logical decoding – Upsert – SCRAM – And more…

slide-41
SLIDE 41

41

Upgrades

PostgresOpen – 12 September 2019

https://knowyourmeme.com/memes/all-the-things

slide-42
SLIDE 42

42

Method

Upgrades

PostgresOpen – 12 September 2019

slide-43
SLIDE 43

43

Method

Upgrades

PostgresOpen – 12 September 2019

slide-44
SLIDE 44

44

Method

Upgrades

PostgresOpen – 12 September 2019

slide-45
SLIDE 45

45

Method

Upgrades

PostgresOpen – 12 September 2019

slide-46
SLIDE 46

46

Method

Upgrades

PostgresOpen – 12 September 2019

slide-47
SLIDE 47

47

Method

Upgrades

PostgresOpen – 12 September 2019

slide-48
SLIDE 48

48

“Replication” methods

  • Application
  • pg_dump / pg_restore
  • pg_upgrade
  • Logical replication with pglogical

Upgrades

PostgresOpen – 12 September 2019

slide-49
SLIDE 49

49

“Replication” methods

Upgrades

PostgresOpen – 12 September 2019

Application

slide-50
SLIDE 50

50

Application

1. Write objects to both clusters 2. Copy old objects to new cluster 3. Switchover

Upgrades

PostgresOpen – 12 September 2019

slide-51
SLIDE 51

51

Application

Pros Cons

Developers are autonomous Different object management for too much teams No downtime Requires a single endpoint or inconsistencies RDBMS independent Not a developer priority

Upgrades

PostgresOpen – 12 September 2019

slide-52
SLIDE 52

52

  • Conclusion

https://github.com/googlefonts/noto-emoji/blob/master/svg/emoji_u1f44e.svg

Upgrades

PostgresOpen – 12 September 2019

Application

slide-53
SLIDE 53

53

“Replication” methods

Upgrades

PostgresOpen – 12 September 2019

pg_dump / pg_restore

slide-54
SLIDE 54

54

pg_dump / pg_restore

1. Set old cluster to read-only mode 2. Dump old cluster with pg_dump 3. Restore on new cluster with pg_restore 4. Switchover

Upgrades

PostgresOpen – 12 September 2019

slide-55
SLIDE 55

55

pg_dump / pg_restore

Pros Cons

DBA team is autonomous Extended period of downtime for large databases Easy to setup Wipe table and index bloat

PostgresOpen – 12 September 2019

Upgrades

slide-56
SLIDE 56

56

  • Conclusion

https://github.com/googlefonts/noto-emoji/blob/master/svg/emoji_u1f44d.svg

Upgrades

PostgresOpen – 12 September 2019

pg_dump / pg_restore

slide-57
SLIDE 57

57

“Replication” methods

Upgrades

PostgresOpen – 12 September 2019

pg_upgrade

slide-58
SLIDE 58

58

pg_upgrade

1. Install both versions on new cluster 2. Setup streaming replication from old cluster to new cluster 3. Set old cluster to read-only mode 4. Run pg_upgrade on new cluster with hardlinks 5. Update statistics in stage on new cluster 6. Switchover

Upgrades

PostgresOpen – 12 September 2019

slide-59
SLIDE 59

59

pg_upgrade

Upgrades

PostgresOpen – 12 September 2019

slide-60
SLIDE 60

60

pg_upgrade

Upgrades

PostgresOpen – 12 September 2019

slide-61
SLIDE 61

61

pg_upgrade

Upgrades

PostgresOpen – 12 September 2019

slide-62
SLIDE 62

62

pg_upgrade

Pros Cons

DBA team is autonomous Requires multiple versions of binaries on the same host Very short downtime Rebuild streaming replication to have up-to-date data Easy to setup (the first time)

PostgresOpen – 12 September 2019

Upgrades

slide-63
SLIDE 63

63

A word on statistics

Upgrades

PostgresOpen – 12 September 2019

slide-64
SLIDE 64

64

A word on statistics

Upgrades

PostgresOpen – 12 September 2019

https://knowyourmeme.com/memes/reality-hits-you-hard-bro

slide-65
SLIDE 65

65

A word on statistics

  • vacuumdb to the rescue

Upgrades

PostgresOpen – 12 September 2019

$ vacuumdb --all --analyze-in-stages -j 10

slide-66
SLIDE 66

66

  • Conclusion

https://github.com/googlefonts/noto-emoji/blob/master/svg/emoji_u1f44d.svg

Upgrades

PostgresOpen – 12 September 2019

pg_upgrade

slide-67
SLIDE 67

67

“Replication” methods

Upgrades

PostgresOpen – 12 September 2019

Logical replication with pglogical

slide-68
SLIDE 68

68

Logical replication with pglogical

  • Requires version 9.4+
  • Logical replication

– Doesn’t replicate DDL – Doesn’t replicate sequences

  • pglogical additional functions

– pglogical.replicate_ddl_command(command text, replication_sets text[]) – pglogical.synchronize_sequence(relation regclass)

Upgrades

PostgresOpen – 12 September 2019

slide-69
SLIDE 69

69

Logical replication with pglogical

1. Setup provider (install extensions, setup node, setup replication set) 2. Dump schema on provider 3. Restore schema on subscriber 4. Setup subscriber (install extensions, setup node, setup subscription) 5. Wait for subscriptions to be in sync 6. Set old cluster to read-only mode 7. Synchronize sequences 8. Switchover

Upgrades

PostgresOpen – 12 September 2019

slide-70
SLIDE 70

70

Logical replication with pglogical

Pros Cons

DBA team is autonomous Complex setup Very short downtime Hard to debug (some logs are too generic) Database precision Objects in the database (secrets included) High level of locks required Encoding must be the same Provider can fail and take down production

PostgresOpen – 12 September 2019

Upgrades

slide-71
SLIDE 71

71

Logical replication with pglogical

  • Deadlocks

Upgrades

PostgresOpen – 12 September 2019

ERROR: deadlock detected at character 237 DETAIL: Process 16477 waits for AccessShareLock on relation 17241 of database 17032; blocked by process 17333. Process 17333 waits for AccessExclusiveLock on relation 4920800 of database 17032; blocked by process 16477. Process 16477: <application query> Process 17333: SELECT pglogical.replication_set_add_all_tables('default', ARRAY['public']); HINT: See server log for query details. STATEMENT: <application query>

slide-72
SLIDE 72

72

Logical replication with pglogical

Upgrades

PostgresOpen – 12 September 2019

  • Sequences

ERROR: duplicate key value violates unique constraint "table_pkey"

slide-73
SLIDE 73

73

Logical replication with pglogical

Upgrades

PostgresOpen – 12 September 2019

ERROR: encoding conversion for binary datum not supported yet DETAIL: expected_encoding UTF8 must be unset or match server_encoding SQL_ASCII CONTEXT: slot "pgl_<slotname>", output plugin "pglogical_output", in the startup callback LOG: could not receive data from client: Connection reset by peer

4.13 Database encoding differences PGLogical does not support replication between databases with different encoding. We recommend using UTF-8 encoding in all replicated databases.

  • Encoding must be the same
  • Crystal clear in the documentation

https://www.2ndquadrant.com/fr/ressources/pglogical/documentation/

slide-74
SLIDE 74

74

Logical replication with pglogical

  • Avoid explicit locks
  • Use UTF-8 encoding
  • Use latest pglogical commercial version and support open source
  • Or fallback to another solution

– built-in logical replication – pg_upgrade – pg_dump and pg_restore

Upgrades

PostgresOpen – 12 September 2019

slide-75
SLIDE 75

75

Logical replication with pglogical

  • Conclusion

https://github.com/googlefonts/noto-emoji/blob/master/svg/emoji_u1f44d.svg

Upgrades

PostgresOpen – 12 September 2019

slide-76
SLIDE 76

76

2015

PostgreSQL 9.2 Debian Wheezy Application pg_dump pg_restore

2017

PostgreSQL 9.4 Debian Jessie Logical Decoding

Timeline

pg_upgrade

Upgrades

PostgresOpen – 12 September 2019

slide-77
SLIDE 77

77

2017

PostgreSQL 9.4 Debian Jessie Standalone pglogical pg_dump pg_restore pg_upgrade

2019

PostgreSQL 9.6 Debian Stretch Patroni

Timeline

Upgrades

PostgresOpen – 12 September 2019

slide-78
SLIDE 78

78

Conclusion

PostgresOpen – 12 September 2019

slide-79
SLIDE 79

79

Conclusion

Better stability New features Always secure Better performance

Conclusion

PostgresOpen – 12 September 2019

slide-80
SLIDE 80

80

What’s next?

PostgresOpen – 12 September 2019

slide-81
SLIDE 81

81

Next

  • Upgrade to PostgreSQL 12
  • Upgrade to Debian 10
  • Migrate from MySQL to PostgreSQL
  • Automate, automate, automate!

Next

PostgresOpen – 12 September 2019

slide-82
SLIDE 82

82

Extra

PostgresOpen – 12 September 2019

slide-83
SLIDE 83

83

Reversibility Highly available Elastic Performance Managed Eat your own food

Enterprise Cloud Databases

Extra

PostgresOpen – 12 September 2019

https://labs.ovh.com/ha-database

slide-84
SLIDE 84

84

We are hiring!

  • Opensource Database Engineers
  • Site Reliability Engineers (Private Cloud, Openstack, DNS, Deploy, Observability)
  • Software Engineers (containers, baremetal, web hosting)
  • Backend Developers (Python, Go)
  • And more

Extra

PostgresOpen – 12 September 2019

slide-85
SLIDE 85

85

Questions

PostgresOpen – 12 September 2019