split your database
play

Split your database Nicolai Plum Booking.com Database Engineering - PowerPoint PPT Presentation

Split your database Nicolai Plum Booking.com Database Engineering Before 2 After 3 Why? Size Query time, query latency conflicting workloads Business or architecture reasons Regulatory compliance Easier PCI DSS,


  1. Split your database Nicolai Plum Booking.com Database Engineering

  2. Before 2

  3. After 3

  4. Why? ● Size ● Query time, query latency ● … conflicting workloads ● Business or architecture reasons ● Regulatory compliance ● Easier PCI DSS, GDPR compliance, PII... 4

  5. Choosing tables to split out ● Minimise joins ● Maximise gains ● Split out biggest, fastest-growing, busiest, most- regulated, … ● Size and activity: ps-top ● https://github.com/sjmudd/ps-top ● Design requirements, current system structure ● Based on code analysis, application knowledge 5

  6. ps-top ps-top 0.7.5 - 18:44:30 db-101 / 5.7.19-log, up 31d 9h 39m 2 Table (table_io_waits_summary_by_table) 44 rows Ops %| Fetch Insert Update Delete|Table Name 13.49 k 50.6%| 99.8% 0.1% 0.1%|bdb.Inventory 6.97 k 26.1%| 2.6% 94.8% 2.5% 0.1%|bdb.Catalogue 1.36 k 5.1%| 50.9% 24.5% 24.5%|bdb.Transaction 963 3.5%| 66.7% 33.3% |bdb.Delivery 597 2.2%| 51.1% 25.3% 23.6% |bdb.Customer 396 1.4%| 4.5% 94.4% 1.0%|bdb.Reviews 6

  7. Application and code ● Application queries must go to different master for some tables after split ● Code change ● Proxy magic (untested!) ● (almost certainly) Need to change your code ● Try to separate read and write database handles in code ● Try to use application feature switch, A/B experiment framework, or other runtime new-old switch mechanism ● Implement in your test environment first 7

  8. Joins & code changes ● Writes: ● Scan (statement) binlog ● Preferably get rid of update clauses with joins altogether ● Code analysis ● Reads ● Code analysis ● Query sampling ● Performance_Schema, proxies ● Perfection is difficult, you may have fixes after split 8

  9. Foreign keys, triggers, stored procedures ● Foreign Keys between the schemas will not work ● Remove them and refactor code ● Move relational integrity checks and enforcement to application layers ● Reflect upon your life and give up using FKs ● Check stored procedures and triggers for use of both schemas ● Implement triggers and stored procedures on product schema as necessary ● No Percona Online Schema change during split 9

  10. Replication topology 10

  11. Translation Master ● Schema translation ● replicate-rewrite-db=’Sales->Product' ● Replication filter ● replicate-wild-do-table = Product.Catalogue ● replicate-wild-do-table = Product.Inventory ● replicate-wild-do-table = Product.Reviews 11

  12. Schema name translation 12

  13. Scheme name translation 13

  14. Preparing Translation master ● Make a slave, with binlog enabled ● Shut down MySQL ● Configuration: Add ● skip-slave-start ● replicate-rewrite-db=’sales->product’ ● replicate-wild-do-table… 14

  15. Preparing Translation master ● Create new schema ( product ) ● Rename all tables that are being split to the new schema ● Drop old schema ( sales ) from translation master ● Start replication slave ● Remove skip-slave-start ● If replication stops with missing table errors, find the bad query, fix up (skip) or make a new translation master again 15

  16. Preparing new replication chain ● Clone the translation master to new ( product ) master ● Remove the database translate and replication filter statements ● Clone more slave databases as needed 16

  17. Client connections ● Sales clients ● Product clients ● Write connection to ● Write connection to sales master sales master ● Read connection to ● Read connetion to sales slaves product slaves 17

  18. Clients 18

  19. So, how do clients see it? ● Sales clients ● Product clients ● Write sales schema in ● Write sales schema in sales master sales master ● Read sales schema in ● Read product schema sales slaves in product slaves 19

  20. Better client access ● Create a view on the sales master of the tables in a product schema ● Simple view is updatable, so UPDATE/ INSERT/DELETE succeed ● product client code can be changed to use new schema name before split ● Change product clients to use product on sales master before split 20

  21. Schemas with views 21

  22. Creating views ● Sys schema tool use sys; call create_synonym_db(’sales', ’product'); ● Cannot handle views in sales ● View of view is not possible ● product views must be built on the same real sales tables as sales views ● Drop views of tables not moving ● Or create database product then create view… 22

  23. Views on slaves also ● Updates are executed properly in SBR ● Slave can be promoted to master ● Using Orchestrator or similar 23

  24. So, how do clients see it now? ● Sales clients ● Product clients ● Write sales schema in ● Write product schema sales master in sales master ● Read sales schema in ● Read product schema sales slaves in product slaves 24

  25. Decreased redundancy ● Failover of the translation master is hard ● Keep the intermediate stage short-lived ● Probably best to prepare to repoint slaves in a hurry if translation master dies ● Keep sufficient capacity in parent chain for total load 25

  26. Grants ● MySQL users of the sales tables will need privileges on product tables on the sales master ● Separate grants for sales and product makes the split easier and safer ● Grants (mysql schema) do not replicate through translation master ● product master will need matching grants 26

  27. Client database server configuration ● Need a way to change server name quickly ● DNS CNAME or A record ● Preferably without application restart 27

  28. Client access while split is ongoing ? 28

  29. Choice ● Disconnect ALL ● Allow some product product clients from clients to connect to sales master product master while BEFORE connecting others still use sales to product master master ● More downtime ● Less downtime ● More data integrity ● Less data integrity 29

  30. Choice ● Disconnect first ● Disconnect later ● No chance of conflicting ● Some chance of transactions conflicting transactions ● Auto-increment counter ● Fixed or external counter columns and id sources ● Complex less- ● More-understood understood application application ● Low risk appetite ● Higher risk appetite ● Lower cost of downtime ● High cost of downtime 30

  31. Disconnect first ● Disable product client access to sales master – use separate user grants if possible, lock the user ● Kill all connections from product clients ● Wait for translation master replication pointer to catch up past the point where product clients were blocked ● Stop replication on the translation master ● Change product client configuration to use product master ● (going back is now hard) 31

  32. Disconnect first 32

  33. Disconnect first 33

  34. Disconnect first 34

  35. Disconnect later ● Change product client configuration to use product master ● Wait for product clients to make new connections to product master ● Disable product client access to sales master – use separate user grants if possible, lock the user ● Kill all connections from product clients ● Wait for translation master replication pointer to catch up ● Stop replication on the translation master ● (going back is now hard) 35

  36. Disconnect later 36

  37. Disconnect later 37

  38. Disconnect later 38

  39. After split 39

  40. Cleanup ● Remove translation master ● Drop views and grants for product on the sales master ● Rename, then drop tables from the sales master 40

  41. Final state 41

  42. ? nicolai.plum@booking.com 42

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend