Managing Data and Operation Distribution In MongoDB Antonios Giannopoulos and Jason Terpko DBA’s @ Rackspace/ObjectRocket linkedin.com/in/antonis/ | linkedin.com/in/jterpko/ 1
Introduction Antonios Giannopoulos Jason Terpko www.objectrocket.com 2
Overview • Sharded Cluster • Shard Keys Selection • Shard Key Operations • Chunk Management • Data Distribution • Orphaned documents • Q&A www.objectrocket.com 3
Sharded Cluster • Cluster Metadata • Data Layer • Query Routing • Cluster Communication www.objectrocket.com 4
Cluster Metadata
Data Layer … s2 s1 sN
Replication Data redundancy relies on an idempotent log of operations.
Query Routing … s2 s1 sN
Sharded Cluster … s2 s1 sN
Cluster Communication How do independent components become a cluster and communicate? ● Replica Set ○ Replica Set Monitor ○ Replica Set Configuration ○ Network Interface ASIO Replication / Network Interface ASIO Shard Registry ○ Misc: replSetName, keyFile, clusterRole ● Mongos Configuration ○ configDB Parameter ○ Network Interface ASIO Shard Registry ○ Replica Set Monitor ○ Task Executor ● Post Add Shard ○ Collection config.shards ○ Replica Set Monitor ○ Task Executor Pool ○ config.system.sessions
Primary Shard Database <foo> … s2 s1 sN
Collection UUID With featureCompatibilityVersion 3.6 all collections are assigned an immutable UUID. Cluster Metadata config.collections Data Layer (mongod) config.collections
Collection UUID With featureCompatibilityVersion 3.6 all collections are assigned an immutable UUID. Cluster Metadata config.collections Data Layer (mongod) config.collections Important • UUID’s for a namespace must match • Use 4.0+ Tools for a sharded cluster restore
Shard Key - Selection • Profiling • Identify shard key candidates • Pick a shard key • Challenges www.objectrocket.com 14
Sharding Shards are Physical Partitions Chunks are Logical Partitions Database <foo> Collection <foo> … s2 s1 sN chunk chunk chunk chunk chunk chunk 15
What is a Chunk? The mission of the shard key is to create chunks The logical partitions your collection is divided into and how data is distributed across the cluster. ● Maximum size is defined in config.settings ○ Default 64MB ● Before 3.4.11: Hardcoded maximum document count of 250,000 ● Version 3.4.11 and higher: 1.3 configured chunk size by the average document size ● Chunk map is stored in config.chunks ○ Continuous range from MinKey to MaxKey ● Chunk map is cached at both the mongos and mongod ○ Query Routing ○ Sharding Filter ● Chunks distributed by the Balancer ○ Using moveChunk ○ Up to maxSize
Shard Key Selection Profiling Helps identify your workload Requires Level 2 – db.setProfilingLevel(2) May need to increase profiler size www.objectrocket.com 17
Shard Key Selection Profiling Candidates Export statements types with frequency Export statement patterns with frequency Produces a list of shard key candidates www.objectrocket.com 18
Shard Key Selection Build-in Profiling Candidates Constraints Key and Value is immutable Must not contain NULLs Update and findAndModify operations must contain shard key Unique constraints must be maintained by a prefix of shard key A shard key cannot contain special index types (i.e. text) Potentially reduces the list of candidates www.objectrocket.com 19
Shard Key Selection Schema Build-in Profiling Candidates Constraints Constraints Cardinality Monotonically increased Data Hotspots Operational Hotspots Targeted vs Scatter-gather operations www.objectrocket.com 20
Shard Key Selection Schema Build-in Profiling Candidates Future Constraints Constraints Poor cardinality Growth and data hotspots Data pruning & TTL indexes Schema changes Try to simulate the dataset in 3,6 and 12 months www.objectrocket.com 21
Shard key - Operations • Apply a shard key • Revert a shard key www.objectrocket.com 22
Apply a shard key Create the associated index Make sure the balancer is stopped: sh.stopBalancer() sh.getBalancerState() Apply the shard key: sh.shardCollection(“foo.col”,{field1:1,...,fieldN:1}) Allow a burn period Start the balancer www.objectrocket.com 23
Sharding sh.ShardCollection({foo.foo},<key>) Burn Period sh.startBalancer() Database <foo> Collection <foo> … s2 s1 sN chunk chunk chunk chunk chunk chunk
Revert a shard key Two categories: Affects functionality (exceptions, inconsistent data,…) o Affects performance (operational hotspots…) o Dump/Restore Requires downtime – write and in some cases read o Time consuming operation o You may restore on a sharded or unsharded collection o Better pre-create indexes o Same or new cluster can be used o Streaming dump/restore is an option o On special cases, like time series data can be fast o www.objectrocket.com 25
Revert a shard key Dual writes Mongo to Mongo connector or Change streams o No downtime o Requires extra capacity o May Increase latency o Same or new cluster can be used o Adds complexity o Alter the config database Requires downtime – but minimal o Easy during burn period o Time consuming, if chunks are distributed o Has overhead during chunk moves o www.objectrocket.com 26
Revert a shard key Process: 1) Disable the balancer – sh.stopBalancer() 2) Move all chunks to the primary shard (skip during burn period) 3) Stop one secondary from the config server ReplSet (for rollback) 4) Stop all mongos and all shards 5) On the config server replset primary execute: db.getSiblingDB(‘config’).chunks.remove({ns:<collection name>}) db.getSiblingDB(‘config’).collections.remove({_id:<collection name>}) 6) Start all mongos and shards 7) Start the secondary from the config server replset Rollback: • After step 6, stop all mongos and shards • Stop the running members of the config server ReplSet and wipe their data directory • Start all config server replset members • Start all mongos and shards www.objectrocket.com 27
Revert a shard key Online option requested on SERVER-4000 - May be supported in 4.2 Further reading - Morphus : Supporting Online Reconfigurations in Sharded NoSQL Systems http://dprg.cs.uiuc.edu/docs/ICAC2015/Conference.pdf Special use cases : Extend a shard key, by adding field(s) ({a:1} to {a:1,b:1}) Possible (and easier) if b’s max and min (per a) are predefined o For example {year:month} to be extended to {year:month:day} o Reduce the elements of a shard key (({a:1, b:1} to {a:1}) Possible (and easier) if all distinct “a” values are in the same shard o There aren’t chunks with the same “a.min” (adds complexity) o www.objectrocket.com 28
Revert a shard key Always preform a dry-run Balancer/Autosplit must be disabled You must take downtime during the change *There might be a more optimal code path but the above one worked like a charm www.objectrocket.com 29
Chunk Splitting and • Pre-splitting Merging • Auto Splits • Manual Intervention www.objectrocket.com 30
Distribution Goal Database Size: 200G 25% 25% Primary Shard: s1 25% Database <foo> … s2 s1* s4 50G 50G 50G 31
Pre-Split – Hashed Keys Shard keys using MongoDB’s hashed index allow the use of numInitialChunks. Hashing Mechanism jdoe@gmail.com 694ea0904ceaf766c6738166ed89bafb NumberLong(“7588178963792066406”) Value 64-bits of MD5 64-bit Integer Estimation Size = Collection size (in MB) / 32 1,600 = 51,200 / 32 Count = Number of documents / 125000 800 = 100,000,000 / 125,000 Limit = Number of shards * 8192 32,768 = 4 *8192 numInitialChunks = Min(Max(Size, Count), Limit) 1600 = Min(Max(1600, 800), 32768) Command db.runCommand( { shardCollection: ”foo.users", key: { ”email": "hashed" }, numInitialChunks : 1600 } ); 32
Pre-Split – Deterministic Prerequisites Use Case: Collection containing user profiles with email as the unique key. 1. Shard key analysis complete 2. Understanding of access patterns 3. Knowledge of the data 4. Unique key constraint 33
Pre-Split – Deterministic Prerequisites Split Initial Chunk Splits 34
Pre-Split – Deterministic Prerequisites Split Balance 35
Pre-Split – Deterministic Balance Split Prerequisites Split 36
Automatic Splitting Controlling Auto-Split • sh.enableAutoSplit() • sh.disableAutoSplit() Alternatively Mongos • The component responsible for track statistics • Bytes Written Statistics • Multiple Mongos Servers for HA 37
Sub-Optimal Distribution Database Size: 200G 40% 20% Primary Shard: s1 20% Chunks: Balanced Database <foo> … s2 s1* s4 38
Maintenance – Splitting Four Helpful Resources: • collStats • config.chunks • Profiler • Oplog • dataSize 39
Maintenance – Splitting Five Helpful Resources: • collStats • config.chunks • dataSize • oplog.rs • system.profile 40
Recommend
More recommend