What's New in Percona Server for MongoDB? 2019 Q3: Enterprise - - PowerPoint PPT Presentation

what s new in percona server for mongodb
SMART_READER_LITE
LIVE PREVIEW

What's New in Percona Server for MongoDB? 2019 Q3: Enterprise - - PowerPoint PPT Presentation

What's New in Percona Server for MongoDB? 2019 Q3: Enterprise Enhancements and v4.2 4:00 PM - 4:50 PM - Room B About Adamo and Akira Two of the most experienced MongoDB field experts in the world. Adamo w/ MongoDB: 2013 ~ MongoDB: 2009 ~


slide-1
SLIDE 1

What's New in Percona Server for MongoDB?

2019 Q3: Enterprise Enhancements and v4.2 4:00 PM - 4:50 PM - Room B

slide-2
SLIDE 2

Two of the most experienced MongoDB field experts in the world. MongoDB: 2009 ~ vs. AdamoMongoDB

+ AkiraMongoDB > MongoDB !

About Adamo and Akira

2

Adamo w/ MongoDB: 2013 ~ Akira w/ MongoDB: 2014 ~

slide-3
SLIDE 3

Talk Overview

  • New in Percona Server for MongoDB (PSMDB) 4.0+

○ Encryption at rest with Hashicorp Vault ○ All about PSMDB

  • MongoDB Community / PSMDB 4.2

○ New Primary throttle (a.k.a. "flow control") ○ New index build process ○ Cursors and user info added to $currentOp ○ Wildcard indexes ○ Modularization-friendly config files

3

slide-4
SLIDE 4

4

Encryption at Rest with Hashicorp Vault

slide-5
SLIDE 5

What is Encryption and How Does it Work?

Encryption is the process of hiding data in such a way that only those who have the key (decrypt key) will be able to read the data. Any data - files, emails, network, individual fields - can be encrypted. For MongoDB data-at-rest encryption the collection documents and index key entries are encrypted within the WiredTiger Btree file format saved in the *.wt files.

5

slide-6
SLIDE 6

PSMDB has two optional encryption features:

  • Encryption at Rest
  • Encryption in Transit (SSL/TLS)

We'll discuss both, but focus a bit more on the encryption at rest as this is a free feature only in PSMDB

6

Types of Encryption

slide-7
SLIDE 7

WiredTiger Keyfile-Encryption

This is the basic encryption where there is a key file that acts as an encryption and decryption key of the database. The encryption key is in the filesystem and can be read by any root user or mongod. The database is encrypted, however the secret is not that secret

7

slide-8
SLIDE 8

WiredTiger Keyfile-Encryption

8

Instance

slide-9
SLIDE 9

A vault is an external process capable of keeping secrets and answering API-like calls to show a secret to a client. PSMDB is fully integrated with Hashicorp Vault.

WiredTiger Vault Encryption

9

slide-10
SLIDE 10

WiredTiger Vault Encryption

10

a8abc37456e - token secret Token can be changed Secret is not on the same disk as data Instance

slide-11
SLIDE 11

What is Hashicorp Vault?

Vault is a tool for securely accessing secrets. A secret is anything that you want to tightly control access to, such as API keys, passwords, or certificates. Vault provides a unified interface to any secret, while providing tight access control and recording a detailed audit log.

11

slide-12
SLIDE 12

How Does Hashicorp Vault Work?

Vault only speaks using TLS, so we need to configure the CaFile, otherwise the client won't be able to understand the reply. This also is a security feature, as any request without SSL will fail. Clients’ request the secret to the vault using a previously-created token. This is logged in the audit log and the server replies to the client with the secret. The secret is used to decrypt the db.key database, then open the master key to encrypt/decrypt the database.

12

slide-13
SLIDE 13

Parameters to Enable Vault in PSMDB

  • -enableEncryption
  • -encryptionCipherMode AES256-CBC
  • -vaultServerName <vault ip>
  • -vaultPort 8200
  • -vaultToken <machine token>
  • -vaultSecret secret/data/psmdb-node1

13

slide-14
SLIDE 14

Parameters to Enable Vault in PSMDB

security: enableEncryption: true vault: serverName: 127.0.0.1 port: 8200 tokenFile: /home/user/path/token secret: secret/data/hello

14

slide-15
SLIDE 15

Backups?

  • Logical backups (mongodump) works the same as before, all the data is

decrypted.

  • Binary copies (percona hot backup) will need to access the vault in order

to get the secret to open the key.db, otherwise the database will fail to start. Always encrypt your logical backup, the easiest way is using GPG.

15

slide-16
SLIDE 16
  • Because the encryption is at rest all the data is transiting over the wire

without encryption. That makes it really easy to intercept.

  • Encryption at rest doesn't remove the necessity of using TLS/SSL (we

will talk about that shortly).

What About the Data in Transit?

16

slide-17
SLIDE 17

17

Everything Else

slide-18
SLIDE 18

Everything Else

Percona Server for MongoDB comes with additional features such as:

  • LDAP Authentication
  • Auditing
  • Log Redaction
  • In-Memory Storage Engine
  • Hot Backup

All free and open source!

slide-19
SLIDE 19

LDAP authentication

LDAP stands for Lightweight Directory Access Protocol and it is a common protocol used in companies to centralise their users in just one software. All the other connected software can validate an user and password though the LDAP client. Microsoft version of LDAP is Active Directory. PSMDB features LDAP authentication, not authorization.

slide-20
SLIDE 20

Auditing

For enhanced security and compliance, awareness of the operations the database is performing can be critical. With an audit, it’s possible to track operations such as user and index creation - at the database level.

slide-21
SLIDE 21

Log Redaction

Logs can have sensitive data. Depending on regulations, certain information may not be allowed to be saved in a log file. Log redaction hides sensitive information, changing the values to a different character.

slide-22
SLIDE 22

In Memory Storage Engine

Low latency storage engine that doesn't interact with the disk subsystem. Completely ephemeral, once the database stops all the data is gone. Sub-millisecond latency, only for specific use cases.

slide-23
SLIDE 23

PSMDB only features

Hot Backup: This is a backup command that will generate an exact copy of the database (binary copy) in a different folder, in a very lightweight fashion.

> use admin switched to db admin > db.runCommand({createBackup: 1, backupDir: "/my/backup/data/path"}) { "ok" : 1 }

slide-24
SLIDE 24

PSMDB is compatible with MongoDB Enterprise and Community. Just replace MongoDB Community binaries with PSMDB and you'll be all set. PSMDB can replace MongoDB Enterprise in place except when config file has enabled: * security.kmip.* options (PSMDB uses security.vault.* instead) * security.ldap.* options (PSMDB supports only saslauthd LDAP authentication as of v4.0.12-6)

Migration to PSMDB

slide-25
SLIDE 25

PMM

PMM is an open-source platform for managing and monitoring MySQL and MongoDB performance and metrics. It is based on Docker, virtual appliances and AWS AMI and it is self hosted.

https://www.percona.com/blog/2018/07/05/configuring-pmm-monitoring-mongodb-cluster/

slide-26
SLIDE 26

PMM 2 is GA!

https://pmmdemo.percona.com/graph/

slide-27
SLIDE 27

27

New in 4.2

slide-28
SLIDE 28

28

Primary Throttle

"Flow control"

slide-29
SLIDE 29

Primary Throttle

What happens when a Replica set has:

  • Uneven hardware?
  • 'Noisy neighbours' in VM servers?

Primary server's capacity > Secondary's capacity No problem so long as Load < Secondary's capacity But replication lag will grow when load goes above secondary's capacity

29

slide-30
SLIDE 30

Long Replication Lag OK? Not OK?

Yes, if:

  • You want fastest-possible writes at any time. (Use w:1)

But:

  • Recognize high-load, server capacity-saturating times are the most

likely times to have failovers.

  • Accept that writes will be rolled-back in those failovers

A lot of replication lag == a lot of rolled-back documents.

  • If you use secondary reads – very stale data.

30

slide-31
SLIDE 31

Long Replication Lag OK? Not OK?

"And if I say; 'No thanks?’"

  • A. Use w:majority Write Concern
  • Latency for client increased
  • More connections open simultaneously, awaiting client ⇄ P ⇄ S

confirmations.

  • Capacity effectively throttled to weakest server in the w:majority subset of the

replicaset.

31

slide-32
SLIDE 32

MVCC, Transactions, Replication

MVCC Update

32

Make new version of document

+

Pin old version document until no client needs it. Clean-up performed asynchronously by storage engine. WiredTiger is an MVCC architecture. It supports transactions.

slide-33
SLIDE 33

MVCC, Transactions, Replication

MongoDB 4.0 added multi-document, user-level transactions.

  • Multiple clients can attempt to read or modify the same doc.
  • The old document version is pinned until the slowest/latest

client request referencing it finishes. MongoDB 4.2 enable cluster-wide transactions. Long transactions == Long pin time == Larger active cache. When app client requests have conflicts: double-work (or worse)

33

slide-34
SLIDE 34

MVCC, Transactions, Replication

A secondary reading from the primary == Another client. Pins old versions whilst its replication optime is older. At checkpoint time newest, committed docs saved to Primary's disk. Unreplicated old document versions go into a separate "Cache

  • verflow" datastructure (a.k.a. Lookaside table).

Cost: Lift-and-move (+ disk flush) on a fraction of the per-minute write volume.

34

slide-35
SLIDE 35

Impact of Secondary Lag on the Primary

Costs:

Amount of cache active increases proportional to pin time.

  • Transaction pinning:

○ Best-case: Linear increase. ○ High doc conflicts: Exponential increase.

  • Replication lag factor at checkpoint:

○ Old doc versions go into "Cache overflow." ○ Linear but high cost for proportion of writes each min. ○ Proportion ~= replication-lag / flush interval (default 60s).

35

slide-36
SLIDE 36

Impact of Secondary Lag on the Primary

If a checkpoint takes too long, latency suffers.

  • No software 'locks'.
  • But hardware resources are heavily utilized.

E.g. five second checkpoint could easily cause 1.0+ sec latency for clients.

36

slide-37
SLIDE 37

Primary Throttle

Software throttle to prevent replication lag. Capping "cache overflow" work == Capping worst-case conflicts for I/O resources during checkpoints

37

slide-38
SLIDE 38

Primary Throttle

SERVER-37865

"Add mechanism to throttle writes on the primary" Master branch Commit: SERVER-39673 Tweaks: SERVER-40367, SERVER-39616, SERVER-39867 Code reading:

  • mongo/db/storage/flow_control.cpp
  • mongo/db/concurrency/flow_control_ticketholder.cpp
  • mongo/db/storage/flow_control_parameters.idl

38

slide-39
SLIDE 39

Primary Throttle

Default settings (new as of 4.2)

  • enableFlowControl:

true (SERVER-41340)

  • flowControlTargetLagSeconds: 10
  • flowControlThresholdLagPercentage: 0.5 (50%, i.e. 5 secs)
  • flowControlMaxSamples: 1000000
  • flowControlSamplePeriod: 1000 (ops)
  • flowControlMinTicketsPerSecond: 100

(N.b. look for "Server Parameters" not "Configuration options" in docs).

39

slide-40
SLIDE 40

Primary Throttle

Observing: See serverStatus flowControl section (on primaries only)

"flowControl" : { "enabled" : true/false, "targetRateLimit" : xxx, "timeAcquiringMicros" : xxx, = Total time client writes forced to wait for fC "locksPerOp" : xxx, "sustainerRate" : xxx, "isLagged" : true/false,

true = In use this moment

"isLaggedCount" : xxx,

= Not used since last restart

"isLaggedTimeMicros" : xxx,

Low = More or less unused

},

40

slide-41
SLIDE 41

41

'Middle-Ground' Index Builds

slide-42
SLIDE 42

Background Index Builds

In MongoDB <= 4.0 "background" option for createIndex

db.collection.createIndex({"x": 1}, {"background": true})

Took longer than default mode ('foreground'), but didn't block other reads and writes. As of 4.2 "background" option deprecated, ignored.

42

slide-43
SLIDE 43

4.2+ Index Build Summary

  • Mechanism closer to <= 4.0 background index build

○ But without non-optimal index structure as side effect

  • Performance approx. as good as 'foreground' index build

if you guarantee writes are paused to that collection.

  • Only brief blocking of other clients. Once at start and once at end.

So what is removed?

  • Forced blocking of other reader and writers

(to allow index build to complete as fast as possible)

43

slide-44
SLIDE 44

44

CurrentOp Enhancements

slide-45
SLIDE 45

Where is 'listConnections' Command?

Simple feature request ticket:- 'Please give us a list-connections command'

45

slide-46
SLIDE 46

Where is 'listConnections' Command?

No ball! Denied! Closed 11 mins after

  • pened.

46

Seems like a fair request ... but listConnections isn't the right idea for a distributed database like MongoDB.

slide-47
SLIDE 47

Egress User conns != Ingress DB conns

47

S S S P P P S S S Applications mongos nodes Shards x RS nodes

Sum egress TCP socket count here Ingress TCP sockets at any single mongos or mongod node

>=

slide-48
SLIDE 48

Egress User conns != Ingress DB conns

48

A mongod or mongos node does not receive information on all potential client

  • conns. That would not be scalable.

Only observes which TCP connections are currently open to it. A single TCP connection might be used for hundreds of different ops per second.

  • Client's perspective: Its connections to shard mongod always appear open.
  • Shard node's perspective: At any given moment only some fraction of clients

are running ops on it.

slide-49
SLIDE 49

So What is the Best / Right Command?

49

List running operations, not connections db.currentOp() db.aggregate([{$currentOp}]) But idle cursors awaiting the next "getMore" command were unlistable in <= 4.0. Fixed in 4.2:

slide-50
SLIDE 50

Idle Cursor?

50

Client code

var cursor = db.collection.find({ <many doc query>}); while (cursor.hasNext()) { print(cursor.next()._id) }

Server ops:

find

Returns first 101 documents + server-side cursorId (doing nothing while client iterates those 101 docs)

getMore Returns next 16MB of docs

(doing nothing while client iterates that next ~16MB of docs)

getMore Returns next 16MB of docs

... ...

slide-51
SLIDE 51

Idle Cursor?

51

Client: Has query object that is still open and being read. Server: Finished running the initial find command and sent first batch,

  • r one of the subsequent getMore command's batches.

Server-side cursor inactive/unused until next "getMore" command. db.aggregate([{$currentOp: {idleCursors: true}}])

slide-52
SLIDE 52

User-auth'ed conns →__system conns

52

S S S P P P S S S Applications mongos nodes Shards x RS nodes u s e r

  • x

user-y user-z __system user

Trust __system user

slide-53
SLIDE 53

53

'Impersonated' user/role info attached to op details for auditing subsystem v4.2: (SERVER-5261) 'Impersonated user' now displaying as "runBy" in currentOp.

User-auth'ed conns →__system conns

slide-54
SLIDE 54

54

DBAs occasionally need to answer: "What client apps are putting the load on this mongod node?" Don't analyze by connections: TCP sockets not 1-to-1 with client sessions, plus user info not attached 1-to-1 there either. List by using currentOp:

  • Operations (both currently running or yielding).
  • Idle cursors (queries inactive server-side between first find batch and

successive getMore commands).

  • Includes user and role info, as well as client IP address.

Summary For currentOp Enhancement

slide-55
SLIDE 55

55

Wildcard Indexes

slide-56
SLIDE 56

Arbitrary Attributes? Prior Workaround

{ "_id": ..., "sn": ..., "modelName": ..., "attributes": [ { "k": "color", " v": "blue" }, { " k": "capacity", " v": "3200mAh" } ] }

//Make compound index on (key name, value):

db.collection.createIndex({" attributes.k": 1, "attributes.v": 1}) db.collection.find({" attributes.k": "color", "attributes.v": "blue"}) db.collection.aggregate([{$match: {"attributes.k": "color"}}, {$group: {"_id": "$attributes.v"}}, {$proj: {"color": "$_id"}}])

56

slide-57
SLIDE 57

Arbitrary Attributes? Prior Workaround

Despite using a Document DB you need to add mini-ORM in app:

collection.find(...) for k in myObj.attributes myObj.setProperty(array_item.k, array_item.v) delete myObj.attributes

  • for k in myObj.properties():

if (k not in fixedSchemaFields): myObj.attributes.k = myObj[k] delete myObj[k] collection.insert(myObj)

57

slide-58
SLIDE 58

4.2 Wildcard Indexes

Query engine applies facade over the same mechanism. In reality only one index structure. I.e. another special type like:

  • Text indexes
  • Multikey indexes (i.e. nested array fields)
  • Geo indexes

58

slide-59
SLIDE 59

4.2 Wildcard Indexes

createIndex specification:

  • Match all fields: { "$**" : 1 }
  • Match only 1 nested object’s fields: { "attributes.$**" : 1 }
  • Match/Filter subset: { "$**" : 1 , " wildcardProjection": {....}}

59

slide-60
SLIDE 60

4.2 Wildcard Indexes

Not standard index structure, so limitations apply:

  • Cannot be used in a compound index
  • Cannot be combined with hashed, TTL, 2d options
  • Query or Sort by only one of the wildcard fields.
  • Can't query

k: $exists

  • Can't query

k: {$ne: null}

  • Can't test equality to non-scalar values.

If you need the above: continue using the mini-ORM workaround.

60

slide-61
SLIDE 61

61

Modularization-Friendly Config Files

slide-62
SLIDE 62

__rest, __exec External Config

Don't want to update config files? Container image lockdown etc? Add --configExpand [exec,rest] as a command-line arg, then:

62

What an exciting time to be alive!

... net: dbPath: type: "string" __exec: "echo ${CURRENT_TESTDB_ROOT}" trim: "whitespace" ... net: dbPath: /mnt/cvol_test20191002

(especially for security team ^_^; )

slide-63
SLIDE 63

Any Questions?

slide-64
SLIDE 64

Rate My Session

64

slide-65
SLIDE 65

We’re Hiring!

65

Percona’s open source database experts are true superheroes, improving database performance for customers across the globe. Our staff live in nearly 30 different countries around the world, and most work remotely from home. Discover what it means to have a Percona career with the smartest people in the database performance industries, solving the most challenging problems our customers come across.

slide-66
SLIDE 66

Thank You