Secondary reads: the good and the bad Bartomiej Noga Agenda Read - - PowerPoint PPT Presentation

secondary reads the good and the bad
SMART_READER_LITE
LIVE PREVIEW

Secondary reads: the good and the bad Bartomiej Noga Agenda Read - - PowerPoint PPT Presentation

Secondary reads: the good and the bad Bartomiej Noga Agenda Read Preference configuration Lagging secondaries and stale or missing/duplicated data What queries can be safely run on secondaries? Improving read throughput:


slide-1
SLIDE 1

Secondary reads: the good and the bad

Bartłomiej Nogaś

slide-2
SLIDE 2

2

Agenda

  • Read Preference configuration
  • Lagging secondaries and stale or missing/duplicated data
  • What queries can be safely run on secondaries?
  • Improving read throughput: sharding vs reading from

secondaries

slide-3
SLIDE 3

Read preference configurations

And impact of step downs

slide-4
SLIDE 4

4

Client Configuration options

ELIGIBLE NODE Node that satisfies all the conditions defined in the Read Preference. A client directs reads to all eligible nodes at random.

slide-5
SLIDE 5

5

Client Configuration options

  • serverSelectionTimeout

○ How long to wait for an eligible node ○ Defaults to 30 seconds

ELIGIBLE NODE Node that satisfies all the conditions defined in the Read Preference. A client directs reads to all eligible nodes at random.

slide-6
SLIDE 6

6

Client Configuration options

  • serverSelectionTimeout

○ How long to wait for an eligible node ○ Defaults to 30 seconds

  • localTresholdMS (default 15ms)

○ Size of the latency window for selecting among available replica set members

ELIGIBLE NODE Node that satisfies all the conditions defined in the Read Preference. A client directs reads to all eligible nodes at random.

slide-7
SLIDE 7

7

Latency Window

  • Every 10 seconds (3.2) driver sends a heartbeat to measure

network response time(last_RTT)

  • Average RTT is a weighted moving function,

Last observation weight is 0.2 ( last nine around 0.85 )

  • localTresholdMS is relative to the server with lowest RTT
slide-8
SLIDE 8

8

Available Read Preference Modes

Primary Secondary Primary preferred Secondary preferred

Nearest

Nearest

slide-9
SLIDE 9

9

Primary and Secondary

PRIMARY

  • Read only from the

Primary member

  • Exception if no

Primary is available SECONDARY

  • Read only from

secondary members within latency window

  • Exception if there is

no Secondary

slide-10
SLIDE 10

10

Primary and Secondary Preferred

PRIMARY PREFERRED

  • Read from the

Primary member

  • If no Primary is

available follow the procedure for secondary read preference SECONDARY PREFERRED

  • Read from

secondary members within latency window

  • If no Secondary is

available read from Primary

slide-11
SLIDE 11

11

Nearest

NEAREST Read from any member within the latency window WHEN TO USE If you need the shortest response time

slide-12
SLIDE 12

Read Preference Tags

Multiple DC configuration

slide-13
SLIDE 13

13

Read preference tags

  • Tag is a single key/value pair:
  • Ex. {"dc": "A"}
  • Tag set is a document containing

zero or more of such tags

  • Example: {"dc": "A",

"role": "backup"}

  • One can’t use tags with read

Preference Primary

slide-14
SLIDE 14

14

Multiple DC configuration

  • Nearest with tags

{"dc": "A"} will choose between (P and S1)

Primary (P) Secondary (S1)

{"dc": "A"}

Secondary (S2) Secondary (S3)

{"dc": "B"}

slide-15
SLIDE 15

15

Multiple DC configuration

  • Nearest with tags

{"dc": "A"} will choose between (P and S1)

  • secondaryPreferred with

tags {"dc": "B"} will read from S2 or S3 or P

Primary (P) Secondary (S1)

{"dc": "A"}

Secondary (S2) Secondary (S3)

{"dc": "B"}

slide-16
SLIDE 16

16

Multiple DC configuration

Primary (P) Secondary (S1)

{"dc": "A"}

Secondary (S2) Secondary (S3)

{"dc": "B"}

  • Note: setting

Mode: Secondary Tags: {"dc": "A"} would allow only node S1, in case of failure of this node there will be no eligible members

slide-17
SLIDE 17

17

Agenda

  • Read Preference configuration
  • Lagging secondaries and stale or missing/duplicated data
  • What queries can be safely run on secondaries?
  • Improving read throughput: sharding vs reading from

secondaries

slide-18
SLIDE 18

Lagging secondaries

And stale or missing data

slide-19
SLIDE 19

19

Stale data

Primary Secondary

  • Replication lag

○ rs.printSlaveReplicationInfo() (rs.status())

  • Typically replication lag

should not be bigger than a couple of seconds

  • The lag can grow big for

example when secondaries uses worse hardware than primary

slide-20
SLIDE 20

20

Stale data

Primary (P) Secondary (S1) Lag 2s

  • Take an example:

○ An update is made on P on a document

Secondary (S2) Lag 4s Client (C)

slide-21
SLIDE 21

21

Stale data

Primary (P) Secondary (S1) Lag 2s

  • Take an example:

○ An update is made on P on a document ○ The write is replicated to S1

Secondary (S2) Lag 4s Client (C)

slide-22
SLIDE 22

22

Stale data

Primary (P) Secondary (S1) Lag 2s

  • Take an example:

○ An update is made on P on a document ○ The write is replicated to S1 ○ C is reading the document from S1 (got updated version)

Secondary (S2) Lag 4s Client (C)

slide-23
SLIDE 23

23

Stale data

Primary (P) Secondary (S1) Lag 2s

  • Take an example:

○ An update is made on P on a document ○ The write is replicated to S1 ○ C is reading the document from S1 (got updated version) ○ Then C is reading the same document from S2 (old record)

Secondary (S2) Lag 4s Client (C)

slide-24
SLIDE 24

24

Stale data

Primary (P) Secondary (S1) Lag 2s

  • Take an example:

○ An update is made on P on a document ○ The write is replicated to S1 ○ C is reading the document from S1 (got updated version) ○ Then C is reading the same document from S2 (old record)

  • Important to monitor

replication lag

Secondary (S2) Lag 4s Client (C)

slide-25
SLIDE 25

25

Changes in MongoDB 3.4

Primary (P) Secondary (S1) Lag 2s

  • maxStalenessMS parameter

is added to read Preference

  • This parameter defines the

maximum replication latency for a secondary to read from

  • Example: if maxStalnessMS

is set to 3000ms:

○ S1, lag 2s will be eligible ○ S2, lag 4s will not be eligible

Secondary (S2) Lag 4s

slide-26
SLIDE 26

26

Missing/Duplicated data in Sharded Cluster

TWO PROBLEMS

  • Duplicated/outdated data

because of orphaned documents

  • Missing data because of not

yet replicated chunk migration SERVER-3645 - Inaccurate count for primary SERVER-5931 - Inconsistent read from secondary in sharded environment

slide-27
SLIDE 27

27

Orphaned records and duplicated data

  • Duplicated and outdated with

secondary readPreference

  • Orphaned Document

○ Failed balancer rounds ○ During chunk migration Orphaned Document In sharded cluster it’s a document that exists also on shard that it doesn’t belong to

slide-28
SLIDE 28

28

Orphaned records and duplicated data

db.test shard key: { "_id": 1 } { "_id": [object MinKey] } -> { "_id": 10 } on: test-rs0 { "_id": 10 } -> { "_id": [object MaxKey] } on: test-rs1

slide-29
SLIDE 29

29

Orphaned records and duplicated data

db.test shard key: { "_id": 1 } { "_id": [object MinKey] } -> { "_id": 10 } on: test-rs0 { "_id": 10 } -> { "_id": [object MaxKey] } on: test-rs1 test-rs0/test_db db.test.find(): {"_id" : 1, "rs": 0} {"_id" : 2, "rs": 0}

slide-30
SLIDE 30

30

Orphaned records and duplicated data

db.test shard key: { "_id": 1 } { "_id": [object MinKey] } -> { "_id": 10 } on: test-rs0 { "_id": 10 } -> { "_id": [object MaxKey] } on: test-rs1 test-rs0/test db.test.find(): {"_id" : 1, "rs": 0} {"_id" : 2, "rs": 0} test-rs1/test db.test.find(): {"_id" : 12, "rs": 1} {"_id" : 2, "rs": 1}

slide-31
SLIDE 31

31

Orphaned records and duplicated data

test-rs0/test {"_id" : 1, "rs": 0} {"_id" : 2, "rs": 0} test-rs1/test {"_id" : 12, "rs": 1} {"_id" : 2, "rs": 1} Query readPreference=Primary db.test.find().readPref("primary") {"_id" : 1, "rs": 0} {"_id" : 2, "rs": 0} {"_id" : 12, "rs": 1}

slide-32
SLIDE 32

32

Orphaned records and duplicated data

test-rs0/test {"_id" : 1, "rs": 0} {"_id" : 2, "rs": 0} test-rs1/test {"_id" : 12, "rs": 1} {"_id" : 2, "rs": 1} Query readPreference=Secondary db.test.find().readPref("secondary") {"_id" : 1, "rs": 0} {"_id" : 2, "rs": 0} {"_id" : 12, "rs": 1} {"_id" : 2, "rs": 1}

slide-33
SLIDE 33

33

Orphaned records and duplicated data

test-rs0/test {"_id" : 1, "rs": 0} {"_id" : 2, "rs": 0} test-rs1/test {"_id" : 12, "rs": 1} {"_id" : 2, "rs": 1} Query readPreference=Secondary find({"_id" : 2}).readPref("secondary") {"_id" : 2, "rs": 0}

slide-34
SLIDE 34

34

Orphaned records and duplicated data

test-rs0/test {"_id" : 1, "rs": 0} {"_id" : 2, "rs": 0} test-rs1/test {"_id" : 12, "rs": 1} {"_id" : 2, "rs": 1} Query readPreference=Secondary find({"_id" : 2}).readPref("secondary") {"_id" : 2, "rs": 0} find({"rs" : 1}).readPref("secondary") {"_id" : 2, "rs": 1}

slide-35
SLIDE 35

35

Missing data with active balancer

  • By default balancer migrates

chunks with "writeConcern": {"w": 2}

  • writeConcern for the

balancer can be changed

slide-36
SLIDE 36

36

Missing data with active balancer

  • By default balancer migrates

chunks with "writeConcern": {"w": 2}

  • writeConcern for the

balancer can be changed

  • At the end of a migration -

config database is updated

slide-37
SLIDE 37

37

Missing data with active balancer

  • By default balancer migrates

chunks with "writeConcern": {"w": 2}

  • writeConcern for the

balancer can be changed

  • At the end of a migration -

config database is updated

Primary (P) Secondary (S1) Secondary (S2) Client (C)

slide-38
SLIDE 38

38

Is there a workaround?

  • CleanOrphaned documents
  • n shards
  • Don’t use automatic balancer
  • Set the balancer window
  • Issue only not critical reads
  • n Secondaries
slide-39
SLIDE 39

39

Agenda

  • Read Preference configuration
  • Lagging secondaries and stale or missing/duplicated data
  • What queries can be safely run on secondaries?
  • Improving read throughput: sharding vs reading from

secondaries

slide-40
SLIDE 40

What queries can be run on secondaries?

slide-41
SLIDE 41

41

What queries can be run on secondaries?

+ Counts + Offline data analyzing + Backup jobs + Online queries that doesn’t require strong consistency

  • Queries that

requires strong consistency

slide-42
SLIDE 42

42

Agenda

  • Read Preference configuration
  • Lagging secondaries and stale or missing/duplicated data
  • What queries can be safely run on secondaries?
  • Improving read throughput: sharding vs reading from

secondaries

slide-43
SLIDE 43

Improving read throughput

Read from secondaries, sharding

slide-44
SLIDE 44

44

Improving read throughput - Secondaries

+ Reduce read time if Multi data center + Efficient use of secondary indexes + Reduce network load + Reduce CPU load

  • No immediate

consistency

  • Does not reduce

indexes or working set size

  • Every node in a

replica set has roughly the same write load

slide-45
SLIDE 45

45

Improving read throughput - Sharding

+ Reduces indexes and working set size + Strong consistency + Improves also write throughput

  • Secondary indexes

are inefficient so it may require data de-normalization

  • Requires using

additional nodes for config servers and mongos

slide-46
SLIDE 46

46

Further Readings

  • SERVER-5931 - Secondary reads in sharded clusters need

stronger consistency

  • Server Selection Specification
  • How to clean orphaned documents
slide-47
SLIDE 47

Thank you

Contact: bartlomiej.nogas@allegrogroup.com