how to visually spot and analyze slow mongodb operations
play

How to visually spot and analyze slow MongoDB operations Kay Agahd - PowerPoint PPT Presentation

How to visually spot and analyze slow MongoDB operations Kay Agahd idealo internet GmbH About me Current location: Berlin/Germany Education: Engineer's degree, Software Engineering Experience: Software developer from 1998 - 2009


  1. How to visually spot and analyze slow MongoDB operations Kay Agahd idealo internet GmbH

  2. About me ● Current location: Berlin/Germany ● Education: Engineer's degree, Software Engineering ● Experience: ○ Software developer from 1998 - 2009 in Paris/France ○ Database engineer since 2012 at idealo in Berlin/Germany ● Certifications: ○ MongoDB certified Java developer (final grade 100%) ○ MongoDB certified DBA (final grade 96%) linkedin.com/in/kayag/ 2

  3. About idealo ● founded in 2000 ● Europe’s leading price comparison website and app ● Germany, Austria, United Kingdom, France, Italy and Spain ● > 1 billion offers online (November 2018) ● fast growing 3

  4. idealo and MongoDB ● different types of databases: Oracle, MySQL, PostgreSQL, MongoDB ● MongoDB in production since v1.6 (ca. 2011) ● sharding in production since MongoDB v1.8 ● MongoDB stores mainly offers for back-end usage ● > 2 billions docs in offerStore, up to 1 bn both read/writes per day ● > 10 billions docs in offerHistory 4

  5. Motivation Why we need a better MongoDB profiler

  6. Review profiling ● MongoDB supports profiling of “slow” operations ● “slow” is a threshold to be set when turning profiling on (100 ms) ● profiler writes collected data to a capped collection of the profiled database ● profiling per-database or per-instance on a running mongod 6

  7. Inconveniences ● each mongod and database needs to be handled separately ● sharding: shards * repl. factor * databases = #profilers ● gives only a view on a limited time span due to capped collection ● profiling/analyzing may add stress to the profiled server ● different formats of “query” field makes querying more difficult ● bug: ops through mongos omit the user (JIRA: SERVER-7538) 7

  8. idealo requirements ● easily switch on/off profiling, even for many mongod’s involved ● quick overview of types of slow-ops and their quantity within a time period (“types” means op type, user, server, query shape, etc.) ● historical view to see how slow-ops evolve to extrapolate them ● discovering spikes in time or in slow-op types ● filtering by slow-op types and/or time range to drill down ● usable also by non-database-admins, e.g. software developers 8

  9. What we’ve built MongoDB slow operations profiler

  10. How it works mongod 1 slow ops app DB 1 profiler 1..n collector DB DB 2 DB n mongod n profiler n..m DB n1 DB n2 DB nm 10

  11. Example of slow op document { "shardVersion": [ "nreturned": 185, "op": "query", Timestamp(14944, 25276), "responseLength": 119954, "ns": "offerStore.offer", ObjectId("591c6...8fcde") "protocol": "op_command", "query": { ] "millis": 4189, "find": "offer", }, "execStats": { "filter": { "keysExamined": 2210852, "stage": "PROJECTION", "shopId": 292731, "docsExamined": 232, "nReturned": 185, "opIds": { "cursorExhausted": true, "executionTimeMillisEstimate": 3941, "$in": [ "keyUpdates": 0, "works": 2210853, 29337,5478 "writeConflicts": 0, "advanced": 185, ] ... 124 lines omitted ... "numYield": 17272, }, "locks": { "offerStatus": { }, "Global": { "$gt": 0 "ts": ISODate("2018-10-26T07:17:12.747Z"), "acquireCount": { } "client": "10.135.128.219", "r": }, "allUsers": [ NumberLong("34546")}}, "projection": { { "Database": { "traceId": 1, "user": "__system", "acquireCount": { "bokey": 1, "db": "local" "r": "version": 1, } NumberLong("17273")}}, "offerTitle": 1 ], "Collection": { }, "user": "__system@local" "acquireCount": { "batchSize":1000, } "r": 11 ... NumberLong("17273")}} },

  12. Condense slow op documents { "_id":ObjectId("5bd3090b68b5c4203f53ce7e"), "ts":ISODate("2018-10-26T12:31:07.752Z"), "adr":"host523.idealo.de", "lbl":"offerStoreDE", "rs":"offerStoreDE09", "db":"offerStore", "col":"offers" "op":"getmore", "fields":["shopId", "opIds.$in", "offerStatus.$gt"], "sort":["_id"], "nret":500, "reslen":94656, "millis":5322, "user":"__system@local" } 12

  13. Some numbers of the collector DB ● > 250 millions slow ops stored within the last > 5 years ● average doc size = 238 Bytes ● uncompressed data size ca. 55 GB ● index size < 9 GB ● total storage size (snappy compression) < 12 GB 13

  14. Configuration ● Collector { "collector":{ "hosts":["myCollectorHost_member1:27017", "myCollectorHost_member2:27017", "myCollectorHost_member3:27017"], "db":"profiling", "collection":"slowops", "adminUser":"", "adminPw":"" }, ... 14

  15. Configuration ● databases to be profiled "profiled":[ { "label":"dbs foo", "hosts":["someHost1:27017", "someHost2:27017", "someHost3:27017"], "ns":["someDB.someCollection", "anotherDB.*"], "enabled": false }, { "label":"dbs bar", "hosts":["someMongoRouter1:27017","someMongoRouter2:27017"], "ns":["*.*"], "adminUser":"kay", "adminPw":"never.tell.it!:-)", "enabled":false, "slowMS":500, "responseTimeoutInMs":2000 } ],... 15

  16. How it looks 16

  17. How it looks - part 2 17

  18. Slow ops diagram 2018/10/30 10:04 = Count:95 db=offerStore coll=offers op=query fields=[shopId,mCC.$gt] Duration: avg:322 max:990 sum:31.682 ms 18

  19. Slow ops data table 19

  20. Slow ops search form 20

  21. Further benefits ● global collector allows to see evolution of slow ops ● aggregate slow ops of last minute grouped by label {$match: { ts: {$gt:from, $lt:to}} }, {$group: {_id: {label:"$lbl"}, count:{$sum:1}, sumMs:{$sum:"$millis"}, maxMs:{$max:"$millis"}, sumNret:{$sum:"$nret"}, sumResplen:{$sum:"$reslen"} } } ● send metrics i.e. count, durations and nReturned to graphite ● build grafana dashboard 21

  22. Historical view per DBS 22

  23. Further documentation ● Open source project, you are welcome to contribute: https://github.com/idealo/mongodb-slow-operations-profiler ● Blog post 1 of 2: https://medium.com/idealo-tech-blog/how-to-visually-spot- and-analyze-slow-mongodb-operations-d91ac819e0de ● Blog post 2 of 2: https://medium.com/idealo-tech-blog/mongodb-slow- operations-analyzer-2-0-24da414fad13 We are hiring: jobs.idealo.de 23

  24. Rate My Session 24

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend