MongoDB Indexing Dr Janusz R. Getta School of Computing and - - PowerPoint PPT Presentation

mongodb indexing
SMART_READER_LITE
LIVE PREVIEW

MongoDB Indexing Dr Janusz R. Getta School of Computing and - - PowerPoint PPT Presentation

MongoDB Indexing file:///Users/jrg/235-2020-SPRING/SLIDES/WEEK11/26indexing/26indexing.html#1 CSCI235 Database Systems MongoDB Indexing Dr Janusz R. Getta School of Computing and Information Technology - University of Wollongong 1 of 25


slide-1
SLIDE 1

CSCI235 Database Systems

MongoDB Indexing

Dr Janusz R. Getta

School of Computing and Information Technology - University of Wollongong

MongoDB Indexing file:///Users/jrg/235-2020-SPRING/SLIDES/WEEK11/26indexing/26indexing.html#1 1 of 25 21/10/20, 11:50 pm

slide-2
SLIDE 2

Indexing

Outline

Overview of indexing Single key index Compound key index Sparse index Multikey index Hashed index Geospacial index Index administration

TOP Created by Janusz R. Getta, CSCI235 Database Systems, Spring 2020 2/25

MongoDB Indexing file:///Users/jrg/235-2020-SPRING/SLIDES/WEEK11/26indexing/26indexing.html#1 2 of 25 21/10/20, 11:50 pm

slide-3
SLIDE 3

Overview of indexing

Indexes significantly reduce amount of time needed to access the documents Without indexes all the documents in a collection must be accessed Single-key index is the most appropriate for {"key":"value"} query conditions For query conditions over multiple keys, e.g. {$and: [{"key1":"value1"},{"key2":"value2"}]} compound index is the best option If we have a compound index on (key1, key2) then the second index on key1 is not really needed, however it may still speed up an access a bit If we have a compound index on (key1, key2) then the second index on key2 speeds up access a lot An order of keys in a compound index is very important

TOP Created by Janusz R. Getta, CSCI235 Database Systems, Spring 2020 3/25

MongoDB Indexing file:///Users/jrg/235-2020-SPRING/SLIDES/WEEK11/26indexing/26indexing.html#1 3 of 25 21/10/20, 11:50 pm

slide-4
SLIDE 4

Indexing

Outline

Overview of indexing Single key index Compound key index Sparse index Multikey index Hashed index Geospacial index Index administration

TOP Created by Janusz R. Getta, CSCI235 Database Systems, Spring 2020 4/25

MongoDB Indexing file:///Users/jrg/235-2020-SPRING/SLIDES/WEEK11/26indexing/26indexing.html#1 4 of 25 21/10/20, 11:50 pm

slide-5
SLIDE 5

Single key indexes

An index on "_id" is an automatically created single-key index Equality search over the values of "_id" is the fasted possible search The following command creates a single key unique index on a key code The index is unique because it enforces uniqueness of the values associated with key code, i.e. each document in a collection has a different value associated with a key code An attempt to insert two documents with the same value of key code fails enforcing a key constraint Unique index should be create before inserting any data A unique index cannot be created on a collection where duplicate keys exist

db.department.createIndex( {"code": 1}, {"unique":true} )

createIndex() TOP Created by Janusz R. Getta, CSCI235 Database Systems, Spring 2020 5/25

MongoDB Indexing file:///Users/jrg/235-2020-SPRING/SLIDES/WEEK11/26indexing/26indexing.html#1 5 of 25 21/10/20, 11:50 pm

slide-6
SLIDE 6

Single key indexes

The following command creates a single key nonunique index on a key budget

db.department.createIndex( {"budget": 1}, {"unique": false} )

createIndex() TOP Created by Janusz R. Getta, CSCI235 Database Systems, Spring 2020 6/25

MongoDB Indexing file:///Users/jrg/235-2020-SPRING/SLIDES/WEEK11/26indexing/26indexing.html#1 6 of 25 21/10/20, 11:50 pm

slide-7
SLIDE 7

Indexing

Outline

Overview of indexing Single key index Compound key index Sparse index Multikey index Hashed index Geospacial index Index administration

TOP Created by Janusz R. Getta, CSCI235 Database Systems, Spring 2020 7/25

MongoDB Indexing file:///Users/jrg/235-2020-SPRING/SLIDES/WEEK11/26indexing/26indexing.html#1 7 of 25 21/10/20, 11:50 pm

slide-8
SLIDE 8

Compound key index

The following commands create the single key nonunique indexes on the keys budget and total_staff_number A query like uses only one of the indexes created above A query optimizer picks the more efficient index (with higher selectivity) To use both indexes we can traverse each index separately and calculate intersection of disk locations found

db.department.createIndex( {"budget":1}, {"unique":false} )

createIndex()

db.department.createIndex( {"total_staff_number":1}, {"unique":false} )

createindex()

db.department.find({"budget":2000, :"total_staff_number":5})

createIndex() TOP Created by Janusz R. Getta, CSCI235 Database Systems, Spring 2020 8/25

MongoDB Indexing file:///Users/jrg/235-2020-SPRING/SLIDES/WEEK11/26indexing/26indexing.html#1 8 of 25 21/10/20, 11:50 pm

slide-9
SLIDE 9

Compound key index

The following commands create a compound key nonunique index on the keys budget and total_staff_number A compound index is a single index where each entry is composed of more than one key A compound index is used by a query

db.department.createIndex( {"budget":1, "total_staff_number":1}, {"unique":false} )

createIndex()

db.department.find({"budget":2000, :"total_staff_number":5})

find() TOP Created by Janusz R. Getta, CSCI235 Database Systems, Spring 2020 9/25

MongoDB Indexing file:///Users/jrg/235-2020-SPRING/SLIDES/WEEK11/26indexing/26indexing.html#1 9 of 25 21/10/20, 11:50 pm

slide-10
SLIDE 10

Indexing

Outline

Overview of indexing Single key index Compound key index Sparse index Multikey index Hashed index Geospacial index Index administration

TOP Created by Janusz R. Getta, CSCI235 Database Systems, Spring 2020 10/25

MongoDB Indexing file:///Users/jrg/235-2020-SPRING/SLIDES/WEEK11/26indexing/26indexing.html#1 10 of 25 21/10/20, 11:50 pm

slide-11
SLIDE 11

Sparse index

MongoDB indexes are dense by default In a dense index for every document there is an index key even the document lacks a key Then there exists a null entry in an index and it is possible to use an index for a query like Dense index is inconvenient when: In a sparse index, only documents that have a value for the indexed key are indexed

db.department.find( {"budget":null} )

find()

unique index on a field that doesn’t appear in every document in a collection is needed a large number of documents in a collection have no indexed key

  • db.department.createIndex( {"total_staff_number":1},

{"unique":false, "sparse":true} )

createIndex() TOP Created by Janusz R. Getta, CSCI235 Database Systems, Spring 2020 11/25

MongoDB Indexing file:///Users/jrg/235-2020-SPRING/SLIDES/WEEK11/26indexing/26indexing.html#1 11 of 25 21/10/20, 11:50 pm

slide-12
SLIDE 12

Indexing

Outline

Overview of indexing Single key index Compound key index Sparse index Multikey index Hashed index Geospacial index Index administration

TOP Created by Janusz R. Getta, CSCI235 Database Systems, Spring 2020 12/25

MongoDB Indexing file:///Users/jrg/235-2020-SPRING/SLIDES/WEEK11/26indexing/26indexing.html#1 12 of 25 21/10/20, 11:50 pm

slide-13
SLIDE 13

Multikey index

In multikey index multiple entries in the index reference the same document Multikey index is useful for indexing fields whose values are arrays Each value in this courses.code array will appear in the index A query on any array values can use the index to locate the document

db.department.createIndex( {"course.code":1} )

createIndex() TOP Created by Janusz R. Getta, CSCI235 Database Systems, Spring 2020 13/25

MongoDB Indexing file:///Users/jrg/235-2020-SPRING/SLIDES/WEEK11/26indexing/26indexing.html#1 13 of 25 21/10/20, 11:50 pm

slide-14
SLIDE 14

Indexing

Outline

Overview of indexing Single key index Compound key index Sparse index Multikey index Hashed index Geospacial index Index administration

TOP Created by Janusz R. Getta, CSCI235 Database Systems, Spring 2020 14/25

MongoDB Indexing file:///Users/jrg/235-2020-SPRING/SLIDES/WEEK11/26indexing/26indexing.html#1 14 of 25 21/10/20, 11:50 pm

slide-15
SLIDE 15

Hashed index

In hashed index the keys become the arguments of a hash function and a results of of hash function determine location of a document in a hash bucket The hashed values will determine the ordering of the documents Hashed indexes have the following restructions: Hashed indexes are used for sharding

db.department.createIndex( {"name":"hashed"} )

Indexing

equality queries can be processed with an index range queries cannot use hashed index multikey hashed indexes are not allowed floating-point values are cast to an integer before being hashed; 1.4 and 1.5 will have the same value in hashed index

  • TOP Created by Janusz R. Getta, CSCI235 Database Systems, Spring 2020

15/25

MongoDB Indexing file:///Users/jrg/235-2020-SPRING/SLIDES/WEEK11/26indexing/26indexing.html#1 15 of 25 21/10/20, 11:50 pm

slide-16
SLIDE 16

Indexing

Outline

Overview of indexing Single key index Compound key index Sparse index Multikey index Hashed index Geospacial index Index administration

TOP Created by Janusz R. Getta, CSCI235 Database Systems, Spring 2020 16/25

MongoDB Indexing file:///Users/jrg/235-2020-SPRING/SLIDES/WEEK11/26indexing/26indexing.html#1 16 of 25 21/10/20, 11:50 pm

slide-17
SLIDE 17

Geospacial index

Geospacial index allows to find the documents that are close to a given location, based on latitude and longitude values stored in each document Geospacial index can be used to efficiently calculate geographic distances, including the curvature of the earth MongoDB supports different kinds of indexes, however, only the first two types of indexes listed below can be combined to a compound index

1: Ascending B*-tree index

  • 1: Descending B*-tree index

"hashed": Hashtable index; very fast for lookup by exact value, especially in very large collections. But it is not usable for inexact queries ("$gt", "$regex" or similar) "text": Text index designed for searching for words in strings with natural language "2d": Geospatial index on a flat plane "2dsphere": Geospatial index on a sphere

  • TOP Created by Janusz R. Getta, CSCI235 Database Systems, Spring 2020

17/25

MongoDB Indexing file:///Users/jrg/235-2020-SPRING/SLIDES/WEEK11/26indexing/26indexing.html#1 17 of 25 21/10/20, 11:50 pm

slide-18
SLIDE 18

Indexing

Outline

Overview of indexing Single key index Compound key index Sparse index Multikey index Hashed index Geospacial index Index administration

TOP Created by Janusz R. Getta, CSCI235 Database Systems, Spring 2020 18/25

MongoDB Indexing file:///Users/jrg/235-2020-SPRING/SLIDES/WEEK11/26indexing/26indexing.html#1 18 of 25 21/10/20, 11:50 pm

slide-19
SLIDE 19

Index administration

Listing the indexes

db.department.getIndexes()

find()

[ { "v" : 2, "key" : { "_id" : 1 }, "name" : "_id_", "ns" : "test.department" } ]

Results TOP Created by Janusz R. Getta, CSCI235 Database Systems, Spring 2020 19/25

MongoDB Indexing file:///Users/jrg/235-2020-SPRING/SLIDES/WEEK11/26indexing/26indexing.html#1 19 of 25 21/10/20, 11:50 pm

slide-20
SLIDE 20

Index administration

Creating an index

db.department.createIndex( {"name":"hashed"} )

createIndex()

{ "createdCollectionAutomatically" : false, "numIndexesBefore" : 1, "numIndexesAfter" : 2, "ok" : 1 }

Results TOP Created by Janusz R. Getta, CSCI235 Database Systems, Spring 2020 20/25

MongoDB Indexing file:///Users/jrg/235-2020-SPRING/SLIDES/WEEK11/26indexing/26indexing.html#1 20 of 25 21/10/20, 11:50 pm

slide-21
SLIDE 21

Index administration

Listing the indexes

db.department.getIndexes()

getindexes()

[ { "v" : 2, "key" : { "_id" : 1 }, "name" : "_id_", "ns" : "test.department" }, { "v" : 2, "key" : { "name" : "hashed" }, "name" : "name_hashed", "ns" : "test.department" } ]

Results TOP Created by Janusz R. Getta, CSCI235 Database Systems, Spring 2020 21/25

MongoDB Indexing file:///Users/jrg/235-2020-SPRING/SLIDES/WEEK11/26indexing/26indexing.html#1 21 of 25 21/10/20, 11:50 pm

slide-22
SLIDE 22

Index administration

Delete an index name_hashed

db.department.dropIndex("name_hashed")

dropIndex()

db.department.getIndexes()

getIndexes()

[ { "v" : 2, "key" : { "_id" : 1 }, "name" : "_id_", "ns" : "test.department" } ]

Results TOP Created by Janusz R. Getta, CSCI235 Database Systems, Spring 2020 22/25

MongoDB Indexing file:///Users/jrg/235-2020-SPRING/SLIDES/WEEK11/26indexing/26indexing.html#1 22 of 25 21/10/20, 11:50 pm

slide-23
SLIDE 23

Index administration

Creation of indexes before loading data allows indexes to be built incrementally as the data is inserted Creation of an index on an already loaded collection may take a long time It is possible to create an index in the background without closing a database system It is possible to create an index in offline by taking a replica node offline, building an index, and taking node online allowing the node to catch up with master replica node When ready we can promote a node to primary and take another secondary node offline, etc.

db.department.createIndex( {"total_staff_number":1},{"background":true} )

createIndex() TOP Created by Janusz R. Getta, CSCI235 Database Systems, Spring 2020 23/25

MongoDB Indexing file:///Users/jrg/235-2020-SPRING/SLIDES/WEEK11/26indexing/26indexing.html#1 23 of 25 21/10/20, 11:50 pm

slide-24
SLIDE 24

Index administration

It is possible to re-build indexes in order to defragment them after a lot

  • f updates

db.department.reIndex()

reIndex()

{ "nIndexesWas" : 1, "nIndexes" : 1, "indexes" : [ { "v" : 2, "key" : { "_id" : 1 }, "name" : "_id_", "ns" : "test.department" } ], "ok" : 1 }

Results TOP Created by Janusz R. Getta, CSCI235 Database Systems, Spring 2020 24/25

MongoDB Indexing file:///Users/jrg/235-2020-SPRING/SLIDES/WEEK11/26indexing/26indexing.html#1 24 of 25 21/10/20, 11:50 pm

slide-25
SLIDE 25

References

Chodorow K. MongoDB The Definitive Guide, O'Reilly, 2013 Banker K., Bakkum P., Verch S., Garret D., Hawkins T., MongoDB in Action, 2nd ed., Manning Publishers, 2016 MongoDB Manual, Indexes https://docs.mongodb.com/manual /indexes/

TOP Created by Janusz R. Getta, CSCI235 Database Systems, Spring 2020 25/25

MongoDB Indexing file:///Users/jrg/235-2020-SPRING/SLIDES/WEEK11/26indexing/26indexing.html#1 25 of 25 21/10/20, 11:50 pm