MongoDB Open-source, high-performance , document- oriented database - - PowerPoint PPT Presentation

mongodb
SMART_READER_LITE
LIVE PREVIEW

MongoDB Open-source, high-performance , document- oriented database - - PowerPoint PPT Presentation

MongoDB Open-source, high-performance , document- oriented database Jay Urbain, PhD https://docs.mongodb.com/ Modern Databases Non-relational data stores (NoSQL) Key/value Horizontal scalability, no table joins Hive,


slide-1
SLIDE 1

MongoDB

Open-source, high-performance, document-

  • riented database

Jay Urbain, PhD https://docs.mongodb.com/

slide-2
SLIDE 2

Modern Databases

  • Non-relational data stores (“NoSQL”)

– Key/value – Horizontal scalability, no table joins – Hive, Dynamo, Big Table, CouchDB, Redis, MongoDB,…

  • Next generation OLAP (OnLine Analytical Processing)

– Dimensional data model – Column store – Vertica, Aster, Greenplum

  • RDBMS OLTP (OnLine Transaction Processing)

– Relational model – Transaction oriented – Oracle, MySQL, PostgreSQL

slide-3
SLIDE 3

Database Market

NoSQL

slide-4
SLIDE 4

Non-relational Database Characteristics

  • No joins, no complex transactions => horizontally scalable

architectures.

  • Transactions at the individual table row or document level
  • Flexible data models => schema on read
  • Many are not using SQL for queries, e.g., JSON objects
  • Many are moving to support some level of SQL
  • Improved ways to develop applications? Depends…
slide-5
SLIDE 5

Non-relational Data Models

  • Key/Value – MemCache, Amazon Dynamo
  • Tabular – Google Big Table, Impala
  • Document oriented – MongoDB, CouchDB, other JSON stores
slide-6
SLIDE 6

Relational vs. non-Relational

BASE (Basically Available, Soft State, Eventual consistency) analysis of NoSQL. ACID (Atomicity, Consistency, Isolation, and Durability) versus BASE. CAP theorem – get 2 of 3: consistency, availability, partition tolerance

slide-7
SLIDE 7

Deciding Factors

  • Use case
  • Transactional support
  • Ad hoc query
  • Analytical processing
  • Reliability
  • Maintainability
  • Ease of Use
  • Scalability
  • Cost
slide-8
SLIDE 8

Document oriented database

  • Document-oriented databases (DOD) are designed for

storing, retrieving and managing document-oriented information (semi-structured data).

  • Popular with web applications.
  • DODs are one of the main categories of NoSQL databases.
  • DODs are a subclass of the key-value store NoSQL database.
  • In a key-value store, the data is considered to be opaque to

the database.

  • A DOD relies on the internal structure of the document in
  • rder to extract metadata that the database engine uses for

further optimization.

slide-9
SLIDE 9

Document oriented database

  • Document databases contrast with traditional relational

database (RDB).

  • RDBs store data in separate tables that are defined a priori,

and a single object may be spread across several tables.

  • DODs store all information for a given object in a single

instance in the database, and every stored object can be different from every other.

  • Eliminates the need for object-relational mapping while

loading data into the database.

slide-10
SLIDE 10

MongoDB History humongous

  • Most popular document oriented database.
  • Designed and developed by founders of Doubleclick, ShopWiki, GILT

group, etc.

  • GOAL: create high performance, fully consistent, horizontally scalable

general purpose data store.

  • MongoDB uses JSON-like documents with schemata.
  • Coding started fall 2007
  • Open Source – AGPL, written in C++
  • First production site March 2008
  • Current version: ~3.6.9 / 2018
slide-11
SLIDE 11

MongoDB

  • MongoDB is a distributed database at its core, so high availability,

horizontal scaling, and geographic distribution are built in and easy to use.

– MongoDB scales horizontally using sharding. – MongoDB can run over multiple servers, balancing the load or duplicating data to keep the system up and running in case of hardware failure.

  • Stores data in flexible, JSON-like documents, meaning fields can vary from

document to document and the data structure can be changed over time.

  • The document model maps to the objects in your application code,

making data easy to work with. No object-relational mapping.

  • Ad hoc queries, indexing, and real time aggregation provide ways to

access and analyze your data

  • Field, range query, and regular expression searches.
  • Fields in a MongoDB document can be indexed with primary and

secondary indices.

  • File storage
slide-12
SLIDE 12

JSON-style Documents represented as BSON

binary-encoded serialization of JSON-like documents

slide-13
SLIDE 13

Flexible Schemas

slide-14
SLIDE 14

Replication

slide-15
SLIDE 15

Auto-sharding

slide-16
SLIDE 16

Uses Cases

  • Good use cases

– Scaling out – Caching – The Web – High volume – Simple data models

  • Bad use cases

– Highly transactional – Ad-hoc business intelligences – Problems that require SQL – Complex relational data models

slide-17
SLIDE 17

MongoDB Basics

  • A collection is like a relational table.
  • Collections contain documents.
  • A document within a collection is like a record (row) within a

table.

  • Each document has an _id that is unique across all documents

within a collection.

slide-18
SLIDE 18

JSON Documents

  • Rich data models
  • Seamlessly map to native programming language types
  • Flexible for dynamic data
  • Better data locality
slide-19
SLIDE 19

Javascript Post - API

slide-20
SLIDE 20

Find posts by author - API

slide-21
SLIDE 21

Last ten posts - API

slide-22
SLIDE 22

RESTful Queries in MongoDB

  • The mongo model update function takes three arguments:

– query – JSON object of matching properties to identify the document to update – data – JSON object specifying the properties to update – callback – function that is called with the number of modified documents

  • The data to update is retrieved from the request body, which

is used to pass in larger chunks of data, often stored as a single JSON object.

slide-23
SLIDE 23

RESTful Queries in MongoDB

  • The JSON object passed in corresponds to the Mongo database

schema defining the project documents and includes only the model properties to modify.

  • Example: we can use curl to update a specific property, e.g.,

numberofsaves, in a specific project's data: $ curl -i -X PUT -H 'Content-Type: application/json' -d '{"numberofsaves": "272"}' http://localhost:3001/api/v1/projects/5593c8792fee421039c0afe6

  • It sends a PUT request with JSON content to the project update

endpoint.

slide-24
SLIDE 24

RESTful Queries in MongoDB

  • The –i requests that the headers are included in the output.
  • The –X specifies the HTTP method.
  • The -d argument specifies the request body or data containing

the JSON object with the properties to modify.

  • The routing URL includes the version number and ends with

the mongo database id of the project to update.

  • Curl prints the following response to this request:

HTTP/1.1 202 Accepted Content-Type: text/plain; charset=utf-8 Content-Length: 8

slide-25
SLIDE 25

Add a new record

Python: tasks_results = mongo.db.tasks _id = tasks_results.insert(task_)

Javascript:

slide-26
SLIDE 26

Delete a record

Demo MongoDB curl commands if time tasks_results = mongo.db.tasks result = tasks_results.delete_one({"id": str(task_id)}) Javascript: Python:

slide-27
SLIDE 27

JSON Serialization

slide-28
SLIDE 28

json_util provides two helper methods: dumps and loads, that wrap the native JSON methods and provide explicit BSON conversion to and from JSON.

slide-29
SLIDE 29

json_util – Tools for using Python’s json module with BSON documents

Example usage (serialization): >>> from bson import Binary, Code >>> from bson.json_util import dumps >>> dumps([{'foo': [1, 2]}, ... {'bar': {'hello': 'world'}}, ... {'code': Code("function x() { return 1; }", {})}, ... {'bin': Binary(b"")}]) '[{"foo": [1, 2]}, {"bar": {"hello": "world"}}, {"code": {"$code": "function x() { return 1; }", "$scope": {}}}, {"bin": {"$binary": "AQIDBA==", "$type": "00"}}]'

slide-30
SLIDE 30

json_util – Tools for using Python’s json module with BSON documents

Example usage (deserialization): >>> from bson.json_util import loads >>> loads('[{"foo": [1, 2]}, {"bar": {"hello": "world"}}, {"code": {"$scope": {}, "$code": "function x() { return 1; }"}}, {"bin": {"$type": "80", "$binary": "AQIDBA=="}}]') [{u'foo': [1, 2]}, {u'bar': {u'hello': u'world'}}, {u'code': Code('function x() { return 1; }', {})}, {u'bin': Binary('...', 128)}]