Introduction to NoSQL Instructor: Ekpe Okorafor 1. Big Data - PowerPoint PPT Presentation

Introduction to NoSQL Instructor: Ekpe Okorafor 1. Big Data Academy - Accenture 2. Computer Science - African University of Science & Technology

Agenda • Introduction • Technical Overview • Use Cases • Under The Hood: Compare & Contrast 2

What Is NoSQL? NoSQL is a bit like Cloud Computing - An umbrella term NoSQL: • Data stores that avoid the RELATIONAL model • Use other data models

NoSQL == Not Relational Typical NoSQL characteristics ….. • No schema • No joins • Usually distributed • Usually replicated Relational databases have been • Usually not ACID a successful technology for twenty years, providing • No SQL persistence, concurrency control, and an integration mechanism

Why NoSQL? Definitely consider NoSQL if you have ….. • Need to scale horizontally without having to invest in EXPENSIVE large servers and storage area networks (SAN) • Requirement to control 99 %ile latency • Requirement for rapid development • in a coder friendly environment NoSQL NoSQL seems to be a better match for some companies than to others. For many industry needs, traditional RDBMS will work adequately.

…Other Reasons Problems that don’t require RDBMS • Data access by primary key only • Data join not needed • Write-intensive and continuously • Data model is a single set of items NoSQL These problems don’t necessarily require a relational database and other data models and solutions can be considered.

Look At The Trends The enterprise data landscape is changing Emerging Database Model Traditional RDBMS Model Weak structured data Fixed data structure Schemaless approach Schema creation Simple access patterns Applications are more social Trend 1 write, many reads Many writers, many readers Authorship constrained Authorship is universal Few writers, many readers Anyone can read and write Fixed data location Data creation/access is global Central data model Distributed data set model Traditional "relational" databases are not designed to manage emerging data types

What It All Means Enterprises have a cost effective option to ……. • Undertake data problems previously thought to be too difficult or impossible to solve using traditional legacy relational databases • Tap into huge unstructured data sources from emerging platforms for data analysis and business intelligence • Derive connected intelligence using graph database methods as data becomes increasingly more complex and highly connected Emerging Legacy!!!

What Should Be Done • NoSQL business enterprise data model analysis Key Value pair • Key-Value pair databases are Web Analytics frequently found in caching Online and fast-lookup apps booking/itinerary management and • search Column-oriented databases power sensor networks, such Column- as with SETI and NASA Graph oriented databases NoSQL • Document-based databases Large Sensor Networks Social Networks are often used in place of Key- Social Network Data Analysis Value Pair databases when Document- based richer querying is required Web App User Data • Graph databases can match Analysis social graphs, and simplify Semantic Data Analysis relationship navigation Document Archive Management

Making The Right choice Consider the key MOTIVATION & business need • Just as transactional & analytical processing needs lead to technologies optimized for OLTP and OLAP • Align the critical motivation and business needs to desired NoSQL solution Big Data Convenience Connectedness • Large volume of data • • Simple to set up , ease of Complex and connected • Storage and processing use and schema-less data data. • • requirements Knowledge about the Knowledge about the • Column oriented and key- individual networks and relationships value stores are well • • key-value and document Graph databases can suited to big data markedly improve one’s stores ) help solve environments providing big problems related to atomic ability to leverage data intelligence intelligence connected intelligence

NoSQL Systems Are alternative to traditional RDBMS, providing … • Flexible schema • Quicker/cheaper to set up • Massive scalability • Relaxed consistency → higher performance & availability ✓ No declarative query language → more programming ✓ Relaxed consistency → fewer guarantees

NoSQL Systems Data Models • “ NoSQL ” = “Not Only SQL’ Not every data management/analysis problem is best solved exclusively using traditional RDBMS • Current NoSQL based on data model types include: o Key-value pair o Document-based o Column oriented o Graph database

Complexity Size Key-value pair Column oriented Document based Graph Complexity

Key-Value Pair Frequently found in caching and fast-lookup apps • Extremely simple interface o Data model: (key, value) pairs o Operations: Insert(key,value), Fetch(key), Update(key), Delete(key) • Implementation: efficiency, scalability, fault-tolerance o Records distributed to nodes based on keys o Replication o Single- record transactions, “eventual consistency” • Example systems o Redis, Riak

Document-Based Used when richer key-value querying is required • Like key-value store except value is document o Data model: (key, document) pairs o Document: JSON, XML, other semi-structured formats o Basic operations: o Insert(key,document), Fetch(key), Update(key), Delete(key) • Example systems o CouchDB, MongoDB, Riak , …..

Column Oriented Used when richer key-value querying is required • Like key-value store except value is document o Data model: columnar stores o Document: structured data designed to scale to large size o Basic operations: • Example systems • Hbase, Cassandra

Graph Database Used to simplify relationship navigation • Graph database systems o Data model: nodes and edges o Nodes may have properties (including ID) o Edges may have labels or roles o Interfaces and query languages vary • Example systems o Neo4J, DSE Graph, GraphDB, …….

Which One To Use? Key-value Column-Based Processing a constant Handles size well. stream of small reads Massive write loads. and writes HA. MapReduce NoSQL Data Models Natural data modeling. Complex and Programmer friendly. connected data. Graph Rapid development. algorithms and Web friendly relations Document Graph

Beyond Data Models Choosing a solution by data model alone is not enough Need a classification that would actually allow an observer to determine whether or not the solution category is appropriate for a given use case?

NoSQL Solutions Use case categories NoSQL Application Use Intelligence Data Model Requirements Case

Use Case Categories Non-exhaustive list of use case categories Products / Redis, Riak, CoucDB, MongoDB, Hbase, Cassandra, Neo4J, etc . features • Storing Session • Event logging • Recommendation • Content Mgt Systems Business • Search optimization Information engines • Web Analytics • User Profiles • Customer analytics • Business intelligence Use Case • Real-Time Analytics • Shopping Cart Data • Social computing Application High Unstructured Caching Web-scale Complex Data Availability Data Requirement Document Key-Value Column Graph Data Model Atomic Big Data Connected Intelligence

1. Social Media Atomic + Key-Value + High Availability Background • Yammer is an enterprise social network • Huge data to manage from its rapidly growing user base • Data is always updated • Needed to build a new notifications feature • Gives the user a sorted set of notifications • Call to action based on the nature of the notification Challenge NoSQL Approach • Data size = 2+ Terabytes • Employ a reliable, scalable NoSQL solution • Duplicate data and stability concerns due to • High availability is paramount • Amazon – Dynamo model fits use case difficulty with replication and database crashes • Dynamo-inspired projects – (Riak & Voldemort) • Data is stored in a Postgres data store • Postgres provides consistency of data guarantees at • Riak chosen because of stability and very low latency the expense of availability • Need for high availability (HA) Results • Yammer now has a robust Notifications module in its social collaboration tool • No increase its data footprint on its single point of failure • Very low latency • Highly available data powering the notifications

Introduction to NoSQL Instructor: Ekpe Okorafor 1. Big Data - PowerPoint PPT Presentation

Introduction to NoSQL Instructor: Ekpe Okorafor 1. Big Data Academy - Accenture 2. Computer Science - African University of Science & Technology Agenda Introduction Technical Overview Use Cases Under The Hood: Compare

NoSQL and MongoDB 1 2 Introduction to NoSQL Based on a presentation by Traversy Media 3 What

NoSQL Source: Pramod J. Sadalage and Martin Fowler NoSQL Distilled: A Brief Guide to the

NoSQL Terje Gjster, Ph.D. UiA, Grimstad 16. November 2015 Overview Introduction and

NoSQL like There is No Tomorrow Khawaja Head of Engineering, NoSQL Swaminathan Sivasubramanian

How to Use NoSQL in Enterprise Java Applications Patrick Baumgartner NoSQL Roadshow | Zrich |

How to Use NoSQL in Enterprise Java Applications Patrick Baumgartner NoSQL Roadshow | Basel |

The NoSQL Ecosystem 7-21-10 Wednesday, July 21, 2010 Executive summary NoSQL is about using

1 2 What is covered in this presentation? A brief history of databases NoSQL WHY, WHAT

NoSQL Concepts, Techniques & Systems Part 1 Valentina Ivanova IDA, Linkping University

NoSQL CS226 Big-data Management 1 Based on a presentation by Traversy Media 2 What is

Tarantool - a NoSQL Tarantool - a NoSQL database with SQL database with SQL Pavel Lapaev,

NoSQL Concepts, Techniques & Systems Part 2 Valentina Ivanova IDA, Linkping University

Why NoSQL? Why Riak? Justin Sheehy justin@basho.com 1 What's all of this NoSQL nonsense?

NoSQL Introduction CS 377: Database Systems Recap: Data Never Sleeps

Data Modeling in the NoSQL World By: Ashutosh Kale, Adham Kamel, Jordan Mercado Kevin Kim,

Consistency of NoSQL Models Au Tran, Thy Nguyen, Chaz Chang, Vijaypal Singh, Timothy To, Akash

AWS Solutions Architect -- Associate Certification Review Brent Tuggle, Chris Kuehn, Phil Winans,

Building Your Own BaaS With Apache Usergrid & Docker : Lessons Learned At Scale Sungju

NoSQL Data Stores Corso di Sistemi e Architetture per Big Data A.A. 2017/18 Valeria Cardellini

Redis and Memcached Speaker: Vladimir Zivkovic, Manager, IT June, 2019 Problem Scenario

Get ready to be whats next. Jared Shockley http://jaredontech.com Senior Service Engineer

Final update: April changes 30 March 2020 Tax Agents and Bookkeepers This content is correct as

Mocking Drupal: Unit Testing in Drupal 8 Matthew Radcliffe mradcliffe @mattkineme Spoilers

Whatever Mechanics of Were Calling Improvement This Series Evans Center for Implementation

Introduction to NoSQL Instructor: Ekpe Okorafor 1. Big Data - PowerPoint PPT Presentation

Introduction to NoSQL Instructor: Ekpe Okorafor 1. Big Data Academy - Accenture 2. Computer Science - African University of Science & Technology Agenda Introduction Technical Overview Use Cases Under The Hood: Compare

NoSQL and MongoDB 1 2 Introduction to NoSQL Based on a presentation by Traversy Media 3 What

NoSQL Source: Pramod J. Sadalage and Martin Fowler NoSQL Distilled: A Brief Guide to the

NoSQL Terje Gjster, Ph.D. UiA, Grimstad 16. November 2015 Overview Introduction and

NoSQL like There is No Tomorrow Khawaja Head of Engineering, NoSQL Swaminathan Sivasubramanian

How to Use NoSQL in Enterprise Java Applications Patrick Baumgartner NoSQL Roadshow | Zrich |

How to Use NoSQL in Enterprise Java Applications Patrick Baumgartner NoSQL Roadshow | Basel |

The NoSQL Ecosystem 7-21-10 Wednesday, July 21, 2010 Executive summary NoSQL is about using

1 2 What is covered in this presentation? A brief history of databases NoSQL WHY, WHAT

NoSQL Concepts, Techniques &amp; Systems Part 1 Valentina Ivanova IDA, Linkping University

NoSQL CS226 Big-data Management 1 Based on a presentation by Traversy Media 2 What is

Tarantool - a NoSQL Tarantool - a NoSQL database with SQL database with SQL Pavel Lapaev,

NoSQL Concepts, Techniques &amp; Systems Part 2 Valentina Ivanova IDA, Linkping University

Why NoSQL? Why Riak? Justin Sheehy justin@basho.com 1 What's all of this NoSQL nonsense?

NoSQL Introduction CS 377: Database Systems Recap: Data Never Sleeps

Data Modeling in the NoSQL World By: Ashutosh Kale, Adham Kamel, Jordan Mercado Kevin Kim,

Consistency of NoSQL Models Au Tran, Thy Nguyen, Chaz Chang, Vijaypal Singh, Timothy To, Akash

AWS Solutions Architect -- Associate Certification Review Brent Tuggle, Chris Kuehn, Phil Winans,

Building Your Own BaaS With Apache Usergrid &amp; Docker : Lessons Learned At Scale Sungju

NoSQL Data Stores Corso di Sistemi e Architetture per Big Data A.A. 2017/18 Valeria Cardellini

Redis and Memcached Speaker: Vladimir Zivkovic, Manager, IT June, 2019 Problem Scenario

Get ready to be whats next. Jared Shockley http://jaredontech.com Senior Service Engineer

Final update: April changes 30 March 2020 Tax Agents and Bookkeepers This content is correct as

Mocking Drupal: Unit Testing in Drupal 8 Matthew Radcliffe mradcliffe @mattkineme Spoilers

Whatever Mechanics of Were Calling Improvement This Series Evans Center for Implementation

NoSQL Concepts, Techniques & Systems Part 1 Valentina Ivanova IDA, Linkping University

NoSQL Concepts, Techniques & Systems Part 2 Valentina Ivanova IDA, Linkping University

Building Your Own BaaS With Apache Usergrid & Docker : Lessons Learned At Scale Sungju