Tarantool - a NoSQL Tarantool - a NoSQL database with SQL database - PowerPoint PPT Presentation

Tarantool - a NoSQL Tarantool - a NoSQL database with SQL database with SQL Pavel Lapaev, Kirill Yukhin, Product Manager@Mail.ru Engineering Manager @Mail.ru 1

Agenda Agenda What is Mail.ru Group? What is Tarantool? Performance Storage engines Scaling Why SQL? Roadmap 2

Mail.ru Group Mail.ru Group 20 years in business, leading IT company in Russia Social networks VK (97m monthly) and Odnoklassniki (45m monthly) Email (top 5 in the world, 100m active accounts) Portal and IM (35m monthly) Online Games (512m accounts) E-commerce, Search, Delivery, Marketplace, E- learning, Maps, etc. 3

Tarantool in a Nutshell Tarantool in a Nutshell An in-memory database with an integrated application server Team of 70+ people 10 years of history Open-source and enterprise versions 4

Tarantool Facts Tarantool Facts Here is a bunch of features: In-memory and disk storage engines Core written in C, app server exposes Lua Persistence (WAL and snapshots) Application server onboard ACID transactions Horizontal scalability: sharding and replication NoSQL... with SQL 5

Tarantool Products Tarantool Products Tarantool itself Cartridge (cluster management framework) Kubernetes Operator Enterprise Edition Data Grid 6

Enterprise Products Enterprise Products Enterprise Edition L2, L3 support Enterprise database connectivity Oracle replication modules Security audit log Data Grid System to develop distributed apps Flexible connectivity to external sources Versioned data storage Pre and post processing of data Lots of tools already in the box 7

Tarantool Customers Tarantool Customers 8

History History Created @ Mail.ru Group about 10 years ago Used to store sessions/profiles of millions of users 4 instances load web-page AJAX request profiles mobile API 8 instances Web servers > 1.000.000 requests per second 9

Must-have and mustn't-have features Must-have and mustn't-have features No secondary keys, constraints etc. Schema-less Need a language. *QL is not must-have High-speed in any sense! Simple Extensible Transactions Persistency Once again: it must be fast , no excuses 10

Tarantool: Bird's Eye View Tarantool: Bird's Eye View No need for cache: It is in-memory But still DBMS: persistency and transactions It regards ACID Single threaded: It is lock-free Easy: imperative language is on board: Lua It JIT s It's easy to program for business It scales: Replication and sharding 11

DBMS + Application Server C, Lua, SQL, Python, PHP, Go, Java, C# ... Persistent in-memory and disk storage engines Stored procedures in C, Lua, SQL Process Queries WAL Network handling Threads 12

Coöperative multitasking Multithreading Fibers Event-loop 13

Coöperative multitasking Multithreading That is a stall Losses on caches coherency support Losses on locks Losses on long operations Fibers Event-loop 13

Coöperative multitasking Multithreading That is a stall Losses on caches coherency support Losses on locks Losses on long operations Fibers Event-loop Thread is always busy Lock-free Single core - no coherency issues at all 13

Vinyl Vinyl In-memory is OK, but not always enough Write-oriented: LSM tree Same API as memtx Transactions, secondary keys 14

Scaling Scaling Why? 15

Scaling Scaling Vertical 15

Scaling Scaling Horizontal 15

Horizontal scaling Horizontal scaling Replication Sharding ABC ABC ABC A C B Scaling computation and fault Scaling computation and tolerance data 16

Horizontal scaling Horizontal scaling Replication Sharding ABC ABC ABC A C B Scaling computation and fault Scaling computation and tolerance data Replication and sharding A B C A A B B C C Scaling computation, data and fault tolerance 16

Replication Replication Asynchronous Synchronous begin commit begin commit prepare replicate replicate 17

Replication Replication Asynchronous Synchronous begin commit begin commit prepare replicate replicate Commit is not waiting for replication to succeed 17

Replication Replication Asynchronous Synchronous begin commit begin commit prepare replicate replicate Commit is not waiting for replication to Two phase commit. To succeed, need to succeed replicate to N nodes 17

Replication Replication Asynchronous Synchronous begin commit begin commit prepare replicate replicate Commit is not waiting for replication to Two phase commit. To succeed, need to succeed replicate to N nodes Faster Replicas might lag, conflict 17

Replication Replication Asynchronous Synchronous begin commit begin commit prepare replicate replicate Commit is not waiting for replication to Two phase commit. To succeed, need to succeed replicate to N nodes Faster More reliable Replicas might lag, conflict Slower, complicated protocols 17

Sharding Sharding Decide where to store? Ranges hash min max Found range where the key belongs -> Calculated hash of the key -> found the node found the node 18

Sharding Sharding Decide where to store? Ranges hash min max Found range where the key belongs -> Calculated hash of the key -> found the node found the node Best Complicated Usually useless 18

Sharding Sharding Decide where to store? Ranges hash min max Found range where the key belongs -> Calculated hash of the key -> found the node found the node Good enough Best ? Complex resharding Complicated Complex queries not fast Usually useless 18

Resharding problem Resharding problem shard _ id ( key ) : key → { shard , shard , ..., shard } 1 2 N Change N leads to change of shard-function shard _ id ( key 1) =  new _ shard _ id ( key ) 19

Resharding problem Resharding problem shard _ id ( key ) : key → { shard , shard , ..., shard } 1 2 N Change N leads to change of shard-function shard _ id ( key 1) =  new _ shard _ id ( key ) Useless data Need to re-calculate shard- moves functions for all data Some data might move on one of old nodes 19

Resharding problem Resharding problem shard _ id ( key ) : key → { shard , shard , ..., shard } 1 2 N Change N leads to change of shard-function shard _ id ( key 1) =  new _ shard _ id ( key ) Useless data Need to re-calculate shard- moves functions for all data Some data might move on one of old nodes ... but not in Tarantool land 19

Virtual sharding Virtual sharding Virtual Physical Data nodes nodes {tuple} {tuple} {tuple} {tuple} {tuple} {tuple} 20

Virtual sharding Virtual sharding Virtual Physical Data nodes nodes {tuple} {tuple} {tuple} {tuple} {tuple} {tuple} shard _ id ( key ) = { bucket , bucket , ..., bucket } 1 2 N # = const >> # Shard-function is fixed 20

Sharding Sharding Ranges Hashes Virtual buckets Having a range or a bucket, how to find where it is stored physically? 21

Sharding Sharding Ranges Hashes Virtual buckets Having a range or a bucket, how to find where it is stored physically? 1. Prohibit re-sharding 21

Sharding Sharding Ranges Hashes Virtual buckets Having a range or a bucket, how to find where it is stored physically? 1. Prohibit re-sharding 2. Always visit all nodes 21

Sharding Sharding Ranges Hashes Virtual buckets Having a range or a bucket, how to find where it is stored physically? 1. Prohibit re-sharding 2. Always visit all nodes 3. Implement proxy-router! 21

Why SQL? Why SQL? CREATE TABLE t1 (id INTEGER PRIMARY KEY, a INTEGER, b INTEGER, c INTEGER) CREATE TABLE t2 (id INTEGER PRIMARY KEY, x INTEGER, y INTEGER, z INTEGER) SQL> SELECT DISTINCT(a) FROM t1, t2 WHERE t1.id = t2.id AND t2.y > 1; 22

Why SQL? Why SQL? CREATE TABLE t1 (id INTEGER PRIMARY KEY, a INTEGER, b INTEGER, c INTEGER) CREATE TABLE t2 (id INTEGER PRIMARY KEY, x INTEGER, y INTEGER, z INTEGER) function query() local join = {} for _, v1 in box.space.t1:pairs({}, {iterator='ALL'}) do local v2 = box.space.t2:get(v1[1]) if v2[3] > 1 then table.insert(join, {t1=v1, t2=v2}) end end local dist = {} for _, v in pairs(join) do if dist[v['t1'][2]] == nil then dist[v['t1'][2]] = 1 end end local result = {} for k, _ in pairs(dist) do table.insert(result, k) end return result end 23

SQL Features SQL Features Trying to be subset of ANSI Minimum overhead of query planner ACID transactions, SAVEPOINTs left/inner/natural JOIN, UNION/EXCEPT, subqueries HAVING, GROUP BY, ORDER BY WITH RECURSIVE Triggers Views Constraints Collations 24

Tarantool - a NoSQL Tarantool - a NoSQL database with SQL database - PowerPoint PPT Presentation

Tarantool - a NoSQL Tarantool - a NoSQL database with SQL database with SQL Pavel Lapaev, Kirill Yukhin, Product Manager@Mail.ru Engineering Manager @Mail.ru 1 Agenda Agenda What is Mail.ru Group? What is Tarantool? Performance Storage

SQL and JS Pitfalls Assignment 2 Preparation SQL Concepts SQL vs. NoSQL

Intermezzo: A typical database architecture 136 A typical database architecture SQL SQL SQL

NoSQL and MongoDB 1 2 Introduction to NoSQL Based on a presentation by Traversy Media 3 What

This Lecture SQL The SQL language SQL, the relational model, and E/R diagrams SQL Data

SQL SQL SQL = Structured Query Language Standard query language for relational

SQL & MySQL Jeff Siarto - TC 361 Whats the Difference? MySQL is a database SQL is

What is SQL Database Managed Instance? SQL Database (DBaaS) A flavor of SQL DB that designed to

A1 (Part 2): Injection SQL Injection SQL injection is prevalent SQL injection is impactful Why a

What is SQL? SQL stands for Structured Query Language SQL lets you access and manipulate

BASIC SQL CHAPTER 4 (6/E) CHAPTER 8 (5/E) 1 CHAPTER 4 OUTLINE SQL Data Definition and

Basic SQL Lecture 2 1 Outline Data in SQL Simple Queries in SQL Queries with more

NoSQL Source: Pramod J. Sadalage and Martin Fowler NoSQL Distilled: A Brief Guide to the

& Nosql DB New Syllabus 2019-20 Visit : python.mykvs.in for regular updates SQL SQL is an

Databases SQL, NoSQL, ORMs, REST/GraphQL, JSON/gRPC SQL QL (1974) 4) Initially, relational

Database Programming in SQL/O RACLE SQL-3 Standard/ORACLE 8: ER-Modeling Schema

Database Utilities 10/17/2007 DC/Win Database Utilities Opening Database Utilities From File on

Course Content Web Technologies and Applications Introduction Databases & WWW

Java and RDBMS Married with issues Database constraints Speaker Jeroen van Schagen Situation

175 Why use a database? You can query the data (run searches) You can integrate with other

Database Applications JDBC SQL Injection Course Objectives Design Construction Applications

Efficient Object-Relational Mapping for JAVA and J2EE Applications or the impact of J2EE on

What is a back-end? A back-end is a stand-alone application program or the part of an application

POSTGIS berblick, Tips und Tricks Stefan Keller Topics What is PostGIS? Spatial table

Database Management Systems Session 6 Instructor: Vinnie Costa vcosta@optonline.net CSC056-Z1

Tarantool - a NoSQL Tarantool - a NoSQL database with SQL database - PowerPoint PPT Presentation

Tarantool - a NoSQL Tarantool - a NoSQL database with SQL database with SQL Pavel Lapaev, Kirill Yukhin, Product Manager@Mail.ru Engineering Manager @Mail.ru 1 Agenda Agenda What is Mail.ru Group? What is Tarantool? Performance Storage

SQL and JS Pitfalls Assignment 2 Preparation SQL Concepts SQL vs. NoSQL

Intermezzo: A typical database architecture 136 A typical database architecture SQL SQL SQL

NoSQL and MongoDB 1 2 Introduction to NoSQL Based on a presentation by Traversy Media 3 What

This Lecture SQL The SQL language SQL, the relational model, and E/R diagrams SQL Data

SQL SQL SQL = Structured Query Language Standard query language for relational

SQL &amp; MySQL Jeff Siarto - TC 361 Whats the Difference? MySQL is a database SQL is

What is SQL Database Managed Instance? SQL Database (DBaaS) A flavor of SQL DB that designed to

A1 (Part 2): Injection SQL Injection SQL injection is prevalent SQL injection is impactful Why a

What is SQL? SQL stands for Structured Query Language SQL lets you access and manipulate

BASIC SQL CHAPTER 4 (6/E) CHAPTER 8 (5/E) 1 CHAPTER 4 OUTLINE SQL Data Definition and

Basic SQL Lecture 2 1 Outline Data in SQL Simple Queries in SQL Queries with more

NoSQL Source: Pramod J. Sadalage and Martin Fowler NoSQL Distilled: A Brief Guide to the

&amp; Nosql DB New Syllabus 2019-20 Visit : python.mykvs.in for regular updates SQL SQL is an

Databases SQL, NoSQL, ORMs, REST/GraphQL, JSON/gRPC SQL QL (1974) 4) Initially, relational

Database Programming in SQL/O RACLE SQL-3 Standard/ORACLE 8: ER-Modeling Schema

Database Utilities 10/17/2007 DC/Win Database Utilities Opening Database Utilities From File on

Course Content Web Technologies and Applications Introduction Databases &amp; WWW

Java and RDBMS Married with issues Database constraints Speaker Jeroen van Schagen Situation

175 Why use a database? You can query the data (run searches) You can integrate with other

Database Applications JDBC SQL Injection Course Objectives Design Construction Applications

Efficient Object-Relational Mapping for JAVA and J2EE Applications or the impact of J2EE on

What is a back-end? A back-end is a stand-alone application program or the part of an application

POSTGIS berblick, Tips und Tricks Stefan Keller Topics What is PostGIS? Spatial table

Database Management Systems Session 6 Instructor: Vinnie Costa vcosta@optonline.net CSC056-Z1

SQL & MySQL Jeff Siarto - TC 361 Whats the Difference? MySQL is a database SQL is

& Nosql DB New Syllabus 2019-20 Visit : python.mykvs.in for regular updates SQL SQL is an

Course Content Web Technologies and Applications Introduction Databases & WWW