hagiography (noun) CMU SCS ChristosTheGreekGodofDatabases.com - - PDF document

hag i og ra phy
SMART_READER_LITE
LIVE PREVIEW

hagiography (noun) CMU SCS ChristosTheGreekGodofDatabases.com - - PDF document

Faloutsos/Pavlo CMU - 15-415/615 CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415/615 - DB Applications C. Faloutsos A. Pavlo How to Scale a Database System CMU SCS hagiography (noun) CMU SCS


slide-1
SLIDE 1

Faloutsos/Pavlo CMU - 15-415/615 1

CMU SCS

Carnegie Mellon Univ.

  • Dept. of Computer Science

15-415/615 - DB Applications

  • C. Faloutsos – A. Pavlo

How to Scale a Database System

CMU SCS

hag·i·og·ra·phy

(noun)

CMU SCS

ChristosTheGreekGodofDatabases.com

  • Pinterest meets Causal Encounters

meets Kickstarter meets Twitter

– With Christos!

slide-2
SLIDE 2

Faloutsos/Pavlo CMU - 15-415/615 2

CMU SCS
  • More reads than writes.
  • All media stored outside of DBMS.
  • How do we choose the right

database architecture? ChristosTheGreekGodofDatabases.com

Faloutsos/Pavlo CMU SCS 15-415/615 4 CMU SCS

Outline

  • Single-Node Databases
  • NoSQL Systems
  • NewSQL Systems
Faloutsos/Pavlo CMU SCS 15-415/615 5 CMU SCS

Late-1990s / Early-2000s

  • All the big players were heavyweight and

expensive.

– Oracle, DB2, Sybase, SQL Server, Informix.

  • Open-source databases were missing

important features.

– Postgres, mSQL, MySQL.

Faloutsos/Pavlo CMU SCS 15-415/615 6
slide-3
SLIDE 3

Faloutsos/Pavlo CMU - 15-415/615 3

CMU SCS

Mid-2000s

  • MySQL + InnoDB is widely adopted by

new web companies:

– Supported transactions, replication, recovery. – Memcache for caching queries.

Faloutsos/Pavlo CMU SCS 15-415/615 7 CMU SCS
  • Let’s go with MySQL.
  • We’re getting a lot of traffic.
  • Our database server is saturated!

How do we increase the capacity of our database server?

ChristosTheGreekGodofDatabases.com

Faloutsos/Pavlo CMU SCS 15-415/615 8 CMU SCS

Buy a faster machine. Idea #1:

slide-4
SLIDE 4

Faloutsos/Pavlo CMU - 15-415/615 4

CMU SCS

Scaling Up

Application Server Database Server

(+) Requires no change to application. (+) Improvements are immediate. (-) Expensive! Diminishing Returns. (-) Single Point of Failure.

  • More disks.
  • More RAM.
  • Faster CPUs.
  • Use SSDs.
Faloutsos/Pavlo CMU SCS 15-415/615 10 CMU SCS

Replicate database on multiple servers. Idea #2:

CMU SCS

Replication

Application Server Database Server

(+) Requires no change to application. (+) Parallelize read operations. (+) Improved fault tolerance. (-) Expensive! Diminishing Returns. (-) Writes limited to slowest node.

Replicas

Faloutsos/Pavlo CMU SCS 15-415/615 12

Read Request

slide-5
SLIDE 5

Faloutsos/Pavlo CMU - 15-415/615 5

CMU SCS

Cache query results. Idea #3:

CMU SCS

Query Cache

Application Server Database Server

(+) Reduce load on DBMS. (+) Fast API. (-) Extra roundtrip per query. (-) Requires application changes. (-) Doesn’t help write-heavy apps.

Replicas

Faloutsos/Pavlo CMU SCS 15-415/615 14

Check Cache Query Request Update Cache memcache

CMU SCS

Push SQL into stored procedures. Idea #4:

slide-6
SLIDE 6

Faloutsos/Pavlo CMU - 15-415/615 6

CMU SCS

Stored Procedures

Application Code Database Server

(+) Reduces network roundtrips. (+) Less lock contention. (+) Modularization. (-) Application logic in two places. (-) PL/SQL is not standardized.

Replicas

def getPage(request): # Process request EXEC SQL EXEC SQL # Process results if x == True: EXEC SQL else: EXEC SQL # Render HTML page return (html)

BEGIN: EXEC SQL EXEC SQL if x == True: EXEC SQL else: EXEC SQL return (results) END;

def getPage(request): # Process request EXEC PROCEDURE # Render HTML page return (html)

Stored Procedure

Faloutsos/Pavlo CMU SCS 15-415/615 16 CMU SCS

Shard database across multiple servers. Idea #5:

CMU SCS

Sharding / Partitioning

Application Server Database Cluster

(+) Parallelize all operations. (+) Much easier to add more hardware. (-) Most DBMSs don’t support this. (-) Joins are expensive. (-) Non-trivial to split database.

Logical Partitions

Faloutsos/Pavlo CMU SCS 15-415/615 18
slide-7
SLIDE 7

Faloutsos/Pavlo CMU - 15-415/615 7

CMU SCS
  • We want to scale out but writing a sharding

layer is hard.

  • Some parts of our

application don’t need a full-featured DBMS.

ChristosTheGreekGodofDatabases.com

Faloutsos/Pavlo CMU SCS 15-415/615 CMU SCS

Give up ACID guarantees for scalability. Idea #6:

CMU SCS

Application Servers DBMS Servers

Eventual Consistency

Faloutsos/Pavlo CMU SCS 15-415/615 21

Master Replicas

Update Profile Get Profile

? ?

slide-8
SLIDE 8

Faloutsos/Pavlo CMU - 15-415/615 8

CMU SCS

Late-2000s (NoSQL)

  • NoSQL systems are able to scale

horizontally right out of the box by giving traditional database features.

Faloutsos/Pavlo CMU SCS 15-415/615 22 CMU SCS
  • We need to process payments.
  • We don’t want to lose orders.
  • We need joins and ACID transactions.

ChristosTheGreekGodofDatabases.com

CMU SCS

Strong Consistency

Send Money Nice Christos Pictures! Thanks!

  • $100

+$100

Faloutsos/Pavlo CMU SCS 15-415/615 24

Use Two- Phase Commit

slide-9
SLIDE 9

Faloutsos/Pavlo CMU - 15-415/615 9

CMU SCS

Keep guarantees,

  • ptimize for

workload type. Idea #7:

CMU SCS

Early-2010s (NewSQL)

  • New DBMSs that can scale across multiple

machines natively and provide ACID guarantees.

CMU SCS

Conclusion

  • RDBMS (Single-Node):

– MySQL, Postgres

  • NoSQL (Multi-Node):

– Key-Value, Documents, Graphs

  • NewSQL (Multi-Node):

– Transaction Processing, MySQL Sharding

Faloutsos/Pavlo CMU SCS 15-415/615 27
slide-10
SLIDE 10

Faloutsos/Pavlo CMU - 15-415/615 10

CMU SCS

What DBMS should my start-up use?

slide-11
SLIDE 11

Faloutsos/Pavlo CMU - 15-415/615 11

CMU SCS

Beyond the 15-415/615

  • Christos is teaching 15-826 this fall:

– Multimedia Databases and Data Mining

  • Send me an email if you’re interested in

working on a database research project.