Main Memory Storage Engines Justin A. DeBrabant - - PowerPoint PPT Presentation

main memory storage engines
SMART_READER_LITE
LIVE PREVIEW

Main Memory Storage Engines Justin A. DeBrabant - - PowerPoint PPT Presentation

Main Memory Storage Engines Justin A. DeBrabant debrabant@cs.brown.edu Roadmap Paper 1: Data-Oriented Transaction Execution Paper 2: OLTP Through the Looking Glass Paper 3: Generic Database Cost Models for Hierarchical Memory Systems


slide-1
SLIDE 1

Main Memory Storage Engines

Justin A. DeBrabant debrabant@cs.brown.edu

slide-2
SLIDE 2

Roadmap

  • Paper 1: Data-Oriented Transaction

Execution

  • Paper 2: OLTP Through the Looking Glass
  • Paper 3: Generic Database Cost Models for

Hierarchical Memory Systems

Storage Engines 2

slide-3
SLIDE 3

Storage Engine?

  • the part of the database that actually stores

and retrieves data

– responsible for db performance

  • concurrency, consistency

– separate from the database “front end”

  • A single database can have several database

engine options

– e.g. MySQL supports InnoDB and MyISAM

Storage Engines 3

slide-4
SLIDE 4

Paper 1

  • Data Oriented Transaction Execution

– I. Pandis at al. (CMU/EPFL/Northwestern) – VLDB ‘10

Storage Engines 4

slide-5
SLIDE 5

Motivation

  • Hardware has changed

– recently, we’ve run into “thermal wall”

  • hard to fit more resistors per chip
  • …must abide by Moore’s Law!

– add more cores per chip – rely on thread-level parallelism

– most current architectures designed in the 80’s

  • what assumptions were made about the hardware?

Storage Engines 5

slide-6
SLIDE 6

Thread-to-Transaction Model

  • in most database engines, each transaction

assigned to its own thread – more cores = more parallel threads – each thread responsible for locking shared resources as needed

  • works fine with a few threads, how about thousands

executing concurrently on hundreds of hardware contexts?

Storage Engines 6

slide-7
SLIDE 7

Data Access Pattern

  • Each thread only worries about its own

transaction

– no coordination among transactions

  • i.e. uncoordinated data access

– leads to high lock contention, especially at data “hot spots”

Storage Engines 7

slide-8
SLIDE 8

Data Access Visualization

Storage Engines 8

! "! #! $! %! &!! !'" !'# !'$ !'% !"#$%"&$# $'()!*+),+- $./)01"23"2/04+0,2'34!*&345)42'3406-

slide-9
SLIDE 9

Lock Contention As a Bottleneck

Storage Engines 9

!@ #!@ %!@ &!@ '!@ "!!@ ' $$ &A '2 "!! !-78!9#8:;<$=> *+,!,(-./!012 9:?8.->8

/345!678!93:;< /345!678 );=>8!93:;< ?385

slide-10
SLIDE 10

The future looks bleak…

  • Not quite!
  • Idea: “Coordinate” data access patterns

– rather than having threads contending for locks, have transactions contending for threads – distribute the transactions to the data, not data to the transactions

Storage Engines 10

slide-11
SLIDE 11

Thread-to-Data Model

  • each thread is coupled with a disjoint subset
  • f the database
  • threads coordinate access to their own data

using a private locking mechanism

Storage Engines 11

slide-12
SLIDE 12

“Coordinated” Data Access

Storage Engines 12

! "! #! $! %! &!! !'" !'# !'$ !'% $'()!*+),+- $./)01"23"1020!*!7%8-

slide-13
SLIDE 13

A Data Oriented Architecture (DORA)

  • a shared-everything architecture designed to

scale to very high core counts

  • retains ACID properties
  • data (i.e. relations) are divided into disjoint

datasets

– 1 executer (thread) per dataset

Storage Engines 13

slide-14
SLIDE 14

Routing

  • How to map datasets?

– use a routing rule

  • Routing rules use a subset of columns from

a table, called routing fields, to map rows to datasets

– in practice, columns from primary or candidate keys are used – can be dynamically updated to balance load

Storage Engines 14

slide-15
SLIDE 15

Transaction Flow Graphs

  • used to map incoming transaction to

executers

  • actions are the data access parts of the query
  • identifiers describe which columns an action

uses

  • What about actions that don’t match routing

fields?

– called secondary actions, more difficult

Storage Engines 15

slide-16
SLIDE 16

Secondary Actions

  • which executer is responsible?

– for indexes that don’t index the routing fields, store the routing fields in the leaf nodes

  • added space overhead?
  • expensive to update indexes if routing fields are

changed?

Storage Engines 16

slide-17
SLIDE 17

Rendezvous Points

  • often, data dependencies exist between

actions

– insert rendezvous points between actions with data dependencies

  • logically separates execution into different phases
  • system cannot concurrently execute actions from

different phases

Storage Engines 17

slide-18
SLIDE 18

Executing an Action

  • 3 structures:

– incoming action queue

  • processed in order received

– completed action queue – thread-local lock table

  • use action identifiers to “lock” data to avoid

conflicts

Storage Engines 18

slide-19
SLIDE 19

Inserts and Deletes

  • Still need to acquire row-level locks through

centralized locking manager

– why?

  • T1 deletes a record
  • T2 inserts a record into the slot vacated by the

record deleted by T1

  • T1 aborts but can’t roll back, slot is taken

– row-level locks often not a source of contention

Storage Engines 19

slide-20
SLIDE 20

Experimental Setup

  • 3 benchmarks used, all OLTP

– TM-1

  • 7 transactions, 4 with updates

– TPC-C

  • 150 warehouses (approx. 20 GB)

– TPC-B

  • 100 branches (approx. 2 GB)

Storage Engines 20

slide-21
SLIDE 21

Lock Contention

Storage Engines 21

! "!! #!! $!! %!! &'()*+!,-.*/ 0-1!/)2)/ 345')(!/)2)/ ! 6!!! "!!! 7!!! #!!! 8!!!

slide-22
SLIDE 22

Throughput

Storage Engines 22

! "! #! $! %! 6!! ! "8 8! 98 6!! !"#$%&"'%(!)*('+,

  • ./!0$12!)3,

!45 :;0< =*>)/4?) ! 6! "! 7! #! 8! ! "8 8! 98 6!!

  • ./!0$12!)3,

!.-"6 :;0< =*>)/4?) ! 6! "! 7! #! ! "8 8! 98 6!!

  • ./!0$12!)3,

!.-"-!7#28#9(1(%+ :;0< =*>)/4?)

slide-23
SLIDE 23

Response Times

Storage Engines 23

! !"# !"$ !"% !"& ' '"# !"#$%!&'()"*('!+,$' ()*+,-.+ /012

slide-24
SLIDE 24

Conclusions

  • Traditional database engines not made for

the amount of thread-level parallelism seen in machines today

– lock contention a major part of that

  • A thread-to-data approach can significantly

reduce lock contention

Storage Engines 24

slide-25
SLIDE 25

Paper 2

  • OLTP Through the Looking Glass, and

What we Found There

– Stavros Harizopoulos et al. – SIGMOD ‘08

Storage Engines 25

slide-26
SLIDE 26

Motivation

  • Hardware has changed

– db systems were designed when memory was sparse – many OLTP databases can fit entirely in memory

  • Even in memory, there are other bottlenecks

– logging, latching, locking, buffer management

Storage Engines 26

slide-27
SLIDE 27

Alternative Architectures

  • logless

– removing logging

  • single transaction

– remove locking/latching

  • main memory resident

– remove transaction bookkeeping

Storage Engines 27

slide-28
SLIDE 28

Goals

  • Remove each of the “unnecessary” parts,
  • ne by one, and evaluate performance

– Determine relative performance gains by removing each feature

Storage Engines 28

slide-29
SLIDE 29

Instruction Count Breakdown

Storage Engines 29

! "#! $#! %#! &#! '##! '"#! '$#! '%#! '&#!

!"#$%&'$()"#

!"#$%

*$%++ ,)--("- ,)'.("- ,/$'0("- 1&22+% 3/"/-+%

%+3/("("-4)5+%0+/6 7*$%++ 7,)--("- 7,)'.("- 7,/$'0("- 71&22+% &'()*+, !"#$%&'() *&+,"-&.'"/0+%$1+"2/'12$/+'3%&,4.25/'62%'7,89&/+'+%,/0,1+"2/)

.+8#

')33($ '%+/$+4%+')%6 94:4&;6/$+4%+')%6 94:4;("4<4&";(" 94:4*$%++4,)).&; 1+-(" R9SN PT9MN LP9UN P?9PN LS9SN LV9LN 3/"/-+% .+8#

slide-30
SLIDE 30

Instruction Count Breakdown

Storage Engines 30

(#) (") ($) (%) (&) '(#) '(") '($) '(%) '(&)

!"#$%&'$()"#

!"#$%&':) *&+,"-&.'"/0+%$1+"2/'12$/+'3%&,4.25/'62%';&5'<%.&%'+%,/0,1+"2/) !"#$%

*$%++ ,)--("- ,)'.("- ,/$'0("- 1&22+% 3/"/-+%

%+3/("("-4)5+%0+/6 7*$%++ 7,)--("- 7,)'.("- 7,/$'0("- 71&22+% &'()*+,

.+8#

3/"/-+% .+8# ')33($ =94:4("#+%$4("6+: =>4:4'%+/$+4%+')%6 >94:4;("4<4&";(" >94:4*$%++4,)).&; 1+-(" ==4:4&;6/$+4%+')%6 U9MN WR9UN LR9PN LU9WN LL9TN LU9PN

slide-31
SLIDE 31

Conclusions

  • Antiquated disk-based features can cause

significant overhead in a main memory system

  • Each component of a system should be

carefully evaluated

Storage Engines 31

slide-32
SLIDE 32

Paper 3

  • Generic Database Cost Models for

Hierarchical Memory Systems

– S. Manegold et al. – VLDB ‘02

Storage Engines 32

slide-33
SLIDE 33

Motivation

  • Cost models are a key part of query
  • ptimization

– traditional cost models based on disk accesses

  • What about in a main memory system?

– memory hierarchy

  • L1, L2, L3, main memory, (solid-state?)

Storage Engines 33

slide-34
SLIDE 34

Goals

  • An accurate cost model should weight each

memory hierarchy differently

– overall “cost” of an operator should be the sum

  • f the cost at all memory hierarchies

– each level has different access cost

  • weight each access by that level’s cost

Storage Engines 34

slide-35
SLIDE 35

Data Access Patterns

  • different operators exhibit different data

access patterns

– pattern dictates both cost and number of caches misses

  • How to accurately model access patterns?

– basic access patterns

  • single/repetitive sequential traversal, single/

repetitive random traversal, random

  • compound access patterns

Storage Engines 35

slide-36
SLIDE 36

Cost Models

  • For each basic access pattern, derive custom

cost model (not shown)

  • Combine basic access pattern cost models to

derive compound access pattern cost models

  • For each database operator (i.e. sort), map

to a cost model

Storage Engines 36

slide-37
SLIDE 37

Experimental Analysis

Storage Engines 37

1e+00 1e+01 1e+02 1e+03 1e+04 1e+05 1e+06 1e+07 1e+08 1e+09 128kB512kB 2MB 8MB 32MB128MB ||U||=C2 relation sizes (||U||) L1 misses L2 misses TLB misses time [ms]

a) Quick-Sort

1e+00 1e+01 1e+02 1e+03 1e+04 1e+05 1e+06 1e+07 1e+08 128kB512kB 2MB 8MB 32MB128MB relation sizes (||U||=||V||=||W||) L1 misses L2 misses TLB misses time [ms]

b) Merge-Join

1e+00 1e+01 1e+02 1e+03 1e+04 1e+05 1e+06 1e+07 1e+08 1e+09 128kB512kB 2MB 8MB 32MB128MB ||H||=C3 ||H||=C2 relation sizes (||U||=||V||=||W||) L1 misses L2 misses TLB misses time [ms]

c) Hash-Join

slide-38
SLIDE 38

Conclusions

  • Basic cost models presented can model the

costs in main memory systems

  • These memory-based cost models could

also be used to enhance current disk-based cost models

Storage Engines 38

slide-39
SLIDE 39

Questions?

Storage Engines 39