DATABASE SYSTEM IMPLEMENTATION GT 4420/6422 // SPRING 2019 // - - PowerPoint PPT Presentation

database system implementation
SMART_READER_LITE
LIVE PREVIEW

DATABASE SYSTEM IMPLEMENTATION GT 4420/6422 // SPRING 2019 // - - PowerPoint PPT Presentation

DATABASE SYSTEM IMPLEMENTATION GT 4420/6422 // SPRING 2019 // @JOY_ARULRAJ LECTURE #3: STORAGE MODELS 2 LAST CLASS Implications of availability of large DRAM chips for database systems Buffer Management Concurrency Control Logging


slide-1
SLIDE 1

DATABASE SYSTEM IMPLEMENTATION

GT 4420/6422 // SPRING 2019 // @JOY_ARULRAJ LECTURE #3: STORAGE MODELS

slide-2
SLIDE 2

LAST CLASS

Implications of availability of large DRAM chips for database systems

→ Buffer Management → Concurrency Control → Logging and Recovery → Query Processing How do these components fit together? How does a SQL query get executed within the system?

2

slide-3
SLIDE 3

ANATOMY OF A DATABASE SYSTEM

Connection Manager + Admission Control Query Parser Query Optimizer Query Executor Lock Manager (Concurrency Control) Access Methods (or Indexes) Buffer Pool Manager Log Manager Memory Manager + Disk Manager Networking Manager

3

Query Transactional Storage Manager Query Processor Shared Utilities Process Manager

Source: Anatomy of a Database System

slide-4
SLIDE 4

ANATOMY OF A DATABASE SYSTEM

Connection Manager + Admission Control Query Parser Query Optimizer Query Executor Lock Manager (Concurrency Control) Access Methods (or Indexes) Buffer Pool Manager Log Manager Memory Manager + Disk Manager Networking Manager

4

Query Transactional Storage Manager Query Processor Shared Utilities Process Manager

Source: Anatomy of a Database System

slide-5
SLIDE 5

ANATOMY OF A DATABASE SYSTEM

5

Process Manager

→ Manages client connections

Query Processor

→ Parse, plan and execute queries on top of storage manager

Transactional Storage Manager

→ Knits together buffer management, concurrency control, logging and recovery

Shared Utilities

→ Manage hardware resources across threads

slide-6
SLIDE 6

TODAY’S AGENDA

Field Storage Format (Type Representation) Tuple Storage Format Table Storage Format (Storage Models)

6

slide-7
SLIDE 7

DATA ORGANIZATION

7

Fixed-Length Data Blocks Index

Block Id + Offset

Variable-Length Data Blocks

slide-8
SLIDE 8

DATA ORGANIZATION

One can think of an in-memory database as just a large array of bytes.

→ The schema tells the DBMS how to convert the bytes into the appropriate type (e.g., INTEGER, DATE). → Each tuple is prefixed with a header that contains meta- data (e.g., last modified time-stamp).

Storing tuples with as fixed-length data makes it easy to compute the starting point of any tuple.

→ No tuple indirection array as in the case of slotted pages with variable-length tuples in disk-oriented systems

8

slide-9
SLIDE 9

MEMORY PAGES

OS maps physical pages to virtual memory pages. The CPU's MMU maintains a TLB that contains the physical address of a virtual memory page.

→ The TLB resides in the CPU caches. → It can't obviously store every possible every possible entry for a large memory machine.

When you allocate a block, the memory allocator keeps that it aligned to page boundaries to reduce memory fragmentation (e.g., glibc malloc).

9

slide-10
SLIDE 10

TRANSPARENT HUGE PAGES (THP)

Maintain larger pages automatically (2MB to 1GB)

→ Each page has to be a contiguous blocks of memory. → Greatly reduces the # of TLB entries (meta-data)

With THP, the OS will to reorganize pages in the background to keep things compact.

→ Split larger pages into smaller pages. → Combine smaller pages into larger pages. → Can cause the DBMS process to stall on memory access.

Almost every DBMS says to disable this feature:

→ Oracle, MemSQL, NuoDB, MongoDB, Sybase IQ

10

Source: Alexandr Nikitin

slide-11
SLIDE 11

FIELD STORAGE FORMAT (TYPES)

INTEGER/BIGINT/SMALLINT/TINYINT

→ C/C++ Representation (endianness depends on CPU)

FLOAT/REAL/DOUBLE vs. NUMERIC/DECIMAL

→ Floating-point Decimals / Fixed-point Decimals

VARCHAR/VARBINARY/TEXT/BLOB

→ Pointer to other location if type is ≥64-bits → Header with length and address to next location (if segmented), followed by data bytes.

TIME/DATE/TIMESTAMP

→ 32/64-bit integer of (micro)seconds since Unix epoch

11

slide-12
SLIDE 12

VARIABLE PRECISION NUMBERS

Inexact, variable-precision floating point type that uses the “native” C/C++ types

→ Example: FLOAT, REAL, DOUBLE → FLOAT(n): n is number of bits that are used to store the mantissa of the float number → REAL = FLOAT(24) → DOUBLE = FLOAT(53)

Store directly as specified by IEEE-754. Typically faster than fixed precision numbers.

12

slide-13
SLIDE 13

VARIABLE PRECISION NUMBERS

13

#include <stdio.h> int main(int argc, char* argv[]) { float x = 0.1; float y = 0.2; printf("x+y = %.20f\n", x+y); printf("0.3 = %.20f\n", 0.3); }

Rounding Example

slide-14
SLIDE 14

VARIABLE PRECISION NUMBERS

14

#include <stdio.h> int main(int argc, char* argv[]) { float x = 0.1; float y = 0.2; printf("x+y = %.20f\n", x+y); printf("0.3 = %.20f\n", 0.3); }

Rounding Example

x+y = 0.30000001192092895508 0.3 = 0.29999999999999998890

Output

slide-15
SLIDE 15

FIXED PRECISION NUMBERS

Numeric data types with arbitrary precision and

  • scale. Used when round errors are unacceptable.

→ Example: NUMERIC, DECIMAL → NUMERIC = DECIMAL

Typically stored in a exact, variable-length binary representation with additional meta-data.

→ Like a VARCHAR but not stored as a string → 2 times slower to sum one million values.

15

slide-16
SLIDE 16

POSTGRES: NUMERIC

16

typedef unsigned char NumericDigit; typedef struct { int ndigits; int weight; int scale; int sign; NumericDigit *digits; } numeric;

slide-17
SLIDE 17

POSTGRES: NUMERIC

17

typedef unsigned char NumericDigit; typedef struct { int ndigits; int weight; int scale; int sign; NumericDigit *digits; } numeric;

# of Digits

slide-18
SLIDE 18

POSTGRES: NUMERIC

18

typedef unsigned char NumericDigit; typedef struct { int ndigits; int weight; int scale; int sign; NumericDigit *digits; } numeric;

# of Digits Weight of 1st Digit

slide-19
SLIDE 19

POSTGRES: NUMERIC

19

typedef unsigned char NumericDigit; typedef struct { int ndigits; int weight; int scale; int sign; NumericDigit *digits; } numeric;

# of Digits Weight of 1st Digit Scale Factor

slide-20
SLIDE 20

POSTGRES: NUMERIC

20

typedef unsigned char NumericDigit; typedef struct { int ndigits; int weight; int scale; int sign; NumericDigit *digits; } numeric;

# of Digits Weight of 1st Digit Scale Factor Positive/Negative/NaN

slide-21
SLIDE 21

POSTGRES: NUMERIC

21

typedef unsigned char NumericDigit; typedef struct { int ndigits; int weight; int scale; int sign; NumericDigit *digits; } numeric;

# of Digits Weight of 1st Digit Scale Factor Positive/Negative/NaN Digit Storage

slide-22
SLIDE 22

POSTGRES: NUMERIC

22

typedef unsigned char NumericDigit; typedef struct { int ndigits; int weight; int scale; int sign; NumericDigit *digits; } numeric;

# of Digits Weight of 1st Digit Scale Factor Positive/Negative/NaN Digit Storage

slide-23
SLIDE 23

POSTGRES: NUMERIC

23

typedef unsigned char NumericDigit; typedef struct { int ndigits; int weight; int scale; int sign; NumericDigit *digits; } numeric;

# of Digits Weight of 1st Digit Scale Factor Positive/Negative/NaN Digit Storage

slide-24
SLIDE 24

TUPLE STORAGE FORMAT

24

CREATE TABLE AndySux ( id INT PRIMARY KEY, value BIGINT );

char[]

slide-25
SLIDE 25

TUPLE STORAGE FORMAT

25

CREATE TABLE AndySux ( id INT PRIMARY KEY, value BIGINT ); header id value

char[]

slide-26
SLIDE 26

TUPLE STORAGE FORMAT

26

CREATE TABLE AndySux ( id INT PRIMARY KEY, value BIGINT ); header id value

char[]

slide-27
SLIDE 27

TUPLE STORAGE FORMAT

27

CREATE TABLE AndySux ( id INT PRIMARY KEY, value BIGINT ); header id value

char[]

slide-28
SLIDE 28

TUPLE STORAGE FORMAT

28

CREATE TABLE AndySux ( id INT PRIMARY KEY, value BIGINT ); header id value

char[]

reinterpret_cast<int32_t*>(address)

slide-29
SLIDE 29

TUPLE STORAGE FORMAT

29

CREATE TABLE AndySux ( id INT PRIMARY KEY, value BIGINT ); header id value

char[]

reinterpret_cast<int32_t*>(address)

Reinterpret cast does not compile to any

  • instructions. It instructs the compiler to treat the

sequence of bits as if it is of <int32_t> type.

slide-30
SLIDE 30

VARIABLE-LENGTH FIELDS

30

CREATE TABLE AndySux ( value VARCHAR(1024) ); header 64-BIT POINTER

char[]

Variable-Length Data Blocks

INSERT INTO AndySux VALUES (”His jokes are the worst that I have ever heard. I hate him so much.");

slide-31
SLIDE 31

VARIABLE-LENGTH FIELDS

31

CREATE TABLE AndySux ( value VARCHAR(1024) ); header 64-BIT POINTER

char[]

Variable-Length Data Blocks

INSERT INTO AndySux VALUES (”His jokes are the worst that I have ever heard. I hate him so much.");

slide-32
SLIDE 32

VARIABLE-LENGTH FIELDS

32

CREATE TABLE AndySux ( value VARCHAR(1024) ); header 64-BIT POINTER

char[]

Variable-Length Data Blocks

His jokes are the worst that I have ever heard. I hate LENGTH NEXT him so much. LENGTH NEXT

INSERT INTO AndySux VALUES (”His jokes are the worst that I have ever heard. I hate him so much.");

slide-33
SLIDE 33

VARIABLE-LENGTH FIELDS

33

CREATE TABLE AndySux ( value VARCHAR(1024) ); header 64-BIT POINTER

char[]

Variable-Length Data Blocks

His|64-BIT POINTER His jokes are the worst that I have ever heard. I hate LENGTH NEXT him so much. LENGTH NEXT

INSERT INTO AndySux VALUES (”His jokes are the worst that I have ever heard. I hate him so much.");

slide-34
SLIDE 34

NULL DATA TYPES

Choice #1: Special Values

→ Designate a value to represent NULL for a particular data type (e.g., INT32_MIN).

Choice #2: Null Column Bitmap Header

→ Store a bitmap in the tuple header that specifies what attributes are null. Limits the number of columns.

Choice #3: Per Attribute Null Flag

→ Store a flag that marks that a value is null (not in header). → Have to use more space than just a single bit because this messes up with word alignment. Increases column size.

34

slide-35
SLIDE 35

NULL DATA TYPES

Choice #1: Special Values

→ Designate a value to represent NULL for a particular data type (e.g., INT32_MIN).

Choice #2: Null Column Bitmap Header

→ Store a bitmap in the tuple header that specifies what attributes are null. Limits the number of columns.

Choice #3: Per Attribute Null Flag

→ Store a flag that marks that a value is null (not in header). → Have to use more space than just a single bit because this messes up with word alignment. Increases column size.

35

slide-36
SLIDE 36

NULL DATA TYPES

Choice #1: Special Values

→ Designate a value to represent NULL for a particular data type (e.g., INT32_MIN).

Choice #2: Null Column Bitmap Header

→ Store a bitmap in the tuple header that specifies what attributes are null. Limits the number of columns.

Choice #3: Per Attribute Null Flag

→ Store a flag that marks that a value is null (not in header). → Have to use more space than just a single bit because this messes up with word alignment. Increases column size.

36

slide-37
SLIDE 37

NULL DATA TYPES

Choice #1: Special Values

→ Designate a value to represent NULL for a particular data type (e.g., INT32_MIN).

Choice #2: Null Column Bitmap Header

→ Store a bitmap in the tuple header that specifies what attributes are null. Limits the number of columns.

Choice #3: Per Attribute Null Flag

→ Store a flag that marks that a value is null (not in header). → Have to use more space than just a single bit because this messes up with word alignment. Increases column size.

37

slide-38
SLIDE 38

NULL DATA TYPES

Choice #1: Special Values

→ Designate a value to represent NULL for a particular data type (e.g., INT32_MIN).

Choice #2: Null Column Bitmap Header

→ Store a bitmap in the tuple header that specifies what attributes are null. Limits the number of columns.

Choice #3: Per Attribute Null Flag

→ Store a flag that marks that a value is null (not in header). → Have to use more space than just a single bit because this messes up with word alignment. Increases column size.

38

slide-39
SLIDE 39

DATA ALIGNMENT

A CPU accesses memory by a single memory word (64 bits) at a time. If it fetches a value that is not word-aligned, it may: →Execute two reads to load the appropriate parts

  • f the data word and reassemble them (x86).

→Read some unexpected combination of bytes assembled into a 64-bit word. →Throw an exception

39

slide-40
SLIDE 40

WORD-ALIGNED TUPLES

All attributes in a tuple must be word aligned to enable the CPU to access it without any unexpected behavior or additional work.

40

CREATE TABLE AndySux ( id INT PRIMARY KEY, cdate TIMESTAMP, color CHAR(2), zipcode INT );

char[]

slide-41
SLIDE 41

WORD-ALIGNED TUPLES

All attributes in a tuple must be word aligned to enable the CPU to access it without any unexpected behavior or additional work.

41

CREATE TABLE AndySux ( id INT PRIMARY KEY, cdate TIMESTAMP, color CHAR(2), zipcode INT ); 64-bit Word 64-bit Word 64-bit Word 64-bit Word

char[]

slide-42
SLIDE 42

WORD-ALIGNED TUPLES

All attributes in a tuple must be word aligned to enable the CPU to access it without any unexpected behavior or additional work.

42

CREATE TABLE AndySux ( id INT PRIMARY KEY, cdate TIMESTAMP, color CHAR(2), zipcode INT ); 32-bits 64-bit Word 64-bit Word 64-bit Word 64-bit Word

char[]

slide-43
SLIDE 43

WORD-ALIGNED TUPLES

All attributes in a tuple must be word aligned to enable the CPU to access it without any unexpected behavior or additional work.

43

CREATE TABLE AndySux ( id INT PRIMARY KEY, cdate TIMESTAMP, color CHAR(2), zipcode INT ); 32-bits 64-bit Word 64-bit Word 64-bit Word 64-bit Word id

char[]

slide-44
SLIDE 44

WORD-ALIGNED TUPLES

All attributes in a tuple must be word aligned to enable the CPU to access it without any unexpected behavior or additional work.

44

CREATE TABLE AndySux ( id INT PRIMARY KEY, cdate TIMESTAMP, color CHAR(2), zipcode INT ); 32-bits 64-bits 64-bit Word 64-bit Word 64-bit Word 64-bit Word id cdate

char[]

slide-45
SLIDE 45

WORD-ALIGNED TUPLES

All attributes in a tuple must be word aligned to enable the CPU to access it without any unexpected behavior or additional work.

45

CREATE TABLE AndySux ( id INT PRIMARY KEY, cdate TIMESTAMP, color CHAR(2), zipcode INT ); 32-bits 64-bits 16-bits 64-bit Word 64-bit Word 64-bit Word 64-bit Word id cdate c

char[]

slide-46
SLIDE 46

WORD-ALIGNED TUPLES

All attributes in a tuple must be word aligned to enable the CPU to access it without any unexpected behavior or additional work.

46

CREATE TABLE AndySux ( id INT PRIMARY KEY, cdate TIMESTAMP, color CHAR(2), zipcode INT ); 32-bits 64-bits 16-bits 32-bits 64-bit Word 64-bit Word 64-bit Word 64-bit Word id cdate c zipc

char[]

slide-47
SLIDE 47

WORD-ALIGNED TUPLES

All attributes in a tuple must be word aligned to enable the CPU to access it without any unexpected behavior or additional work.

47

CREATE TABLE AndySux ( id INT PRIMARY KEY, cdate TIMESTAMP, color CHAR(2), zipcode INT ); 32-bits 64-bits 16-bits 32-bits 64-bit Word 64-bit Word 64-bit Word 64-bit Word id cdate c zipc

char[]

slide-48
SLIDE 48

CREATE TABLE AndySux ( id INT PRIMARY KEY, cdate TIMESTAMP, color CHAR(2), zipcode INT );

WORD-ALIGNED TUPLES

All attributes in a tuple must be word aligned to enable the CPU to access it without any unexpected behavior or additional work.

48

64-bit Word 64-bit Word 64-bit Word 64-bit Word id cdate c zipc

00000000 00000000 00000000 00000000 0000 0000 0000 0000

char[]

32-bits 64-bits 16-bits 32-bits

slide-49
SLIDE 49

CREATE TABLE AndySux ( id INT PRIMARY KEY, cdate TIMESTAMP, color CHAR(2), zipcode INT );

WORD-ALIGNED TUPLES

All attributes in a tuple must be word aligned to enable the CPU to access it without any unexpected behavior or additional work.

49

64-bit Word 64-bit Word 64-bit Word 64-bit Word id cdate c zipc

00000000 00000000 00000000 00000000 0000 0000 0000 0000

char[]

32-bits 64-bits 16-bits 32-bits

slide-50
SLIDE 50

TABLE STORAGE FORMAT

We looked at how to store fields and tuples Storage Models

→ N-ary Storage Model (NSM) / Row-Store → Decomposition Storage Model (DSM) / Column-Store → Flexible or Hybrid Storage Model

50

slide-51
SLIDE 51

N-ARY STORAGE MODEL (NSM)

The DBMS stores all of the attributes for a single tuple contiguously. Ideal for OLTP workloads where txns tend to

  • perate only on an individual entity and insert-

heavy workloads. Use the tuple-at-a-time iterator model.

51

slide-52
SLIDE 52

N-ARY STORAGE MODEL (NSM)

52

1 Georgia Tech 15000 Atlanta 2 Wisconsin 30000 Madison 3 Carnegie Mellon 6000 Pittsburgh ID University Enrollment City 4 UC Berkeley 30000 Berkeley

slide-53
SLIDE 53

NSM PHYSICAL STORAGE

Choice #1: Heap-Organized Tables

→ Tuples are stored in blocks called a heap. → The heap does not necessarily define an order.

Choice #2: Index-Organized Tables

→ Tuples are stored in the primary key index itself. → Index does define an order based on the primary key.

53

slide-54
SLIDE 54

N-ARY STORAGE MODEL (NSM)

Advantages

→ Fast inserts, updates, and deletes. → Good for queries that need the entire tuple. → Can use index-oriented physical storage.

Disadvantages

→ Not good for scanning large portions of the table and/or a subset of the attributes. → OLAP workloads & wide tables with lots of attributes

54

slide-55
SLIDE 55

DECOMPOSITION STORAGE MODEL (DSM)

The DBMS stores a single attribute for all tuples contiguously in a block of data.

→ Sometimes also called vertical partitioning.

Ideal for OLAP workloads where read-only queries perform large scans over a subset of the table’s attributes. Use the vector-at-a-time iterator model.

55

slide-56
SLIDE 56

56

1 2 3 4

DECOMPOSITION STORAGE MODEL (DSM)

Georgia Tech Wisconsin Carnegie Mellon UC Berkeley 15000 30000 6000 30000 Atlanta Madison Pittsburgh Berkeley ID University Enrollment City

slide-57
SLIDE 57

DECOMPOSITION STORAGE MODEL (DSM)

1970s: Cantor DBMS (Swedish defense ministry) 1980s: DSM Proposal 1990s: SybaseIQ (in-memory query accelerator) 2000s: Vertica, Vectorwise, MonetDB 2010s: “The Big Three” Cloudera Impala, Amazon Redshift, SAP HANA, MemSQL

57

slide-58
SLIDE 58

TUPLE IDENTIFICATION IN DSM

Choice #1: Fixed-length Offsets

→ Each value is the same length for an attribute.

Choice #2: Embedded Tuple Ids

→ Each value is stored with its tuple id in a column.

58

Offsets

1 2 3

A B C D

Embedded Ids

A

1 2 3

B

1 2 3

C

1 2 3

D

1 2 3

slide-59
SLIDE 59

DECOMPOSITION STORAGE MODEL (DSM)

Advantages

→ Reduces the amount wasted work because the DBMS

  • nly reads the data that it needs.

→ Better compression.

Disadvantages

→ Slow for point queries, inserts, updates, and deletes because of tuple splitting/stitching (OLTP workloads).

59

slide-60
SLIDE 60

OBSERVATION

Can we build a single system that supports both OLTP and OLAP workloads? Data is “hot” when first entered into database

→ A newly inserted tuple is more likely to be updated again the near future.

As a tuple ages, it is updated less frequently.

→ At some point, a tuple is only accessed in read-only queries along with other tuples.

What if we want to use this data to make decisions that affect new txns?

60

slide-61
SLIDE 61

BIFURCATED ENVIRONMENT

61

Extract Transform Load OLAP Data Warehouse OLTP Data Silos

slide-62
SLIDE 62

BIFURCATED ENVIRONMENT

62

Extract Transform Load OLAP Data Warehouse OLTP Data Silos

slide-63
SLIDE 63

BIFURCATED ENVIRONMENT

63

Extract Transform Load OLAP Data Warehouse OLTP Data Silos

slide-64
SLIDE 64

HYBRID STORAGE MODEL

Single database instance that uses different storage models for hot and cold data. Store new data in NSM for fast OLTP Migrate data to DSM for more efficient OLAP

64

slide-65
SLIDE 65

HYBRID STORAGE MODEL

Choice #1: Separate Execution Engines

→ Use separate execution engines that are optimized for either NSM or DSM databases.

Choice #2: Single, Flexible Architecture

→ Use single execution engine that is able to efficiently

  • perate on both NSM and DSM databases.

65

slide-66
SLIDE 66

SEPARATE EXECUTION ENGINES

Run separate “internal” DBMSs that each only

  • perate on DSM or NSM data.

→ Need to combine query results from both engines to appear as a single logical database to the application. → Have to use a synchronization method (e.g., 2PC) if a txn spans execution engines.

Two approaches to do this:

→ Fractured Mirrors (Oracle, IBM) → Delta Store (SAP HANA)

66

slide-67
SLIDE 67

FRACTURED MIRRORS

Store a second copy of the database in a DSM layout that is automatically updated.

→ All updates are first entered in NSM then eventually copied into DSM mirror.

67

A CASE FOR FRACTURED MIRRORS VLDB 2002

slide-68
SLIDE 68

FRACTURED MIRRORS

Store a second copy of the database in a DSM layout that is automatically updated.

→ All updates are first entered in NSM then eventually copied into DSM mirror.

68

A CASE FOR FRACTURED MIRRORS VLDB 2002

NSM (Primary) DSM (Mirror)

slide-69
SLIDE 69

FRACTURED MIRRORS

Store a second copy of the database in a DSM layout that is automatically updated.

→ All updates are first entered in NSM then eventually copied into DSM mirror.

69

A CASE FOR FRACTURED MIRRORS VLDB 2002

OLTP Updates NSM (Primary) DSM (Mirror)

slide-70
SLIDE 70

FRACTURED MIRRORS

Store a second copy of the database in a DSM layout that is automatically updated.

→ All updates are first entered in NSM then eventually copied into DSM mirror.

70

A CASE FOR FRACTURED MIRRORS VLDB 2002

OLTP Updates NSM (Primary) DSM (Mirror)

slide-71
SLIDE 71

FRACTURED MIRRORS

Store a second copy of the database in a DSM layout that is automatically updated.

→ All updates are first entered in NSM then eventually copied into DSM mirror.

71

A CASE FOR FRACTURED MIRRORS VLDB 2002

OLTP Updates OLAP Queries NSM (Primary) DSM (Mirror)

slide-72
SLIDE 72

DELTA STORE

Stage updates to the database in an NSM table. A background thread migrates updates from delta store and applies them to DSM data.

72

Delta Store DSM Historical Data

slide-73
SLIDE 73

DELTA STORE

Stage updates to the database in an NSM table. A background thread migrates updates from delta store and applies them to DSM data.

73

Delta Store DSM Historical Data OLTP Updates

slide-74
SLIDE 74

DELTA STORE

Stage updates to the database in an NSM table. A background thread migrates updates from delta store and applies them to DSM data.

74

Delta Store DSM Historical Data OLTP Updates

slide-75
SLIDE 75

CATEGORIZING DATA

Choice #1: Manual Approach

→ DBA specifies what tables should be stored as DSM.

Choice #2: Off-line Approach

→ DBMS monitors access logs offline and then makes decision about what data to move to DSM.

Choice #3: On-line Approach

→ DBMS tracks access patterns at runtime and then makes decision about what data to move to DSM.

75

slide-76
SLIDE 76

PELOTON ADAPTIVE STORAGE

Employ a single execution engine architecture that is able to operate on both NSM and DSM data.

→ Don’t need to store two copies of the database. → Don’t need to sync multiple database segments.

Note that a DBMS can still use the delta-store approach with this single-engine architecture.

76

BRIDGING THE ARCHIPELAGO BETWEEN ROW-STORES AND COLUMN-STORES FOR HYBRID WORKLOADS SIGMOD 2016

slide-77
SLIDE 77

77

1 2 3 4

FLEXIBLE STORAGE MODEL

Georgia Tech 15000 Wisconsin 30000 Carnegie Mellon 6000 UC Berkeley 30000 Atlanta Madison Pittsburgh Berkeley ID University Enrollment City

slide-78
SLIDE 78

PELOTON ADAPTIVE STORAGE

78

Original Data

SELECT AVG(B) FROM AndySux WHERE C = “yyy” UPDATE AndySux SET A = 123, B = 456, C = 789 WHERE D = “xxx”

A B C D

slide-79
SLIDE 79

PELOTON ADAPTIVE STORAGE

79

Original Data

SELECT AVG(B) FROM AndySux WHERE C = “yyy” UPDATE AndySux SET A = 123, B = 456, C = 789 WHERE D = “xxx”

A B C D

slide-80
SLIDE 80

PELOTON ADAPTIVE STORAGE

80

Original Data

SELECT AVG(B) FROM AndySux WHERE C = “yyy” UPDATE AndySux SET A = 123, B = 456, C = 789 WHERE D = “xxx”

A B C D

Cold Hot

slide-81
SLIDE 81

PELOTON ADAPTIVE STORAGE

81

Original Data Adapted Data

SELECT AVG(B) FROM AndySux WHERE C = “yyy” UPDATE AndySux SET A = 123, B = 456, C = 789 WHERE D = “xxx”

A B C D A B C D A B C D

Cold Hot

slide-82
SLIDE 82

PELOTON ADAPTIVE STORAGE

82

Original Data Adapted Data

SELECT AVG(B) FROM AndySux WHERE C = “yyy” UPDATE AndySux SET A = 123, B = 456, C = 789 WHERE D = “xxx”

A B C D A B C D A B C D

Cold Hot

slide-83
SLIDE 83

TILE ARCHITECTURE

Introduce an indirection layer that abstracts the true layout of tuples from query operators.

83

A B C D

slide-84
SLIDE 84

TILE ARCHITECTURE

Introduce an indirection layer that abstracts the true layout of tuples from query operators.

84

Tile Group A Tile Group B

A B C D

slide-85
SLIDE 85

TILE ARCHITECTURE

Introduce an indirection layer that abstracts the true layout of tuples from query operators.

85

Tile Group A Tile Group B

A B C D

Tile #1 Tile #2 Tile #3 Tile #4

slide-86
SLIDE 86

TILE ARCHITECTURE

Introduce an indirection layer that abstracts the true layout of tuples from query operators.

86

A B C D

Tile #1 Tile #2 Tile #3 Tile #4

H

+ + + + +

Tile Group Header

slide-87
SLIDE 87

TILE ARCHITECTURE

Introduce an indirection layer that abstracts the true layout of tuples from query operators.

87

A B C D H

+ + + + +

slide-88
SLIDE 88

AS

γ

s

TILE ARCHITECTURE

Introduce an indirection layer that abstracts the true layout of tuples from query operators.

88

A B C D H

+ + + + +

SELECT AVG(B) FROM AndySux WHERE C = “yyy”

slide-89
SLIDE 89

AS

γ

s

TILE ARCHITECTURE

Introduce an indirection layer that abstracts the true layout of tuples from query operators.

89

A B C D H

+ + + + +

SELECT AVG(B) FROM AndySux WHERE C = “yyy”

1 2

B

1 2 3

slide-90
SLIDE 90

AS

γ

s

TILE ARCHITECTURE

Introduce an indirection layer that abstracts the true layout of tuples from query operators.

90

A B C D H

+ + + + +

SELECT AVG(B) FROM AndySux WHERE C = “yyy”

1 2

B

1 2 3

slide-91
SLIDE 91

91

FLEXIBLE STORAGE MODEL

1 2 Georgia Tech 15000 Wisconsin 30000 Atlanta Madison ID University Enrollment City 3 Carnegie Mellon 4 UC Berkeley 6000 30000 Pittsburgh Berkeley

slide-92
SLIDE 92

PELOTON ADAPTIVE STORAGE

92

Sep-15

Execution Time (ms)

Sep-16 Sep-17 Sep-18 Sep-19 Sep-20

slide-93
SLIDE 93

PELOTON ADAPTIVE STORAGE

93

Sep-15

Sc Scan In Insert rt Sc Scan In Insert rt Sc Scan In Insert rt Sc Scan In Insert rt Sc Scan In Insert rt Sc Scan In Insert rt

Execution Time (ms)

Sep-16 Sep-17 Sep-18 Sep-19 Sep-20

slide-94
SLIDE 94

PELOTON ADAPTIVE STORAGE

94

400 800 1200 1600

Row Layout Column Layout Adaptive Layout Sep-15

Sc Scan In Insert rt Sc Scan In Insert rt Sc Scan In Insert rt Sc Scan In Insert rt Sc Scan In Insert rt Sc Scan In Insert rt

Execution Time (ms)

Sep-16 Sep-17 Sep-18 Sep-19 Sep-20

slide-95
SLIDE 95

PELOTON ADAPTIVE STORAGE

95

400 800 1200 1600

Row Layout Column Layout Adaptive Layout Sep-15

Sc Scan In Insert rt Sc Scan In Insert rt Sc Scan In Insert rt Sc Scan In Insert rt Sc Scan In Insert rt Sc Scan In Insert rt

Execution Time (ms)

Sep-16 Sep-17 Sep-18 Sep-19 Sep-20

slide-96
SLIDE 96

PELOTON ADAPTIVE STORAGE

96

400 800 1200 1600

Row Layout Column Layout Adaptive Layout Sep-15

Sc Scan In Insert rt Sc Scan In Insert rt Sc Scan In Insert rt Sc Scan In Insert rt Sc Scan In Insert rt Sc Scan In Insert rt

Execution Time (ms)

Sep-16 Sep-17 Sep-18 Sep-19 Sep-20

slide-97
SLIDE 97

PARTING THOUGHTS

A flexible architecture that supports a hybrid storage model is the next major trend in DBMSs

→ This will enable relational DBMSs to support both OLTP and OLAP database workloads.

97

slide-98
SLIDE 98

NEXT CLASS

Database Compression Reminder: Homework 0 is due today. Reminder: Homework 1 has been released. It will be due on Jan 24.

98