DATA ANALYTICS USING DEEP LEARNING GT 8803 // FALL 2019 // JOY - - PowerPoint PPT Presentation

data analytics using deep learning
SMART_READER_LITE
LIVE PREVIEW

DATA ANALYTICS USING DEEP LEARNING GT 8803 // FALL 2019 // JOY - - PowerPoint PPT Presentation

DATA ANALYTICS USING DEEP LEARNING GT 8803 // FALL 2019 // JOY ARULRAJ L E C T U R E # 0 6 : D I S K - C E N T R I C A N D I N - M E M O R Y D A T A B A S E S Y S T E M S administrivia Project ideas List shared on Piazza Start


slide-1
SLIDE 1

DATA ANALYTICS USING DEEP LEARNING

GT 8803 // FALL 2019 // JOY ARULRAJ

L E C T U R E # 0 6 : D I S K - C E N T R I C A N D I N - M E M O R Y D A T A B A S E S Y S T E M S

slide-2
SLIDE 2

GT 8803 // Fall 2019

administrivia

  • Project ideas

– List shared on Piazza – Start looking for team-mates! – Sign up for discussion slots during office hours

2

slide-3
SLIDE 3

GT 8803 // Fall 2019

LAST CLASS

  • History of DBMSs

– In a way though, it really was a history of data models

  • Data Models

– Hierarchical data model (tree) (IMS) – Network data model (graph) (CODASYL) – Relational data model (tables) (System R, INGRES)

  • Overarching theme about all these systems

– They were all disk-based DBMSs

3

slide-4
SLIDE 4

GT 8803 // Fall 2019

TODAY’s AGENDA

  • Disk-centric DBMSs
  • In-Memory DBMSs

4

slide-5
SLIDE 5

GT 8803 // Fall 2018

DISK-CENTRIC DBMSs

5

slide-6
SLIDE 6

GT 8803 // Fall 2019

ANATOMY OF A DATABASE SYSTEM

Connection Manager + Admission Control Query Parser Query Optimizer Query Executor Lock Manager (Concurrency Control) Access Methods (or Indexes) Buffer Pool Manager Log Manager Memory Manager + Disk Manager Networking Manager

6

Query Transactional Storage Manager Query Processor Shared Utilities Process Manager

Source: Anatomy of a Database System

slide-7
SLIDE 7

GT 8803 // Fall 2019

ANATOMY OF A DATABASE SYSTEM

7

  • Process Manager

– Manages client connections

  • Query Processor

– Parse, plan and execute queries on top of storage manager

  • Transactional Storage Manager

– Knits together buffer management, concurrency control, logging and recovery

  • Shared Utilities

– Manage hardware resources across threads

slide-8
SLIDE 8

GT 8803 // Fall 2019

TOPICS

  • Implications of availability of large DRAM

chips for database systems

– Buffer Management – Query Processing – Concurrency Control – Logging and Recovery

8

slide-9
SLIDE 9

GT 8803 // Fall 2019

BACKGROUND

  • Much of the history of DBMSs is about dealing

with the limitations of hardware.

  • Hardware was much different when the
  • riginal DBMSs were designed:

– Uniprocessor (single-core CPU) – RAM was severely limited (few MB). – The database had to be stored on disk. – Disk is slow. No seriously, I mean really slow.

9

slide-10
SLIDE 10

GT 8803 // Fall 2019

BACKGROUND

  • But now DRAM capacities are large enough

that most databases can fit in memory.

– Structured data sets are smaller (e.g., tables with numeric data). – Unstructured data sets are larger (e.g., videos).

  • So why not just use a "traditional" disk-
  • riented DBMS with a really large cache?

10

slide-11
SLIDE 11

GT 8803 // Fall 2018

DISK-ORIENTED DBMS OVERHEAD

11

Measured CPU Instructions

OLTP THROUGH THE LOOKING GLASS, AND WHAT WE FOUND THERE

SIGMOD, pp. 981-992, 2008.

slide-12
SLIDE 12

GT 8803 // Fall 2018

DISK-ORIENTED DBMS OVERHEAD

12 BUFFER POOL LATCHING LOCKING LOGGING B-TREE KEYS REAL WORK

Measured CPU Instructions

OLTP THROUGH THE LOOKING GLASS, AND WHAT WE FOUND THERE

SIGMOD, pp. 981-992, 2008.

slide-13
SLIDE 13

GT 8803 // Fall 2018

DISK-ORIENTED DBMS OVERHEAD

13 BUFFER POOL LATCHING LOCKING LOGGING B-TREE KEYS REAL WORK

34%

Measured CPU Instructions

OLTP THROUGH THE LOOKING GLASS, AND WHAT WE FOUND THERE

SIGMOD, pp. 981-992, 2008.

slide-14
SLIDE 14

GT 8803 // Fall 2018

DISK-ORIENTED DBMS OVERHEAD

14 BUFFER POOL LATCHING LOCKING LOGGING B-TREE KEYS REAL WORK

14%

34%

Measured CPU Instructions

OLTP THROUGH THE LOOKING GLASS, AND WHAT WE FOUND THERE

SIGMOD, pp. 981-992, 2008.

slide-15
SLIDE 15

GT 8803 // Fall 2018

DISK-ORIENTED DBMS OVERHEAD

15 BUFFER POOL LATCHING LOCKING LOGGING B-TREE KEYS REAL WORK

16% 14%

34%

Measured CPU Instructions

OLTP THROUGH THE LOOKING GLASS, AND WHAT WE FOUND THERE

SIGMOD, pp. 981-992, 2008.

slide-16
SLIDE 16

GT 8803 // Fall 2018

DISK-ORIENTED DBMS OVERHEAD

16 BUFFER POOL LATCHING LOCKING LOGGING B-TREE KEYS REAL WORK

16% 14%

34%

12%

Measured CPU Instructions

OLTP THROUGH THE LOOKING GLASS, AND WHAT WE FOUND THERE

SIGMOD, pp. 981-992, 2008.

slide-17
SLIDE 17

GT 8803 // Fall 2018

DISK-ORIENTED DBMS OVERHEAD

17 BUFFER POOL LATCHING LOCKING LOGGING B-TREE KEYS REAL WORK

16% 14%

34%

12%

Measured CPU Instructions

OLTP THROUGH THE LOOKING GLASS, AND WHAT WE FOUND THERE

SIGMOD, pp. 981-992, 2008.

16%

slide-18
SLIDE 18

GT 8803 // Fall 2018

DISK-ORIENTED DBMS OVERHEAD

18 BUFFER POOL LATCHING LOCKING LOGGING B-TREE KEYS REAL WORK

16% 14%

34%

12%

Measured CPU Instructions

OLTP THROUGH THE LOOKING GLASS, AND WHAT WE FOUND THERE

SIGMOD, pp. 981-992, 2008.

16%

7%

slide-19
SLIDE 19

GT 8803 // Fall 2019

bUFFER MANAGEMENT

  • The primary storage location of the database

is on non-volatile storage (e.g., SSD).

– The database is stored in a file as a collection of fixed-length blocks called slotted pages on disk.

  • The system uses an volatile in-memory buffer

pool to cache blocks fetched from disk.

– Its job is to manage the movement of those blocks back and forth between disk and memory.

19

slide-20
SLIDE 20

GT 8803 // Fall 2019

bUFFER MANAGEMENT

  • When a query accesses a page, the DBMS

checks to see if that page is already in memory in a buffer pool

– If it’s not, then the DBMS has to retrieve it from disk and copy it into a free frame in the buffer pool. – If there are no free frames, then find a page to evict guided by the page replacement policy. – If the page being evicted is dirty, then the DBMS has to write it back to disk to ensure the durability (ACID) of data.

20

slide-21
SLIDE 21

GT 8803 // Fall 2019

bUFFER MANAGEMENT

  • Page replacement policy is a differentiating

factor between open-source and commercial DBMSs.

– What kind of data does it contain? – Is the page dirty? – How likely is the page to be accessed in the near future? – Examples: LRU, LFU, CLOCK, ARC

21

slide-22
SLIDE 22

GT 8803 // Fall 2019

bUFFER MANAGEMENT

  • Once the page is in memory, the DBMS

translates any on-disk addresses to their in- memory addresses. (Page Identifier) (Page Pointer) [#100] [0x5050]

22

slide-23
SLIDE 23

GT 8803 // Fall 2018

bUFFER MANAGEMENT

23

Buffer Pool

page6 page4

Index Database (On-Disk)

Slotted Pages

Page Table

page0 page1 page2 page2

slide-24
SLIDE 24

GT 8803 // Fall 2018

bUFFER MANAGEMENT

24

Buffer Pool

page6 page4

Index

Page Id + Slot #

Database (On-Disk)

Slotted Pages

Page Table

page0 page1 page2 page2

slide-25
SLIDE 25

GT 8803 // Fall 2018

bUFFER MANAGEMENT

25

Buffer Pool

page6 page4

Index

Page Id + Slot #

Database (On-Disk)

Slotted Pages

Page Table

page0 page1 page2 page2

slide-26
SLIDE 26

GT 8803 // Fall 2018

bUFFER MANAGEMENT

26

Buffer Pool

page6 page4

Index

Page Id + Slot #

Database (On-Disk)

Slotted Pages

Page Table

page0 page1 page2 page2

slide-27
SLIDE 27

GT 8803 // Fall 2018

bUFFER MANAGEMENT

27

Buffer Pool

page6 page4

Index

Page Id + Slot #

Database (On-Disk)

Slotted Pages

Page Table

page0 page1 page2 page2

slide-28
SLIDE 28

GT 8803 // Fall 2018

bUFFER MANAGEMENT

28

Buffer Pool

page6 page4

Index

Page Id + Slot #

Database (On-Disk)

Slotted Pages

Page Table

page0 page1 page2 page2

slide-29
SLIDE 29

GT 8803 // Fall 2018

bUFFER MANAGEMENT

29

Buffer Pool

page6 page4

Index

Page Id + Slot #

Database (On-Disk)

Slotted Pages

Page Table

page0 page1 page2 page2

slide-30
SLIDE 30

GT 8803 // Fall 2018

bUFFER MANAGEMENT

30

Buffer Pool

page6 page4

Index

Page Id + Slot #

Database (On-Disk)

Slotted Pages

Page Table

page0 page1 page2

slide-31
SLIDE 31

GT 8803 // Fall 2018

bUFFER MANAGEMENT

31

Buffer Pool

page6 page4

Index

Page Id + Slot #

Database (On-Disk)

Slotted Pages

Page Table

page0 page1 page2 page1

slide-32
SLIDE 32

GT 8803 // Fall 2018

bUFFER MANAGEMENT

32

Buffer Pool

page6 page4

Index

Page Id + Slot #

Database (On-Disk)

Slotted Pages

Page Table

page0 page1 page2 page1

slide-33
SLIDE 33

GT 8803 // Fall 2018

bUFFER MANAGEMENT

33

Buffer Pool

page6 page4

Index

Page Id + Slot #

Database (On-Disk)

Slotted Pages

Page Table

page0 page1 page2 page1

slide-34
SLIDE 34

GT 8803 // Fall 2019

bUFFER MANAGEMENT

  • Every tuple access has to go through the

buffer pool manager regardless of whether that data will always be in memory.

– Always have to translate a tuple’s record id to its memory location. – Worker thread has to pin pages that it needs to make sure that they are not swapped to disk.

34

slide-35
SLIDE 35

GT 8803 // Fall 2019

BUFFER MANAGEMENT

35

slide-36
SLIDE 36

GT 8803 // Fall 2019

BUFFER MANAGEMENT

  • Q: What do we gain by managing an in-

memory buffer?

– A: Accelerate query processing by storing frequently-accessed pages in fast memory

  • Q: Can we “learn” an optimal page

replacement policy?

– A: Recent paper from Google on learning memory accesses based on LSTM models.

36

slide-37
SLIDE 37

GT 8803 // Fall 2019

BUFFER MANAGEMENT

  • Q: What do we gain by managing an in-

memory buffer?

– A: Accelerate query processing by storing frequently-accessed pages in fast memory

  • Q: Can we “learn” an optimal page

replacement policy?

– A: Recent paper from Google on learning memory accesses based on LSTM models.

37

slide-38
SLIDE 38

GT 8803 // Fall 2019

BUFFER MANAGEMENT

  • Q: What do we gain by managing an in-

memory buffer?

– A: Accelerate query processing by storing frequently-accessed pages in fast memory

  • Q: Can we “learn” an optimal page

replacement policy?

– A: Recent paper from Google on learning memory accesses based on LSTM models.

38

slide-39
SLIDE 39

GT 8803 // Fall 2018

QUERY PROCESSING

39

Tuple-at-a-time

→ Each operator calls next on their child to get the next tuple to process.

Operator-at-a-time

→ Each operator materializes their entire

  • utput for their parent operator.

Vector-at-a-time

→ Each operator calls next on their child to get the next chunk of data to process.

SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND B.value > 100

A B

A.id=B.id value>100 A.id, B.value

s

p

slide-40
SLIDE 40

GT 8803 // Fall 2019

QUERY PROCESSING

  • The best strategy for executing a query plan

in a disk-centric DBMS

– Sequential scans over a table are much faster than random accesses

  • The traditional tuple-at-a-time iterator

model works well

– Because output of an operator will not fit in limited memory

40

slide-41
SLIDE 41

GT 8803 // Fall 2019

CONCURRENCY CONTROL

  • In a disk-oriented DBMS, the systems assumes

that a txn could stall at any time when it tries to access data that is not in memory.

41

slide-42
SLIDE 42

GT 8803 // Fall 2019

CONCURRENCY CONTROL

  • Execute other txns at the same time so that if
  • ne txn stalls then others can keep running.

– This is not because the DBMS is trying to use all cores in the CPU (still focusing on single-core CPUs) – We do this to let system make forward progress by executing another txn while the current txn is waiting for data to be fetched from disk

42

slide-43
SLIDE 43

GT 8803 // Fall 2019

CONCURRENCY CONTROL

  • Concurrency control policy

– Responsible for deciding how to interleave

  • perations of concurrent transactions in such a way

that it appears as if they are running serially – This property is referred to as serializability of transactions

43

slide-44
SLIDE 44

GT 8803 // Fall 2019

CONCURRENCY CONTROL

  • Concurrency control policy

– DBMS has to set locks and latches to ensure the highest level of isolation (ACID) between transactions – Locks are stored in a separate data structure (lock table) to avoid being swapped to disk.

44

slide-45
SLIDE 45

GT 8803 // Fall 2019

LOGGING & RECOVERY

  • This protocol helps ensure the atomicity and

durability properties (ACID)

– Durability: Changes made by committed transactions must be present in the database after recovering from a power failure. – Atomicity: Changes made by uncommitted (in- progress/aborted) transactions must not be present in the database after recovering from a power failure.

45

slide-46
SLIDE 46

GT 8803 // Fall 2019

LOGGING & RECOVERY

  • DBMSs use STEAL and NO-FORCE buffer pool

management policies.

– STEAL: DBMS can flush pages dirtied by uncommitted transactions to disk. – NO-FORCE: DBMS is not required to flush all pages dirtied by committed transactions to disk. – So all page modifications have to be flushed to the write-ahead log (WAL) before a txn can commit

46

slide-47
SLIDE 47

GT 8803 // Fall 2019

LOGGING & RECOVERY

  • Each log entry contains the before and after

images of modified tuples.

– STEAL: Modifications made by uncommitted transactions that are flushed to disk have to rolled back. – NO-FORCE: Modifications made by committed transactions might not have been flushed to disk.

47

slide-48
SLIDE 48

GT 8803 // Fall 2019

LOGGING & RECOVERY

  • Each log entry contains the before and after

images of modified tuples.

– Recording the before and after images in the log is critical to ensuring atomicity and durability – Lots of work to keep track of log sequence numbers (LSNs) all throughout the DBMS.

48

slide-49
SLIDE 49

GT 8803 // Fall 2019

LOGGING & RECOVERY

49

slide-50
SLIDE 50

GT 8803 // Fall 2019

LOGGING & RECOVERY

  • Q: What would happen if we use a NO-STEAL

policy?

– A: Cannot support large transactions that make changes larger than the buffer pool

  • Q: What would happen if we use a FORCE

policy?

– A: Performance would drop by orders of magnitude since need to randomly write to disk all the time.

50

slide-51
SLIDE 51

GT 8803 // Fall 2019

LOGGING & RECOVERY

  • Q: What would happen if we use a NO-STEAL

policy?

– A: Cannot support large transactions that make changes larger than the buffer pool

  • Q: What would happen if we use a FORCE

policy?

– A: Performance would drop by orders of magnitude since need to randomly write to disk all the time.

51

slide-52
SLIDE 52

GT 8803 // Fall 2019

LOGGING & RECOVERY

  • Q: What would happen if we use a NO-STEAL

policy?

– A: Cannot support large transactions that make changes larger than the buffer pool

  • Q: What would happen if we use a FORCE

policy?

– A: Performance would drop by orders of magnitude since need to randomly write to disk all the time.

52

slide-53
SLIDE 53

GT 8803 // Fall 2019

TAKEAWAYS

  • Disk-oriented DBMSs do a lot of extra stuff

because they are predicated on the assumption that data has to reside on disk

  • In-memory DBMSs maximize performance by
  • ptimizing these protocols and algorithms

53

slide-54
SLIDE 54

GT 8803 // Fall 2018

IN-MEMORY DBMSs

54

slide-55
SLIDE 55

GT 8803 // Fall 2019

IN-MEMORY DBMSS

  • Assume that the primary storage location of

the database is permanently in memory.

  • Early ideas proposed in the 1980s but it is

now feasible because DRAM prices are low and capacities are high.

55

slide-56
SLIDE 56

GT 8803 // Fall 2019

BOTTLENECKS

  • If I/O is no longer the slowest resource, much
  • f the DBMS’s architecture will have to

change account for other bottlenecks:

– Locking/latching – Cache misses – Predicate evaluations – Data movement & copying – Networking (between application & DBMS)

56

slide-57
SLIDE 57

GT 8803 // Fall 2018

STORAGE ACCESS LATENCIES

57

L3 DRAM SSD HDD Read Latency

~20 ns 60 ns 25,000 ns 10,000,000 ns

Write Latency

~20 ns 60 ns 300,000 ns 10,000,000 ns

LET’S TALK ABOUT STORAGE & RECOVERY METHODS FOR NON-VOLATILE MEMORY DATABASE SYSTEMS

SIGMOD, pp. 707-722, 2015.

slide-58
SLIDE 58

GT 8803 // Fall 2018

STORAGE ACCESS LATENCIES

58

Jim Gray’s analogy:

→Reading from L3 cache: Reading a book on a table →Reading from HDD: Flying to Pluto to read that book

Because everything fits in DRAM, we can do more sophisticated things in software.

slide-59
SLIDE 59

GT 8803 // Fall 2019

bUFFER MANAGEMENT

  • An in-memory DBMS does not need to store

the database in slotted pages but it will still

  • rganize tuples in blocks:

– Direct memory pointers vs. tuple identifiers – Separate pools for fixed-length (e.g., numeric data) and variable-length data (e.g., images) – Use checksums to detect software errors from trashing the database.

59

slide-60
SLIDE 60

GT 8803 // Fall 2018

bUFFER MANAGEMENT

60

Fixed-Length Data Blocks Index Variable-Length Data Blocks

slide-61
SLIDE 61

GT 8803 // Fall 2018

bUFFER MANAGEMENT

61

Fixed-Length Data Blocks Index

Memory Address

Variable-Length Data Blocks

slide-62
SLIDE 62

GT 8803 // Fall 2018

bUFFER MANAGEMENT

62

Fixed-Length Data Blocks Index

Memory Address

Variable-Length Data Blocks

slide-63
SLIDE 63

GT 8803 // Fall 2018

bUFFER MANAGEMENT

63

Fixed-Length Data Blocks Index

Memory Address

Variable-Length Data Blocks

slide-64
SLIDE 64

GT 8803 // Fall 2019

bUFFER MANAGEMENT

  • DRAM is fast, but data is not accessed with

the same frequency and in the same manner.

– Hot Data: OLTP Operations (Tweets posted yesterday) – Cold Data: OLAP Queries (Tweets posted last year)

  • We will study techniques for how to bring

back disk-resident data without slowing down the entire system.

64

slide-65
SLIDE 65

GT 8803 // Fall 2018

QUERY PROCESSING

65

SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND B.value > 100

A B

A.id=B.id value>100 A.id, B.value

s

p

slide-66
SLIDE 66

GT 8803 // Fall 2018

QUERY PROCESSING

66

Tuple-at-a-time

→ Each operator calls next on their child to get the next tuple to process.

Operator-at-a-time

→ Each operator materializes their entire

  • utput for their parent operator.

Vector-at-a-time

→ Each operator calls next on their child to get the next chunk of data to process.

SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND B.value > 100

A B

A.id=B.id value>100 A.id, B.value

s

p

slide-67
SLIDE 67

GT 8803 // Fall 2018

QUERY PROCESSING

67

Tuple-at-a-time

→ Each operator calls next on their child to get the next tuple to process.

Operator-at-a-time

→ Each operator materializes their entire

  • utput for their parent operator.

Vector-at-a-time

→ Each operator calls next on their child to get the next chunk of data to process.

SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND B.value > 100

A B

A.id=B.id value>100 A.id, B.value

s

p

slide-68
SLIDE 68

GT 8803 // Fall 2019

QUERY PROCESSING

  • The best strategy for executing a query plan

in a DBMS changes when all of the data is already in memory.

– Sequential scans are no longer significantly faster than random access.

  • The traditional tuple-at-a-time iterator

model is too slow because of function calls.

– This problem is more significant in OLAP DBMSs.

68

slide-69
SLIDE 69

GT 8803 // Fall 2019

QUERY PROCESSING

69

slide-70
SLIDE 70

GT 8803 // Fall 2019

QUERY PROCESSING

  • Q: Query processing in in-memory systems:

sequential scans or random accesses?

– A: Sequential scans are no longer significantly faster than random access.

  • Q: Will the traditional tuple-at-a-time iterator

work well now?

– A: No, too slow because of function calls (virtual table lookups).

70

slide-71
SLIDE 71

GT 8803 // Fall 2019

QUERY PROCESSING

  • Q: Query processing in in-memory systems:

sequential scans or random accesses?

– A: Sequential scans are no longer significantly faster than random access.

  • Q: Will the traditional tuple-at-a-time iterator

work well now?

– A: No, too slow because of function calls (virtual table lookups).

71

slide-72
SLIDE 72

GT 8803 // Fall 2019

QUERY PROCESSING

  • Q: Query processing in in-memory systems:

sequential scans or random accesses?

– A: Sequential scans are no longer significantly faster than random access.

  • Q: Will the traditional tuple-at-a-time iterator

work well now?

– A: No, too slow because of function calls (virtual table lookups).

72

slide-73
SLIDE 73

GT 8803 // Fall 2019

CONCURRENCY CONTROL

  • Observation: The cost of a txn acquiring a lock

is the same as accessing data (since the lock data is also in memory).

  • In-memory DBMS may want to detect

conflicts at a different granularity.

– Fine-grained locking allows for better concurrency but requires more locks. – Coarse-grained locking requires fewer locks but limits the amount of concurrency.

73

slide-74
SLIDE 74

GT 8803 // Fall 2019

CONCURRENCY CONTROL

  • The DBMS can store locking information

about each tuple together with its data.

– This helps with CPU cache locality. – Mutexes are too slow. Need to use CAS instructions.

74

slide-75
SLIDE 75

GT 8803 // Fall 2019

CONCURRENCY CONTROL

  • Disk-oriented DBMSs

– Stalling during disk I/O

  • Memory-oriented DBMSs

– New bottleneck is contention caused from txns executing on multiple cores trying to access data at the same time.

75

slide-76
SLIDE 76

GT 8803 // Fall 2019

LOGGING & RECOVERY

  • The DBMS still needs a WAL on disk since the

system could halt at anytime.

– Use group commit to batch log entries and flush them together to amortize fsync cost. – May be possible to use more lightweight logging schemes (e.g., only store redo information, NO- STEAL). – But since there are no "dirty" pages, there is no need to maintain LSNs all throughout the system.

76

slide-77
SLIDE 77

GT 8803 // Fall 2019

LOGGING & RECOVERY

  • The system also still takes checkpoints to

speed up recovery time.

  • Different methods for check-pointing:

– Old idea: Maintain a second copy of the database in memory that is updated by replaying the WAL. – Switch to a special “copy-on-write” mode and then write a dump of the database to disk. – Fork DBMS process and then have the child process write its contents to disk (using virtual memory).

77

slide-78
SLIDE 78

GT 8803 // Fall 2019

SUMMARY

  • Disk-oriented DBMSs are a relic of the past.

– Most structured databases fit entirely in DRAM on a single machine.

  • The world has finally become comfortable

with in-memory data storage and processing.

78

slide-79
SLIDE 79

GT 8803 // Fall 2019

ANATOMY OF A DATABASE SYSTEM

Connection Manager + Admission Control Query Parser Query Optimizer Query Executor Lock Manager (Concurrency Control) Access Methods (or Indexes) Buffer Pool Manager Log Manager Memory Manager + Disk Manager Networking Manager

79

Query Transactional Storage Manager Query Processor Shared Utilities Process Manager

Source: Anatomy of a Database System

slide-80
SLIDE 80

GT 8803 // Fall 2019

NEXT LECTURE

  • Data Storage
  • Assigned Reading

– BlazeIt: Fast Exploratory Video Queries using Neural Networks

80