1 / 57
Storage Management
Lecture 4: Storage Management 1 / 57 Storage Management - - PowerPoint PPT Presentation
Storage Management Lecture 4: Storage Management 1 / 57 Storage Management Administrivia Assignment 1 is due on September 7th @ 11:59pm 2 / 57 Storage Management Layered Architecture Layered Architecture 3 / 57 Storage Management Layered
1 / 57
Storage Management
2 / 57
Storage Management
3 / 57
Storage Management Layered Architecture
4 / 57
Storage Management Layered Architecture
5 / 57
Storage Management Layered Architecture
▶ Manages client connections
▶ Parse, plan and execute queries on top of storage manager
▶ Knits together buffer management, concurrency control, logging and recovery
▶ Manage hardware resources across threads
6 / 57
Storage Management Layered Architecture
▶ Connection Manager + Admission Control
▶ Query Parser ▶ Query Optimizer (a.k.a., Query Planner) ▶ Query Executor
▶ Lock Manager ▶ Access Methods (a.k.a., Indexes) ▶ Buffer Pool Manager ▶ Log Manager
▶ Memory, Disk, and Networking Manager
7 / 57
Storage Management Layered Architecture
Application Data
Filesystem Logical Drive Physical Drive
8 / 57
Storage Management Layered Architecture
▶ application logic must be shielded from physical storage implementation details ▶ physical storage can be reorganized ▶ hardware can be changed
▶ must scale to (nearly) arbitrary data size ▶ efficiently access to individual tuples ▶ efficiently update an arbitrary subset of tuples
▶ data must never be lost ▶ must cope with hardware and software failures
9 / 57
Storage Management Layered Architecture
10 / 57
Storage Management Layered Architecture
DB access layer query layer storage layer query translation and optimization managing records and access paths DB buffer and hardware interface Purpose declarative queries sets of records records page Access Granularity
11 / 57
Storage Management Layered Architecture
12 / 57
Storage Management Layered Architecture
DB granularity: data structures: granularity: block, file free space inventory, extent table ... track, cylinder, ... granularity: data structures: granularity: page, segment page table, block map ... block, file granularity: data structures: granularity: physical record,... free space inventory, page indexes ... page, segment granularity: data structures: granularity: logical record, key,... access path, physical schema ... physical record, ... granularity: data structures: granularity: relation, view, ... logical schema, integrity constraints logical record, key, ... granularity: relation, view, ... Device Interface File Interface DB Buffer Record Access Record Interface Query Interface SQL,... FIND NEXT record, STORE record write record, insert in B-tree,... access page j, release page j read block k, write block k application logical data access paths physical data page structure storage allocation external storage
13 / 57
Storage Management Layered Architecture
14 / 57
Storage Management Hardware Properties
15 / 57
Storage Management Hardware Properties
▶ no longer true!
▶ start getting problematic, too. density is very high ▶ only capacity, not access time
16 / 57
Storage Management Hardware Properties
17 / 57
Storage Management Hardware Properties
18 / 57
Storage Management Hardware Properties
19 / 57
Storage Management Hardware Properties
20 / 57
Storage Management Hardware Properties
21 / 57
Storage Management Hardware Properties
22 / 57
Storage Management Hardware Properties
23 / 57
Storage Management Hardware Properties
24 / 57
Storage Management Disk-Oriented DBMS
25 / 57
Storage Management Disk-Oriented DBMS
26 / 57
Storage Management Disk-Oriented DBMS
27 / 57
Storage Management Disk-Oriented DBMS
▶ Interprets the contents of page 2 using the header
▶ page number −→ buffer pool slot ▶ page number −→ file block
28 / 57
Storage Management Disk-Oriented DBMS
29 / 57
Storage Management Disk-Oriented DBMS
▶ madvise: Tell the OS how you expect to read certain pages. ▶ mlock: Tell the OS that memory ranges cannot be paged out. ▶ msync: Tell the OS to flush memory ranges out to disk.
▶ Full Usage: MonetDB, LMDB, e.t.c. ▶ Partial Usage: mongoDB, MemSQL, e.t.c.
30 / 57
Storage Management Disk-Oriented DBMS
▶ Flushing dirty pages to disk in the correct order. ▶ Specialized prefetching. ▶ Buffer replacement policy. ▶ Thread/process scheduling.
31 / 57
Storage Management Disk-Oriented DBMS
32 / 57
Storage Management File Storage
33 / 57
Storage Management File Storage
▶ The OS doesn’t know anything about the contents of these files.
▶ Some "enterprise" DBMSs still support this. ▶ Most newer DBMSs do not roll their own filesystem
34 / 57
Storage Management File Storage
▶ Some do their own scheduling of I/O operations to improve spatial and temporal locality
▶ Tracks data being read from and written to pages. ▶ Tracks the available free space.
35 / 57
Storage Management File Storage
▶ It can contain tuples, meta-data, indexes, log records. . . ▶ Most systems do not mix page types. ▶ Some systems require a page to be self-contained. Why?
▶ The DBMS uses an indirection layer to map page ids to physical locations. ▶ This is implemented as a page directory table.
36 / 57
Storage Management File Storage
▶ Hardware Page (usually 4 KB) ▶ OS Page (usually 4 KB) ▶ Database Page (512 B – 16 KB)
37 / 57
Storage Management File Storage
▶ Heap File Organization ▶ Sequential / Sorted File Organization ▶ Hashing File Organization
38 / 57
Storage Management File Storage
▶ Create / Get / Write / Delete Page ▶ Must also support iterating over all pages.
▶ Linked List ▶ Page Directory
39 / 57
Storage Management File Storage
▶ HEAD of the free page list. ▶ HEAD of the data page list.
40 / 57
Storage Management File Storage
41 / 57
Storage Management Page Layout
42 / 57
Storage Management Page Layout
▶ Page Size ▶ Checksum ▶ DBMS Version ▶ Transaction Visibility ▶ Compression Information
43 / 57
Storage Management Page Layout
▶ We are still assuming that we are only storing tuples.
▶ Tuple-oriented ▶ Log-structured
44 / 57
Storage Management Page Layout
▶ What happens if we delete a tuple? ▶ What happens if we have a variable-length attribute?
45 / 57
Storage Management Page Layout
▶ The number of used slots ▶ The offset of the starting location of the last slot used.
46 / 57
Storage Management Page Layout
▶ Inserts store the entire tuple. ▶ Deletes mark the tuple as deleted. ▶ Updates contain the delta of just the attributes that were modified.
47 / 57
Storage Management Page Layout
48 / 57
Storage Management Page Layout
49 / 57
Storage Management Tuple Layout
50 / 57
Storage Management Tuple Layout
51 / 57
Storage Management Tuple Layout
▶ Visibility info (concurrency control) ▶ Bit map for keeping track of NULL values.
52 / 57
Storage Management Tuple Layout
CREATE TABLE foo ( a INT PRIMARY KEY, b INT NOT NULL, c INT, d DOUBLE, e FLOAT );
53 / 57
Storage Management Tuple Layout
▶ Potentially reduces the amount of I/O for common workload patterns. ▶ Can make updates more expensive.
▶ IBM System R did this in the 1970s. ▶ Several NoSQL DBMSs do this without calling it physical denormalization.
CREATE TABLE foo ( a INT PRIMARY KEY, b INT NOT NULL ); CREATE TABLE bar ( c INT PRIMARY KEY, a INT REFERENCES foo (a) );
54 / 57
Storage Management Tuple Layout
▶ Most common: page_id + offset/slot ▶ Can also contain file location info.
▶ PostgreSQL: CTID (6-bytes) ▶ SQLite: ROWID (10-bytes) ▶ Oracle: ROWID (8-bytes)
55 / 57
Storage Management Tuple Layout
56 / 57
Storage Management Tuple Layout
57 / 57
Storage Management Tuple Layout