ADM INISTRIVIA Project #1 is due Fri Sept 27 th @ 11:59pm Homework #2 - - PowerPoint PPT Presentation

adm inistrivia
SMART_READER_LITE
LIVE PREVIEW

ADM INISTRIVIA Project #1 is due Fri Sept 27 th @ 11:59pm Homework #2 - - PowerPoint PPT Presentation

07 Tree Indexes Part I Intro to Database Systems Andy Pavlo AP AP 15-445/15-645 Computer Science Fall 2019 Carnegie Mellon University 2 ADM INISTRIVIA Project #1 is due Fri Sept 27 th @ 11:59pm Homework #2 is due Mon Sept 30 th @ 11:59pm


slide-1
SLIDE 1

Intro to Database Systems 15-445/15-645 Fall 2019 Andy Pavlo Computer Science Carnegie Mellon University

AP AP

07 Tree Indexes

Part I

slide-2
SLIDE 2 CMU 15-445/645 (Fall 2019)

ADM INISTRIVIA

Project #1 is due Fri Sept 27th @ 11:59pm Homework #2 is due Mon Sept 30th @ 11:59pm

2

slide-3
SLIDE 3 CMU 15-445/645 (Fall 2019)

DATA STRUCTURES

Internal Meta-data Core Data Storage Temporary Data Structures Table Indexes

3

slide-4
SLIDE 4 CMU 15-445/645 (Fall 2019)

TABLE INDEXES

A table index is a replica of a subset of a table's attributes that are organized and/or sorted for efficient access using a subset of those attributes. The DBMS ensures that the contents of the table and the index are logically in sync.

4

slide-5
SLIDE 5 CMU 15-445/645 (Fall 2019)

TABLE INDEXES

It is the DBMS's job to figure out the best index(es) to use to execute each query. There is a trade-off on the number of indexes to create per database.

→ Storage Overhead → Maintenance Overhead

5

slide-6
SLIDE 6 CMU 15-445/645 (Fall 2019)

TODAY'S AGENDA

B+Tree Overview Design Decisions Optimizations

6

slide-7
SLIDE 7 CMU 15-445/645 (Fall 2019)

B- TREE FAM ILY

There is a specific data structure called a B-Tree. People also use the term to generally refer to a class of balanced tree data structures:

→ B-Tree (1971) → B+Tree (1973) → B*Tree (1977?) → Blink-Tree (1981)

7

slide-8
SLIDE 8 CMU 15-445/645 (Fall 2019)

B- TREE FAM ILY

There is a specific data structure called a B-Tree. People also use the term to generally refer to a class of balanced tree data structures:

→ B-Tree (1971) → B+Tree (1973) → B*Tree (1977?) → Blink-Tree (1981)

7

slide-9
SLIDE 9 CMU 15-445/645 (Fall 2019)

B+ TREE

A B+Tree is a self-balancing tree data structure that keeps data sorted and allows searches, sequential access, insertions, and deletions in O(log n).

→ Generalization of a binary search tree in that a node can have more than two children. → Optimized for systems that read and write large blocks of data.

8

slide-10
SLIDE 10 CMU 15-445/645 (Fall 2019)

B+ TREE PROPERTIES

A B+Tree is an M-way search tree with the following properties:

→ It is perfectly balanced (i.e., every leaf node is at the same depth). → Every node other than the root, is at least half-full M/2-1 ≤ #keys ≤ M-1 → Every inner node with k keys has k+1 non-null children

9

slide-11
SLIDE 11 CMU 15-445/645 (Fall 2019)

B+ TREE EXAM PLE

10

Leaf Nodes Inner Node Sibling Pointers

6 7 9 13 1 3 5 9

slide-12
SLIDE 12 CMU 15-445/645 (Fall 2019)

B+ TREE EXAM PLE

10

Leaf Nodes <5 <9 ≥9 Inner Node

<value>|<key>

Sibling Pointers

6 7 9 13 1 3 5 9

<node*>|<key>

slide-13
SLIDE 13 CMU 15-445/645 (Fall 2019)

NODES

Every B+Tree node is comprised of an array of key/value pairs.

→ The keys are derived from the attributes(s) that the index is based on. → The values will differ based on whether the node is classified as inner nodes or leaf nodes.

The arrays are (usually) kept in sorted key order.

11

slide-14
SLIDE 14 CMU 15-445/645 (Fall 2019)

B+Tree Leaf Node

B+ TREE LEAF NODES

12

K1 V1 • • • Kn Vn

¤ ¤

Prev Next

slide-15
SLIDE 15 CMU 15-445/645 (Fall 2019)

B+Tree Leaf Node

B+ TREE LEAF NODES

12

K1 V1 • • • Kn Vn

¤ ¤

Prev Next PageID PageID

slide-16
SLIDE 16 CMU 15-445/645 (Fall 2019)

B+Tree Leaf Node

B+ TREE LEAF NODES

12

Key+ Value K1 V1 • • • Kn Vn

¤ ¤

Prev Next

¤ ¤

PageID PageID

slide-17
SLIDE 17 CMU 15-445/645 (Fall 2019)

B+Tree Leaf Node

B+ TREE LEAF NODES

12

Sorted Keys K1 K2 K3 K4 K5 • • • Kn Values

¤ ¤ ¤ ¤ ¤

  • • • ¤

¤

Prev

¤

Next # Level # Slots

slide-18
SLIDE 18 CMU 15-445/645 (Fall 2019)

B+Tree Leaf Node

B+ TREE LEAF NODES

12

Sorted Keys K1 K2 K3 K4 K5 • • • Kn Values

¤ ¤ ¤ ¤ ¤

  • • • ¤

¤

Prev

¤

Next # Level # Slots

slide-19
SLIDE 19 CMU 15-445/645 (Fall 2019)

LEAF NODE VALUES

Approach #1: Record Ids

→ A pointer to the location of the tuple that the index entry corresponds to.

Approach #2: Tuple Data

→ The actual contents of the tuple is stored in the leaf node. → Secondary indexes have to store the record id as their values.

13

slide-20
SLIDE 20 CMU 15-445/645 (Fall 2019)

B- TREE VS. B+ TREE

The original B-Tree from 1972 stored keys + values in all nodes in the tree.

→ More space efficient since each key only appears once in the tree.

A B+Tree only stores values in leaf nodes. Inner nodes only guide the search process.

14

slide-21
SLIDE 21 CMU 15-445/645 (Fall 2019)

B+ TREE INSERT

Find correct leaf node L. Put data entry into L in sorted order. If L has enough space, done! Otherwise, split L keys into L and a new node L2

→ Redistribute entries evenly, copy up middle key. → Insert index entry pointing to L2 into parent of L.

To split inner node, redistribute entries evenly, but push up middle key.

15

Source: Chris Re
slide-22
SLIDE 22 CMU 15-445/645 (Fall 2019)

B+ TREE VISUALIZATIO N

https://cmudb.io/btree

Source: David Gales (Univ. of San Francisco)

16

slide-23
SLIDE 23 CMU 15-445/645 (Fall 2019)

B+ TREE DELETE

Start at root, find leaf L where entry belongs. Remove the entry. If L is at least half-full, done! If L has only M/2-1 entries,

→ Try to re-distribute, borrowing from sibling (adjacent node with same parent as L). → If re-distribution fails, merge L and sibling.

If merge occurred, must delete entry (pointing to L

  • r sibling) from parent of L.

17

Source: Chris Re
slide-24
SLIDE 24 CMU 15-445/645 (Fall 2019)

B+ TREES IN PRACTICE

Typical Fill-Factor: 67%. Typical Capacities:

→ Height 4: 1334 = 312,900,721 entries → Height 3: 1333 = 2,406,104 entries

Pages per level:

→ Level 1 = 1 page = 8 KB → Level 2 = 134 pages = 1 MB → Level 3 = 17,956 pages = 140 MB

18

slide-25
SLIDE 25 CMU 15-445/645 (Fall 2019)

CLUSTERED INDEXES

The table is stored in the sort order specified by the primary key.

→ Can be either heap- or index-organized storage.

Some DBMSs always use a clustered index.

→ If a table doesn’t contain a primary key, the DBMS will automatically make a hidden row id primary key.

Other DBMSs cannot use them at all.

19

slide-26
SLIDE 26 CMU 15-445/645 (Fall 2019)

SELECTIO N CONDITIO NS

The DBMS can use a B+Tree index if the query provides any of the attributes of the search key. Example: Index on <a,b,c>

→ Supported: (a=5 AND b=3) → Supported: (b=3).

Not all DBMSs support this. For hash index, we must have all attributes in search key.

20

slide-27
SLIDE 27 CMU 15-445/645 (Fall 2019)

SELECTIO N CONDITIO NS

21

Find Key=(A,B)

A,C B,B C,C A,C B,A A,A A,B B,B B,C C,C C,D

A ≤ A B ≤ C

slide-28
SLIDE 28 CMU 15-445/645 (Fall 2019)

SELECTIO N CONDITIO NS

21

Find Key=(A,B)

A,C B,B C,C A,C B,A A,A A,B B,B B,C C,C C,D

Find Key=(A,*)

A ≤ A

slide-29
SLIDE 29 CMU 15-445/645 (Fall 2019)

SELECTIO N CONDITIO NS

21

Find Key=(A,B)

A,C B,B C,C A,C B,A A,A A,B B,B B,C C,C C,D

Find Key=(A,*)

A ≤ A A ≤ B

slide-30
SLIDE 30 CMU 15-445/645 (Fall 2019)

SELECTIO N CONDITIO NS

21

Find Key=(A,B) Find Key=(*,B)

A,C B,B C,C A,C B,A A,A A,B B,B B,C C,C C,D

Find Key=(A,*)

slide-31
SLIDE 31 CMU 15-445/645 (Fall 2019)

SELECTIO N CONDITIO NS

21

Find Key=(A,B) Find Key=(*,B)

A,C B,B C,C A,C B,A A,A A,B B,B B,C C,C C,D

*,B < C,C

Find Key=(A,*)

slide-32
SLIDE 32 CMU 15-445/645 (Fall 2019)

SELECTIO N CONDITIO NS

21

Find Key=(A,B) Find Key=(*,B)

A,C B,B C,C A,C B,A A,A A,B B,B B,C C,C C,D

*,B < C,C

Find Key=(A,*)

* = A B = B * = B B = B

slide-33
SLIDE 33 CMU 15-445/645 (Fall 2019)

B+ TREE DESIGN CHOICES

Node Size Merge Threshold Variable Length Keys Non-Unique Indexes Intra-Node Search

22

slide-34
SLIDE 34 CMU 15-445/645 (Fall 2019)

NODE SIZE

The slower the storage device, the larger the

  • ptimal node size for a B+Tree.

→ HDD ~1MB → SSD: ~10KB → In-Memory: ~512B

Optimal sizes can vary depending on the workload

→ Leaf Node Scans vs. Root-to-Leaf Traversals

23

slide-35
SLIDE 35 CMU 15-445/645 (Fall 2019)

M ERGE THRESH OLD

Some DBMSs do not always merge nodes when it is half full. Delaying a merge operation may reduce the amount of reorganization. It may also be better to just let underflows to exist and then periodically rebuild entire tree.

24

slide-36
SLIDE 36 CMU 15-445/645 (Fall 2019)

VARIABLE LENGTH KEYS

Approach #1: Pointers

→ Store the keys as pointers to the tuple’s attribute.

Approach #2: Variable Length Nodes

→ The size of each node in the index can vary. → Requires careful memory management.

Approach #3: Padding

→ Always pad the key to be max length of the key type.

Approach #4: Key Map / Indirection

→ Embed an array of pointers that map to the key + value list within the node.

25

slide-37
SLIDE 37 CMU 15-445/645 (Fall 2019)

¤ ¤ ¤ ¤

Andy V1 Obama Prashanth V3 V4 Lin V2

B+Tree Leaf Node

KEY M AP / INDIRECTIO N

26

Key+Values

¤

Prev

¤

Next

#

Level

#

Slots Sorted Key Map

slide-38
SLIDE 38 CMU 15-445/645 (Fall 2019)

¤ ¤ ¤ ¤

Andy V1 Obama Prashanth V3 V4 Lin V2

B+Tree Leaf Node

KEY M AP / INDIRECTIO N

26

Key+Values

¤

Prev

¤

Next

#

Level

#

Slots Sorted Key Map

slide-39
SLIDE 39 CMU 15-445/645 (Fall 2019)

¤ ¤ ¤ ¤

Andy V1 Obama Prashanth V3 V4 Lin V2

B+Tree Leaf Node

KEY M AP / INDIRECTIO N

26

Key+Values

¤

Prev

¤

Next

#

Level

#

Slots Sorted Key Map

A·¤ L·¤ O·¤ P·¤

slide-40
SLIDE 40 CMU 15-445/645 (Fall 2019)

NON- UN IQ UE INDEXES

Approach #1: Duplicate Keys

→ Use the same leaf node layout but store duplicate keys multiple times.

Approach #2: Value Lists

→ Store each key only once and maintain a linked list of unique values.

27

slide-41
SLIDE 41 CMU 15-445/645 (Fall 2019)

B+Tree Leaf Node

NON- UN IQ UE: DUPLICATE KEYS

28

Sorted Keys K1 K1 K1 K2 K2 • • • Kn

¤

Prev

¤

Next # Level # Slots Values

¤ ¤ ¤ ¤ ¤

  • • • ¤
slide-42
SLIDE 42 CMU 15-445/645 (Fall 2019)

B+Tree Leaf Node

NON- UN IQ UE: DUPLICATE KEYS

28

Sorted Keys K1 K1 K1 K2 K2 • • • Kn

¤

Prev

¤

Next # Level # Slots Values

¤ ¤ ¤ ¤ ¤

  • • • ¤
slide-43
SLIDE 43 CMU 15-445/645 (Fall 2019)

B+Tree Leaf Node

NON- UN IQ UE: VALUE LISTS

29

Values

¤ ¤ ¤ ¤ ¤

  • • •

¤

Prev

¤

Next # Level # Slots Sorted Keys K1 K2 K3 K4 K5 • • • Kn

slide-44
SLIDE 44 CMU 15-445/645 (Fall 2019)

INTRA- N ODE SEARCH

Approach #1: Linear

→ Scan node keys from beginning to end.

Approach #2: Binary

→ Jump to middle key, pivot left/right depending on comparison.

Approach #3: Interpolation

→ Approximate location of desired key based

  • n known distribution of keys.

30

Find Key=8

5 6 7 8 9 10 4

slide-45
SLIDE 45 CMU 15-445/645 (Fall 2019)

INTRA- N ODE SEARCH

Approach #1: Linear

→ Scan node keys from beginning to end.

Approach #2: Binary

→ Jump to middle key, pivot left/right depending on comparison.

Approach #3: Interpolation

→ Approximate location of desired key based

  • n known distribution of keys.

30

Find Key=8

5 6 7 8 9 10 4 5 6 7 8 9 10 4

slide-46
SLIDE 46 CMU 15-445/645 (Fall 2019)

INTRA- N ODE SEARCH

Approach #1: Linear

→ Scan node keys from beginning to end.

Approach #2: Binary

→ Jump to middle key, pivot left/right depending on comparison.

Approach #3: Interpolation

→ Approximate location of desired key based

  • n known distribution of keys.

30

Find Key=8

5 6 7 8 9 10 4 5 6 7 8 9 10 4

slide-47
SLIDE 47 CMU 15-445/645 (Fall 2019)

INTRA- N ODE SEARCH

Approach #1: Linear

→ Scan node keys from beginning to end.

Approach #2: Binary

→ Jump to middle key, pivot left/right depending on comparison.

Approach #3: Interpolation

→ Approximate location of desired key based

  • n known distribution of keys.

30

Find Key=8

5 6 7 8 9 10 4 5 6 7 8 9 10 4

slide-48
SLIDE 48 CMU 15-445/645 (Fall 2019)

INTRA- N ODE SEARCH

Approach #1: Linear

→ Scan node keys from beginning to end.

Approach #2: Binary

→ Jump to middle key, pivot left/right depending on comparison.

Approach #3: Interpolation

→ Approximate location of desired key based

  • n known distribution of keys.

30

Find Key=8

5 6 7 8 9 10 4 5 6 7 8 9 10 4

slide-49
SLIDE 49 CMU 15-445/645 (Fall 2019)

INTRA- N ODE SEARCH

Approach #1: Linear

→ Scan node keys from beginning to end.

Approach #2: Binary

→ Jump to middle key, pivot left/right depending on comparison.

Approach #3: Interpolation

→ Approximate location of desired key based

  • n known distribution of keys.

30

Find Key=8

5 6 7 8 9 10 4 5 6 7 8 9 10 4 5 6 7 8 9 10 4

Offset: 7-(10-8)=5

slide-50
SLIDE 50 CMU 15-445/645 (Fall 2019)

OPTIM IZATIO NS

Prefix Compression Suffix Truncation Bulk Insert Pointer Swizzling

31

slide-51
SLIDE 51 CMU 15-445/645 (Fall 2019)

PREFIX COM PRESSIO N

Sorted keys in the same leaf node are likely to have the same prefix. Instead of storing the entire key each time, extract common prefix and store

  • nly unique suffix for each key.

→ Many variations.

32

robbed robbing robot bed bing

  • t

Prefix: rob

slide-52
SLIDE 52 CMU 15-445/645 (Fall 2019)

SUFFIX TRUNCATIO N

The keys in the inner nodes are only used to "direct traffic".

→ We don't need the entire key.

Store a minimum prefix that is needed to correctly route probes into the index.

33

abcdefghijk lmnopqrstuv … … … …

slide-53
SLIDE 53 CMU 15-445/645 (Fall 2019)

SUFFIX TRUNCATIO N

The keys in the inner nodes are only used to "direct traffic".

→ We don't need the entire key.

Store a minimum prefix that is needed to correctly route probes into the index.

33

… … … … abc lmn

slide-54
SLIDE 54 CMU 15-445/645 (Fall 2019)

BULK INSERT

The fastest/best way to build a B+Tree is to first sort the keys and then build the index from the bottom up.

34

6 7 9 13 1 3

Keys: 3, 7, 9, 13, 6, 1 Sorted Keys: 1, 3, 6, 7, 9, 13

slide-55
SLIDE 55 CMU 15-445/645 (Fall 2019)

BULK INSERT

The fastest/best way to build a B+Tree is to first sort the keys and then build the index from the bottom up.

34

6 9 6 7 9 13 1 3

Keys: 3, 7, 9, 13, 6, 1 Sorted Keys: 1, 3, 6, 7, 9, 13

slide-56
SLIDE 56 CMU 15-445/645 (Fall 2019)

POINTER SWIZZLING

Nodes use page ids to reference other nodes in the index. The DBMS must get the memory location from the page table during traversal. If a page is pinned in the buffer pool, then we can store raw pointers instead of page ids. This avoids address lookups from the page table.

35

6 9 6 7 1 3

Buffer Pool

1

Header

2

Header

3

Header

Find Key>3

slide-57
SLIDE 57 CMU 15-445/645 (Fall 2019)

POINTER SWIZZLING

Nodes use page ids to reference other nodes in the index. The DBMS must get the memory location from the page table during traversal. If a page is pinned in the buffer pool, then we can store raw pointers instead of page ids. This avoids address lookups from the page table.

35

6 9 6 7 1 3

Page # 2

Buffer Pool

1

Header

2

Header

3

Header

Page # 2 → <Page*>

Find Key>3

slide-58
SLIDE 58 CMU 15-445/645 (Fall 2019)

POINTER SWIZZLING

Nodes use page ids to reference other nodes in the index. The DBMS must get the memory location from the page table during traversal. If a page is pinned in the buffer pool, then we can store raw pointers instead of page ids. This avoids address lookups from the page table.

35

6 9 6 7 1 3

Page # 2 Page # 3

Buffer Pool

1

Header

2

Header

3

Header

Page # 2 → <Page*> Page # 3 → <Page*>

Find Key>3

slide-59
SLIDE 59 CMU 15-445/645 (Fall 2019)

POINTER SWIZZLING

Nodes use page ids to reference other nodes in the index. The DBMS must get the memory location from the page table during traversal. If a page is pinned in the buffer pool, then we can store raw pointers instead of page ids. This avoids address lookups from the page table.

35

6 9 6 7 1 3

Buffer Pool

1

Header

2

Header

3

Header

Find Key>3

<Page*> <Page*>

slide-60
SLIDE 60 CMU 15-445/645 (Fall 2019)

CONCLUSIO N

The venerable B+Tree is always a good choice for your DBMS.

36

slide-61
SLIDE 61 CMU 15-445/645 (Fall 2019)

NEXT CLASS

More B+Trees Tries / Radix Trees Inverted Indexes

37