cse 326 data structures b trees
play

CSE 326: Data Structures B-Trees Hal Perkins Weiss Sec. 4.7 - PowerPoint PPT Presentation

B-Trees CSE 326: Data Structures B-Trees Hal Perkins Weiss Sec. 4.7 Spring 2007 Lecture 14-15 TIme to access Trees so far (conservative) CPU (has registers) 1 ns per instruction BST SRAM Cache Cache 8KB - 4MB 2-10 ns Main


  1. B-Trees CSE 326: Data Structures B-Trees Hal Perkins Weiss Sec. 4.7 Spring 2007 Lecture 14-15 TIme to access Trees so far (conservative) CPU (has registers) 1 ns per instruction • BST SRAM Cache Cache 8KB - 4MB 2-10 ns Main Memory DRAM • AVL Main Memory up to 10GB 40-100 ns • Splay a few Disk Disk milli seconds many GB (5-10 Million ns) 3 4 1

  2. M -ary Search Tree Solution: B-Trees • specialized M -ary search trees • Each node has (up to) M-1 keys: – subtree between two keys x and y contains • Maximum branching factor of M leaves with values v such that 3 7 12 21 x ≤ v < y • Complete tree has height = • Pick branching factor M # disk accesses for find : such that each node takes one full { page, block } 3 ≤ x<7 7 ≤ x<12 12 ≤ x<21 21 ≤ x x<3 Runtime of find : of memory 5 6 B-Trees B-Tree: Example B-Tree with M = 4 (# pointers in internal node) What makes them disk-friendly? and L = 4 (# data items in leaf) 1. Many keys stored in a node 10 40 • All brought to memory/cache in one access! 3 15 20 30 50 2. Internal nodes contain only keys; Only leaf nodes contain keys and actual data • The tree structure can be loaded into memory 10 11 12 20 25 26 40 42 1 2 irrespective of data object size AB xG 3 5 6 9 15 17 30 32 33 36 50 60 70 • Data actually resides in disk Data objects, that I’ll Note: All leaves at the same depth! 7 8 ignore in slides 2

  3. B-Tree Properties ‡ Example, Again B-Tree with M = 4 – Data is stored at the leaves and L = 4 – All leaves are at the same depth and contains between ⎡ L /2 ⎤ and L data items 10 40 – Internal nodes store up to M-1 keys – Internal nodes have between ⎡ M /2 ⎤ and M children 3 15 20 30 50 – Root (special case) has between 2 and M children (or root could be a leaf) 1 2 10 11 12 20 25 26 40 42 3 5 6 9 15 17 30 32 33 36 50 60 70 (Only showing keys, but leaves also have data!) ‡ These are technically B + -Trees 9 10 B-trees vs. AVL trees Building a B-Tree Suppose we have 100 million items (100,000,000): 3 3 14 • Depth of AVL Tree Insert(3) Insert(14) The empty B-Tree • Depth of B+ Tree with M = 128, L = 64 M = 3 L = 2 Now, Insert(1)? 11 12 3

  4. M = 3 L = 2 Splitting the Root M = 3 L = 2 Overflowing leaves Too many keys in a leaf! Too many 14 keys in a leaf! 14 14 Insert(59) Insert(26) 1 3 14 14 1 3 1 3 14 1 3 14 59 Insert(1) And create 3 14 a new root 1 3 14 1 3 14 14 26 59 So, split the leaf . So, split the leaf. 14 59 And add a new child 1 3 14 26 59 13 14 M = 3 L = 2 Propagating Splits Insertion Algorithm 14 59 1. Insert the key in its leaf 3. If an internal node ends up 14 59 with M+1 items, overflow ! 2. If the leaf ends up with L+1 Insert(5) Add new 14 26 59 items, overflow ! – Split the node into two nodes: child 1 3 14 26 59 • original with ⎡ ( M +1)/2 ⎤ items – Split the leaf into two nodes: • new one with ⎣ ( M +1)/2 ⎦ items original with ⎡ ( L +1)/2 ⎤ items 1 3 5 • new one with ⎣ ( L +1)/2 ⎦ items – Add the new child to the parent • Split the leaf, but no space in parent! – If the parent ends up with M +1 – Add the new child to the parent items, overflow ! – If the parent ends up with M +1 14 items, overflow ! 5 14 59 4. Split an overflowed root in two Create a 5 59 and hang the new nodes under new root a new root This makes the tree deeper! 1 3 5 14 26 59 1 3 5 14 26 59 15 16 So, split the node. 4

  5. M = 3 L = 2 M = 3 L = 2 After More Routine Inserts Deletion 1. Delete item from leaf 2. Update keys of ancestors if necessary 14 Insert(89) 5 59 14 14 Insert(79) Delete(59) 1 3 5 14 26 59 5 59 89 5 79 89 14 1 3 5 14 26 59 79 89 1 3 5 14 26 79 89 5 59 89 What could go wrong? 1 3 5 14 26 59 79 89 17 18 M = 3 L = 2 Deletion and Adoption Does Adoption Always Work? A leaf has too few keys! • What if the sibling doesn’t have enough for you to 14 14 borrow from? Delete(5) 5 79 89 ? 79 89 e.g. you have ⎡ L /2 ⎤ -1 and sibling has ⎡ L /2 ⎤ ? 1 3 5 14 26 79 89 1 3 14 26 79 89 So, borrow from a sibling 14 3 79 89 1 3 3 14 26 79 89 19 20 5

  6. Deletion with Propagation M = 3 L = 2 M = 3 L = 2 Deletion and Merging (More Adoption) A leaf has too few keys! 14 14 Delete(3) 3 79 89 ? 79 89 14 79 1 3 14 26 79 89 1 14 26 79 89 Adopt a 79 89 14 89 neighbor And no sibling with surplus! 1 14 26 79 89 1 14 26 79 89 14 So, delete 79 89 the leaf But now an internal node 21 22 has too few subtrees! 1 14 26 79 89 M = 3 L = 2 M = 3 L = 2 Pulling out the Root A Bit More Adoption A leaf has too few keys! And no sibling with surplus! 79 79 Delete(26) So, delete 26 89 89 the leaf; merge 79 79 14 26 79 89 14 79 89 Delete(1) 26 89 14 89 (adopt a But now the root A node has too few subtrees sibling) has just one subtree! and no neighbor with surplus! 1 14 26 79 89 14 26 79 89 79 Delete 79 89 89 the node 23 24 14 79 89 14 79 89 6

  7. M = 3 L = 2 Pulling out the Root (continued) Deletion Algorithm The root has just one subtree! 1. Remove the key from its leaf Simply make the one child the new root! 2. If the leaf ends up with fewer 79 89 than ⎡ L /2 ⎤ items, underflow ! 14 79 89 – Adopt data from a sibling; update the parent – If adopting won’t work, delete 79 89 node and merge with neighbor – If the parent ends up with fewer than ⎡ M /2 ⎤ items, 14 79 89 underflow ! 25 26 Deletion Slide Two Thinking about B-Trees • B-Tree insertion can cause (expensive) splitting and 3. If an internal node ends up with fewer than ⎡ M /2 ⎤ items, underflow ! propagation • B-Tree deletion can cause (cheap) adoption or – Adopt from a neighbor; update the parent (expensive) deletion, merging and propagation – If adoption won’t work, • Propagation is rare if M and L are large merge with neighbor (Why?) – If the parent ends up with fewer than • If M = L = 128 , then a B-Tree of height 4 will ⎡ M /2 ⎤ items, underflow ! This reduces the store at least 30,000,000 items height of the tree! 4. If the root ends up with only one child, make the child the new root of the tree 27 28 7

  8. Tree Names You Might Encounter FYI: – B-Trees with M = 3 , L = x are called 2-3 trees • Nodes can have 2 or 3 keys – B-Trees with M = 4 , L = x are called 2-3-4 trees • Nodes can have 2, 3, or 4 keys 29 8

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend