15-721 DATABASE SYSTEMS Lecture #08 Latch-free OLTP Indexes - - PowerPoint PPT Presentation

15 721
SMART_READER_LITE
LIVE PREVIEW

15-721 DATABASE SYSTEMS Lecture #08 Latch-free OLTP Indexes - - PowerPoint PPT Presentation

15-721 DATABASE SYSTEMS Lecture #08 Latch-free OLTP Indexes (Part II) Andy Pavlo / / Carnegie Mellon University / / Spring 2016 @Andy_Pavlo // Carnegie Mellon University // Spring 2017 2 TODAYS AGENDA Bw-Tree Index ART Index


slide-1
SLIDE 1

Andy Pavlo / / Carnegie Mellon University / / Spring 2016

DATABASE SYSTEMS

Lecture #08 – Latch-free OLTP Indexes (Part II)

15-721

@Andy_Pavlo // Carnegie Mellon University // Spring 2017

slide-2
SLIDE 2

CMU 15-721 (Spring 2017)

TODAY’S AGENDA

Bw-Tree Index ART Index Profiling in Peloton

2

slide-3
SLIDE 3

CMU 15-721 (Spring 2017)

OBSERVATION

We cannot have reverse pointers in a latch-free concurrent Skip List because CaS can only update a single address at a time.

3

slide-4
SLIDE 4

CMU 15-721 (Spring 2017)

BW-TREE

Latch-free B+Tree index

→ Threads never need to set latches or block.

Key Idea #1: Deltas

→ No updates in place → Reduces cache invalidation.

Key Idea #2: Mapping Table

→ Allows for CAS of physical locations of pages.

4

THE BW-TREE: A B-TREE FOR NEW HARDWARE ICDE 2013

slide-5
SLIDE 5

CMU 15-721 (Spring 2017)

BW-TREE: MAPPING TABLE

5

Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer Index Page

102 101 104

slide-6
SLIDE 6

CMU 15-721 (Spring 2017)

BW-TREE: MAPPING TABLE

5

Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer

102 104 102 104

Index Page

slide-7
SLIDE 7

CMU 15-721 (Spring 2017)

BW-TREE: DELTA UPDATES

6

Each update to a page produces a new delta. Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer Page 102

Source: Justin Levandoski

slide-8
SLIDE 8

CMU 15-721 (Spring 2017)

BW-TREE: DELTA UPDATES

6

Each update to a page produces a new delta. Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer

▲Insert 50

Page 102

Source: Justin Levandoski

slide-9
SLIDE 9

CMU 15-721 (Spring 2017)

BW-TREE: DELTA UPDATES

6

Each update to a page produces a new delta. Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer

▲Insert 50

Page 102

Delta physically points to base page.

Source: Justin Levandoski

slide-10
SLIDE 10

CMU 15-721 (Spring 2017)

BW-TREE: DELTA UPDATES

6

Each update to a page produces a new delta. Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer

▲Insert 50

Page 102

Install delta address in physical address slot of mapping table using CAS. Delta physically points to base page.

Source: Justin Levandoski

slide-11
SLIDE 11

CMU 15-721 (Spring 2017)

BW-TREE: DELTA UPDATES

6

Each update to a page produces a new delta. Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer

▲Insert 50

Page 102

Install delta address in physical address slot of mapping table using CAS. Delta physically points to base page.

Source: Justin Levandoski

slide-12
SLIDE 12

CMU 15-721 (Spring 2017)

BW-TREE: DELTA UPDATES

6

Each update to a page produces a new delta. Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer

▲Insert 50

Page 102

▲Delete 48

Install delta address in physical address slot of mapping table using CAS. Delta physically points to base page.

Source: Justin Levandoski

slide-13
SLIDE 13

CMU 15-721 (Spring 2017)

BW-TREE: DELTA UPDATES

6

Each update to a page produces a new delta. Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer

▲Insert 50

Page 102

▲Delete 48

Install delta address in physical address slot of mapping table using CAS. Delta physically points to base page.

Source: Justin Levandoski

slide-14
SLIDE 14

CMU 15-721 (Spring 2017)

BW-TREE: SEARCH

7

Traverse tree like a regular B+tree. Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer

▲Insert 50

Page 102

▲Delete 48

slide-15
SLIDE 15

CMU 15-721 (Spring 2017)

BW-TREE: SEARCH

7

Traverse tree like a regular B+tree. Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer

▲Insert 50

Page 102

▲Delete 48

If mapping table points to delta chain, stop at first

  • ccurrence of search key.
slide-16
SLIDE 16

CMU 15-721 (Spring 2017)

BW-TREE: SEARCH

7

Traverse tree like a regular B+tree. Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer

▲Insert 50

Page 102

▲Delete 48

Otherwise, perform binary search on base page. If mapping table points to delta chain, stop at first

  • ccurrence of search key.
slide-17
SLIDE 17

CMU 15-721 (Spring 2017)

BW-TREE: CONTENTION UPDATES

8

Threads may try to install updates to same state of the page. Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer

▲Insert 50

Page 102

slide-18
SLIDE 18

CMU 15-721 (Spring 2017)

BW-TREE: CONTENTION UPDATES

8

Threads may try to install updates to same state of the page. Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer

▲Insert 50

Page 102

▲Delete 48 ▲Insert 16

slide-19
SLIDE 19

CMU 15-721 (Spring 2017)

BW-TREE: CONTENTION UPDATES

8

Threads may try to install updates to same state of the page. Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer

▲Insert 50

Page 102

▲Delete 48

Winner succeeds, any losers must retry or abort

▲Insert 16

slide-20
SLIDE 20

CMU 15-721 (Spring 2017)

BW-TREE: CONTENTION UPDATES

8

Threads may try to install updates to same state of the page. Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer

▲Insert 50

Page 102

▲Delete 48

Winner succeeds, any losers must retry or abort

▲Insert 16

slide-21
SLIDE 21

CMU 15-721 (Spring 2017)

BW-TREE: CONTENTION UPDATES

8

Threads may try to install updates to same state of the page. Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer

▲Insert 50

Page 102

▲Delete 48

Winner succeeds, any losers must retry or abort

▲Insert 16

X

slide-22
SLIDE 22

CMU 15-721 (Spring 2017)

BW-TREE: DELTA TYPES

Record Update Deltas

→ Insert/Delete/Update of record on a page

Structure Modification Deltas

→ Split/Merge information

9

slide-23
SLIDE 23

CMU 15-721 (Spring 2017)

BW-TREE: CONSOLIDATION

10

Consolidate updates by creating new page with deltas applied. Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer

▲Insert 50

Page 102

▲Delete 48 ▲Insert 55

slide-24
SLIDE 24

CMU 15-721 (Spring 2017)

BW-TREE: CONSOLIDATION

10

Consolidate updates by creating new page with deltas applied. Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer

▲Insert 50

Page 102

▲Delete 48 ▲Insert 55

New 102

slide-25
SLIDE 25

CMU 15-721 (Spring 2017)

BW-TREE: CONSOLIDATION

10

Consolidate updates by creating new page with deltas applied. Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer

▲Insert 50

Page 102

▲Delete 48 ▲Insert 55

New 102

▲Insert 50

slide-26
SLIDE 26

CMU 15-721 (Spring 2017)

BW-TREE: CONSOLIDATION

10

Consolidate updates by creating new page with deltas applied. Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer

▲Insert 50

Page 102

▲Delete 48

CAS-ing the mapping table address ensures no deltas are missed.

▲Insert 55

New 102

slide-27
SLIDE 27

CMU 15-721 (Spring 2017)

BW-TREE: CONSOLIDATION

10

Consolidate updates by creating new page with deltas applied. Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer

▲Insert 50

Page 102

▲Delete 48

CAS-ing the mapping table address ensures no deltas are missed.

▲Insert 55

New 102

slide-28
SLIDE 28

CMU 15-721 (Spring 2017)

BW-TREE: CONSOLIDATION

10

Consolidate updates by creating new page with deltas applied. Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer

▲Insert 50

Page 102

▲Delete 48

CAS-ing the mapping table address ensures no deltas are missed.

▲Insert 55

New 102

Old page + deltas are marked as garbage.

slide-29
SLIDE 29

CMU 15-721 (Spring 2017)

BW-TREE: GARBAGE COLLECTION

Operations are tagged with an epoch

→ Each epoch tracks the threads that are part of it and the

  • bjects that can be reclaimed.

→ Thread joins an epoch prior to each operation and post

  • bjects that can be reclaimed for the current epoch (not

necessarily the one it joined)

Garbage for an epoch reclaimed only when all threads have exited the epoch.

11

slide-30
SLIDE 30

CMU 15-721 (Spring 2017)

BW-TREE: GARBAGE COLLECTION

12

Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer

▲Insert 50

Page 102

▲Delete 48 ▲Insert 55

New 102 CPU1

Epoch Table

CPU1

slide-31
SLIDE 31

CMU 15-721 (Spring 2017)

BW-TREE: GARBAGE COLLECTION

12

Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer

▲Insert 50

Page 102

▲Delete 48 ▲Insert 55

New 102 CPU1 CPU2

Epoch Table

CPU1 CPU2

slide-32
SLIDE 32

CMU 15-721 (Spring 2017)

BW-TREE: GARBAGE COLLECTION

12

Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer

▲Insert 50

Page 102

▲Delete 48 ▲Insert 55

New 102 CPU1 CPU2

Epoch Table

CPU1 CPU2

slide-33
SLIDE 33

CMU 15-721 (Spring 2017)

BW-TREE: GARBAGE COLLECTION

12

Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer

▲Insert 50

Page 102

▲Delete 48 ▲Insert 55

New 102 CPU1

Epoch Table

CPU1

slide-34
SLIDE 34

CMU 15-721 (Spring 2017)

BW-TREE: GARBAGE COLLECTION

12

Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer

▲Insert 50

Page 102

▲Delete 48 ▲Insert 55

New 102

Epoch Table

slide-35
SLIDE 35

CMU 15-721 (Spring 2017)

BW-TREE: GARBAGE COLLECTION

12

Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer New 102

Epoch Table

slide-36
SLIDE 36

CMU 15-721 (Spring 2017)

BW-TREE: STRUCTURE MODIFICATIONS

Split Delta Record

→ Mark that a subset of the base page’s key range is now located at another page. → Use a logical pointer to the new page.

Separator Delta Record

→ Provide a shortcut in the modified page’s parent on what ranges to find the new page.

14

slide-37
SLIDE 37

CMU 15-721 (Spring 2017)

102 104 101 103

BW-TREE: STRUCTURE MODIFICATIONS

15

Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer 3 4 5 6 1 2 7 8

105

slide-38
SLIDE 38

CMU 15-721 (Spring 2017)

102 104 101 103

BW-TREE: STRUCTURE MODIFICATIONS

15

Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer 3 4 5 6 1 2 7 8

105

slide-39
SLIDE 39

CMU 15-721 (Spring 2017)

102 105 104 101 103

BW-TREE: STRUCTURE MODIFICATIONS

15

Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer 3 4 5 6 1 2 7 8

105

5 6

slide-40
SLIDE 40

CMU 15-721 (Spring 2017)

102 105 104 101 103

BW-TREE: STRUCTURE MODIFICATIONS

15

Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer 3 4 5 6 1 2 7 8

105

5 6

slide-41
SLIDE 41

CMU 15-721 (Spring 2017)

102 105 104 101 103

BW-TREE: STRUCTURE MODIFICATIONS

15

Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer 3 4 5 6 1 2 7 8

105

5 6

▲Split

slide-42
SLIDE 42

CMU 15-721 (Spring 2017)

102 105 104 101 103

BW-TREE: STRUCTURE MODIFICATIONS

15

Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer 3 4 5 6 1 2 7 8

105

5 6

▲Split

X X

slide-43
SLIDE 43

CMU 15-721 (Spring 2017)

102 105 104 101 103

BW-TREE: STRUCTURE MODIFICATIONS

15

Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer 3 4 5 6 1 2 7 8

105

5 6

▲Split

X X

slide-44
SLIDE 44

CMU 15-721 (Spring 2017)

102 105 104 101 103

BW-TREE: STRUCTURE MODIFICATIONS

15

Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer 3 4 5 6 1 2 7 8

105

5 6

▲Split

X X

slide-45
SLIDE 45

CMU 15-721 (Spring 2017)

102 105 104 101 103

BW-TREE: STRUCTURE MODIFICATIONS

15

Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer 3 4 5 6 1 2 7 8

105

5 6

▲Split

X X

slide-46
SLIDE 46

CMU 15-721 (Spring 2017)

102 105 104 101 103

BW-TREE: STRUCTURE MODIFICATIONS

15

Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer 3 4 5 6 1 2 7 8

105

5 6

▲Split

X X

slide-47
SLIDE 47

CMU 15-721 (Spring 2017)

102 105 104 101 103

BW-TREE: STRUCTURE MODIFICATIONS

15

Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer 3 4 5 6 1 2 7 8

105

5 6

▲Split

X X

slide-48
SLIDE 48

CMU 15-721 (Spring 2017)

102 105 104 101 103

BW-TREE: STRUCTURE MODIFICATIONS

15

Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer 3 4 5 6 1 2 7 8

105

5 6

▲Split

X X

[-∞,3) [3,7) [7,∞)

slide-49
SLIDE 49

CMU 15-721 (Spring 2017)

102 105 104 101 103

BW-TREE: STRUCTURE MODIFICATIONS

15

Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer 3 4 5 6 1 2 7 8

105

5 6

▲Separator ▲Split

X X

[-∞,3) [3,7) [7,∞)

slide-50
SLIDE 50

CMU 15-721 (Spring 2017)

102 105 104 101 103

BW-TREE: STRUCTURE MODIFICATIONS

15

Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer 3 4 5 6 1 2 7 8

105

5 6

▲Separator ▲Split

X X

[-∞,3) [3,7) [7,∞) [5,7)

slide-51
SLIDE 51

CMU 15-721 (Spring 2017)

102 105 104 101 103

BW-TREE: STRUCTURE MODIFICATIONS

15

Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer 3 4 5 6 1 2 7 8

105

5 6

▲Separator ▲Split

X X

[-∞,3) [3,7) [7,∞) [5,7)

slide-52
SLIDE 52

CMU 15-721 (Spring 2017)

102 105 104 101 103

BW-TREE: STRUCTURE MODIFICATIONS

15

Mapping Table

PID Addr 101 102 103 104

Logical Pointer Physical Pointer 3 4 5 6 1 2 7 8

105

5 6

▲Separator ▲Split

X X

[-∞,3) [3,7) [7,∞) [5,7)

slide-53
SLIDE 53

CMU 15-721 (Spring 2017)

BW-TREE: PERFORMANCE

16

Source: Justin Levandoski

10.4 3.83 2.84 0.56 0.66 0.33 4.23 1.02 0.72

2 4 6 8 10 12 Xbox Synthetic Deduplication

Operations/sec (M)

Bw-Tree B+Tree Skip List

Processor: 1 socket, 4 cores w/ 2×HT

slide-54
SLIDE 54

CMU 15-721 (Spring 2017)

ADAPATIVE RADIX TREE (ART)

Uses digital representation of keys to examine prefixes one-by-one instead of comparing entire key. Radix trees properties:

→ The height of the tree depends on the length of keys. → Does not require rebalancing → The path to a leaf node represents the key of the leaf → Keys are stored implicitly and can be reconstructed from paths.

17

THE ADAPTIVE RADIX TREE: ARTFUL INDEXING FOR MAIN-MEMORY DATABASES ICDE 2013

slide-55
SLIDE 55

CMU 15-721 (Spring 2017)

TRIE VS. RADIX TREE

18

Keys: HELLO, HAT, HAVE Trie

E H L

¤

L O A

¤ T ¤

V E

slide-56
SLIDE 56

CMU 15-721 (Spring 2017)

TRIE VS. RADIX TREE

18

Keys: HELLO, HAT, HAVE Trie

E H L

¤

L O A

¤ T ¤

V E

slide-57
SLIDE 57

CMU 15-721 (Spring 2017)

TRIE VS. RADIX TREE

18

Keys: HELLO, HAT, HAVE Trie

E H L

¤

L O A

¤ T ¤

V E

slide-58
SLIDE 58

CMU 15-721 (Spring 2017)

TRIE VS. RADIX TREE

18

Keys: HELLO, HAT, HAVE Trie

E H L

¤

L O A

¤ T ¤

V E

Radix Tree

ELLO H

¤

A

¤ T ¤

VE

slide-59
SLIDE 59

CMU 15-721 (Spring 2017)

TRIE VS. RADIX TREE

18

Keys: HELLO, HAT, HAVE Trie

E H L

¤

L O A

¤ T ¤

V E

Radix Tree

ELLO H

¤

A

¤ T ¤

VE

slide-60
SLIDE 60

CMU 15-721 (Spring 2017)

ART INDEX: MODIFICATIONS

19

¤

ELLO

¤ ¤

T VE H A

slide-61
SLIDE 61

CMU 15-721 (Spring 2017)

ART INDEX: MODIFICATIONS

19

¤

ELLO

¤ ¤

T VE H A

Operation: Insert HAIR

slide-62
SLIDE 62

CMU 15-721 (Spring 2017)

ART INDEX: MODIFICATIONS

19

¤

ELLO

¤ ¤

T VE H A

¤

IR

Operation: Insert HAIR

slide-63
SLIDE 63

CMU 15-721 (Spring 2017)

ART INDEX: MODIFICATIONS

19

¤

ELLO

¤ ¤

T VE H A

¤

IR

Operation: Insert HAIR Operation: Delete HAT, HAVE

slide-64
SLIDE 64

CMU 15-721 (Spring 2017)

ART INDEX: MODIFICATIONS

19

¤

ELLO

¤ ¤

T VE H A

¤

IR

Operation: Insert HAIR Operation: Delete HAT, HAVE

slide-65
SLIDE 65

CMU 15-721 (Spring 2017)

ART INDEX: MODIFICATIONS

19

¤

ELLO H A

¤

IR

Operation: Insert HAIR Operation: Delete HAT, HAVE

slide-66
SLIDE 66

CMU 15-721 (Spring 2017)

ART INDEX: MODIFICATIONS

19

¤

ELLO H A

¤

IR

Operation: Insert HAIR Operation: Delete HAT, HAVE

slide-67
SLIDE 67

CMU 15-721 (Spring 2017)

ART INDEX: MODIFICATIONS

19

¤

ELLO H A

Operation: Insert HAIR Operation: Delete HAT, HAVE

AIR

¤

slide-68
SLIDE 68

CMU 15-721 (Spring 2017)

ART INDEX: BINARY COMPARABLE KEYS

Not all attribute types can be decomposed into binary comparable digits for a radix tree.

→ Unsigned Integers: Byte order must be flipped for little endian machines. → Signed Integers: Flip two’s-complement so that negative numbers are smaller than positive. → Floats: Classify into group (neg vs. pos, normalized vs. denormalized), then store as unsigned integer. → Compound: Transform each attribute separately.

20

slide-69
SLIDE 69

CMU 15-721 (Spring 2017)

ART INDEX: BINARY COMPARABLE KEYS

21

Hex Key: 0A 0B 0C 0D Int Key: 168496141

0A 0B 0C 0D

Big Endian

0D 0C 0B 0A

Little Endian

slide-70
SLIDE 70

CMU 15-721 (Spring 2017)

ART INDEX: BINARY COMPARABLE KEYS

21

Hex Key: 0A 0B 0C 0D Int Key: 168496141

0A 0B 0C 0D

Big Endian

0D 0C 0B 0A

Little Endian

0F0F0F 0A

¤

0B

¤

0B0F

¤

OF0F

¤ ¤ ¤

0C 0F 0D

slide-71
SLIDE 71

CMU 15-721 (Spring 2017)

ART INDEX: BINARY COMPARABLE KEYS

21

Hex Key: 0A 0B 0C 0D Int Key: 168496141

0A 0B 0C 0D

Big Endian

0D 0C 0B 0A

Little Endian

0F0F0F 0A

¤

0B

¤

0B0F

¤

OF0F

¤ ¤ ¤

0C 0F 0D

slide-72
SLIDE 72

CMU 15-721 (Spring 2017)

BINARY COMPRABLE KEYS

22

6695 3277 7899 9430 6775 18518 12093 13682 31052

8000 16000 24000 32000 Insert Lookup Delete

Execution Time (ms)

CompactIntsKey GenericKey + FastCompare GenericKey + GenericCompare

Peloton w/ Bw-Tree Index Data Set: 10m keys (three 64-bit ints)

slide-73
SLIDE 73

CMU 15-721 (Spring 2017)

CONCURRENT ART INDEX

HyPer’s ART is not latch-free. Optimistic crabbing scheme where writers are not blocked on readers.

→ Writers increment counter when they acquire latch. → Readers can proceed if a node’s latch is available. → It then checks whether the latch’s counter has changed from when it checked the latch.

23

THE ART OF PRACTICAL SYNCHRONIZATION DaMoN 2016

slide-74
SLIDE 74

CMU 15-721 (Spring 2017)

SINGLE-THREADED PERFORMANCE

24

5.7 2.1 5.1 1.9 4.2 2.0 3.7 0.1 5.8 2.2 5.5 1.9 3.2 1.5 2.7

N/A

4.5 6.6 2.9

2 4 6 8 10 Read-only Insert-only Read/Write Scan/Insert

Operations/sec (M)

B+Tree Masstree Skip List Bw-Tree ART

Source: Huanchen Zhang

23.7 Data Set: 30m Random 64-bit Integers

slide-75
SLIDE 75

CMU 15-721 (Spring 2017)

PARTING THOUGHTS

Bw-Tree is probably the most dank latch-free index in recent years. ART has amazing performance. Need to understand it better.

25

slide-76
SLIDE 76

CMU 15-721 (Spring 2017)

26

ANDY’S

TIPS FOR PROFILING

slide-77
SLIDE 77

CMU 15-721 (Spring 2017)

MOTIVATION

Consider a program with functions foo and bar. How can we speed it up with only a debugger ?

→ Randomly pause it during execution → Collect the function call stack

27

slide-78
SLIDE 78

CMU 15-721 (Spring 2017)

RANDOM PAUSE METHOD

Consider this scenario

→ Collected 10 call stack samples → Say 6 out of the 10 samples were in foo

What percentage of time was spent in foo?

→ Roughly 60% of the time was spent in foo → Accuracy increases with # of samples

28

slide-79
SLIDE 79

CMU 15-721 (Spring 2017)

AMDAHL’S LAW

Say we optimized foo to run 2 times faster What’s the expected overall speedup ?

→ 60% of time spent in foo drops in half → 40% of time spent in bar unaffected → p = percentage of time spent in optimized task → s = speed up for the optimized task → Overall speedup = = 1.4 times faster

29

slide-80
SLIDE 80

CMU 15-721 (Spring 2017)

AMDAHL’S LAW

1 0.6 2 +0.4 1 1 0.6 2 +0.4 0.6 2 0.6 0.6 2 2 0.6 2 +0.4 1 0.6 2 +0.4 = 1.4 times faster 1 0.6 2 +0.4 1 1 0.6 2 +0.4 0.6 2 0.6 0.6 2 2 0.6 2 +0.4 1 0.6 2 +0.4 = 1.4 times faster 1 +(1− ) +(1− ) 1 +(1− ) Say we optimized foo to run 2 times faster What’s the expected overall speedup ?

→ 60% of time spent in foo drops in half → 40% of time spent in bar unaffected f d k

29

slide-81
SLIDE 81

CMU 15-721 (Spring 2017)

PROFILING TOOLS FOR REAL

Choice #1: Valgrind

→ Heavyweight instrumentation framework with a lot of tools → Sophisticated visualization tools

Choice #2: Perf

→ Lightweight tool that can record different kinds of events → Console-oriented visualization tools

30

slide-82
SLIDE 82

CMU 15-721 (Spring 2017)

CHOICE #1: VALGRIND

Instrumentation framework for building dynamic analysis tools

→ memcheck: a memory error detector → callgrind: a call-graph generating profiler

Using callgrind to profile the index test and Peloton in general:

31

$ valgrind --tool=callgrind --trace-children=yes ./tests/skiplist_index_test $ valgrind --tool=callgrind --trace-children=yes ./bin/peloton &> /dev/null&

slide-83
SLIDE 83

CMU 15-721 (Spring 2017)

$ kcachegrind callgrind.out.12345

KCACHEGRIND

Profile data visualization tool

32

slide-84
SLIDE 84

CMU 15-721 (Spring 2017)

$ kcachegrind callgrind.out.12345

KCACHEGRIND

Profile data visualization tool

32

Cumulative Time Distribution Callgraph View

slide-85
SLIDE 85

CMU 15-721 (Spring 2017)

CHOICE #2: PERF

Tool for using the performance counters subsystem in Linux.

→ -e = sample the event cycles at the user level only → -c = collect a sample every 2000 occurrences of event

Uses counters for tracking events

→ On counter overflow, the kernel records a sample → Sample contains info about program execution

33

$ perf record -e cycles:u -c 2000 ./tests/skiplist_index_test

slide-86
SLIDE 86

CMU 15-721 (Spring 2017)

PERF VISUALIZATION

We can also use perf to visualize the generated profile for our application.

34

$ perf report

slide-87
SLIDE 87

CMU 15-721 (Spring 2017)

PERF VISUALIZATION

We can also use perf to visualize the generated profile for our application.

34

$ perf report

Cumulative Time Distribution

slide-88
SLIDE 88

CMU 15-721 (Spring 2017)

PERF EVENTS

Supports several other events like:

→ L1-dcache-load-misses → branch-misses

To see a list of events: Another usage example:

35

$ perf list $ perf record -e cycles,LLC-load-misses -c 2000 ./tests/skiplist_index_test

slide-89
SLIDE 89

CMU 15-721 (Spring 2017)

REFERENCES

Valgrind

→ The Valgrind Quick Start Guide → Callgrind → Kcachegrind → Tips for the Profiling/Optimization process

Perf

→ Perf Tutorial → Perf Examples → Perf Analysis Tools

36

slide-90
SLIDE 90

CMU 15-721 (Spring 2017)

NEXT CLASS

Indexing for OLAP workloads.

→ More from Microsoft Research…

37