DHTM: Durable Hardware Transactional Memory Arpit Joshi , Vijay - - PowerPoint PPT Presentation

dhtm durable hardware transactional memory
SMART_READER_LITE
LIVE PREVIEW

DHTM: Durable Hardware Transactional Memory Arpit Joshi , Vijay - - PowerPoint PPT Presentation

DHTM: Durable Hardware Transactional Memory Arpit Joshi , Vijay Nagarajan, Marcelo Cintra, Stratis Viglas ISCA 2018 Persistent Memory is here 2 Persistent Memory is here 2 Persistent Memory Systems L1 L1 LLC Persistent Memory


slide-1
SLIDE 1

Arpit Joshi, Vijay Nagarajan, Marcelo Cintra, Stratis Viglas

DHTM: Durable Hardware Transactional Memory

ISCA 2018

slide-2
SLIDE 2

Persistent Memory is here…

2

slide-3
SLIDE 3

Persistent Memory is here…

2

slide-4
SLIDE 4

Persistent Memory Systems

L1 LLC Persistent Memory L1

3

slide-5
SLIDE 5

Persistent Memory Systems

L1 LLC Persistent Memory L1

  • Persistent Memory
  • Non-volatility over the memory bus
  • Load/Store interface to persistent data

3

slide-6
SLIDE 6

Persistent Memory Systems

L1 LLC Persistent Memory L1

  • Persistent Memory
  • Non-volatility over the memory bus
  • Load/Store interface to persistent data

3

System Crashes

slide-7
SLIDE 7

Persistent Memory Systems

L1 LLC Persistent Memory L1

  • Persistent Memory
  • Non-volatility over the memory bus
  • Load/Store interface to persistent data
  • Crash Consistency
  • Is the persistent state consistent?
  • Programming Model: ACID Transactions

3

System Crashes

slide-8
SLIDE 8

Persistent Memory Systems

L1 LLC Persistent Memory L1

  • Persistent Memory
  • Non-volatility over the memory bus
  • Load/Store interface to persistent data
  • Crash Consistency
  • Is the persistent state consistent?
  • Programming Model: ACID Transactions

3

System Crashes

“Ensuring failure atomicity for all this computation without failure-atomic transactions is practically infeasible, if not impossible.” Marathe et al. [HotStorage’17]

slide-9
SLIDE 9

Persistent Memory Systems

L1 LLC Persistent Memory L1

  • Persistent Memory
  • Non-volatility over the memory bus
  • Load/Store interface to persistent data
  • Crash Consistency
  • Is the persistent state consistent?
  • Programming Model: ACID Transactions

3

System Crashes

“Ensuring failure atomicity for all this computation without failure-atomic transactions is practically infeasible, if not impossible.” Marathe et al. [HotStorage’17]

How fast can we support ACID?

slide-10
SLIDE 10

ACID Transactions

L1 LLC Persistent Memory L1

4

slide-11
SLIDE 11

ACID Transactions

L1 LLC Persistent Memory L1

4

Atomic Visibility

slide-12
SLIDE 12

ACID Transactions

L1 LLC Persistent Memory L1

4

Atomic Visibility Atomic Durability

slide-13
SLIDE 13

ACID Transactions

L1 LLC Persistent Memory L1

4

Atomic Visibility Atomic Durability Locks HTM STM

slide-14
SLIDE 14

ACID Transactions

L1 LLC Persistent Memory L1

4

Atomic Visibility Atomic Durability Locks HTM STM

Check- pointing H/W Logging S/W Logging

slide-15
SLIDE 15

ACID Transactions

L1 LLC Persistent Memory L1

4

Atomic Visibility Atomic Durability Locks HTM STM

Check- pointing H/W Logging S/W Logging

slide-16
SLIDE 16

Atomic Visibility: HTM

5

slide-17
SLIDE 17

Atomic Visibility: HTM

  • Commercial HTMs [Intel, IBM]

5

L1 Cache

Cache Line

A = 15

R

B = 20

W

1 1

slide-18
SLIDE 18

Atomic Visibility: HTM

  • Commercial HTMs [Intel, IBM]
  • Version Management: read/write sets in

L1 cache

5

L1 Cache

Cache Line

A = 15

R

B = 20

W

1 1

slide-19
SLIDE 19

Atomic Visibility: HTM

  • Commercial HTMs [Intel, IBM]
  • Version Management: read/write sets in

L1 cache

  • Conflict Detection: piggy back on the

coherence protocol

5

L1 Cache

Cache Line

A = 15

R

B = 20

W

1 1

slide-20
SLIDE 20

Atomic Visibility: HTM

  • Commercial HTMs [Intel, IBM]
  • Version Management: read/write sets in

L1 cache

  • Conflict Detection: piggy back on the

coherence protocol

  • Commit: make updates non-speculative

5

L1 Cache

Cache Line

A = 15

R

B = 20

W

slide-21
SLIDE 21

Atomic Visibility: HTM

  • Commercial HTMs [Intel, IBM]
  • Version Management: read/write sets in

L1 cache

  • Conflict Detection: piggy back on the

coherence protocol

  • Commit: make updates non-speculative
  • Abort: invalidate write set

5

L1 Cache

Cache Line R

B = 20

W

slide-22
SLIDE 22

Atomic Visibility: HTM

  • Commercial HTMs [Intel, IBM]
  • Version Management: read/write sets in

L1 cache

  • Conflict Detection: piggy back on the

coherence protocol

  • Commit: make updates non-speculative
  • Abort: invalidate write set

5

L1 Cache

Cache Line R

B = 20

W

Write-sets in commercial HTMs limited by the size of the L1 cache.

slide-23
SLIDE 23

Atomic Durability: Logging

6

slide-24
SLIDE 24

Atomic Durability: Logging

  • Logging for durability [Doshi’16,

Joshi’17, Shin’17, Ogleari’18]

6

Persistent Memory

In-place Values

A = 10 B = 20 C = 30

slide-25
SLIDE 25

Atomic Durability: Logging

  • Logging for durability [Doshi’16,

Joshi’17, Shin’17, Ogleari’18]

  • Write a log entry for every update

6

Persistent Memory

In-place Values

A = 10 B = 20 C = 30

Transaction Log

A = 15 B = 25

slide-26
SLIDE 26

Atomic Durability: Logging

  • Logging for durability [Doshi’16,

Joshi’17, Shin’17, Ogleari’18]

  • Write a log entry for every update
  • Commit: Update the values in-place

6

Persistent Memory

In-place Values

A = 15 B = 25 C = 30

Transaction Log

slide-27
SLIDE 27

Atomic Durability: Logging

  • Logging for durability [Doshi’16,

Joshi’17, Shin’17, Ogleari’18]

  • Write a log entry for every update
  • Commit: Update the values in-place
  • Abort: Undo any in-place updates

6

Persistent Memory

In-place Values

A = 15 B = 25 C = 30

Transaction Log

A = 10 B = 20

slide-28
SLIDE 28

Atomic Durability: Logging

  • Logging for durability [Doshi’16,

Joshi’17, Shin’17, Ogleari’18]

  • Write a log entry for every update
  • Commit: Update the values in-place
  • Abort: Undo any in-place updates

6

Persistent Memory

In-place Values

A = 15 B = 25 C = 30

Transaction Log

A = 10 B = 20

In-place updates in the critical path of commit High memory write bandwidth requirement

slide-29
SLIDE 29

ACID = HTM + Logging

Goals:

  • Support fast commits
  • Minimise memory bandwidth consumption
  • Extend the supported transaction size
  • Maintain the simplicity of commercial HTMs

7

slide-30
SLIDE 30

DHTM: Durable Hardware Transactional Memory

L1 LLC Persistent Memory L1

8 Log Writes

slide-31
SLIDE 31

Commercial HTM + Hardware Redo Log

DHTM: Durable Hardware Transactional Memory

L1 LLC Persistent Memory L1

8 Log Writes

slide-32
SLIDE 32

Commercial HTM + Hardware Redo Log

  • H/W Redo Log + Log Buffer

Reduced memory bandwidth Fast commits

DHTM: Durable Hardware Transactional Memory

L1 LLC Persistent Memory L1

8 Log Writes

slide-33
SLIDE 33

Commercial HTM + Hardware Redo Log

  • H/W Redo Log + Log Buffer

Reduced memory bandwidth Fast commits

  • H/W Log + Sticky State

Extended transaction size to the LLC Simplicity of commercial HTM

DHTM: Durable Hardware Transactional Memory

L1 LLC Persistent Memory L1

8 Log Writes

slide-34
SLIDE 34

9

L1 LLC Persistent Memory L1

Log Writes

DHTM: Log Buffer

slide-35
SLIDE 35

9

L1 LLC Persistent Memory L1

  • Redo Log Bandwidth Problem

Log Writes

DHTM: Log Buffer

slide-36
SLIDE 36

9

L1 LLC Persistent Memory L1

  • Redo Log Bandwidth Problem
  • write a log entry for every store

Log Writes

DHTM: Log Buffer

slide-37
SLIDE 37

9

L1 LLC Persistent Memory L1

  • Redo Log Bandwidth Problem
  • write a log entry for every store
  • multiple stores create multiple log entries

Log Writes

DHTM: Log Buffer

slide-38
SLIDE 38

9

L1 LLC Persistent Memory L1

  • Redo Log Bandwidth Problem
  • write a log entry for every store
  • multiple stores create multiple log entries
  • Solution: Log Buffer

Log Writes

DHTM: Log Buffer

slide-39
SLIDE 39

9

L1 LLC Persistent Memory L1

  • Redo Log Bandwidth Problem
  • write a log entry for every store
  • multiple stores create multiple log entries
  • Solution: Log Buffer
  • track cache lines being modified

Log Writes

DHTM: Log Buffer

slide-40
SLIDE 40

9

L1 LLC Persistent Memory L1

  • Redo Log Bandwidth Problem
  • write a log entry for every store
  • multiple stores create multiple log entries
  • Solution: Log Buffer
  • track cache lines being modified
  • multiple writes coalesced in a log entry

Log Writes

DHTM: Log Buffer

slide-41
SLIDE 41

9

L1 LLC Persistent Memory L1

  • Redo Log Bandwidth Problem
  • write a log entry for every store
  • multiple stores create multiple log entries
  • Solution: Log Buffer
  • track cache lines being modified
  • multiple writes coalesced in a log entry
  • log entry written to persistent memory on eviction

from log buffer

Log Writes

DHTM: Log Buffer

slide-42
SLIDE 42

DHTM: Transaction States

10

slide-43
SLIDE 43

DHTM: Transaction States

10

Active

Begin Transaction

slide-44
SLIDE 44

DHTM: Transaction States

10

Active Commit

Begin Transaction End Transaction & Log Records Persisted

slide-45
SLIDE 45

DHTM: Transaction States

10

Active Commit Commit Complete

Begin Transaction End Transaction & Log Records Persisted In-place Data Persisted

slide-46
SLIDE 46

DHTM: Transaction States

10

Active Commit Commit Complete Abort

Begin Transaction End Transaction & Log Records Persisted In-place Data Persisted Conflict

slide-47
SLIDE 47

DHTM: Commit Example

L1 Cache

Cache Line R W

Persistent Memory

In-place Values

A = 15 B = 25 A = 10 B = 20 C = 30

Transaction Log

A = 10 B = 20

State Log Buffer

Begin_Transaction Write (A=15) Read (B) Write (B=25) End_Transaction

11

slide-48
SLIDE 48

Active

DHTM: Commit Example

L1 Cache

Cache Line R W

Persistent Memory

In-place Values

A = 15 B = 25 A = 10 B = 20 C = 30

Transaction Log

A = 10 B = 20

State Log Buffer

Begin_Transaction Write (A=15) Read (B) Write (B=25) End_Transaction

11

slide-49
SLIDE 49

Active Active

DHTM: Commit Example

L1 Cache

Cache Line R W

Persistent Memory

In-place Values

A = 15 B = 25 A = 10 B = 20 C = 30

Transaction Log

A = 10 B = 20

State Log Buffer

Begin_Transaction Write (A=15) Read (B) Write (B=25) End_Transaction

11

A = 15 1 A

slide-50
SLIDE 50

Active Active

DHTM: Commit Example

L1 Cache

Cache Line R W

Persistent Memory

In-place Values

A = 15 B = 25 A = 10 B = 20 C = 30

Transaction Log

A = 10 B = 20

State Log Buffer

Begin_Transaction Write (A=15) Read (B) Write (B=25) End_Transaction

11

A = 15 1 A A = 15 1 A B = 20 1

slide-51
SLIDE 51

Active Active

DHTM: Commit Example

L1 Cache

Cache Line R W

Persistent Memory

In-place Values

A = 15 B = 25 A = 10 B = 20 C = 30

Transaction Log

A = 10 B = 20

State Log Buffer

Begin_Transaction Write (A=15) Read (B) Write (B=25) End_Transaction

11

A = 15 1 A A = 15 1 A B = 20 B = 25 1 1 A = 15 B

slide-52
SLIDE 52

Active Active

Commit

DHTM: Commit Example

L1 Cache

Cache Line R W

Persistent Memory

In-place Values

A = 15 B = 25 A = 10 B = 20 C = 30

Transaction Log

A = 10 B = 20

State Log Buffer

Begin_Transaction Write (A=15) Read (B) Write (B=25) End_Transaction

11

A = 15 1 A A = 15 1 A B = 20 B = 25 1 1 A = 15 B B = 25 B = 25 A = 15 Commit 1

slide-53
SLIDE 53

Active Active

Commit

DHTM: Commit Example

L1 Cache

Cache Line R W

Persistent Memory

In-place Values

A = 15 B = 25 A = 10 B = 20 C = 30

Transaction Log

A = 10 B = 20

State Log Buffer

Begin_Transaction Write (A=15) Read (B) Write (B=25) End_Transaction

11

A = 15 1 A A = 15 1 A B = 20 B = 25 1 1 A = 15 B B = 25 B = 25 A = 15 Commit A = 15 B = 25 Complete

Commit Complete

Commit B = 25 1

slide-54
SLIDE 54

DHTM: Supporting Overflow

12

slide-55
SLIDE 55

DHTM: Supporting Overflow

  • Problems with Overflow:

12

slide-56
SLIDE 56

DHTM: Supporting Overflow

  • Problems with Overflow:
  • Version Management:
  • global operation on write-set on a commit/abort
  • overhead infeasible in larger caches (beyond L1)

12

slide-57
SLIDE 57

DHTM: Supporting Overflow

  • Problems with Overflow:
  • Version Management:
  • global operation on write-set on a commit/abort
  • overhead infeasible in larger caches (beyond L1)
  • Conflict Detection:
  • additional metadata to detect conflicts
  • increased complexity due to NACK based protocols

12

slide-58
SLIDE 58

DHTM: Supporting Overflow

13

slide-59
SLIDE 59

DHTM: Supporting Overflow

13

  • Solution
slide-60
SLIDE 60

DHTM: Supporting Overflow

13

LLC Persistent Memory

  • Solution
  • Version Management:
  • Overflow List
slide-61
SLIDE 61

DHTM: Supporting Overflow

13

LLC Persistent Memory

Overflow List

C A B

  • Solution
  • Version Management:
  • Overflow List
slide-62
SLIDE 62

DHTM: Supporting Overflow

13

LLC Persistent Memory

Overflow List

C A B

  • Solution
  • Version Management:
  • Overflow List
slide-63
SLIDE 63

DHTM: Supporting Overflow

13

LLC Persistent Memory

Overflow List

C A B

  • Solution
  • Version Management:
  • Overflow List
  • Conflict Detection:
  • maintain sticky state on overflow

(similar to LogTM)

  • avoid NACK by restricting overflow

to LLC

slide-64
SLIDE 64

DHTM: Supporting Overflow

13

LLC Persistent Memory

Overflow List

C A B

  • Solution
  • Version Management:
  • Overflow List
  • Conflict Detection:
  • maintain sticky state on overflow

(similar to LogTM)

  • avoid NACK by restricting overflow

to LLC

Further details on supporting overflows are in the paper.

slide-65
SLIDE 65

Evaluation

  • System Configuration
  • We evaluate an 8-core machine with a 2-level cache hierarchy
  • HTM’s implement (first) writer wins conflict resolution policy

14

Atomic Visibility Atomic Durability ATOM Locks Hardware Undo Log LogTM+ATOM HTM (LogTM) Hardware Undo Log DHTM HTM Hardware Redo Log (Log Buffer)

slide-66
SLIDE 66

Evaluation

15

slide-67
SLIDE 67

Evaluation

15 1 1.25 1.5 1.75 2

queue hash sdg sps btree rbtree gmean

ATOM LogTM+ATOM DHTM

slide-68
SLIDE 68

Evaluation

15 1 1.25 1.5 1.75 2

queue hash sdg sps btree rbtree gmean

ATOM LogTM+ATOM DHTM

slide-69
SLIDE 69

Evaluation

15 1 1.25 1.5 1.75 2

queue hash sdg sps btree rbtree gmean

ATOM LogTM+ATOM DHTM

slide-70
SLIDE 70

Evaluation

15 1 1.25 1.5 1.75 2

queue hash sdg sps btree rbtree gmean

ATOM LogTM+ATOM DHTM

26%

slide-71
SLIDE 71

Evaluation

15 1 1.25 1.5 1.75 2

queue hash sdg sps btree rbtree gmean

ATOM LogTM+ATOM DHTM

17%

slide-72
SLIDE 72

Conclusion

  • Persistent memory systems require crash consistency
  • ACID Transactions: widely understood crash

consistency mechanism

  • DHTM: ACID transactions in hardware
  • Atomic Visibility: commercial HTM
  • Atomic Durability: bandwidth optimized hardware redo log
  • Leverage hardware logging to extend transaction size unto LLC

16

slide-73
SLIDE 73

Arpit Joshi, Vijay Nagarajan, Marcelo Cintra, Stratis Viglas

DHTM: Durable Hardware Transactional Memory

ISCA 2018