Efficient Persist Barriers for Multicores Arpit Joshi, Vijay - - PowerPoint PPT Presentation

efficient persist barriers for multicores
SMART_READER_LITE
LIVE PREVIEW

Efficient Persist Barriers for Multicores Arpit Joshi, Vijay - - PowerPoint PPT Presentation

MICRO-48 Efficient Persist Barriers for Multicores Arpit Joshi, Vijay Nagarajan, Marcelo Cintra, Stratis Viglas Summary Efficient persist barrier Used to implement persistency models Persistency = when stores become durable


slide-1
SLIDE 1

Efficient Persist Barriers for Multicores

Arpit Joshi, Vijay Nagarajan, Marcelo Cintra, Stratis Viglas

MICRO-48

slide-2
SLIDE 2

December 9, 2015 MICRO-48

Summary

  • Efficient persist barrier
  • Used to implement persistency models
  • Persistency = when stores become durable

(Consistency = when stores become visible)

  • Evaluated
  • Buffered Epoch Persistency
  • Bulk Strict Persistency

2

slide-3
SLIDE 3

December 9, 2015 MICRO-48

Persistent Memory

Access Latency Access Granularity

3

slide-4
SLIDE 4

December 9, 2015 MICRO-48

Persistent Memory

Cache DRAM Secondary Storage

Access Latency Access Granularity

3

slide-5
SLIDE 5

December 9, 2015 MICRO-48

Persistent Memory

Cache DRAM Secondary Storage NVRAM

Access Latency Access Granularity

3

3D Xpoint STT-MRAM PCM

slide-6
SLIDE 6

December 9, 2015 MICRO-48

Persistent Memory

Cache DRAM Secondary Storage NVRAM

Access Latency Access Granularity Fast, fine grained persistence.

3

3D Xpoint STT-MRAM PCM

slide-7
SLIDE 7

December 9, 2015 MICRO-48

Persistent Memory

Advantages:

Unify memory and storage Access to persistent data through processor load/store interface

Challenge:

Maintaining consistency of data structures in memory

4

slide-8
SLIDE 8

December 9, 2015 MICRO-48

Consistency Challenge

Core Cache DRAM Secondary Storage

Software Controlled

5

slide-9
SLIDE 9

December 9, 2015 MICRO-48

Consistency Challenge

Core Cache NVRAM Secondary Storage

Hardware Controlled

Core Cache DRAM Secondary Storage

Software Controlled

5

slide-10
SLIDE 10

December 9, 2015 MICRO-48

Linked List - Naïve

Node Node 1

HEAD Cache

Pseudo-code

  • 1. Create Node
  • 2. Update Node Pointer
  • 3. Update Head Pointer

Node Node 1

HEAD NVRAM

6

slide-11
SLIDE 11

December 9, 2015 MICRO-48

Linked List - Naïve

Node Node 1 Node 2

HEAD Cache

Pseudo-code

  • 1. Create Node
  • 2. Update Node Pointer
  • 3. Update Head Pointer

Node Node 1

HEAD NVRAM

6

slide-12
SLIDE 12

December 9, 2015 MICRO-48

Linked List - Naïve

Node Node 1 Node 2

HEAD Cache

Pseudo-code

  • 1. Create Node
  • 2. Update Node Pointer
  • 3. Update Head Pointer

Node Node 1 Node 2

HEAD NVRAM

6

slide-13
SLIDE 13

December 9, 2015 MICRO-48

Linked List - Naïve

Node Node 1 Node 2

HEAD Cache

Pseudo-code

  • 1. Create Node
  • 2. Update Node Pointer
  • 3. Update Head Pointer

Node Node 1 Node 2

HEAD NVRAM

6

slide-14
SLIDE 14

December 9, 2015 MICRO-48

Linked List - Naïve

Cache

Pseudo-code

  • 1. Create Node
  • 2. Update Node Pointer
  • 3. Update Head Pointer

Node Node 1 Node 2

HEAD NVRAM

6

System Crash!

slide-15
SLIDE 15

December 9, 2015 MICRO-48

Linked List - Failsafe

Node Node 1

HEAD Cache

Pseudo-code

  • 1. Create Node
  • 2. Update Node Pointer
  • 3. Persist Barrier
  • 4. Update Head Pointer

Node Node 1

HEAD NVRAM

7

slide-16
SLIDE 16

December 9, 2015 MICRO-48

Linked List - Failsafe

Node Node 1 Node 2

HEAD Cache

Pseudo-code

  • 1. Create Node
  • 2. Update Node Pointer
  • 3. Persist Barrier
  • 4. Update Head Pointer

Node Node 1

HEAD NVRAM

7

slide-17
SLIDE 17

December 9, 2015 MICRO-48

Linked List - Failsafe

Node Node 1 Node 2

HEAD Cache

Pseudo-code

  • 1. Create Node
  • 2. Update Node Pointer
  • 3. Persist Barrier
  • 4. Update Head Pointer

Node Node 1 Node 2

HEAD NVRAM

7

slide-18
SLIDE 18

December 9, 2015 MICRO-48

Linked List - Failsafe

Node Node 1 Node 2

HEAD Cache

Pseudo-code

  • 1. Create Node
  • 2. Update Node Pointer
  • 3. Persist Barrier
  • 4. Update Head Pointer

Node Node 1 Node 2

HEAD NVRAM

7

slide-19
SLIDE 19

December 9, 2015 MICRO-48

Linked List - Failsafe

Node Node 1 Node 2

HEAD Cache

Pseudo-code

  • 1. Create Node
  • 2. Update Node Pointer
  • 3. Persist Barrier
  • 4. Update Head Pointer

Node Node 1 Node 2

HEAD NVRAM

7

slide-20
SLIDE 20

December 9, 2015 MICRO-48

Linked List - Failsafe

Pseudo-code

  • 1. Create Node
  • 2. Update Node Pointer
  • 3. Persist Barrier
  • 4. Update Head Pointer

Epoch A Epoch B

8

slide-21
SLIDE 21

December 9, 2015 MICRO-48

Linked List - Failsafe

Pseudo-code

  • 1. Create Node
  • 2. Update Node Pointer
  • 3. Persist Barrier
  • 4. Update Head Pointer

Programmer Introduced

8

slide-22
SLIDE 22

December 9, 2015 MICRO-48

Linked List - Failsafe

Pseudo-code

  • 1. Create Node
  • 2. Update Node Pointer
  • 3. Persist Barrier
  • 4. Update Head Pointer

Programmer Introduced

Divide program execution into epochs through programmer inserted persist barriers = Epoch Persistence

8

slide-23
SLIDE 23

December 9, 2015 MICRO-48

St a St b St c St a Persist Barrier St d St e St d Persist Barrier St p St q St d …

Epoch Persistence*

* Pelley et. al., “Memory Persistency”, in ISCA-2014. 9

Epoch 3

b c a a a c d e

Epoch 2

d e p

Visibility Persistence

b

Epoch 1

d q d

slide-24
SLIDE 24

December 9, 2015 MICRO-48

St a St b St c St a Persist Barrier St d St e St d Persist Barrier St p St q St d …

Epoch Persistence*

* Pelley et. al., “Memory Persistency”, in ISCA-2014. 9

Epoch 3

b c a a a c d e

Epoch 2

d e p

Visibility Persistence

b

Epoch 1

d q d

slide-25
SLIDE 25

December 9, 2015 MICRO-48

St a St b St c St a Persist Barrier St d St e St d Persist Barrier St p St q St d …

Epoch Persistence*

* Pelley et. al., “Memory Persistency”, in ISCA-2014. 9

Epoch 3

b c a a a c d e

Epoch 2

d e p

Visibility Persistence

b

Epoch 1

d q d

slide-26
SLIDE 26

December 9, 2015 MICRO-48

St a St b St c St a Persist Barrier St d St e St d Persist Barrier St p St q St d …

Epoch Persistence*

* Pelley et. al., “Memory Persistency”, in ISCA-2014. 9

Epoch 3

b c a a a c d e

Epoch 2

d e p

Visibility Persistence

b

Epoch 1

d q d

slide-27
SLIDE 27

December 9, 2015 MICRO-48

St a St b St c St a Persist Barrier St d St e St d Persist Barrier St p St q St d …

Epoch Persistence*

* Pelley et. al., “Memory Persistency”, in ISCA-2014. 9

Epoch 3

b c a a a c d e

Epoch 2

d e p

Visibility Persistence

b

Epoch 1

d q d

slide-28
SLIDE 28

December 9, 2015 MICRO-48

St a St b St c St a Persist Barrier St d St e St d Persist Barrier St p St q St d …

Epoch Persistence*

* Pelley et. al., “Memory Persistency”, in ISCA-2014. 9

Epoch 3

b c a a a c d e

Epoch 2

d e p

Visibility Persistence

b

Epoch 1

d q d

slide-29
SLIDE 29

December 9, 2015 MICRO-48

St a St b St c St a Persist Barrier St d St e St d Persist Barrier St p St q St d …

Epoch Persistence*

* Pelley et. al., “Memory Persistency”, in ISCA-2014. 9

Epoch 3

b c a a a c d e

Epoch 2

d e p

Visibility Persistence

b

Epoch 1

d q d

Persist operations happen in the critical path of execution.

slide-30
SLIDE 30

December 9, 2015 MICRO-48

Buffered Epoch Persistence* through Lazy Barrier (LB)

  • Implementation of Epoch Persistence
  • Durability lags visibility
  • To allow performing persist operations out of critical path

10

* Pelley et. al., “Memory Persistency”, in ISCA-2014. * Condit et. al., “Better I/O through byte-addressable, persistent memory”, in SOSP-2009.

slide-31
SLIDE 31

December 9, 2015 MICRO-48

d − Conflicting request

a b c e d

Epoch 2

a b a c e

Persistence Visibility

Epoch 1

p q d

Epoch 3

d

Buffered Epoch Persistence* through Lazy Barrier (LB)

11

* Pelley et. al., “Memory Persistency”, in ISCA-2014. * Condit et. al., “Better I/O through byte-addressable, persistent memory”, in SOSP-2009.

slide-32
SLIDE 32

December 9, 2015 MICRO-48

d − Conflicting request

a b c e d

Epoch 2

a b a c e

Persistence Visibility

Epoch 1

p q d

Epoch 3

d

Buffered Epoch Persistence* through Lazy Barrier (LB)

11

* Pelley et. al., “Memory Persistency”, in ISCA-2014. * Condit et. al., “Better I/O through byte-addressable, persistent memory”, in SOSP-2009.

slide-33
SLIDE 33

December 9, 2015 MICRO-48

d − Conflicting request

a b c e d

Epoch 2

a b a c e

Persistence Visibility

Epoch 1

p q d

Epoch 3

d

Buffered Epoch Persistence* through Lazy Barrier (LB)

11

* Pelley et. al., “Memory Persistency”, in ISCA-2014. * Condit et. al., “Better I/O through byte-addressable, persistent memory”, in SOSP-2009.

slide-34
SLIDE 34

December 9, 2015 MICRO-48

d − Conflicting request

a b c e d

Epoch 2

a b a c e

Persistence Visibility

Epoch 1

p q d

Epoch 3

d

Buffered Epoch Persistence* through Lazy Barrier (LB)

11

Cache Line Eviction

* Pelley et. al., “Memory Persistency”, in ISCA-2014. * Condit et. al., “Better I/O through byte-addressable, persistent memory”, in SOSP-2009.

slide-35
SLIDE 35

December 9, 2015 MICRO-48

d − Conflicting request

a b c e d

Epoch 2

a b a c e

Persistence Visibility

Epoch 1

p q d

Epoch 3

d

Buffered Epoch Persistence* through Lazy Barrier (LB)

11

* Pelley et. al., “Memory Persistency”, in ISCA-2014. * Condit et. al., “Better I/O through byte-addressable, persistent memory”, in SOSP-2009.

slide-36
SLIDE 36

December 9, 2015 MICRO-48

d − Conflicting request

a b c e d

Epoch 2

a b a c e

Persistence Visibility

Epoch 1

p q d

Epoch 3

d

Buffered Epoch Persistence* through Lazy Barrier (LB)

11

* Pelley et. al., “Memory Persistency”, in ISCA-2014. * Condit et. al., “Better I/O through byte-addressable, persistent memory”, in SOSP-2009.

Conflicts bring persist operations back in the critical path.

slide-37
SLIDE 37

December 9, 2015 MICRO-48

Intra-thread Conflict

12

d − Conflicting request

a b c e d

Epoch 2

a b a c e

Persistence Visibility

Epoch 1

p q d

Epoch 3

d

slide-38
SLIDE 38

December 9, 2015 MICRO-48

Intra-thread Conflict

12

d − Conflicting request

a b c e d

Epoch 2

a b a c e

Persistence Visibility

Epoch 1

p q d

Epoch 3

d

slide-39
SLIDE 39

December 9, 2015 MICRO-48

Proactive Flush (PF)

  • Persist Triggers in Lazy Barrier
  • Passive Trigger: cache line eviction
  • Reactive Trigger: flush on (intra/inter)-thread conflicts

13

slide-40
SLIDE 40

December 9, 2015 MICRO-48

Proactive Flush (PF)

  • Persist Triggers in Lazy Barrier
  • Passive Trigger: cache line eviction
  • Reactive Trigger: flush on (intra/inter)-thread conflicts
  • Persist Trigger with Proactive Flush
  • Proactive Trigger: proactively flush on epoch completion

13

slide-41
SLIDE 41

December 9, 2015 MICRO-48

e a b c e d

Epoch 2

a b d

Persistence Visibility

Epoch 1 Epoch 3

a c p q d

d

Proactive Flush (PF)

14

slide-42
SLIDE 42

December 9, 2015 MICRO-48

e a b c e d

Epoch 2

a b d

Persistence Visibility

Epoch 1 Epoch 3

a c p q d

d

Proactive Flush (PF)

14

slide-43
SLIDE 43

December 9, 2015 MICRO-48

e a b c e d

Epoch 2

a b d

Persistence Visibility

Epoch 1 Epoch 3

a c p q d

d

Proactive Flush (PF)

14

slide-44
SLIDE 44

December 9, 2015 MICRO-48

e a b c e d

Epoch 2

a b d

Persistence Visibility

Epoch 1 Epoch 3

a c p q d

d

Proactive Flush (PF)

14

slide-45
SLIDE 45

December 9, 2015 MICRO-48

e a b c e d

Epoch 2

a b d

Persistence Visibility

Epoch 1 Epoch 3

a c p q d

d

Proactive Flush (PF)

14

Proactive Flush

slide-46
SLIDE 46

December 9, 2015 MICRO-48

e a b c e d

Epoch 2

a b d

Persistence Visibility

Epoch 1 Epoch 3

a c p q d

d

Proactive Flush (PF)

14

Proactive Flush

slide-47
SLIDE 47

December 9, 2015 MICRO-48

e a b c e d

Epoch 2

a b d

Persistence Visibility

Epoch 1 Epoch 3

a c p q d

d

Proactive Flush (PF)

14

Proactive Flush Reduces the probability of encountering conflicts.

slide-48
SLIDE 48

December 9, 2015 MICRO-48

Thread

Epoch Epoch Epoch

Persistence Visibility Visibility

RY RX W

Z

W

B

W

E

W

F

E B A F

W

A

RP W

E

RQ

E Z

T0 Thread T1

RB

00

E10 E11 E

Inter-thread Conflict

15

slide-49
SLIDE 49

December 9, 2015 MICRO-48

Thread

Epoch Epoch Epoch

Persistence Visibility Visibility

RY RX W

Z

W

B

W

E

W

F

E B A F

W

A

RP W

E

RQ

E Z

T0 Thread T1

RB

00

E10 E11 E

Inter-thread Conflict

15

slide-50
SLIDE 50

December 9, 2015 MICRO-48

Thread

Epoch Epoch Epoch

Persistence Visibility Visibility

RY RX W

Z

W

B

W

E

W

F

E B A F

W

A

RP W

E

RQ

E Z

T0 Thread T1

RB

00

E10 E11 E

Inter-thread Conflict

15

slide-51
SLIDE 51

December 9, 2015 MICRO-48

Thread

Epoch Epoch Epoch

Persistence Visibility Visibility

RY RX W

Z

W

B

W

E

W

F

E B A F

W

A

RP W

E

RQ

E Z

T0 Thread T1

RB

00

E10 E11 E

Inter-thread Conflict

15

slide-52
SLIDE 52

December 9, 2015 MICRO-48

Thread

Epoch Epoch Epoch

Persistence Visibility Visibility

RY RX W

Z

W

B

W

E

W

F

E B A F

W

A

RP W

E

RQ

E Z

T0 Thread T1

RB

00

E10 E11 E

Inter-thread Conflict

15

slide-53
SLIDE 53

December 9, 2015 MICRO-48

Inter-thread Dependency Tracking (IDT)

  • Lazy barrier
  • No tracking of inter-thread dependencies
  • Need to enforce dependencies online

16

slide-54
SLIDE 54

December 9, 2015 MICRO-48

Inter-thread Dependency Tracking (IDT)

  • Lazy barrier
  • No tracking of inter-thread dependencies
  • Need to enforce dependencies online
  • Inter-thread Dependency Tracking
  • Add inter-thread dependence tracking registers
  • Track dependencies to enforce offline

16

slide-55
SLIDE 55

December 9, 2015 MICRO-48

Visibility

Epoch Epoch

Persistence Visibility Thread

RY RX W

Z

W

B

W

E

W

F

E B F

W

A

RP

Z

T1

RQ W

E

T0 Thread

A E

RB

Epoch

00

E10

11

E E

Inter-thread Dependency Tracking (IDT)

17

IDT Table

Source Dependent

slide-56
SLIDE 56

December 9, 2015 MICRO-48

Visibility

Epoch Epoch

Persistence Visibility Thread

RY RX W

Z

W

B

W

E

W

F

E B F

W

A

RP

Z

T1

RQ W

E

T0 Thread

A E

RB

Epoch

00

E10

11

E E

Inter-thread Dependency Tracking (IDT)

17

IDT Table

Source Dependent

slide-57
SLIDE 57

December 9, 2015 MICRO-48

Visibility

Epoch Epoch

Persistence Visibility Thread

RY RX W

Z

W

B

W

E

W

F

E B F

W

A

RP

Z

T1

RQ W

E

T0 Thread

A E

RB

Epoch

00

E10

11

E E

Inter-thread Dependency Tracking (IDT)

17

IDT Table

Source Dependent

IDT Table

Source Dependent

Epoch E00 Epoch E11

slide-58
SLIDE 58

December 9, 2015 MICRO-48

Visibility

Epoch Epoch

Persistence Visibility Thread

RY RX W

Z

W

B

W

E

W

F

E B F

W

A

RP

Z

T1

RQ W

E

T0 Thread

A E

RB

Epoch

00

E10

11

E E

Inter-thread Dependency Tracking (IDT)

17

IDT Table

Source Dependent

IDT Table

Source Dependent

Epoch E00 Epoch E11

slide-59
SLIDE 59

December 9, 2015 MICRO-48

Visibility

Epoch Epoch

Persistence Visibility Thread

RY RX W

Z

W

B

W

E

W

F

E B F

W

A

RP

Z

T1

RQ W

E

T0 Thread

A E

RB

Epoch

00

E10

11

E E

Inter-thread Dependency Tracking (IDT)

17

IDT Table

Source Dependent

IDT Table

Source Dependent

Epoch E00 Epoch E11

slide-60
SLIDE 60

December 9, 2015 MICRO-48

Visibility

Epoch Epoch

Persistence Visibility Thread

RY RX W

Z

W

B

W

E

W

F

E B F

W

A

RP

Z

T1

RQ W

E

T0 Thread

A E

RB

Epoch

00

E10

11

E E

Inter-thread Dependency Tracking (IDT)

17

IDT Table

Source Dependent

Reduces the latency of conflicting requests.

IDT Table

Source Dependent

Epoch E00 Epoch E11

slide-61
SLIDE 61

December 9, 2015 MICRO-48

Evaluation

  • Persistency Models
  • Buffered Epoch Persistency (BEP)
  • maintaining in-memory persistent data structures
  • Bulk Strict Persistency (BSP) = BEP + atomicity
  • provide stronger persistency model (strict persistency) — similar to

doing sequential consistency in bulk mode*

LB Lazy barrier LB+IDT Lazy barrier with inter-thread dependence tracking (IDT) LB+PF Lazy barrier with proactive flush (PF) LB++ Lazy barrier with both IDT and PF

Persist Barrier Designs

18 * Ceze et. al., “BulkSC: Bulk enforcement of sequential consistency”, in ISCA-2007.

slide-62
SLIDE 62

December 9, 2015 MICRO-48

System Configuration

  • We evaluate proposed design using GEM5 full-system

simulation mode

  • 32 Core CMP with 32x1MB LLC cache banks and 4 memory

controllers

  • More details on implementation of persist barrier for such a system are

in the paper.

19

slide-63
SLIDE 63

December 9, 2015 MICRO-48

BEP Transaction Throughput

20

Higher is Better

slide-64
SLIDE 64

December 9, 2015 MICRO-48

BEP Transaction Throughput

20

Higher is Better 3%

slide-65
SLIDE 65

December 9, 2015 MICRO-48

BEP Transaction Throughput

20

Higher is Better 15%

slide-66
SLIDE 66

December 9, 2015 MICRO-48

BEP Transaction Throughput

20

Higher is Better 22%

slide-67
SLIDE 67

December 9, 2015 MICRO-48

BSP Execution Time

21

Lower is Better

slide-68
SLIDE 68

December 9, 2015 MICRO-48

BSP Execution Time

21

15% Lower is Better

slide-69
SLIDE 69

December 9, 2015 MICRO-48

BSP Execution Time

21

20% Lower is Better

slide-70
SLIDE 70

December 9, 2015 MICRO-48

Conclusion

  • We propose an efficient implementation of a persist barrier primitive
  • Buffered implementation, to move persists out of critical path
  • We highlight how conflicts bring them back into critical path
  • We propose and implement two optimizations
  • Proactive Flush: Reduce the percentage of conflicting epochs
  • Inter-thread Dependence Tracking: Reduce the penalty of inter-thread

conflicts

  • We demonstrate the efficacy by implementing two persistence

models, namely BEP and BSP efficiently

22

slide-71
SLIDE 71

Efficient Persist Barriers for Multicores

Arpit Joshi, Vijay Nagarajan, Marcelo Cintra, Stratis Viglas

MICRO-48