Interval-Based Memory Reclamation Haosen Wen , Joseph Izraelevitz, - - PowerPoint PPT Presentation

interval based memory reclamation
SMART_READER_LITE
LIVE PREVIEW

Interval-Based Memory Reclamation Haosen Wen , Joseph Izraelevitz, - - PowerPoint PPT Presentation

Interval-Based Memory Reclamation Haosen Wen , Joseph Izraelevitz, Wentao Cai, H. Alan Beadle and Michael L. Scott University of Rochester PPoPP18 Background Unlike lock-based concurrent data Thread 2 structures, non-blocking ones


slide-1
SLIDE 1

Interval-Based Memory Reclamation

Haosen Wen, Joseph Izraelevitz, Wentao Cai,

  • H. Alan Beadle and Michael L. Scott

University of Rochester PPoPP’18

slide-2
SLIDE 2
  • 2/19-

Background

  • Unlike lock-based concurrent data

structures, non-blocking ones allow updates to happen concurrently with other accesses.

  • Specifcally, a thread might try to

reclaim a block while others still have access to it.

A B

Thread 1 Thread 2

  • (Thread-safe) garbage collecting languages tend to

bring high overhead.

slide-3
SLIDE 3
  • 3/19-

The Problem

  • Manual approaches are majorly based on

"reservations," a global metadata, which require expensive store-load fences to update:

  • Hazard Pointers (HP) [Michael, PODC’02] reserves a

minimum number of blocks per thread, but updates reservation every time a thread follows a shared pointer.

  • Epoch Based Reclamation (EBR) [Fraser, thesis’04];

[Hart et al., 2007] only issues memory fences at

beginnings and ends of operations, but a stalling thread may cause an unbounded amount of blocks to be unreclaimable.

  • Our approach improves EBR by making it

robust to thread stalling.

slide-4
SLIDE 4
  • 4/19-

B A C A B

Reserved by Thread 1

B C

Reserved by Thread 2

Store-load Fence by T1 B A C C B

Reserved by Thread 1

B C

Reserved by Thread 2 Reclaimable

  • Thread 1 is

traversing a linked list and Thread 2 is retiring block A.

  • Blocks in global

array of HPs are reserved from reclamations.

  • Store-load fences

are issued on every HP update.

  • Number of HPs per

thread is usually small, but can be unbounded in some cases.

Not reclaimable

Hazard Pointers (HP)

slide-5
SLIDE 5
  • 5/19-

B A C 1

Reserved by Thread 1

2

  • The Epoch counter is

a slow-ticking "clock"

  • Each thread puts the

current epoch E in reservation at the beginning of

  • perations, reserving

all objects retired on and after epoch E.

  • As a result, only

blocks retired before the lowest reservation can be reclaimed.

Not reclaimable

Epoch-Based Reclamation (EBR)

Epochs Block A 1 2 3 4

Epoch: 2

Block B

Lowest reservation: 1

Reserved by Thread 2 Thread 1 Thread 2

slide-6
SLIDE 6
  • 6/19-

A 1

Reserved by Thread 1

  • The Epoch counter is

a slow-ticking "clock"

  • Each thread puts the

current epoch E in reservation at the beginning of

  • perations, reserving

all objects retired on and after epoch E.

  • As a result, only

blocks retired before the lowest reservation can be reclaimed.

  • Unbounded numbers
  • f blocks may be tied

up if some thread is stalled: EBR is not robust to thread stalling.

Not reclaimable Epochs Block A 1 2 3 4

Epoch: 5

Block B Lowest reservation: 1

Zzzzzz... B C ...

Block C

Epoch-Based Reclamation (EBR)

Block D

D

slide-7
SLIDE 7
  • 7/19-

Thoughts about EBR

  • EBR is not robust [Dice et al., 2016]: a stalled thread

can end up reserving an unbounded number of blocks, including blocks created after it stalled.

  • If reservation of one thread can only hold a

bounded range of epochs, then a stalled thread can only reserve a fnite number of blocks.

  • T
  • ensure correctness, a block should be reserved

if its "life interval" ("lifetime" between its birth epoch and retire epoch) intersects with any reservation(s).

slide-8
SLIDE 8
  • 8/19-

Introducing Interval-Based Reclamation (IBR)

slide-9
SLIDE 9
  • 9/19-

A 2

Reserved by Thread 1

  • IBR tracks the life

interval (hence the name) of all blocks.

  • A block is reclaimable

if its life interval does not intersect with reservations of any thread.

  • The reservation of

each thread contains a fnite range of epochs; a stalled thread won’t reserve any block born after the upper bound of its reservation.

  • A thread updates its

upper reservation as it progresses.

Not reclaimable

Interval-Based Reclamation (IBR)

Epochs Block A 1 2 3 4

Epoch: 5

Block B Reserved epochs: [1, 2]

Zzzzzz... B C ...

Block C

1

  • reclaimable
slide-10
SLIDE 10
  • 10/19-

T agged Pointer IBR (T agIBR)

  • Update reservations when following shared
  • pointers. Goal: reserve the target block

before pointer dereference.

  • A tag in the pointer is guaranteed to be greater

than or equal to the birth epoch of its target.

1 1

Epochs Block A 1 2 3 4

Thread 1 Read(A) 2 1

Epochs Block A 1 2 3 4

Birth: 1 (Data) T ag:2 Birth: 2 (Data) T ag Block A

slide-11
SLIDE 11
  • 11/19-

2 Global Epoch IBR (2GEIBR)

  • Always update upper reservations to the current

global epoch – faster (or simpler*).

  • There is a potential trade-of between space

bound and throughput (or simplicity*) (in long- running operations).

1 1

Epochs Block A 1 2 3 4

Thread 1 Read(A) 4 1

Epochs Block A 1 2 3 4

Epoch: 4 Birth: 1 (Data) Birth: 2 (Data) Block A *with diferent T agIBR variants.

slide-12
SLIDE 12
  • 12/19-

Persistent Object IBR (POIBR)

  • The most straightfarward implementation of IBR:

every thread can only reserve one epoch.

  • Suitable only for data structures who persists
  • histories. For example, one whose internal

pointers are immutable.

slide-13
SLIDE 13
  • 13/19-

Performance Results

slide-14
SLIDE 14
  • 14/19-

Experimental Setup

  • Platform: Intel(R) Xeon(R) CPU E5-2699 v3.
  • Processor: 2 sockets, 18 cores,

2 hyperthreads on each core: 72 hyperthreads in total. (Threads >72, some get stalled)

  • Thread pinning strategy:

1 thread per core on one socket -> hyperthreads on the same socket -> next socket.

slide-15
SLIDE 15
  • 15/19-

Schemes in the test

  • HP: Hazard Pointers
  • EBR: Epoch-based reclamation
  • T

agIBR

  • (sub-variants: T

agIBR-FAA, T agIBR-WCAS in paper.)

  • 2GEIBR: 2 Global Epoch IBR
  • No MM
  • (POIBR in paper)
slide-16
SLIDE 16
  • 16/19-
  • 2000

4000 6000 1 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95100

Threads

  • Avg. of Unreclaimed Retired Blocks
  • EBR

TagIBR 2GEIBR HP

Average retired-but-not-reclaimed objects per operation

Natarajan & Mittal’s Tree

  • Michael’s Hash Map has similar performance

Number of hardware contexts Threads exceeding 72 get stalled

slide-17
SLIDE 17
  • 17/19-
  • 10

20 30 40 50 1 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95100

Threads Throughput (M ops/sec)

  • No MM

EBR TagIBR 2GEIBR HP

Throughput (M ops/s)

Natarajan & Mittal’s Tree

  • Michael’s Hash Map has similar performance

Number of hardware contexts

slide-18
SLIDE 18
  • 18/19-

Throughput (M ops/s)

Michael’s Linked List

  • 0.000

0.025 0.050 0.075 0.100 1 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95100

Threads Throughput (M ops/sec)

  • No MM

EBR TagIBR 2GEIBR HP

slide-19
SLIDE 19
  • 19/19-

Summary

  • We presented Interval-Based Memory

Reclamation, a family of memory management schemes for non-blocking concurrent data structures.

  • These showed throughput comparable to the

fastest existing approach(es), and are robust to thread stalling.

  • In theory, T

agIBR is more suitable for data structures with long operations working on

  • ld data; 2GEIBR for (almost) the rest.
  • The artifact is available at:

https://zenodo.org/record/1168572