Thread-Modular Reasoning for Lock-Free Data Structures Roland Meyer - - PowerPoint PPT Presentation

thread modular reasoning for lock free data structures
SMART_READER_LITE
LIVE PREVIEW

Thread-Modular Reasoning for Lock-Free Data Structures Roland Meyer - - PowerPoint PPT Presentation

Thread-Modular Reasoning for Lock-Free Data Structures Roland Meyer based on joint work with Luk Holk, Tom Vojnar, and Sebastian Wol ff . Lock-Free Data Structures Key Take Aways: e ffi cient but complex correctness =


slide-1
SLIDE 1

Thread-Modular Reasoning
 for Lock-Free Data Structures

Roland Meyer based on joint work with Lukáš Holík, Tomáš Vojnar, and Sebastian Wolff.

slide-2
SLIDE 2

Lock-Free Data Structures

Key Take Aways:

  • efficient but complex
  • correctness = linearizability
  • checking linearizability reduces to reachability
http://www.braunschweig-fotograf.de/mein-braunschweig/
slide-3
SLIDE 3

Concept

  • avoid locks

➡ critical section cannot exist

  • single commands are atomic

➡ compare-and-swap (CAS)

CAS(src, cmp, dst) := atomic { if (src != cmp) return false; src = dst; return true; }

slide-4
SLIDE 4

Example: Treiber’s Stack

push(val): node = new Node(val); while (true) { top = ToS; node.next = top; if (CAS(ToS, top, node)) return; } pop(): while (true) { top = ToS; if (top == NULL) return EMPTY; next = top.next; if (CAS(ToS, top, next)) return top.data; }

1 ToS node top

slide-5
SLIDE 5

Example: Treiber’s Stack

push(val): node = new Node(val); while (true) { top = ToS; node.next = top; if (CAS(ToS, top, node)) return; } pop(): while (true) { top = ToS; if (top == NULL) return EMPTY; next = top.next; if (CAS(ToS, top, next)) return top.data; }

1 ToS node top next top

slide-6
SLIDE 6

pop(): while (true) { top = ToS; if (top == NULL) return EMPTY; next = top.next; if (CAS(ToS, top, next)) return top.data; }

1

Example: Treiber’s Stack

push(val): node = new Node(val); while (true) { top = ToS; node.next = top; if (CAS(ToS, top, node)) return; }

ToS top next 2 top next top2 next2

slide-7
SLIDE 7

Correctness and Concurrency

  • pre/post conditions meaningless

➡ other correctness criteria required

  • linearizability

➡ every concurrent run must coincide with a sequential run ➡ most common for lock-free data structures ➡ illusion of sequentiality [Filipović et al. ESOP’09]:

linearizable ⟺ sequential and concurrent implementation
 are observationally equivalent

slide-8
SLIDE 8

Checking Linearizability

  • check sequentiality illusion

➡ sufficient: sequence of linearization points is valid [Abdulla et al. TACAS’13]


(intuitively: linearization point = change of data structure takes effect)

➡ checking linearizability is a reachability problem

⇐ ⇒ linp(DS) ∩ sequential(DS) = ∅ ⇐ ⇒ linp(DS) ∩ observer(DS) = ∅ ⇐ ⇒ linp(DS) ⊆ sequential(DS) concurrent(DS) | = sequential(DS)

slide-9
SLIDE 9

Overview

  • 1. thread-modular reasoning
  • 2. ownership
  • 3. summaries
slide-10
SLIDE 10

Thread-Modular Reasoning

[Qadeer, Flanagan SPIN’03]

Key Take Aways:

  • compute reachability
  • interference is key to scalability
slide-11
SLIDE 11

Concept

  • view abstraction

➡ split states into set of views ➡ views capture perception of 1 thread (abstract from correlation)

  • state exploration

➡ fixed-point computation:

X = X ∪ sequential(X) ∪ interference(X)

slide-12
SLIDE 12

2

CAS(ToS, top, next)

ToS top2 next2 ToS top1 next1 1

CAS(ToS, top, next)

Example: View Abstraction

Note: both views are equal.

X = X ∪ sequential(X) ∪ interference(X)

slide-13
SLIDE 13

ToS top1 next1 1

CAS(ToS, top, next)

Example: Sequential Step

X = X ∪ sequential(X) ∪ interference(X)

No concurrent behavior.

slide-14
SLIDE 14

ToS top1 next1 1

CAS(ToS, top, next)

Example: Interference Step

X = X ∪ sequential(X) ∪ interference(X)

2

CAS(ToS, top, next)

ToS top2 next2

  • 1. combine
slide-15
SLIDE 15

next2 ToS top1 next1 1

CAS(ToS, top, next)

Example: Interference Step

2

CAS(ToS, top, next)

top2

X = X ∪ sequential(X) ∪ interference(X)

  • 1. combine
  • 2. step
  • 3. project
slide-16
SLIDE 16

Challenges with Interference

  • number of possible combinations is enormous

➡ not all combinations are reasonable

  • need pruning to make the approach practical

➡ precision ➡ performance

  • pruning must be sound
slide-17
SLIDE 17

Pruning Interferences

two types

  • matching

➡ Is it possible to combine at all? Skip if not.

  • correlation

➡ Which nodes should coincide?

slide-18
SLIDE 18

Matching: Complication

  • matching gets harder due to finite abstraction
  • we use reachability predicates (shape analysis):
  • 0-step: =
  • 1-step:
  • n-step: ⤏
  • unreach: ⋈

ToS node // ToS ⤏ NULL // node ToS ToS node

slide-19
SLIDE 19

Matching: Example

ToS node ToS top next logical stack content Subgraph isomorphism: NP-complete!

slide-20
SLIDE 20

Correlation: Example

ToS … node … ToS top next … ToS top2 next2 node1 top2 … ToS next2 node1

??

Exponentially many!

slide-21
SLIDE 21

Practicality is about Interference

  • interference

➡ quadratic in size of state space

  • matching

➡ subgraph isomorphism (NP)

  • correlation

➡ exponential

poor scalability fight imprecision
 (false-positives)

slide-22
SLIDE 22

Ownership

Key Take Aways:

  • ownership saves the day
  • even under explicit memory management
slide-23
SLIDE 23

Concept

partition allocated heap into

  • owned

➡ exclusive access for a single thread ➡ granted upon allocation

  • shared

➡ accessible by every thread ➡ by publishing (e.g. making accessible via shared variables)

slide-24
SLIDE 24

Ownership in Thread-Modular Reasoning [Gotsman et al. PLDI’07]

  • track ownership

➡ small overhead

  • matching

➡ owned cells not contained

  • correlation

➡ owned cells not merged with other nodes

slide-25
SLIDE 25

Ownership and Correlation

ToS … … ToS top next … ToS top2 next2 node1 top2 … ToS next2 node1

??

node

  • wn

node

slide-26
SLIDE 26

Ownership in Thread-Modular Reasoning

  • helps a lot with

➡ matching ➡ correlation

  • makes thread-modular reasoning


practical

➡ prunes false-positives

Only for garbage collection (GC)!

What about explicit memory management (MM)?

slide-27
SLIDE 27

Problem with MM

Ownership does not exist under
 explicit memory management.


— folklore

  • almost true
  • indeed no exclusivity ➡ dangling pointers
  • we introduced weak ownership in VMCAI’16
slide-28
SLIDE 28

Weak Ownership

  • write exclusivity

➡ only owners may write

[VMCAI’16]

  • no read exclusivity

➡ dangling readers allowed ➡ dangling reads unsafe ➡ only owner may rely on memory contents

… …

  • wned

dangling

slide-29
SLIDE 29

Weak Ownership in Thread-Modular Reasoning

  • track dangling pointers

➡ small overhead

  • matching: like normal ownership
  • correlation

➡ -owned cells referenced by only via dangling pointers

  • dangling write accesses may be unsafe

➡ report as bug

1 2

[VMCAI’16]

slide-30
SLIDE 30

MM without


  • wnership

MM with


  • wnership

Treiber’s stack 944s 25.5s #116776 #3175 Michael&Scott’s queue false positive 11700s > #69000 #19742

Performance Impact

[VMCAI’16]

:37 :36 impractical

slide-31
SLIDE 31

Accomplishments

  • ownership helps with matching and correlation
  • low overhead tracking additional info
  • deeming unsafe accesses as bugs reflects programming practice
  • performance improvements for analysis
  • but: not practical yet

➡ interference still computationally complex

slide-32
SLIDE 32

Summaries

Key Take Aways:

  • copy-and-check blocks
  • statelessness
  • efficient interference
slide-33
SLIDE 33

Observation

  • lock-freedom relies on


copy-and-check blocks

  • 1. create local copy of shared data
  • 2. make changes locally
  • 3. publish changes if copy up-to-date

  • r retry otherwise

➡ updates appear atomically

push(val): node = new Node(val); while (true) { top = ToS; node.next = top; if (CAS(ToS, top, node)) return; }

1 2 3

slide-34
SLIDE 34

Insight

So why do interference for all intermediate steps?

➡ instead: apply updates in one shot ➡ potentially unsound: stay tuned

Threads cannot observe the local behavior of other threads.


— SAS’17

slide-35
SLIDE 35

Example: Summary for pop

while (true) { top = ToS; if (top == NULL) return next = top.next; if (CAS(ToS, top, next)) return }

  • 2. remove noise

atomic { }

  • 1. make atomic

; ; EMPTY top.data

slide-36
SLIDE 36

while (true) { top = ToS; if (top == NULL) return; next = top.next; if (CAS(ToS, top, next)) return; }

  • 2. remove noise

atomic { }

  • 1. make atomic
  • 3. copy propagation

Example: Summary for pop

slide-37
SLIDE 37

ToS top = ; ToS ToS while (true) { if ( == NULL) return; next = .next; if (CAS(ToS, , next)) return; } atomic { }

  • 1. make atomic

top top top

  • 2. remove noise
  • 3. copy propagation

Example: Summary for pop

slide-38
SLIDE 38

atomic { }

  • 1. make atomic
  • 2. remove noise
  • 3. copy propagation
  • 4. remove noise

while (true) { }

  • 5. rewrite CAS

if (CAS(ToS, ToS, ToS.next)) return; if (ToS == NULL) return;

Example: Summary for pop

slide-39
SLIDE 39

atomic { }

  • 1. make atomic
  • 2. remove noise
  • 3. copy propagation
  • 4. remove noise
  • 5. rewrite CAS

if (CAS(ToS, ToS, ToS.next)) return; if (ToS == NULL) return; ToS = ToS.next; return;

Example: Summary for pop

slide-40
SLIDE 40

assume(ToS != NULL); atomic { }

  • 1. make atomic
  • 2. remove noise
  • 3. copy propagation
  • 4. remove noise
  • 5. rewrite CAS

if (ToS == NULL) return; ToS = ToS.next; return;

  • 6. rewrite guard

Example: Summary for pop

slide-41
SLIDE 41

assume(ToS != NULL);

  • 1. make atomic
  • 2. remove noise
  • 3. copy propagation
  • 4. remove noise
  • 5. rewrite CAS

ToS = ToS.next;

  • 6. rewrite guard

} atomic {

Example: Summary for pop

slide-42
SLIDE 42
  • easy to compute

➡ similar for push

  • compact form beneficial for analysis


(and understandability)

  • 1. make atomic
  • 2. remove noise
  • 3. copy propagation
  • 4. remove noise
  • 5. rewrite CAS
  • 6. rewrite guard

assume(ToS != NULL); ToS = ToS.next; } atomic {

Example: Summary for pop

slide-43
SLIDE 43

Insight

learn about an object’s state from shared variables (assume);
 and execute atomically

➡ no concurrency: ➡ no interference for summaries needed

Summaries are stateless.


— SAS’17 Finally an efficient interference algorithm!

Y

i

summaryi = X

i

summaryi

slide-44
SLIDE 44

top next 1

CAS(ToS, top, next)

Example: New Interference

2

atomic { assume(ToS != NULL); ToS = ToS.next; }

ToS

slide-45
SLIDE 45

Soundness

  • soundness requires summaries to
  • 1. capture all possible effects of the implementation
  • 2. be stateless
  • both can be checked on the fixed point
  • 1. for each effect check whether some summary can do it
  • 2. summaries must not rely on uninitialized local variables
slide-46
SLIDE 46

Accomplishments

  • improved interference

➡ matching: NP ⟹ not needed ➡ correlation: exponential ⟹ constant (one) ➡ interference: quadratic (in fixed-point approximant) ⟹ linear

  • sound approach despite unsound abstraction
  • works for explicit memory (requires ownership transfer, skipped)
slide-47
SLIDE 47

classical summaries Coarse Stack 0.29s 0.03s Coarse Queue 0.49s 0.05s Treiber’s stack 1.99s 0.06s Michael&Scott’s queue 11.0s 0.39s DGLM queue 9.56s 0.37s

Performance Impact: GC

:10 :10 :33 :28 :25

slide-48
SLIDE 48

classical summaries Coarse Stack 1.89s 0.19s Coarse Queue 2.34s 0.98s Treiber’s stack 25.5s 1.64s Michael&Scott’s queue 11700s 102s DGLM queue false-positive violation

Performance Impact: MM

:10 :2 :15 :114

slide-49
SLIDE 49

Related Work

Key Take Aways:

  • Abdulla et al.
  • Vafeiadis et al.
slide-50
SLIDE 50

Abdulla et al.

  • improve precision of interference

➡ first to make it work for explicit memory management ➡ without weak ownership

  • increase threads per view to 2

➡ could restore precision for matching and correlation

  • poor scalability due to increased state space
slide-51
SLIDE 51

Vafeiadis et al.

  • relies on RGSep (separation logic + rely guarantee)
  • fixed point:

➡ interference recorded per thread in every step ➡ applied to others in next iteration

  • corresponds to learning summaries

➡ no freedom: sound in every step ➡ linear in fixed point (here: linear in program size)

  • only considered garbage collection
slide-52
SLIDE 52

Future Work

  • stateful summaries
  • go beyond singly-linked objects
  • more benchmarks
slide-53
SLIDE 53

F I N