Sequential consistency considered harmful Viktor Vafeiadis Max - - PowerPoint PPT Presentation
Sequential consistency considered harmful Viktor Vafeiadis Max - - PowerPoint PPT Presentation
Sequential consistency considered harmful Viktor Vafeiadis Max Planck Institute for Software Systems (MPI-SWS) 7 November 2017 The standard WMC talk . . . Sequential consistency (SC) CPU CPU Interleaving semantics read write
The standard WMC talk
Sequential consistency (SC)
◮ Interleaving semantics ◮ Intuitive, isn’t it?
Weak memory consistency
◮ All the SC behaviours ◮ + Some weird behaviours ◮ Complicated. . .
CPU write read CPU . . . Memory CPU write write-back read CPU . . . . . . Memory
2
Release-acquire (RA) is not complicated
Message passing (MP) X = Y = 0 X := 1; Y := 1 a := Y ; / /1 b := X / / =0 Store buffering (SB) X := 1; a := Y ; / /0 Y := 1; b := X; / /0
◮ Messages delivered in order. ◮ But they take time to deliver!
3
What about the formal definition?
But isn’t the definition RA more complex than of SC?
◮ It largely depends on presentation. . .
But who cares about the MM definition?
◮ Theoreticians certainly care. ◮ Programmers do not understand programs by looking at
the MM definition, but at its properties. Key question
◮ Determine whether our program is correct. ◮ We want automation: a tool to answer this query. ◮ We also want manual proof techniques.
4
Automated verification
So, which model is best for automated verification?
◮ It depends on the exact question.
Two cases where WMC verification is easier than under SC.
- 1. Checking consistency of an execution
- 2. Bounded model checking
NB: There are other verification problems that are easy under SC and difficult under WM.
5
Checking consistency of an execution
Execution consistency problem Given a concurrent program P with instructions of the form:
◮ x := v
– write constant v to shared variable x
◮ r := x
– read value of x into register r such that no two instructions have the same v or r, and a register assignment R = [r1→v1, . . . rk→vk], determine whether R is a possible outcome of P. This problem is:
◮ NP-complete for SC; ◮ Polynomial for several weak memory models (e.g., RA).
6
Stateless model checking
◮ For SC, 10+ years of research on optimisations.
State of the art: Nidhugg “optimal” DPOR.
◮ For RC11, our first attempt . . . to appear at POPL’18.
Benchmark Nidhugg/SC RCMC/RC11 linuxrwlocks(2) 0.22 s 0.08 s linuxrwlocks(3) 37.65 s 7.58 s ms-queue(2) 0.45 s 0.13 s ms-queue(3) 21.21 s 4.34 s qspinlock(2) 0.11 s 0.06 s qspinlock(3) 15.78 s 3.34 s big0 18.26 s 2.65 s
7
Manual reasoning—program logics
Weak memory enforces local reasoning.
◮ Ownership-based reasoning. ◮ Proof of a thread mentions only variables accessed by it. ◮ Key underlying principle of separation logic.
RA allows causal reasoning.
◮ I have seen an update, so I have seen all previous updates. ◮ “Ownership transfer” in separation logic.
SC, in addition, allows global reasoning.
◮ Proof of a thread can mention local variables of other
threads.
◮ Global reason is complicated.
8
Message passing with the Owicki-Gries method (1976)
Prove that the MP program cannot have its weak behaviour:
- Y = 0
- X := 1;
Y := 1 a := Y ; b := X
- a = 0 ∨ b = 1
- 9
Message passing with the Owicki-Gries method (1976)
Prove that the MP program cannot have its weak behaviour:
- Y = 0
- ⊤
- X := 1;
- X = 1
- Y := 1
- ⊤
- Y = 0 ∨ X = 1
- a := Y ;
- a = 0 ∨ X = 1
- b := X
- a = 0 ∨ b = 1
- a = 0 ∨ b = 1
- ◮ A straightforward local proof.
◮ Sound also under RA.
9
Store buffering with the Owicki-Gries method (1976)
Prove that the SB program cannot have its weak behaviour:
- a = 0
- X := 1;
a := Y Y := 1; b := X
- a = 0 ∨ b = 0
- 10
Store buffering with the Owicki-Gries method (1976)
Prove that the SB program cannot have its weak behaviour:
- a = 0
- a = 0
- X := 1;
- X = 0
- a := Y
- X = 0
- ⊤
- Y := 1;
- Y = 0
- b := X
- Y = 0 ∧ (a = 0 ∨ b = X)
- a = 0 ∨ b = 0
- ◮ Requires a non-trivial global proof!
◮ This non-local reasoning is unsound under RA.
10
Program logics summary
RA supports local and causal reasoning.
◮ Local Owicki-Gries is sound. ◮ Separation logic and extensions (RSL, GPS) are sound.
Global reasoning is unsound under RA.
◮ It requires strong fences. ◮ Global reasoning is complicated to do anyway. ◮ Fences document when global reasoning is needed.
11
Scalability barrier: multi-copy atomicity?
Independent reads of independent writes (IRIW) Initially, X = Y = 0 X := 1 a := X; / /1 b := Y / /0 c := Y ; / /1 d := X / /0 Y := 1
◮ Threads 2 and 3 observe the
X := 1 and Y := 1 writes happen in different orders.
12
Summary
Sequential consistency (SC) is bad.
◮ Thinking about interleavings is considered harmful. ◮ Multi-copy atomicity is fundammentally not scalable. ◮ And it also seems useless in practice.
Release-acquire (RA) is good.
◮ Manual reasoning under RA is clearer.
◮ Fully supports local and causal reasoning. ◮ Fences document global reasoning.
◮ Automated reasoning under RA is easier.
◮ Checking consistency of an execution ◮ Bounded model checking