SLIDE 1
Parametric Verification of Concurrent Programs under the TSO Weak Memory Model
Ahmed Bouajjani
Paris Diderot University
Parosh A. Abdulla Mohamed Faouzi Atig T. Phong Ngo
Uppsala University
SynCoP+PV’17, Uppsala, April 22, 2017 Based on joint work with
Sebastian Burckhardt Madan Musuvathi
Microsoft Research
SLIDE 2 Sequential Consistency
- Concurrent processes with Shared Memory
- Operations: Writes and Reads
- Computation of different processes are shuffled
- Program order is preserved for each process
Operations are immediately visible to all processes
- => Strong consistency:
- Simple and Intuitive model
- Disallows many hardware/compiler optimisations
SLIDE 3 Sequential Consistency
- Concurrent processes with Shared Memory
- Operations: Writes and Reads
- Computation of different processes are shuffled
- Program order is preserved for each process
- => Strong consistency:
- Simple and Intuitive model
- Disallows many hardware/compiler optimisations
- => Strong consistency:
- Simple and Intuitive model
Operations are immediately visible to all processes
SLIDE 4 Sequential Consistency
- Concurrent processes with Shared Memory
- Operations: Writes and Reads
- Computation of different processes are shuffled
- Program order is preserved for each process
Operations are immediately visible to all processes
- => Strong consistency:
- Simple and Intuitive model
- Disallows many hardware/compiler optimisations
SLIDE 5
read(x,0) x=y=0 write(x,1) read(y,0)
hb po
read(x,0) write (x,1) read(y,0)
SC
Weak Memory Models
SLIDE 6
Weak Memory Models
read(x,0) write (x,1) read(y,0)
read(x,0) x=y=0 write(x,1) read(y,0)
hb po
read(x,0) read(y,0) write (x,1)
SC TSO
Relax the Program Order Constraints Swap operations
SLIDE 7
Weak Memory Models
read(x,0) write (x,1) read(y,0)
read(x,0) x=y=0 write(x,1) read(y,0)
hb po
read(x,0) read(y,0) write (x,1)
SC TSO
Relax the Program Order Constraints Swap operations Execute in parallel
SLIDE 8 Total Store Ordering
- writes are sent to store buffers (one per process)
- writes are committed to memory at any time
- reads are from
- fences executed when own buffer is empty
- own store buffer if a value exists (last write to the variable)
- otherwise from the memory
P1 Pn
Store Buffers Memory
w(x,2) w(y,1) w(x,1)
… …
w(y,2)
SLIDE 9
Non SC Behaviours
write(x,1) write(y,1) read(y,0) read(x,0)
x=y=0
CS1 CS2
CS1 and CS2 ?
SLIDE 10 Non SC Behaviours
write(x,1) write(y,1) read(y,0) read(x,0)
x=y=0
CS1 CS2
po po hb hb CS1 and CS2 ?
SLIDE 11 Non SC Behaviours
write(x,1) write(y,1) read(y,0) read(x,0)
x=y=0
CS1 CS2
po po hb hb CS1 and CS2 ?
- writes are delayed: pending in store buffers
- reads get old values in the memory (0’s)
- Impossible under SC
- Possible under TSO!
SLIDE 12 Non SC Behaviours
write(x,1) write(y,1) read(y,0) read(x,0)
x=y=0
CS1 CS2
po po hb hb
CS1 and CS2 ?
- Possible under TSO!
- writes are delayed: pending in store buffers
- reads get old values in the memory (0’s)
- => po constraints are relaxed
- => reads can overtake writes
SLIDE 13
TSO: Semantics
x=0 y=0 P1 P2 w(x,1) r(x,0) r(y,0) w(y,1) P1 P2
> >
SLIDE 14
TSO: Semantics
x=0 y=0 P1 P2 w(x,1) r(x,0) r(y,0) w(y,1) P1 P2
> >
w(x,1) w(y,1)
SLIDE 15
TSO: Semantics
x=0 y=0 P1 P2 w(x,1) r(x,0) r(y,0) w(y,1) P1 P2
> >
w(x,1) w(y,1)
SLIDE 16 Avoiding Reordering: Fences
write(x,1) write(y,1) read(y,0) read(x,0)
x=y=0
CS1 CS2
CS1 and CS2 ?
fence fence
- A fence forces flushing the store buffer
- => CS1 and CS2 becomes impossible
hb hb hb hb hb hb
SLIDE 17 Avoiding Reordering: Fences
write(x,1) write(y,1) read(y,0) read(x,0)
x=y=0
CS1 CS2
CS1 and CS2 ?
fence fence
- A fence forces flushing the store buffer
- => CS1 and CS2 becomes impossible
hb hb hb hb hb hb
SC can be enforced: fence after each write
SLIDE 18
Safety/Reachability Verification Problems
P1 Pn
1 m m 1
… … . . . . . . for every n, for every m, [ P1 || … || Pn ]TSO(m) satisfies Always (Safe) there is n, there is m, [ P1 || … || Pn ]TSO(m) satisfies Reachable (Not Safe)
SLIDE 19
First step: Let us fix the number of processes
P1 Pn
1 m m 1
… … . . . . . . for every m, [ P1 || … || Pn ]TSO(m) satisfies Always (Safe) there is m, [ P1 || … || Pn ]TSO(m) satisfies Reachable (Not Safe)
SLIDE 20
P1 Pn
1 m m 1
… … . . . . . . there is m, [ P1 || … || Pn ]TSO(m) satisfies Reachable (Not Safe) [ P1 || … || Pn ]TSO(∞) satisfies Reachable (Not Safe)
First step: Let us fix the number of processes
Consider Unbounded Store Buffers <=>
SLIDE 21 Reachability Problem for a given number of processes: Decidability, Complexity
Assume that processes are finite state
- PSPACE-complete, for a fixed number of processes
- EXPSPACE-complete, for the parametric case
Under SC, the control state reachability problem is
SLIDE 22 Assume that processes are finite state
- PSPACE-complete, for a fixed number of processes
- EXPSPACE-complete, for the parametric case
Under SC, the control state reachability problem is
Reachability Problem for a given number of processes: Decidability, Complexity
What about the TSO(∞) reachability?
store buffers are unbounded perfect FIFO queues!!
SLIDE 23 Assume that processes are finite state
- PSPACE-complete, for a fixed number of processes
- EXPSPACE-complete, for the parametric case
Under SC, the control state reachability problem is What about the TSO(∞) reachability?
store buffers are unbounded perfect FIFO queues!!
Reachability Problem for a given number of processes: Decidability, Complexity
What about the parametric TSO(∞) reachability?
SLIDE 24 Reachability Problem for TSO programs: Results
- The TSO reachability problem is decidable
SLIDE 25
- … but it is highly complex
Reduction to/from reachability in lossy channel systems
[Atig, B., Burckhardt, Musuvathi, POPL’10]
(non primitive recursive)
- The TSO reachability problem is decidable
Reachability Problem for TSO programs: Results
SLIDE 26 Reduction to/from reachability in lossy channel systems
[Atig, B., Burckhardt, Musuvathi, POPL’10]
- The parametric TSO reachability problem is decidable
- A dual semantics for TSO
- Monotonic system w.r.t. WQO
- Simpler and more efficient reduction
[Abdulla, Atig, B.,Ngo, CONCUR’16]
- … but it is highly complex (non primitive recursive)
- The TSO reachability problem is decidable
Reachability Problem for TSO programs: Results
SLIDE 27
- A dual semantics for TSO
- Monotonic system w.r.t. WQO
- Simpler and more efficient reduction
[Abdulla, Atig, B.,Ngo, CONCUR’16]
Reduction to/from reachability in lossy channel systems
[Atig, B., Burckhardt, Musuvathi, POPL’10]
- The parametric TSO reachability problem is decidable
- … but it is highly complex (non primitive recursive)
- The TSO reachability problem is decidable
Reachability Problem for TSO programs: Results
SLIDE 28
An example of TSO program
w(x,1) w(x,2) w(y,1) P1 P2 r(x,2) r(y,0) x=y=0
> >
x=0 y=0 P1 TSO store buffer of P1
SLIDE 29
w(x,1) w(x,2) w(y,1) P1 P2 r(x,2) r(y,0) x=y=0
> >
x=0 y=0 P1 w(x,1) w(y,1) w(x,2) TSO store buffer of P1
An example of TSO program
SLIDE 30
w(x,1) w(x,2) w(y,1) P1 P2 r(x,2) r(y,0) x=y=0
> >
x=1 y=0 P1 w(x,1) w(y,1) w(x,2) TSO store buffer of P1
An example of TSO program
SLIDE 31
w(x,1) w(x,2) w(y,1) P1 P2 r(x,2) r(y,0) x=y=0
> >
x=1 y=1 P1 w(x,1) w(y,1) w(x,2) TSO store buffer of P1
An example of TSO program
SLIDE 32
w(x,1) w(x,2) w(y,1) P1 P2 r(x,2) r(y,0) x=y=0
> >
x=2 y=1 P1 w(x,1) w(y,1) w(x,2) TSO store buffer of P1
An example of TSO program
SLIDE 33
w(x,1) w(x,2) w(y,1) P1 P2 r(x,2) r(y,0) x=y=0
> >
x=2 y=1 P1 w(x,1) w(y,1) w(x,2) TSO store buffer of P1
An example of TSO program
SLIDE 34
w(x,1) w(x,2) w(y,1) P1 P2 r(x,2) r(y,0) x=y=0
> >
x=2 y=1 P1 w(x,1) w(y,1) w(x,2)
X
Deadlock under the TSO semantics
TSO store buffer of P1
An example of TSO program
SLIDE 35
TSO Store Buffers —> Lossy Channels ?
w(x,1) w(x,2) w(y,1) P1 P2 r(x,2) r(y,0) x=y=0
> >
x=0 y=0 P1 w(x,1) w(y,1) w(x,2) Lossy Fifo Channel
SLIDE 36
w(x,1) w(x,2) w(y,1) P1 P2 r(x,2) r(y,0) x=y=0
> >
x=1 y=0 P1 w(x,1) w(y,1) w(x,2) Lossy Fifo Channel
TSO Store Buffers —> Lossy Channels ?
SLIDE 37
w(x,1) w(x,2) w(y,1) P1 P2 r(x,2) r(y,0) x=y=0
> >
x=1 y=0 P1 w(x,1) w(y,1) w(x,2) Lossy Fifo Channel
TSO Store Buffers —> Lossy Channels ?
SLIDE 38
w(x,1) w(x,2) w(y,1) P1 P2 r(x,2) r(y,0) x=y=0
> >
x=2 y=0 P1 w(x,1) w(y,1) w(x,2) Lossy Fifo Channel
TSO Store Buffers —> Lossy Channels ?
SLIDE 39
w(x,1) w(x,2) w(y,1) P1 P2 r(x,2) r(y,0) x=y=0
> >
x=2 y=0 P1 w(x,1) w(y,1) w(x,2) Lossy Fifo Channel
TSO Store Buffers —> Lossy Channels ?
SLIDE 40
w(x,1) w(x,2) w(y,1) P1 P2 r(x,2) r(y,0) x=y=0
> >
x=2 y=0 P1 w(x,1) w(y,1) w(x,2) Lossy Fifo Channel
Unsound simulation of TSO!
TSO Store Buffers —> Lossy Channels ?
SLIDE 41
Store Memory Snapshots
w(x,1) w(x,2) w(y,1) P1 P2 r(x,2) r(y,0) x=y=0
> >
x=0 y=0 P1 Future Snapshots of the Memory
SLIDE 42
w(x,1) w(x,2) w(y,1) P1 P2 r(x,2) r(y,0) x=y=0
> >
x=0 y=0 P1 x=1 y=0 Future Snapshots of the Memory
Store Memory Snapshots
SLIDE 43
w(x,1) w(x,2) w(y,1) P1 P2 r(x,2) r(y,0) x=y=0
> >
x=0 y=0 P1 x=1 y=0 x=1 y=1 Future Snapshots of the Memory
Store Memory Snapshots
SLIDE 44
w(x,1) w(x,2) w(y,1) P1 P2 r(x,2) r(y,0) x=y=0
> >
x=0 y=0 P1 x=1 y=0 x=1 y=1 x=2 y=1 Future Snapshots of the Memory
Store Memory Snapshots
SLIDE 45
w(x,1) w(x,2) w(y,1) P1 P2 r(x,2) r(y,0) x=y=0
> >
x=1 y=0 P1 x=1 y=0 x=1 y=1 x=2 y=1 Future Snapshots of the Memory
Store Memory Snapshots
SLIDE 46
w(x,1) w(x,2) w(y,1) P1 P2 r(x,2) r(y,0) x=y=0
> >
x=1 y=0 P1 x=1 y=0 x=1 y=1 x=2 y=1 Future Snapshots of the Memory
+ Lossyness
Store Memory Snapshots with Losses
SLIDE 47
w(x,1) w(x,2) w(y,1) P1 P2 r(x,2) r(y,0) x=y=0
> >
x=2 y=1 P1 x=1 y=0 x=1 y=1 x=2 y=1 Future Snapshots of the Memory
+ Lossyness
Store Memory Snapshots with Losses
SLIDE 48
w(x,1) w(x,2) w(y,1) P1 P2 r(x,2) r(y,0) x=y=0
> >
x=2 y=1 P1 x=1 y=0 x=1 y=1 x=2 y=1
+ Lossyness
Future Snapshots of the Memory
Store Memory Snapshots with Losses
SLIDE 49
w(x,1) w(x,2) w(y,1) P1 P2 r(x,2) r(y,0) x=y=0
> >
x=2 y=1 P1 x=1 y=0 x=1 y=1 x=2 y=1 Future Snapshots of the Memory
X
+ Lossyness
Store Memory Snapshots with Losses
Valid Simulation of TSO
SLIDE 50 From TSO to Lossy Channel Systems
- 1-channel machine per process + composition
SLIDE 51 From TSO to Lossy Channel Systems
- 1-channel machine per process + composition
- Each process:
- write: puts a new memory state at the tail of the channel
- read: checks the channel, then the memory
- memory update: moves the head of the channel to the memory
SLIDE 52 From TSO to Lossy Channel Systems
- 1-channel machine per process + composition
- Each process:
- write: puts a new memory state at the tail of the channel
- read: checks the channel, then the memory
- memory update: moves the head of the channel to the memory
Problem: Interferences between processes ? Processes must agree on the same order of memory updates
SLIDE 53 From TSO to Lossy Channel Systems
- 1-channel machine per process + composition
- Each process:
- write: puts a new memory state at the tail of the channel
- read: checks the channel, then the memory
- memory update: moves the head of the channel to the memory
Problem: Interferences between processes ?
- guesses writes by other processes; put them in the channel
Processes must agree on the same order of memory updates
- Validation of the guesses by composition:
- transitions are labelled by write operations + process id
- machines are synchronized on these actions
SLIDE 54 From Lossy Channel Systems to TSO programs
write(x,-) read(y,-) write(x,d) read(y,d)
x y P1 P2
- P1 simulates a LCS with one channel using x and y:
- send(m) —> write(x, m)
- receive(m) —> read(y,m)
- P2 forwards values from x to y
SLIDE 55 From Lossy Channel Systems to TSO programs
write(x,-) read(y,-) write(x,d) read(y,d)
x y P1 P2
- P1 simulates a LCS with one channel using x and y:
- send(m) —> write(x, m)
- receive(m) —> read(y,m)
- P2 forwards values from x to y
P 2 c a n m i s s s
e v a l u e s
SLIDE 56
Reachability for TSO programs
Thm: The control state reachability problem under TSO is reducible to the reachability problem in lossy channel systems, and vice-versa.
[Atig, B., Burckhardt, Musuvathi, 2010]
SLIDE 57
Reachability for TSO programs
Thm: The control state reachability problem under TSO is reducible to the reachability problem in lossy channel systems, and vice-versa. Coro: The control state reachability problem under TSO is decidable, and it is non primitive recursive.
using [Abdulla & Jonsson1993, Abdulla et al. 1996, Finkel & Schnoebelen 2001, Schnoebelen 2001] [Atig, B., Burckhardt, Musuvathi, 2010]
SLIDE 58
Well …
The complexity is high!
SLIDE 59
Well …
The complexity is high!
… but this is not the main problem
SLIDE 60 Well …
The proposed encoding of TSO programs as LCS’s
- Can not be extended to the parametric case
- Is not practical:
it requires handling memory snapshots it manipulates process id’s
The complexity is high!
… but this is not the main problem
SLIDE 61 Well …
=> We need to change our angle of view… The proposed encoding of TSO programs as LCS’s
- Can not be extended to the parametric case
it requires handling memory snapshots it manipulates process id’s
The complexity is high!
… but this is not the main problem
SLIDE 62 Dual TSO
r(x,0) r(y,1) r(x,1) r(y,3) r(y,1) r(y,3) r(x,2) x=2 y=3 P1 Pn
… …
- Store Buffers —> Load Buffers
- Writes immediately update the Memory
- Reads are sent by the memory to processes
- Reads can be skipped by processes (Load Buffers are lossy)
- => One sequence of memory updates (order of writes)
- => Buffers contain expected reads by processes
- => Buffers represent a “(sub)history” of the memory updates
SLIDE 63 Dual TSO
- Store Buffers —> Load Buffers
- Writes immediately update the Memory
- Reads are sent by the memory to processes
- Reads can be skipped by processes (Load Buffers are lossy)
- => One sequence of memory updates (order of writes)
- => Buffers contain expected reads by processes
- => Buffers represent a “(sub)history” of the memory updates
r(x,0) r(y,1) r(x,1) r(y,3) r(y,1) r(y,3) r(x,2) x=2 y=3 P1 Pn
… …
SLIDE 64
Dual TSO: Semantics
x=0 y=0 P1 P2 w(x,1) r(x,0) r(y,0) w(y,1) P1 P2
> >
SLIDE 65
Dual TSO: Semantics
x=0 y=0 P1 P2 w(x,1) r(x,0) r(y,0) w(y,1) P1 P2
> >
r(x,0) r(y,0)
SLIDE 66
Dual TSO: Semantics
x=1 y=1 P1 P2 w(x,1) r(x,0) r(y,0) w(y,1) P1 P2
> >
r(x,0) r(y,0)
SLIDE 67
Dual TSO: Semantics
x=1 y=1 P1 P2 w(x,1) r(x,0) r(y,0) w(y,1) P1 P2
> >
r(x,0) r(y,0)
SLIDE 68
Dual TSO ~ TSO
Thm: The Dual TSO semantics is equivalent to the TSO semantics with respect to the reachability problem.
SLIDE 69 Parametrized Verification
WQO ≤ between configurations of sizes n and m:
- q1, …, qn, control states of P_1, …, P_n
- B1, …, Bn, contents of the load buffers of P_1, …, P_n
- Mem, the memory state
Given n≥1, a configuration of size n is:
- same memory state
- exists an injective function h: [n] —> [m] s.t.
Thm: Parameterized Dual TSO systems are monotonic w.r.t. ≤
- same control state, for each P_i and P’_h(i))
- sub-word relation on load buffers, for each Pi and P’_h(i))
SLIDE 70 Comparing the two encodings
- No memory snapshot
- No reference to Process Id’s
- Applicable to Parametric Verification
- More efficient verification algorithm
Dual TSO:
SLIDE 71
Experimental Results: Dual-TSO vs Memorax
SLIDE 72
Experimental Results: Parameterised Case
SLIDE 73
Scalability: D-TSO vs Memorex
SLIDE 74 Conclusion
- Verification under WMM’s is hard
- Decidability for (relatively strong) models such as TSO
- High complexity, but practical approaches are possible
- Duality —> simple, general, and efficient decision procedure
SLIDE 75 Conclusion
- Extension to other models ?
- Hardware/Programming Languages models ?
- Related to Consistency Criteria
- Undecidability for more complex models (RMO, Power)
- Under/upper-approximate analyses are needed
[Atig, B., Parlato, CAV 2011] [Abdulla, Atig, B., Ngo, TACAS 2017]
E.g., context-bounded analysis for TSO context-bounded analysis for Power
- Verification under WMM’s is hard
- Decidability for (relatively strong) models such as TSO
- High complexity, but practical approaches are possible
- Duality —> simple, general, and efficient decision procedure
SLIDE 76 Conclusion
- Extension to other models ?
- Hardware/Programming Languages models ?
- Related to Consistency Criteria in concurrent/distributed syst.
- Undecidability for more complex models (RMO, Power)
- Under/upper-approximate analyses are needed
[Atig, B., Parlato, CAV 2011] [Abdulla, Atig, B., Ngo, TACAS 2017]
E.g., context-bounded analysis for TSO context-bounded analysis for Power
- Verification under WMM’s is hard
- Decidability for (relatively strong) models such as TSO
- High complexity, but practical approaches are possible
- Duality —> simple, general, and efficient decision procedure
SLIDE 77 Conclusion
- Extension to other models ?
- Hardware/Programming Languages models ?
- Related to Consistency Criteria in concurrent/distributed syst.
- Undecidability for more complex models (RMO, Power)
- Under/upper-approximate analyses are needed
[Atig, B., Parlato, CAV 2011] [Abdulla, Atig, B., Ngo, TACAS 2017]
E.g., context-bounded analysis for TSO context-bounded analysis for Power
- Verification under WMM’s is hard
- Decidability for (relatively strong) models such as TSO
- High complexity, but practical approaches are possible
- Duality —> simple, general, and efficient decision procedure