Repairing Sequential Consistency in C/C++11
Ori Lahav1 Viktor Vafeiadis1 Jeehoon Kang2 Chung-Kil Hur2 Derek Dreyer1
1Max Planck Institute for Software Systems (MPI-SWS) 2Seoul National University
Repairing Sequential Consistency in C/C++11 Ori Lahav 1 Viktor - - PowerPoint PPT Presentation
Repairing Sequential Consistency in C/C++11 Ori Lahav 1 Viktor Vafeiadis 1 Jeehoon Kang 2 Chung-Kil Hur 2 Derek Dreyer 1 1 Max Planck Institute for Software Systems (MPI-SWS) 2 Seoul National University PLDI 2017 C11s spectrum of consistency
1Max Planck Institute for Software Systems (MPI-SWS) 2Seoul National University
Example due to Yatin Manerkar et al. [CoRR abs/1611.01507]
Example due to Yatin Manerkar et al. [CoRR abs/1611.01507]
Example due to Yatin Manerkar et al. [CoRR abs/1611.01507]
◮ Leading sync compilation (implemented in GCC and LLVM) ◮ Placing sync both before and after SC-accesses
x (0)
y (0)
x (1)
x (1)
y (0)
y (1)
x (0)
y (1)
x (0)
y (0)
x (1)
x (1)
y (0)
y (1)
x (0)
y (1)
x (0)
y (0)
x (1)
x (1)
y (0)
y (1)
x (0)
y (1)
x (0)
y (0)
x (1)
x (1)
y (0)
y (1)
x (0)
y (1)
po
hb
rf
hb
hb
hb
hb
x (0)
y (0)
x (1)
x (1)
y (0)
y (1)
x (0)
y (1)
po
hb
rf
hb
hb
hb
hb
x (0)
y (0)
x (1)
x (1)
y (0)
y (1)
x (0)
y (1)
x (0)
y (0)
x (1)
x (1)
y (0)
y (1)
x (0)
y (1)
x (0)
y (0)
x (1)
x (1)
y (0)
y (1)
x (0)
y (1)
x (0)
y (0)
x (1)
x (1)
y (0)
y (1)
x (0)
y (1)
x (0)
y (0)
x (1)
x (1)
y (0)
y (1)
x (0)
y (1)
x (0)
y (0)
x (1)
x (1)
y (0)
y (1)
x (0)
y (1)
x (0)
y (0)
x (1)
x (1)
y (0)
y (1)
x (0)
y (1)
x (0)①
y (0)
x (1)③
x (1)④
y (0)
y (1)
x (0)②
y (1)
x (0)
y (0)
x (1)
x (1)
y (0)
y (1)
x (0)
y (1)
x (0)①
y (0)
x (1)③
x (1)④
y (0)
y (1)
x (0)②
y (1)
x (0)
y (0)
x (1)
x (1)
y (0)
y (1)
x (0)
y (1)
x
x
x (0)
y (0)
x (1)
x (1)
y (0)
y (1)
x (0)
y (1)
x
x
x (0)
y (0)
x (1)
x (1)
y (0)
y (1)
x (0)
y (1)
x
x
x (0)
y (0)
x (1)
x (1)
y (0)
y (1)
x (0)
y (1)
x (0)
y (0)
x (1)
x (1)
y (0)
y (1)
x (0)
y (1)
hb
sc-order ◮ There are hb-paths between SC-accesses without sync fence in between.
x (0)
y (0)
x (1)
x (1)
y (0)
y (1)
x (0)
y (1)
hb
sc-order ◮ There are hb-paths between SC-accesses without sync fence in between. ◮ Both compilation schemes ensure a sync fence on hb-paths between
x (0)
y (0)
x (1)
x (1)
y (0)
y (1)
x (0)
y (1)
hb
sc-order ◮ There are hb-paths between SC-accesses without sync fence in between. ◮ Both compilation schemes ensure a sync fence on hb-paths between
x (0)
y (0)
x (1)
x (1)
y (0)
y (1)
x (0)
y (1)
hb
sc-order
po
sc-order
po po hb
sc-order ◮ There are hb-paths between SC-accesses without sync fence in between. ◮ Both compilation schemes ensure a sync fence on hb-paths between
x (0)
y (0)
x (1)
x (1)
y (0)
y (1)
x (0)
y (1)
hb
sc-order
po
sc-order
po po hb
sc-order ◮ There are hb-paths between SC-accesses without sync fence in between. ◮ Both compilation schemes ensure a sync fence on hb-paths between
x (0)
y (0)
x (1)
x (1)
y (0)
y (1)
x (0)
y (1)
hb
sc-order
po
sc-order
po po hb
sc-order ◮ There are hb-paths between SC-accesses without sync fence in between. ◮ Both compilation schemes ensure a sync fence on hb-paths between
◮ correctness of existing compilation schemes
◮ Power/ARMv7 (Alglave et al. ’14): leading/trailing sync ◮ x86-TSO: mfence after-SC-writes/before-SC-reads
◮ soundness of compiler optimizations
◮ DRF theorem (without relaxed accesses) ◮ coincides with C11 when SC-accesses are to distinguished locations
◮ SC-fences, even when placed between every two accesses,
◮ Algorithm designers may have to unnecessarily strengthen
◮ Chase-Lev concurrent deque [Lê et al. ’13]: “unrecoverable
hb
sc-order
hb hb sc-per-loc
sc-order ◮ We prove the correctness of existing compilation schemes
◮ SC-fences between every two accesses suffice to restore
◮ Values appear out-of-thin-air ◮ DRF is broken Load-buffering + data dependency a := xrlx; / / 1 y :=rlx a; b := yrlx; / / 1 x :=rlx b; Load-buffering + control dependency a := xrlx; / / 1 if (a = 1) y :=rlx 1; b := yrlx; / / 1 if (b = 1) x :=rlx 1;
◮ Values appear out-of-thin-air ◮ DRF is broken Load-buffering + data dependency a := xrlx; / / 1 y :=rlx a; b := yrlx; / / 1 x :=rlx b; Load-buffering + control dependency a := xrlx; / / 1 if (a = 1) y :=rlx 1; b := yrlx; / / 1 if (b = 1) x :=rlx 1;
◮ Require acyclicity of (program order ∪ reads-from) ◮ More expensive compilation:
◮ Require acyclicity of (program order ∪ reads-from) ◮ More expensive compilation:
◮ Hardware models allow (program order ∪ reads-from) cycles
◮ We have to show that such cycles can be untangled to produce a
◮ weaker semantics for SC-accesses ◮ stronger semantics for SC-fences ◮ disallow (program order ∪ reads-from) cycles
◮ correctness of compilation schemes ◮ soundness of compiler optimizations ◮ programming guarantees (DRF, SC-fences can restore SC)
◮ Mechanize our proofs ◮ ARMv8
◮ weaker semantics for SC-accesses ◮ stronger semantics for SC-fences ◮ disallow (program order ∪ reads-from) cycles
◮ correctness of compilation schemes ◮ soundness of compiler optimizations ◮ programming guarantees (DRF, SC-fences can restore SC)
◮ Mechanize our proofs ◮ ARMv8
eco
def
= (rf ∪ mo ∪ rb)+ pohbpo
def
= po|=loc; hb; po|=loc RC11
def
= acyclic(([Esc] ∪ [Fsc]; hb?); (po ∪ pohbpo ∪ rf ∪ mo ∪ rb); ([Esc] ∪ hb?; [Fsc]) ∪ [Fsc]; hb?; (hb ∪ eco); hb?; [Fsc]) Strong-RC11
def
= acyclic(([Esc] ∪ [Fsc]; hb?); (po ∪ pohbpo ∪ eco); ([Esc] ∪ hb?; [Fsc]) ∪ [Fsc]; hb?; (hb ∪ eco); hb?; [Fsc]) Strongest
def
= acyclic([Esc] ∪ [Fsc]; hb?); (hb ∪ eco); ([Esc] ∪ hb?; [Fsc])
x (0)
y (0)
x (1)
x (1)
y (0)
y (1)
x (0)
y (1)
x (0)
y (0)
x (1)
x (1)
y (0)
y (1)
x (0)
y (1)
hb
sc-order
po po sc-per-loc
sc-order
x (0)
y (0)
x (1)
x (1)
y (0)
y (1)
x (0)
y (1)
hb
sc-order
po po sc-per-loc
sc-order
x (0)
y (0)
x (1)
x (1)
y (0)
y (1)
x (0)
y (1)
hb
sc-order
po po sc-per-loc
sc-order
x (0)
y (0)
x (1)
x (1)
y (0)
y (1)
x (0)
y (1)
hb
sc-order
po po sc-per-loc
sc-order
x (0)
y (0)
x (1)
x (1)
y (0)
y (1)
x (0)
y (1)
hb
sc-order
po po sc-per-loc
sc-order
x (0)
y (0)
x (1)
x (1)
y (0)
y (1)
x (0)
y (1)
hb
sc-order
hb hb
po
po sc-per-loc
sc-order