Relaxed memory models No sequential consistency (SC) in chips today - PowerPoint PPT Presentation

Dynamic Synthesis for Relaxed Memory Models Feng Liu*, Nayden Nedev*, Nedyalko Prisadnikov †, Martin Vechev§, Eran Yahav ‡ *Princeton University, † Sofia University, § ETH Zurich, ‡ Technion 06/13/2012 PLDI 2012, Beijing

Relaxed memory models • No sequential consistency (SC) in chips today • Chip designers implement “relaxed memory” on different architectures : - total store order ( TSO ) Intel’s and AMD’s X86 ; SPARC - partial store order ( PSO ) SPARC - PPC model IBM’s PowerPC; ARM - …

Modeling TSO & PSO Programs • Store Buffering – FIFO queues (buffers) associated with threads – A store goes to a local buffer, not memory – Stores in buffers are flushed at non-deterministic times • Store Forwarding – Satisfy loads from local buffer if possible 3

PSO Example H=0, Done=0 Fails on PSO thread1 thread2 H=1; while (!Done) { } Done=1; assert(H= =1); store flush … H 1 … t1 Done 1 Main Memory … H … t2 Done Done=1 H=0 load 4

Memory Fences H=0, Done=0 thread1 thread2 H=1; while (!Done) { } Fence; assert(H= =1); Done=1; Memory fence is very expensive (10-100 cycles) Use only where necessary 5

Our Approach C/C++ Program P Program P’ FENDER Specification with Dynamic Analysis & S Fences static fixing Memory Model M P’ satisfies S under M 6

Challenge: Handling real-world concurrent programs A lock-free memory allocator 771 lines of C code 2699 lines of IR code [1] M. Michael, “scalable lock - free dynamic memory allocation,” PLDI’04.

Real-World Programs? • Exposing violations under relaxed memory models – Violations occur rarely • Many possible fence placements – Large programs • Written in C/C++ language – Rather than program models 8

Contributions • Demonic scheduler to expose violations – Delay flushes of values from store buffer to main memory • Avoiding bad executions by adding fences – Extracting ordering constraints from bad executions – Enforcing ordering constraints using fences • Parametric synthesis framework – Different memory models • Evaluating fences required under different memory models and correctness criteria – Found redundant and missing fences – Linearizability on relaxed memory models – Handled real C/C++ programs 9

Fender Framework – Support for concurrency and RMM Concurrent Client C/C++ code LLVM-GCC .bc LLVM Interpreter Threading our extension Demonic Memory Scheduler Model existing work 10

Our work – Dynamic analysis Concurrent Client C/C++ code LLVM-GCC .bc SAT assignment LLVM Interpreter Trace Analysis SAT Solver Threading trace Order formula our extension Demonic Memory Specification Scheduler Model existing work 11

Our work – Implement memory fences Fixed bytecode & Concurrent Client Fence location report C/C++ code LLVM-GCC Fence Enforcement .bc SAT modified .bc assignment LLVM Interpreter Trace Analysis SAT Solver Threading Order trace formula our extension Demonic Memory Specification Scheduler Model existing work 12

Example H=0, Done=0 thread1 thread2 L1: H=1; L3: while (!Done) { } L2: Done=1; L4: assert(H==1); : : … H … t1 Done Main Memory … H … t2 Done 13

Interpretation on PSO H=0, Done=0 thread1 thread2 L3: Load Done c L1: H=1; L3: while (!Done) { } L1: Store H=1 L2: Done=1; L4: assert(H==1); : L2: Store Done=1 : L4: Load H trace L1 … H 1 … t1 Done 1 L2 Main Memory … H … t2 Done c load store flush 14

Interpretation on PSO H=0, Done=0 thread1 thread2 c L3 L1: H=1; L3: while (!Done) { } L1 L2: Done=1; L4: assert(H==1); : : L2 trace L3 … H … t1 Done Main Memory … H … t2 Done c load store flush 15

Flush with a probability H=0, Done=0 thread1 thread2 c L3 L1: H=1; L3: while (!Done) { } L1 L2: Done=1; L4: assert(H==1); : L2 : L3 trace L1 … H 1 … t1 Done Main Memory … H … t2 Done c load store flush 16

Execution trace H=0, Done=0 thread1 thread2 c L3 L1: H=1; L3: while (!Done) { } L1 L2: Done=1; L4: assert(H==1); : L2 : L3 trace L4 L1 … H 1 … t1 Done Main : Memory … H … t2 Done c load store flush 17

Checking Specification . . . . . . . . trace different executions c load store flush 18

Repair one trace L1 … H 1 L1 … t1 Done 1 L2 L2 Main Memory … . . . . H trace … t2 Done C order predicate [L1, L2] D : order formula [L1, L2]  [C, D]  … for a single execution x 1 x 2 19

Repair all incorrect traces . . . . . . . . trace One memory fence should be placed here trace1 trace2 trace3 different executions Global formula to SAT solver: (x 1  x 2  ..)  (x 1  x 3  ..)  … trace1 trace3 20

Fix the program H=0, Done=0 thread1 thread2 L1: H=1; L3: while(! Done) { } Fence; L4: assert(H==1); . . . . L2: Done=1; : : 21

Evaluation - Benchmarks Memory safety Operational Sequential consistency linearizability Program TSO PSO TSO PSO TSO PSO 0 0 1 2 2 3 Chase-Lev WSQ Cilk’s THE WSQ 0 0 1 3 - - Work stealing queues 0 0 0 2 1 2 FIFO WSQ 0 0 0 1 0 1 LIFO WSQ 0 0 0 1 0 1 Anchor WSQ 0 3 - - - - FIFO iWSQ Idempotent Work stealing queues 0 2 - - - - LIFO iWSQ 0 2 - - - - Anchor iWSQ 0 0 0 0 0 0 MS2 Queue Concurrent data structures 0 0 0 1 0 1 MSN Queue 0 0 0 0 0 0 LazyList Set Harris’s Set 0 0 0 1 0 1 Memory 0 3 0 4 0 4 Lock-free memory allocator allocator 22

Evaluation - Specifications Memory safety Operational Sequential consistency linearizability Program TSO PSO TSO PSO TSO PSO 0 0 1 2 2 3 Chase-Lev WSQ Cilk’s THE WSQ 0 0 1 3 - - 0 0 0 2 1 2 FIFO WSQ 0 0 0 1 0 1 LIFO WSQ 0 0 0 1 0 1 Anchor WSQ 0 3 - - - - FIFO iWSQ 0 2 - - - - LIFO iWSQ 0 2 - - - - Anchor iWSQ 0 0 0 0 0 0 MS2 Queue 0 0 0 1 0 1 MSN Queue 0 0 0 0 0 0 LazyList Set Harris’s Set 0 0 0 1 0 1 Memory 0 3 0 4 0 4 allocator 23

Evaluation - Memory models Memory safety Operational Sequential consistency linearizability Program TSO PSO TSO PSO TSO PSO 0 0 1 2 2 3 Chase-Lev WSQ Cilk’s THE WSQ 0 0 1 3 - - 0 0 0 2 1 2 FIFO WSQ 0 0 0 1 0 1 LIFO WSQ 0 0 0 1 0 1 Anchor WSQ 0 3 - - - - FIFO iWSQ 0 2 - - - - LIFO iWSQ 0 2 - - - - Anchor iWSQ 0 0 0 0 0 0 MS2 Queue 0 0 0 1 0 1 MSN Queue 0 0 0 0 0 0 LazyList Set Harris’s Set 0 0 0 1 0 1 Memory 0 3 0 4 0 4 allocator 24

Evaluation - number of memory fences Memory safety Operational Sequential consistency linearizability Program TSO PSO TSO PSO TSO PSO 0 0 1 2 2 3 Chase-Lev WSQ Cilk’s THE WSQ 0 0 1 3 - - 0 0 0 2 1 2 FIFO WSQ 0 0 0 1 0 1 LIFO WSQ 0 0 0 1 0 1 Anchor WSQ 0 3 - - - - FIFO iWSQ 0 2 - - - - LIFO iWSQ 0 2 - - - - Anchor iWSQ 0 0 0 0 0 0 MS2 Queue 0 0 0 1 0 1 MSN Queue 0 0 0 0 0 0 LazyList Set Harris’s Set 0 0 0 1 0 1 Memory 0 3 0 4 0 4 allocator 25

Conclusion • Demonic scheduler to expose violations • Avoiding bad executions by adding fences • Parametric synthesis framework • Evaluating fences required under different memory models and correctness criteria 26

Thanks! Q & A 27

Relaxed memory models No sequential consistency (SC) in chips today - PowerPoint PPT Presentation

Dynamic Synthesis for Relaxed Memory Models Feng Liu, Nayden Nedev, Nedyalko Prisadnikov , Martin Vechev, Eran Yahav *Princeton University, Sofia University, ETH Zurich, Technion 06/13/2012 PLDI 2012, Beijing Relaxed

Robustness against Relaxed Memory Models Memory Models Roland Meyer Technische Universit at

Relaxed Separation Logic Tutorial @ POPL14 Viktor Vafeiadis MPI-SWS 20 January 2014

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

A solution of A solution of the cusp problem the cusp problem in relaxed halos in relaxed

5th STL Workshop, June 2005 Title: Relaxed weak queues: an alternative to run-relaxed heaps

Planning and Optimization C2. Delete Relaxation: Finding Relaxed Plans Malte Helmert and Gabriele

Program logics for relaxed consistency UPMARC Summer School 2014 Viktor Vafeiadis Max Planck

Weak memory models INF4140 - Models of concurrency Weak memory models Fall 2016 30. 10. 2016

1 Memory SoC Persistent Memory-Driven Memory Memory Processor-Centric Memory SoC SoC

Networks Computer-Computer Comm CPU CPU CPU CPU Memory Device Device Memory Memory

Predicate Abstraction for Relaxed Memory Models Andrei Dan Yuri Meshman Martin Vechev Eran

Virtual Memory 1 Memory Hierarchy Memory 4GB Cache 1M Registers 1K Question: What if

Personal SE Computer Memory Addresses C Pointers Computer Memory Organization Memory is a

Memory Memory processing is the ability to: Acquire (Short term memory) Manipulate

Memory Management Memory Manager Requirements Minimize primary memory access time

Effective Abstractions for Verification under Relaxed Memory Models Andrei Dan Yuri Meshman

Mainline Explicit Fencing A new era for graphics Gustavo Padovan Open First Agenda Intro to

Welfare Economics Capitalism University of Virginia Matthias Brinkmann Results of Student

Textual Inference - Methods and Applications Gnter Neumann, LT Lab, DFKI, December 2013

Approaches - REALISM 1 POLS 1160 International Relations Fall 2013 In the news. 2 POLS 1160

Single Sided MPI Reusing this material This work is licensed under a Creative Commons

Automatic failovers with Kubernetes using Orchestrator, ProxySQL and Zookeeper Continuous

Fence-Insertion for Structured Programs Arash Pourdamghani Mohammad Taheri Mohsen Lesani

Slide 1 Page: 1 Mathematical Tasks.ppt Effective Mathematics Instruction: The Role of

Relaxed memory models No sequential consistency (SC) in chips today - PowerPoint PPT Presentation

Dynamic Synthesis for Relaxed Memory Models Feng Liu*, Nayden Nedev*, Nedyalko Prisadnikov , Martin Vechev, Eran Yahav *Princeton University, Sofia University, ETH Zurich, Technion 06/13/2012 PLDI 2012, Beijing Relaxed

Robustness against Relaxed Memory Models Memory Models Roland Meyer Technische Universit at

Relaxed Separation Logic Tutorial @ POPL14 Viktor Vafeiadis MPI-SWS 20 January 2014

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

A solution of A solution of the cusp problem the cusp problem in relaxed halos in relaxed

5th STL Workshop, June 2005 Title: Relaxed weak queues: an alternative to run-relaxed heaps

Planning and Optimization C2. Delete Relaxation: Finding Relaxed Plans Malte Helmert and Gabriele

Program logics for relaxed consistency UPMARC Summer School 2014 Viktor Vafeiadis Max Planck

Weak memory models INF4140 - Models of concurrency Weak memory models Fall 2016 30. 10. 2016

1 Memory SoC Persistent Memory-Driven Memory Memory Processor-Centric Memory SoC SoC

Networks Computer-Computer Comm CPU CPU CPU CPU Memory Device Device Memory Memory

Predicate Abstraction for Relaxed Memory Models Andrei Dan Yuri Meshman Martin Vechev Eran

Virtual Memory 1 Memory Hierarchy Memory 4GB Cache 1M Registers 1K Question: What if

Personal SE Computer Memory Addresses C Pointers Computer Memory Organization Memory is a

Memory Memory processing is the ability to: Acquire (Short term memory) Manipulate

Memory Management Memory Manager Requirements Minimize primary memory access time

Effective Abstractions for Verification under Relaxed Memory Models Andrei Dan Yuri Meshman

Mainline Explicit Fencing A new era for graphics Gustavo Padovan Open First Agenda Intro to

Welfare Economics Capitalism University of Virginia Matthias Brinkmann Results of Student

Textual Inference - Methods and Applications Gnter Neumann, LT Lab, DFKI, December 2013

Approaches - REALISM 1 POLS 1160 International Relations Fall 2013 In the news. 2 POLS 1160

Single Sided MPI Reusing this material This work is licensed under a Creative Commons

Automatic failovers with Kubernetes using Orchestrator, ProxySQL and Zookeeper Continuous

Fence-Insertion for Structured Programs Arash Pourdamghani Mohammad Taheri Mohsen Lesani

Slide 1 Page: 1 Mathematical Tasks.ppt Effective Mathematics Instruction: The Role of

Dynamic Synthesis for Relaxed Memory Models Feng Liu, Nayden Nedev, Nedyalko Prisadnikov , Martin Vechev, Eran Yahav *Princeton University, Sofia University, ETH Zurich, Technion 06/13/2012 PLDI 2012, Beijing Relaxed