relaxed memory models
play

Relaxed memory models No sequential consistency (SC) in chips today - PowerPoint PPT Presentation

Dynamic Synthesis for Relaxed Memory Models Feng Liu*, Nayden Nedev*, Nedyalko Prisadnikov , Martin Vechev, Eran Yahav *Princeton University, Sofia University, ETH Zurich, Technion 06/13/2012 PLDI 2012, Beijing Relaxed


  1. Dynamic Synthesis for Relaxed Memory Models Feng Liu*, Nayden Nedev*, Nedyalko Prisadnikov †, Martin Vechev§, Eran Yahav ‡ *Princeton University, † Sofia University, § ETH Zurich, ‡ Technion 06/13/2012 PLDI 2012, Beijing

  2. Relaxed memory models • No sequential consistency (SC) in chips today • Chip designers implement “relaxed memory” on different architectures : - total store order ( TSO ) Intel’s and AMD’s X86 ; SPARC - partial store order ( PSO ) SPARC - PPC model IBM’s PowerPC; ARM - …

  3. Modeling TSO & PSO Programs • Store Buffering – FIFO queues (buffers) associated with threads – A store goes to a local buffer, not memory – Stores in buffers are flushed at non-deterministic times • Store Forwarding – Satisfy loads from local buffer if possible 3

  4. PSO Example H=0, Done=0 Fails on PSO thread1 thread2 H=1; while (!Done) { } Done=1; assert(H= =1); store flush … H 1 … t1 Done 1 Main Memory … H … t2 Done Done=1 H=0 load 4

  5. Memory Fences H=0, Done=0 thread1 thread2 H=1; while (!Done) { } Fence; assert(H= =1); Done=1; Memory fence is very expensive (10-100 cycles) Use only where necessary 5

  6. Our Approach C/C++ Program P Program P’ FENDER Specification with Dynamic Analysis & S Fences static fixing Memory Model M P’ satisfies S under M 6

  7. Challenge: Handling real-world concurrent programs A lock-free memory allocator 771 lines of C code 2699 lines of IR code [1] M. Michael, “scalable lock - free dynamic memory allocation,” PLDI’04.

  8. Real-World Programs? • Exposing violations under relaxed memory models – Violations occur rarely • Many possible fence placements – Large programs • Written in C/C++ language – Rather than program models 8

  9. Contributions • Demonic scheduler to expose violations – Delay flushes of values from store buffer to main memory • Avoiding bad executions by adding fences – Extracting ordering constraints from bad executions – Enforcing ordering constraints using fences • Parametric synthesis framework – Different memory models • Evaluating fences required under different memory models and correctness criteria – Found redundant and missing fences – Linearizability on relaxed memory models – Handled real C/C++ programs 9

  10. Fender Framework – Support for concurrency and RMM Concurrent Client C/C++ code LLVM-GCC .bc LLVM Interpreter Threading our extension Demonic Memory Scheduler Model existing work 10

  11. Our work – Dynamic analysis Concurrent Client C/C++ code LLVM-GCC .bc SAT assignment LLVM Interpreter Trace Analysis SAT Solver Threading trace Order formula our extension Demonic Memory Specification Scheduler Model existing work 11

  12. Our work – Implement memory fences Fixed bytecode & Concurrent Client Fence location report C/C++ code LLVM-GCC Fence Enforcement .bc SAT modified .bc assignment LLVM Interpreter Trace Analysis SAT Solver Threading Order trace formula our extension Demonic Memory Specification Scheduler Model existing work 12

  13. Example H=0, Done=0 thread1 thread2 L1: H=1; L3: while (!Done) { } L2: Done=1; L4: assert(H==1); : : … H … t1 Done Main Memory … H … t2 Done 13

  14. Interpretation on PSO H=0, Done=0 thread1 thread2 L3: Load Done c L1: H=1; L3: while (!Done) { } L1: Store H=1 L2: Done=1; L4: assert(H==1); : L2: Store Done=1 : L4: Load H trace L1 … H 1 … t1 Done 1 L2 Main Memory … H … t2 Done c load store flush 14

  15. Interpretation on PSO H=0, Done=0 thread1 thread2 c L3 L1: H=1; L3: while (!Done) { } L1 L2: Done=1; L4: assert(H==1); : : L2 trace L3 … H … t1 Done Main Memory … H … t2 Done c load store flush 15

  16. Flush with a probability H=0, Done=0 thread1 thread2 c L3 L1: H=1; L3: while (!Done) { } L1 L2: Done=1; L4: assert(H==1); : L2 : L3 trace L1 … H 1 … t1 Done Main Memory … H … t2 Done c load store flush 16

  17. Execution trace H=0, Done=0 thread1 thread2 c L3 L1: H=1; L3: while (!Done) { } L1 L2: Done=1; L4: assert(H==1); : L2 : L3 trace L4 L1 … H 1 … t1 Done Main : Memory … H … t2 Done c load store flush 17

  18. Checking Specification . . . . . . . . trace different executions c load store flush 18

  19. Repair one trace L1 … H 1 L1 … t1 Done 1 L2 L2 Main Memory … . . . . H trace … t2 Done C order predicate [L1, L2] D : order formula [L1, L2]  [C, D]  … for a single execution x 1 x 2 19

  20. Repair all incorrect traces . . . . . . . . trace One memory fence should be placed here trace1 trace2 trace3 different executions Global formula to SAT solver: (x 1  x 2  ..)  (x 1  x 3  ..)  … trace1 trace3 20

  21. Fix the program H=0, Done=0 thread1 thread2 L1: H=1; L3: while(! Done) { } Fence; L4: assert(H==1); . . . . L2: Done=1; : : 21

  22. Evaluation - Benchmarks Memory safety Operational Sequential consistency linearizability Program TSO PSO TSO PSO TSO PSO 0 0 1 2 2 3 Chase-Lev WSQ Cilk’s THE WSQ 0 0 1 3 - - Work stealing queues 0 0 0 2 1 2 FIFO WSQ 0 0 0 1 0 1 LIFO WSQ 0 0 0 1 0 1 Anchor WSQ 0 3 - - - - FIFO iWSQ Idempotent Work stealing queues 0 2 - - - - LIFO iWSQ 0 2 - - - - Anchor iWSQ 0 0 0 0 0 0 MS2 Queue Concurrent data structures 0 0 0 1 0 1 MSN Queue 0 0 0 0 0 0 LazyList Set Harris’s Set 0 0 0 1 0 1 Memory 0 3 0 4 0 4 Lock-free memory allocator allocator 22

  23. Evaluation - Specifications Memory safety Operational Sequential consistency linearizability Program TSO PSO TSO PSO TSO PSO 0 0 1 2 2 3 Chase-Lev WSQ Cilk’s THE WSQ 0 0 1 3 - - 0 0 0 2 1 2 FIFO WSQ 0 0 0 1 0 1 LIFO WSQ 0 0 0 1 0 1 Anchor WSQ 0 3 - - - - FIFO iWSQ 0 2 - - - - LIFO iWSQ 0 2 - - - - Anchor iWSQ 0 0 0 0 0 0 MS2 Queue 0 0 0 1 0 1 MSN Queue 0 0 0 0 0 0 LazyList Set Harris’s Set 0 0 0 1 0 1 Memory 0 3 0 4 0 4 allocator 23

  24. Evaluation - Memory models Memory safety Operational Sequential consistency linearizability Program TSO PSO TSO PSO TSO PSO 0 0 1 2 2 3 Chase-Lev WSQ Cilk’s THE WSQ 0 0 1 3 - - 0 0 0 2 1 2 FIFO WSQ 0 0 0 1 0 1 LIFO WSQ 0 0 0 1 0 1 Anchor WSQ 0 3 - - - - FIFO iWSQ 0 2 - - - - LIFO iWSQ 0 2 - - - - Anchor iWSQ 0 0 0 0 0 0 MS2 Queue 0 0 0 1 0 1 MSN Queue 0 0 0 0 0 0 LazyList Set Harris’s Set 0 0 0 1 0 1 Memory 0 3 0 4 0 4 allocator 24

  25. Evaluation - number of memory fences Memory safety Operational Sequential consistency linearizability Program TSO PSO TSO PSO TSO PSO 0 0 1 2 2 3 Chase-Lev WSQ Cilk’s THE WSQ 0 0 1 3 - - 0 0 0 2 1 2 FIFO WSQ 0 0 0 1 0 1 LIFO WSQ 0 0 0 1 0 1 Anchor WSQ 0 3 - - - - FIFO iWSQ 0 2 - - - - LIFO iWSQ 0 2 - - - - Anchor iWSQ 0 0 0 0 0 0 MS2 Queue 0 0 0 1 0 1 MSN Queue 0 0 0 0 0 0 LazyList Set Harris’s Set 0 0 0 1 0 1 Memory 0 3 0 4 0 4 allocator 25

  26. Conclusion • Demonic scheduler to expose violations • Avoiding bad executions by adding fences • Parametric synthesis framework • Evaluating fences required under different memory models and correctness criteria 26

  27. Thanks! Q & A 27

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend