0 anton podkopaev
play

0 Anton Podkopaev Researcher @ JetBrains Research Postdoc @ - PowerPoint PPT Presentation

Programming language memory models: Problems, Solutions, and Directions Anton Podkopaev anton@ podkopaev.net 0 Anton Podkopaev Researcher @ JetBrains Research Postdoc @ MPI-SWS Docent @ HSE Programming languages Weak memory concurrency


  1. 1. support compiler optimizations 2. provide effjcient compilation to hardware 3. have easy non-expert mode Requirements to (Weak) Memory Models Hardware MMs should [x86, Power, ARM, RISC-V] 1. describe real CPUs 2. save room for future optimizations 3. provide reasonable guarantees for PLs Programming languages’ MMs should [C/C++, Java, JS, Wasm, OCaml] 6

  2. Requirements to (Weak) Memory Models Hardware MMs should [x86, Power, ARM, RISC-V] 1. describe real CPUs 2. save room for future optimizations 3. provide reasonable guarantees for PLs Programming languages’ MMs should [C/C++, Java, JS, Wasm, OCaml] 1. support compiler optimizations 2. provide effjcient compilation to hardware 3. have easy non-expert mode 6

  3. Requirements to (Weak) Memory Models Hardware MMs should [x86, Power, ARM, RISC-V] 1. describe real CPUs 2. save room for future optimizations 3. provide reasonable guarantees for PLs Programming languages’ MMs should [C/C++, Java, JS, Wasm, OCaml] 1. support compiler optimizations 2. provide effjcient compilation to hardware 3. have easy non-expert mode 6

  4. a y y 1 Optimized x 1 b x 1. Compiler optimizations [ x ] := 1 ; [ y ] := 1 ; Source a := [ y ]; b := [ x ]; 7

  5. a y y 1 Optimized x 1 b x 1. Compiler optimizations [ x ] := 1 ; [ y ] := 1 ; Source a := [ y ]; b := [ x ]; 7

  6. 1. Compiler optimizations [ x ] := 1 ; [ y ] := 1 ; Source a := [ y ]; b := [ x ]; a := [ y ]; [ y ] := 1 ; Optimized [ x ] := 1 ; b := [ x ]; 7

  7. 1. Compiler optimizations [ x ] := 1 ; [ y ] := 1 ; Source a := [ y ]; b := [ x ]; ⊆ a := [ y ]; [ y ] := 1 ; Optimized [ x ] := 1 ; b := [ x ]; 7

  8. 2. Effjcient compilation to hardware [ x ] := 1 ; [ y ] := 1 ; Source MM (SC) a := [ y ]; b := [ x ]; [ x ] := 1 ; [ y ] := 1 ; Target MM (x86) mfence ; mfence ; a := [ y ]; b := [ x ]; 8

  9. 2. Effjcient compilation to hardware [ x ] := 1 ; [ y ] := 1 ; Source MM (SC) a := [ y ]; b := [ x ]; No compilation scheme w/o fences [ x ] := 1 ; [ y ] := 1 ; Target MM (x86) mfence ; mfence ; a := [ y ]; b := [ x ]; 8

  10. D ata- R ace- F reedom guarantee: a x b y if a then if b then y 1 x 1 C/C++ MM allows to get a b 1 1 is O ut- O f- T hin- A ir outcome a b 3. Easy non-expert mode Nice program ⇒ nice behaviors 9

  11. D ata- R ace- F reedom guarantee: a x b y if a then if b then y 1 x 1 C/C++ MM allows to get a b 1 1 is O ut- O f- T hin- A ir outcome a b 3. Easy non-expert mode No data races ⇒ only SC behaviors 9

  12. D ata- R ace- F reedom guarantee: a x b y if a then if b then y 1 x 1 C/C++ MM allows to get a b 1 1 is O ut- O f- T hin- A ir outcome a b 3. Easy non-expert mode No data races in SC executions ⇒ only SC behaviors 9

  13. a x b y if a then if b then y 1 x 1 C/C++ MM allows to get a b 1 1 is O ut- O f- T hin- A ir outcome a b 3. Easy non-expert mode D ata- R ace- F reedom guarantee: No data races in SC executions ⇒ only SC behaviors 9

  14. C/C++ MM allows to get a b 1 1 is O ut- O f- T hin- A ir outcome a b 3. Easy non-expert mode D ata- R ace- F reedom guarantee: No data races in SC executions ⇒ only SC behaviors a := [ x ]; b := [ y ]; if a then if b then [ y ] := 1 [ x ] := 1 9

  15. 1 is O ut- O f- T hin- A ir outcome a b 3. Easy non-expert mode D ata- R ace- F reedom guarantee: No data races in SC executions ⇒ only SC behaviors a := [ x ]; b := [ y ]; if a then if b then [ y ] := 1 [ x ] := 1 C/C++ MM allows to get a = b = 1 9

  16. 3. Easy non-expert mode D ata- R ace- F reedom guarantee: No data races in SC executions ⇒ only SC behaviors a := [ x ]; b := [ y ]; if a then if b then [ y ] := 1 [ x ] := 1 C/C++ MM allows to get a = b = 1 a = b = 1 is O ut- O f- T hin- A ir outcome 9

  17. Simplicity No UB RC11 [Lahav et al., 2017] Promising [Kang et al., 2017, Lee et al., 2020] Weakestmo [Chakraborty and Vafeiadis, 2019] Modular Relaxed Dep. [Paviotti et al., 2020] OCaml MM [Dolan et al., 2018] Thank you! http://podkopaev.net Programming languages’ MM to Hardware Comp. Opt. (No OOTA) Efg. Comp. DRF SC [Lamport, 1979] Java MM [Manson et al., 2005] C/C++ MM [Batty et al., 2011] 10

  18. Simplicity No UB RC11 [Lahav et al., 2017] Promising [Kang et al., 2017, Lee et al., 2020] Weakestmo [Chakraborty and Vafeiadis, 2019] Modular Relaxed Dep. [Paviotti et al., 2020] OCaml MM [Dolan et al., 2018] Thank you! http://podkopaev.net Programming languages’ MM to Hardware Comp. Opt. (No OOTA) Efg. Comp. DRF SC [Lamport, 1979] Java MM [Manson et al., 2005] C/C++ MM [Batty et al., 2011] 11

  19. Validity of transformations [Ševčík and Aspinall, 2008] JMM SC Trace-preserving transformations ✓ Reordering normal memory accesses ✗ Redundant read after read elimination ✓ Redundant read after write elimination ✓ Irrelevant read elimination ✓ Irrelevant read introduction ✓ Redundant write before write elimination ✓ Redundant write after read elimination ✓ External action reordering ✗ 12

  20. Drawbacks: Hardware still allows weak behaviors, i.e., no end-to-end SC Requires modifying existing compilers SC-preserving optimizations in LLVM [Marino et al., 2011] Average slowdown: ▶ 34% w/ only SC preserving optimizations ▶ 5.5% w/ optimizations modifjed to preserve SC 13

  21. SC-preserving optimizations in LLVM [Marino et al., 2011] Average slowdown: ▶ 34% w/ only SC preserving optimizations ▶ 5.5% w/ optimizations modifjed to preserve SC Drawbacks: ▶ Hardware still allows weak behaviors, i.e., no end-to-end SC ▶ Requires modifying existing compilers 13

  22. Simplicity No UB RC11 [Lahav et al., 2017] Promising [Kang et al., 2017, Lee et al., 2020] Weakestmo [Chakraborty and Vafeiadis, 2019] Modular Relaxed Dep. [Paviotti et al., 2020] OCaml MM [Dolan et al., 2018] Thank you! http://podkopaev.net Programming languages’ MM to Hardware Comp. Opt. (No OOTA) Efg. Comp. DRF SC [Lamport, 1979] Java MM [Manson et al., 2005] C/C++ MM [Batty et al., 2011] 14

  23. Validity of transformations [Ševčík and Aspinall, 2008] SC JMM ∗ Trace-preserving transformations ✓ ✓ Reordering normal memory accesses ✓ ∗ ✗ Redundant read after read elimination ✓ ✗ Redundant read after write elimination ✓ ✓ Irrelevant read elimination ✓ ✓ Irrelevant read introduction ✓ ✗ Redundant write before write elimination ✓ ✓ Redundant write after read elimination ✓ ✗ External action reordering ✗ ✗ 15

  24. Validity of transformations [Ševčík and Aspinall, 2008] SC JMM ∗ Trace-preserving transformations ✓ ✓ Reordering normal memory accesses ✓ ∗ ✗ Redundant read after read elimination ✓ ✗ Redundant read after write elimination ✓ ✓ Irrelevant read elimination ✓ ✓ Irrelevant read introduction ✓ ✗ Redundant write before write elimination ✓ ✓ Redundant write after read elimination ✓ ✗ External action reordering ✗ ✗ 15

  25. Simplicity No UB RC11 [Lahav et al., 2017] Promising [Kang et al., 2017, Lee et al., 2020] Weakestmo [Chakraborty and Vafeiadis, 2019] Modular Relaxed Dep. [Paviotti et al., 2020] OCaml MM [Dolan et al., 2018] Thank you! http://podkopaev.net Programming languages’ MM to Hardware Comp. Opt. (No OOTA) Efg. Comp. DRF SC [Lamport, 1979] Java MM [Manson et al., 2005] C/C++ MM [Batty et al., 2011] 16

  26. Simplicity No UB RC11 [Lahav et al., 2017] Promising [Kang et al., 2017, Lee et al., 2020] Weakestmo [Chakraborty and Vafeiadis, 2019] Modular Relaxed Dep. [Paviotti et al., 2020] OCaml MM [Dolan et al., 2018] Thank you! http://podkopaev.net Programming languages’ MM to Hardware Comp. Opt. (No OOTA) Efg. Comp. DRF SC [Lamport, 1979] Java MM [Manson et al., 2005] C/C++ MM [Batty et al., 2011] 16

  27. Simplicity No UB RC11 [Lahav et al., 2017] Promising [Kang et al., 2017, Lee et al., 2020] Weakestmo [Chakraborty and Vafeiadis, 2019] Modular Relaxed Dep. [Paviotti et al., 2020] OCaml MM [Dolan et al., 2018] Thank you! http://podkopaev.net Programming languages’ MM to Hardware Comp. Opt. (No OOTA) Efg. Comp. DRF SC [Lamport, 1979] Java MM [Manson et al., 2005] C/C++ MM [Batty et al., 2011] 16

  28. End-to-end SC via Volatile JVM [Liu et al., 2017, Liu et al., 2019] Java MM guarantees D ata- R ace- F reedom: Shared locations are volatile (no data races) ⇒ SC semantics 17

  29. 28 79 81 164 57 85 157 73 125 103 End-to-end SC via Volatile JVM [Liu et al., 2017, Liu et al., 2019] Benchmarks Slowdown, in % DaCapo spark-perf x86 Average Max ARM (1) Average Max ARM (2) Average Max 17

  30. 57 85 157 73 125 103 End-to-end SC via Volatile JVM [Liu et al., 2017, Liu et al., 2019] Benchmarks Slowdown, in % DaCapo spark-perf x86 Average 28 79 Max 81 164 ARM (1) Average Max ARM (2) Average Max 17

  31. 73 125 103 End-to-end SC via Volatile JVM [Liu et al., 2017, Liu et al., 2019] Benchmarks Slowdown, in % DaCapo spark-perf x86 Average 28 79 Max 81 164 ARM (1) Average 57 85 Max 157 ∞ ARM (2) Average Max 17

  32. End-to-end SC via Volatile JVM [Liu et al., 2017, Liu et al., 2019] Benchmarks Slowdown, in % DaCapo spark-perf x86 Average 28 79 Max 81 164 ARM (1) Average 57 85 Max 157 ∞ ARM (2) Average 73 125 Max 103 ∞ 17

  33. Simplicity No UB RC11 [Lahav et al., 2017] Promising [Kang et al., 2017, Lee et al., 2020] Weakestmo [Chakraborty and Vafeiadis, 2019] Modular Relaxed Dep. [Paviotti et al., 2020] OCaml MM [Dolan et al., 2018] Thank you! http://podkopaev.net Programming languages’ MM to Hardware Comp. Opt. (No OOTA) Efg. Comp. DRF SC [Lamport, 1979] Java MM [Manson et al., 2005] C/C++ MM [Batty et al., 2011] 18

  34. Simplicity No UB RC11 [Lahav et al., 2017] Promising [Kang et al., 2017, Lee et al., 2020] Weakestmo [Chakraborty and Vafeiadis, 2019] Modular Relaxed Dep. [Paviotti et al., 2020] OCaml MM [Dolan et al., 2018] Thank you! http://podkopaev.net Programming languages’ MM to Hardware Comp. Opt. (No OOTA) Efg. Comp. DRF SC [Lamport, 1979] Java MM [Manson et al., 2005] C/C++ MM [Batty et al., 2011] 18

  35. Simplicity No UB RC11 [Lahav et al., 2017] Promising [Kang et al., 2017, Lee et al., 2020] Weakestmo [Chakraborty and Vafeiadis, 2019] Modular Relaxed Dep. [Paviotti et al., 2020] OCaml MM [Dolan et al., 2018] Thank you! http://podkopaev.net Programming languages’ MM to Hardware Comp. Opt. (No OOTA) Efg. Comp. DRF SC [Lamport, 1979] Java MM [Manson et al., 2005] C/C++ MM [Batty et al., 2011] 18

  36. Simplicity No UB RC11 [Lahav et al., 2017] Promising [Kang et al., 2017, Lee et al., 2020] Weakestmo [Chakraborty and Vafeiadis, 2019] Modular Relaxed Dep. [Paviotti et al., 2020] OCaml MM [Dolan et al., 2018] Thank you! http://podkopaev.net Programming languages’ MM to Hardware Comp. Opt. (No OOTA) Efg. Comp. DRF SC [Lamport, 1979] Java MM [Manson et al., 2005] C/C++ MM [Batty et al., 2011] 19

  37. Simplicity No UB RC11 [Lahav et al., 2017] Promising [Kang et al., 2017, Lee et al., 2020] Weakestmo [Chakraborty and Vafeiadis, 2019] Modular Relaxed Dep. [Paviotti et al., 2020] OCaml MM [Dolan et al., 2018] Thank you! http://podkopaev.net Programming languages’ MM to Hardware Comp. Opt. (No OOTA) Efg. Comp. DRF SC [Lamport, 1979] Java MM [Manson et al., 2005] C/C++ MM [Batty et al., 2011] 19

  38. C/C++ MM allows to get a = b = 1, OOTA a := [ x ]; b := [ y ]; if a then if b then [ y ] := 1 [ x ] := 1 20

  39. a 0 b 0 a 0 b 1 a 1 b 1 R y 0 R y 1 R y 1 R x 0 R x 0 R x 1 W y 1 W y 1 W y 1 W x 1 W x 1 Axioms: 1. po rf preserved is acyclic ( rf preserved rf ) 2. … rf po po po rf po po Executions in C/C++ MM a := [ x ]; b := [ y ]; [ y ] := 1 if b then [ x ] := 1 21

  40. a 0 b 0 a 0 b 1 a 1 b 1 R y 1 R y 1 R x 0 R x 1 W y 1 W y 1 W x 1 W x 1 Axioms: 1. po rf preserved is acyclic ( rf preserved rf ) 2. … po po rf po rf po po Executions in C/C++ MM a := [ x ]; b := [ y ]; [ y ] := 1 if b then [ x ] := 1 R y 0 R x 0 W y 1 21

  41. a 0 b 1 a 1 b 1 R y 1 R y 1 R x 0 R x 1 W y 1 W y 1 W x 1 W x 1 Axioms: 1. po rf preserved is acyclic ( rf preserved rf ) 2. … po rf po rf po po po Executions in C/C++ MM a := [ x ]; b := [ y ]; [ y ] := 1 if b then [ x ] := 1 // a = 0 ; b = 0 R y 0 R x 0 W y 1 21

  42. a 1 b 1 R y 1 R x 1 W y 1 W x 1 Axioms: 1. po rf preserved is acyclic ( rf preserved rf ) 2. … rf rf po po po po po Executions in C/C++ MM a := [ x ]; b := [ y ]; [ y ] := 1 if b then [ x ] := 1 // a = 0 ; b = 0 // a = 0 ; b = 1 R y 0 R y 1 R x 0 R x 0 W y 1 W y 1 W x 1 21

  43. Axioms: 1. po rf preserved is acyclic ( rf preserved rf ) 2. … po rf po po rf po po Executions in C/C++ MM a := [ x ]; b := [ y ]; [ y ] := 1 if b then [ x ] := 1 // a = 0 ; b = 0 // a = 0 ; b = 1 // a = 1 ; b = 1 R y 0 R y 1 R y 1 R x 0 R x 0 R x 1 W y 1 W y 1 W y 1 W x 1 W x 1 21

  44. po rf rf po po po po Executions in C/C++ MM a := [ x ]; b := [ y ]; [ y ] := 1 if b then [ x ] := 1 // a = 0 ; b = 0 // a = 0 ; b = 1 // a = 1 ; b = 1 R y 0 R y 1 R y 1 R x 0 R x 0 R x 1 W y 1 W y 1 W y 1 W x 1 W x 1 Axioms: 1. po ∪ rf preserved is acyclic ( rf preserved ⊆ rf ) 2. … 21

  45. fake ctrl a x b y R y 1 R x 1 if a then if b then y 1 x 1 W y 1 W x 1 y 1 ctrl else ctrl rf ctrl ctrl rf ctrl rf Out-Of-Thin-Air in C/C++ MM R y 1 R x 1 a := [ x ]; b := [ y ]; [ y ] := 1 if b then [ x ] := 1 W y 1 W x 1 R y 1 R x 1 a := [ x ]; b := [ y ]; if a then if b then [ y ] := 1 [ x ] := 1 W y 1 W x 1 22

  46. fake ctrl a x b y R y 1 R x 1 if a then if b then y 1 x 1 W y 1 W x 1 y 1 ctrl else ctrl rf ctrl ctrl rf ctrl rf Out-Of-Thin-Air in C/C++ MM R y 1 R x 1 a := [ x ]; b := [ y ]; [ y ] := 1 if b then [ x ] := 1 W y 1 W x 1 R y 1 R x 1 a := [ x ]; b := [ y ]; if a then if b then [ y ] := 1 [ x ] := 1 W y 1 W x 1 22

  47. fake ctrl R y 1 R x 1 W y 1 W x 1 ctrl ctrl ctrl else rf ctrl ctrl rf rf Out-Of-Thin-Air in C/C++ MM R y 1 R x 1 a := [ x ]; b := [ y ]; [ y ] := 1 if b then [ x ] := 1 W y 1 W x 1 R y 1 R x 1 a := [ x ]; b := [ y ]; if a then if b then [ y ] := 1 [ x ] := 1 W y 1 W x 1 a := [ x ]; b := [ y ]; if a then if b then [ y ] := 1 [ x ] := 1 [ y ] := 1 22

  48. fake ctrl ctrl rf ctrl ctrl else rf ctrl ctrl rf Out-Of-Thin-Air in C/C++ MM R y 1 R x 1 a := [ x ]; b := [ y ]; [ y ] := 1 if b then [ x ] := 1 W y 1 W x 1 R y 1 R x 1 a := [ x ]; b := [ y ]; if a then if b then [ y ] := 1 [ x ] := 1 W y 1 W x 1 a := [ x ]; b := [ y ]; R y 1 R x 1 if a then if b then [ y ] := 1 [ x ] := 1 W y 1 W x 1 [ y ] := 1 22

  49. ctrl rf ctrl ctrl else rf ctrl ctrl rf Out-Of-Thin-Air in C/C++ MM R y 1 R x 1 a := [ x ]; b := [ y ]; [ y ] := 1 if b then [ x ] := 1 W y 1 W x 1 R y 1 R x 1 a := [ x ]; b := [ y ]; if a then if b then [ y ] := 1 [ x ] := 1 W y 1 W x 1 a := [ x ]; b := [ y ]; R y 1 R x 1 if a then if b then fake ctrl [ y ] := 1 [ x ] := 1 W y 1 W x 1 [ y ] := 1 22

  50. Simplicity No UB RC11 [Lahav et al., 2017] Promising [Kang et al., 2017, Lee et al., 2020] Weakestmo [Chakraborty and Vafeiadis, 2019] Modular Relaxed Dep. [Paviotti et al., 2020] OCaml MM [Dolan et al., 2018] Thank you! http://podkopaev.net Programming languages’ MM to Hardware Comp. Opt. (No OOTA) Efg. Comp. DRF SC [Lamport, 1979] Java MM [Manson et al., 2005] C/C++ MM [Batty et al., 2011] 23

  51. Simplicity No UB Promising [Kang et al., 2017, Lee et al., 2020] Weakestmo [Chakraborty and Vafeiadis, 2019] Modular Relaxed Dep. [Paviotti et al., 2020] OCaml MM [Dolan et al., 2018] Thank you! http://podkopaev.net Programming languages’ MM to Hardware Comp. Opt. (No OOTA) Efg. Comp. DRF SC [Lamport, 1979] Java MM [Manson et al., 2005] C/C++ MM [Batty et al., 2011] RC11 [Lahav et al., 2017] Forbids all po ∪ rf cycles 24

  52. since hardware respects rf How? 1. Restrict compiler optimizations 2. Put a fence between R and W Cheaper for C/C++ than for Java! po po po rf po W rf po po R W R rf po rf po po rf rf rf rf rf po rf po po po po po Forbidding po ∪ rf cycles Enough to respect [ R ] ; po ; [ W ] 25

  53. since hardware respects rf How? 1. Restrict compiler optimizations 2. Put a fence between R and W Cheaper for C/C++ than for Java! rf rf R W R W po rf po po rf po po po rf po po po po po rf po rf po rf rf po Forbidding po ∪ rf cycles Enough to respect [ R ] ; po ; [ W ] ( po ∪ rf ) ∗ 25

  54. since hardware respects rf How? 1. Restrict compiler optimizations 2. Put a fence between R and W Cheaper for C/C++ than for Java! rf po rf po po R po W rf W R po po po po po po po rf rf rf rf po Forbidding po ∪ rf cycles Enough to respect [ R ] ; po ; [ W ] ( po ∪ rf ) ∗ rf \ po rf \ po 25

  55. since hardware respects rf How? 1. Restrict compiler optimizations 2. Put a fence between R and W Cheaper for C/C++ than for Java! po R W R W po po rf po po po rf po po rf po rf po po rf rf rf Forbidding po ∪ rf cycles Enough to respect [ R ] ; po ; [ W ] ( po ∪ rf \ po ) ∗ rf \ po rf \ po 25

  56. since hardware respects rf How? 1. Restrict compiler optimizations 2. Put a fence between R and W Cheaper for C/C++ than for Java! po R W R W po po rf po po po rf po po rf po rf po po rf rf rf Forbidding po ∪ rf cycles Enough to respect [ R ] ; po ; [ W ] ( po ∪ rf \ po ) ∗ rf \ po rf \ po 25

  57. How? 1. Restrict compiler optimizations 2. Put a fence between R and W Cheaper for C/C++ than for Java! po R W R W po po rf po po po rf po po rf po rf rf po rf rf Forbidding po ∪ rf cycles Enough to respect [ R ] ; po ; [ W ] since hardware respects rf \ po ( po ∪ rf \ po ) ∗ rf \ po rf \ po 25

  58. Cheaper for C/C++ than for Java! po R W R W po po rf po po po po rf po rf po rf po rf rf rf Forbidding po ∪ rf cycles Enough to respect [ R ] ; po ; [ W ] since hardware respects rf \ po ( po ∪ rf \ po ) ∗ rf \ po rf \ po How? 1. Restrict compiler optimizations 2. Put a fence between R and W 25

  59. po po R W R W po po rf po po po po rf rf rf rf rf po rf po Forbidding po ∪ rf cycles Enough to respect [ R ] ; po ; [ W ] since hardware respects rf \ po ( po ∪ rf \ po ) ∗ rf \ po rf \ po How? 1. Restrict compiler optimizations 2. Put a fence between R and W Cheaper for C/C++ than for Java! 25

  60. C/C++ has undefjned behavior 26

  61. subject to OOTA int data int data 0 0 atomic< int > f f 0 0 f acq 0 f rel 1 Java: Fine, but may print 0 C/C++: Undefjned Behavior! Race on normal location! Java MM C/C++ MM special locations data race on int weak guarantees undefjned behavior access to int relaxed ( rlx ) access to atomic<int> int while volatile int atomic<int> Undefjned Behavior and Memory Models [ data ] := 42 ; while ([ f ] == 0 ) {} ; [ f ] := 1 ; print ([ data ]); 27

  62. subject to OOTA int data 0 atomic< int > f 0 f acq 0 f rel 1 Java: Fine, but may print 0 C/C++: Undefjned Behavior! Race on normal location! Java MM C/C++ MM special locations data race on int weak guarantees undefjned behavior access to int relaxed ( rlx ) access to atomic<int> int while volatile int atomic<int> Undefjned Behavior and Memory Models int data = 0 ; f = 0 ; [ data ] := 42 ; while ([ f ] == 0 ) {} ; [ f ] := 1 ; print ([ data ]); 27

  63. subject to OOTA int data 0 atomic< int > f 0 f acq 0 f rel 1 Java MM C/C++ MM special locations data race on int weak guarantees undefjned behavior access to int relaxed ( rlx ) access to atomic<int> while volatile int int atomic<int> Undefjned Behavior and Memory Models int data = 0 ; f = 0 ; [ data ] := 42 ; while ([ f ] == 0 ) {} ; [ f ] := 1 ; print ([ data ]); Java: Fine, but may print 0 C/C++: Undefjned Behavior! Race on normal location! 27

  64. subject to OOTA int data 0 f 0 f acq 0 f rel 1 Java: Fine, but may print 0 C/C++: Undefjned Behavior! Race on normal location! Java MM C/C++ MM special locations data race on int weak guarantees undefjned behavior access to int relaxed ( rlx ) access to atomic<int> int while volatile int atomic<int> Undefjned Behavior and Memory Models int data = 0 ; atomic< int > f = 0 ; [ data ] := 42 ; while ([ f ] == 0 ) {} ; [ f ] := 1 ; print ([ data ]); 27

  65. subject to OOTA int data 0 f 0 f 0 f 1 Java: Fine, but may print 0 C/C++: Undefjned Behavior! Race on normal location! Java MM C/C++ MM special locations data race on int weak guarantees undefjned behavior access to int relaxed ( rlx ) access to atomic<int> int while atomic<int> volatile int Undefjned Behavior and Memory Models int data = 0 ; atomic< int > f = 0 ; while ([ f ] acq == 0 ) {} ; [ data ] := 42 ; [ f ] rel := 1 ; print ([ data ]); 27

  66. int data 0 f 0 f 0 f 1 Java: Fine, but may print 0 C/C++: Undefjned Behavior! Race on normal location! subject to OOTA int while atomic<int> volatile int Undefjned Behavior and Memory Models int data = 0 ; atomic< int > f = 0 ; while ([ f ] acq == 0 ) {} ; [ data ] := 42 ; [ f ] rel := 1 ; print ([ data ]); Java MM C/C++ MM special locations data race on int weak guarantees undefjned behavior access to int relaxed ( rlx ) access to atomic<int> 27

  67. int data 0 f 0 f 0 f 1 Java: Fine, but may print 0 C/C++: Undefjned Behavior! Race on normal location! while volatile int int atomic<int> Undefjned Behavior and Memory Models int data = 0 ; atomic< int > f = 0 ; while ([ f ] acq == 0 ) {} ; [ data ] := 42 ; [ f ] rel := 1 ; print ([ data ]); Java MM C/C++ MM special locations data race on int weak guarantees undefjned behavior subject to OOTA access to int relaxed ( rlx ) access to atomic<int> 27

  68. Simplicity No UB Promising [Kang et al., 2017, Lee et al., 2020] Weakestmo [Chakraborty and Vafeiadis, 2019] Modular Relaxed Dep. [Paviotti et al., 2020] OCaml MM [Dolan et al., 2018] Thank you! http://podkopaev.net Programming languages’ MM to Hardware Comp. Opt. (No OOTA) Efg. Comp. DRF SC [Lamport, 1979] Java MM [Manson et al., 2005] C/C++ MM [Batty et al., 2011] RC11 [Lahav et al., 2017] Forbids all po ∪ rf cycles 28

  69. Simplicity Promising [Kang et al., 2017, Lee et al., 2020] Weakestmo [Chakraborty and Vafeiadis, 2019] Modular Relaxed Dep. [Paviotti et al., 2020] OCaml MM [Dolan et al., 2018] Thank you! http://podkopaev.net Programming languages’ MM to Hardware Comp. Opt. (No OOTA) Efg. Comp. No UB DRF SC [Lamport, 1979] Java MM [Manson et al., 2005] C/C++ MM [Batty et al., 2011] RC11 [Lahav et al., 2017] Forbids all po ∪ rf cycles 28

  70. To forbid po ∪ rf cycles in C/C++ enough to respect [ R ] ; po ; [ W ] on atomics 29

  71. ARMv8: bogus conditional branch for relaxed atomic reads No changes for LLVM x86: no fences 1. Restrict compiler optimizations: 2. Put a fence between R and W Slowdown on ARMv8 is 0% on average and 6.3% max CDS from CDS C++, Folly, Junction, Rigtorp libs and 6 bechmarks from CDSSpec Preserving [ R ] ; po ; [ W ] for atomics in LLVM [Ou and Demsky, 2018] 30

  72. ARMv8: bogus conditional branch for relaxed atomic reads No changes for LLVM x86: no fences Slowdown on ARMv8 is 0% on average and 6.3% max CDS from CDS C++, Folly, Junction, Rigtorp libs and 6 bechmarks from CDSSpec Preserving [ R ] ; po ; [ W ] for atomics in LLVM [Ou and Demsky, 2018] 1. Restrict compiler optimizations: 2. Put a fence between R and W 30

  73. ARMv8: bogus conditional branch for relaxed atomic reads x86: no fences Slowdown on ARMv8 is 0% on average and 6.3% max CDS from CDS C++, Folly, Junction, Rigtorp libs and 6 bechmarks from CDSSpec Preserving [ R ] ; po ; [ W ] for atomics in LLVM [Ou and Demsky, 2018] 1. Restrict compiler optimizations: No changes for LLVM 2. Put a fence between R and W 30

  74. ARMv8: bogus conditional branch for relaxed atomic reads Slowdown on ARMv8 is 0% on average and 6.3% max CDS from CDS C++, Folly, Junction, Rigtorp libs and 6 bechmarks from CDSSpec Preserving [ R ] ; po ; [ W ] for atomics in LLVM [Ou and Demsky, 2018] 1. Restrict compiler optimizations: No changes for LLVM 2. Put a fence between R and W ▶ x86: no fences 30

  75. Slowdown on ARMv8 is 0% on average and 6.3% max CDS from CDS C++, Folly, Junction, Rigtorp libs and 6 bechmarks from CDSSpec Preserving [ R ] ; po ; [ W ] for atomics in LLVM [Ou and Demsky, 2018] 1. Restrict compiler optimizations: No changes for LLVM 2. Put a fence between R and W ▶ x86: no fences ▶ ARMv8: bogus conditional branch for relaxed atomic reads 30

  76. Preserving [ R ] ; po ; [ W ] for atomics in LLVM [Ou and Demsky, 2018] 1. Restrict compiler optimizations: No changes for LLVM 2. Put a fence between R and W ▶ x86: no fences ▶ ARMv8: bogus conditional branch for relaxed atomic reads Slowdown on ARMv8 is 0% on average and 6.3% max CDS from CDS C++, Folly, Junction, Rigtorp libs and 6 bechmarks from CDSSpec 30

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend