solving operating systems problems with probabilistic
play

Solving Operating-Systems Problems with Probabilistic Model Checking - PowerPoint PPT Presentation

Introduction Spinlock Case Study PWCS probabilistic write-copy-select Romain Case Study Conclusion Solving Operating-Systems Problems with Probabilistic Model Checking Hendrik Tews Institute for Theoretical Computer Science Office at


  1. Introduction Spinlock Case Study PWCS — probabilistic write-copy-select Romain Case Study Conclusion Solving Operating-Systems Problems with Probabilistic Model Checking Hendrik Tews Institute for Theoretical Computer Science Office at Operating systems group Resilience talk Mai 3, 2013 Hendrik Tews Probabilistic Model Checking Resilience 5/2013 1 / 44

  2. Introduction Spinlock Case Study PWCS — probabilistic write-copy-select Romain Case Study Conclusion Outline Introduction Spinlock Case Study PWCS — probabilistic write-copy-select Romain Case Study Conclusion Hendrik Tews Probabilistic Model Checking Resilience 5/2013 2 / 44

  3. Introduction Spinlock Case Study PWCS — probabilistic write-copy-select Romain Case Study Conclusion Model checking functional system requirements abstract specification, e.g., model M M M Φ temporal formula Φ Φ model checker: M | does M | M | = Φ = Φ = Φ hold ? no + counterexample yes Hendrik Tews Probabilistic Model Checking Resilience 5/2013 3 / 44

  4. Introduction Spinlock Case Study PWCS — probabilistic write-copy-select Romain Case Study Conclusion Probabilistic model checking quantitative system requirements probabilistic specification, e.g., model M M M Φ temporal formula Φ Φ probabilistic model checker: M quantitative analysis of M M against Φ Φ Φ probability for “bad behaviors” is < 10 − 6 < 10 − 6 < 10 − 6 1 probability for “good behaviors” is 1 1 expected costs for .... Hendrik Tews Probabilistic Model Checking Resilience 5/2013 4 / 44

  5. Introduction Spinlock Case Study PWCS — probabilistic write-copy-select Romain Case Study Conclusion Outline Introduction Spinlock Case Study PWCS — probabilistic write-copy-select Romain Case Study Conclusion Hendrik Tews Probabilistic Model Checking Resilience 5/2013 5 / 44

  6. Introduction Spinlock Case Study PWCS — probabilistic write-copy-select Romain Case Study Conclusion Spinlocks Problem ◮ n Processes on n CPU cores ◮ cooperate to protect a shared resource (OS-kernel ready-queue) ◮ Contention is rare, the lock is almost always free ◮ Inter-processor interrupts (IPI’s) are far too slow in this case Solution ◮ Synchronise over a shared lock variable ◮ change lock variable with atomic operations (CAS — compare and swap) ◮ expensive in the contention case Questions ◮ Does it scale to 100 cores? ◮ For which workloads? Hendrik Tews Probabilistic Model Checking Resilience 5/2013 6 / 44

  7. Introduction Spinlock Case Study PWCS — probabilistic write-copy-select Romain Case Study Conclusion Spinlocks Joint work with Christel Baier, Marcus Daum, Benjamin Engel, Hermann H¨ artig, Joachim Klein, Sascha Kl¨ uppelholz, Steffen M¨ arcker and Marcus V¨ olp FMICS 2012 Waiting for locks: How long does it usually take? , in: M. Stoelinga, R. Pinger (Eds.), 17th International Workshop on Formal Methods for Industrial Critical Systems, Vol. 7437 of Lecture Notes in Computer Science, Springer, 2012, pp. 47–62. SSV 2012 Chiefly symmetric: Results on the scalability of probabilistic model checking for operating-system code , in: F. Cassez, R. Huuck, G. Klein, B. Schlich (Eds.), Proceedings Seventh Conference on Systems Software Verification, Vol. 102 of EPTCS, 2012, pp. 156–166. Hendrik Tews Probabilistic Model Checking Resilience 5/2013 7 / 44

  8. Introduction Spinlock Case Study PWCS — probabilistic write-copy-select Romain Case Study Conclusion Test-And-Test-And-Set Lock volatile bool occupied = false; 1 volatile void lock(){ 2 while (atomic swap(occupied, true)){ 3 while (occupied){/* spin loop */} 4 } 5 } 6 void unlock(){ 7 occupied = false 8 } 9 ◮ model n processes that compete for the lock ◮ model lock as separate process ◮ compare results with measurements Hendrik Tews Probabilistic Model Checking Resilience 5/2013 8 / 44

  9. Introduction Spinlock Case Study PWCS — probabilistic write-copy-select Romain Case Study Conclusion Interesting properties In the long run... ◮ Probability for finding the lock free ◮ Probability for getting the lock twice in a row without waiting ◮ Average waiting time for the lock (under the condition that the lock busy) ◮ the 95% quantile of the waiting time quantile picture by Rene Schwarz from Wikimedia Commons Hendrik Tews Probabilistic Model Checking Resilience 5/2013 9 / 44

  10. Introduction Spinlock Case Study PWCS — probabilistic write-copy-select Romain Case Study Conclusion Process i : DTMC Model t i := random ( ν ) if t i > 0: start i ncrit i t i := t i − 1 if t i = 0: if t i = 0 t i := random ( ν ) if ¬ lock i : if t i > 0: t i := min ( t i + 1 , 2) t i := t i − 1 if lock i ∧ t i = 1: t i := random ( γ 0 ) if lock i ∧ t i = 2: t i := random ( γ 1 ) crit i wait i non-critical region ν Distributions: critical region (without spinning) γ 0 critical region (with spinning) γ 1 Hendrik Tews Probabilistic Model Checking Resilience 5/2013 10 / 44

  11. Introduction Spinlock Case Study PWCS — probabilistic write-copy-select Romain Case Study Conclusion The lock: DTMC Model unlock if release i ∧ if release k ∧ ¬ wait 1 ∧ . . . ∧ ¬ wait n ¬ wait 1 ∧ . . . ∧ ¬ wait n if wait i if wait k if release i ∧ wait k . . . . . . lock i lock k if release k ∧ wait i perform uniform probabilistic choice for selecting next lock owner Hendrik Tews Probabilistic Model Checking Resilience 5/2013 11 / 44

  12. Introduction Hendrik Tews probability Results: Probability to find the lock free 2[5][6][50,60] 0.75 0.85 0.95 Spinlock Case Study 0.7 0.8 0.9 2[5][6][40,50] 1 2[5][6][40,50,60] 2[5][6][40,50,60,70] cache aware model 2[5][6][40,60] 3[5][6][50,60] PWCS — probabilistic write-copy-select measured Probabilistic Model Checking 3[5][6][40,50] 3[5][6][40,50,60] 3[5][6][40,50,60,70] 3[5][6][40,60] 4[5][6][50,60] 4[5][6][40,50] 4[5][6][40,50,60] 4[5][6][40,50,60,70] Romain Case Study 4[5][6][40,60] Resilience 5/2013 Conclusion 12 / 44

  13. Results: Average waiting time for spinning processes Introduction Hendrik Tews average waiting time 2[5][6][50,60] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Spinlock Case Study 2[5][6][40,50] 0 2[5][6][40,50,60] 2[5][6][40,50,60,70] 2[5][6][40,60] 3[5][6][50,60] PWCS — probabilistic write-copy-select Probabilistic Model Checking 3[5][6][40,50] 3[5][6][40,50,60] 3[5][6][40,50,60,70] cache aware model 3[5][6][40,60] 4[5][6][50,60] measured 4[5][6][40,50] 4[5][6][40,50,60] 4[5][6][40,50,60,70] Romain Case Study 4[5][6][40,60] Resilience 5/2013 Conclusion 13 / 44

  14. Introduction Results: 95% quantile of the waiting time Hendrik Tews 95% quantile 2[5][6][50,60] 0.2 0.4 0.6 0.8 Spinlock Case Study 2[5][6][40,50] 0 1 2[5][6][40,50,60] 2[5][6][40,50,60,70] cache aware model 2[5][6][40,60] measured 3[5][6][50,60] PWCS — probabilistic write-copy-select Probabilistic Model Checking 3[5][6][40,50] 3[5][6][40,50,60] 3[5][6][40,50,60,70] 3[5][6][40,60] 4[5][6][50,60] 4[5][6][40,50] 4[5][6][40,50,60] 4[5][6][40,50,60,70] Romain Case Study 4[5][6][40,60] Resilience 5/2013 Conclusion 14 / 44

  15. Introduction Spinlock Case Study PWCS — probabilistic write-copy-select Romain Case Study Conclusion Scalability for PRISM, Distribution [40,50] model generation 20 20 steady state RAM 15 15 time in hours RAM in GB 10 10 5 5 0 0 3 4 5 number of processes 3 proc. 4,082,808 number of states: 4 proc. 198,808,720 Hendrik Tews Probabilistic Model Checking Resilience 5/2013 15 / 44

  16. Introduction Spinlock Case Study PWCS — probabilistic write-copy-select Romain Case Study Conclusion Symmetry reduction: Using a generic representative unlock ncrit Lock P 1 lock 1 lock x crit wait ncrit ncrit ncrit ncrit 1 ncrit 2 P P P P P x x x x x crit wait crit wait crit wait crit wait crit wait Hendrik Tews Probabilistic Model Checking Resilience 5/2013 16 / 44

  17. Introduction Spinlock Case Study PWCS — probabilistic write-copy-select Romain Case Study Conclusion Symmetry reduction: Using a generic representative unlock ncrit Lock P 1 lock 1 lock x crit wait ncrit ncrit ncrit ncrit 1 ncrit 2 P P P P P x x x x x crit wait crit wait crit wait crit wait crit wait crit : 1 ncrit 1 : 1 state counters: wait : 2 ncrit 2 : 1 Hendrik Tews Probabilistic Model Checking Resilience 5/2013 16 / 44

  18. Introduction Spinlock Case Study PWCS — probabilistic write-copy-select Romain Case Study Conclusion Results for symmetry-reduced model Non-critical Distribution [40, 50] 1 600 0.9 500 0.8 0.7 400 probability 0.6 time units 0.5 300 0.4 200 0.3 0.2 lock free probability 100 average waiting time 0.1 0 0 10 20 30 40 50 60 70 80 90 100 500 1000 5000 processes Hendrik Tews Probabilistic Model Checking Resilience 5/2013 17 / 44

  19. Introduction Spinlock Case Study PWCS — probabilistic write-copy-select Romain Case Study Conclusion Scalability for symmetry-reduced model Non-critical Distribution [40, 50] 10 9 240 states (bisim. quot.) 10 8 run time 10 7 180 10 6 time (in min) 10 5 states 120 10 4 10 3 60 10 2 10 1 10 0 0 0 10 20 30 40 50 60 70 80 90 100 500 1000 5000 processes Hendrik Tews Probabilistic Model Checking Resilience 5/2013 18 / 44

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend