scalable multi core model checking
play

Scalable Multi-Core Model Checking Alfons Laarman ( - PowerPoint PPT Presentation

Scalable Multi-Core Model Checking Alfons Laarman ( alfons@laarman.com ), Theory joint work with Jaco van de Pol and Tom van Dijk . 1 / 19 Scalable Multi-Core Model Checking Multi-Core Model Checking Research questions Can model checking


  1. Scalable Multi-Core Model Checking Alfons Laarman ( alfons@laarman.com ), Theory joint work with Jaco van de Pol and Tom van Dijk . 1 / 19 Scalable Multi-Core Model Checking

  2. Multi-Core Model Checking Research questions • Can model checking scale (linearly, ideally) on modern multi-cores? 50 dfsfifo Speedup: ● garp ● giop2.nomig i−protocol2 40 S P = T seq / T P leader5 ● ● 30 Speedup Ideal: S P = P ● 20 Linear: ● 10 S P = P / c ● ● 0 0 10 20 30 40 50 Threads 2 / 19 Scalable Multi-Core Model Checking

  3. Multi-Core Model Checking Research questions • Can model checking scale (linearly, ideally) on modern multi-cores? • Are our parallel solutions compatible with other techniques? 50 dfsfifo Speedup: ● garp ● giop2.nomig i−protocol2 40 • Compression techniques S P = T seq / T P leader5 ● ● 30 Speedup + • Symbolic exploration Ideal: S P = P ● 20 • Partial-order reduction Linear: ● 10 S P = P / c ● ● 0 0 10 20 30 40 50 Threads 2 / 19 Scalable Multi-Core Model Checking

  4. Multi-Core Model Checking Research questions • Can model checking scale (linearly, ideally) on modern multi-cores? • Are our parallel solutions compatible with other techniques? 50 dfsfifo Speedup: ● garp ● giop2.nomig i−protocol2 40 • Compression techniques S P = T seq / T P leader5 ● ● 30 Speedup + • Symbolic exploration Ideal: S P = P ● 20 • Partial-order reduction Linear: ● 10 S P = P / c ● ● 0 0 10 20 30 40 50 Threads Related work • “compiler optimizations diminish the benefits of multi-core processing” [Holzmann 07] • “no silver bullet, that would solve all the scalability issues” [Barnat et al. 08] 2 / 19 Scalable Multi-Core Model Checking

  5. Challenges Di ffi culties of parallelism • Steep memory hierarchies 3 / 19 Scalable Multi-Core Model Checking

  6. Challenges Di ffi culties of parallelism • Steep memory hierarchies • Cache coherence protocol 3 / 19 Scalable Multi-Core Model Checking

  7. Challenges Di ffi culties of parallelism • Steep memory hierarchies • Cache coherence protocol #define B (1024 � 1024 � 1024) int main ( void ) { int result = 0; for ( int i = 0; i < B; i++) result++; return result; } 3 / 19 Scalable Multi-Core Model Checking

  8. Challenges Di ffi culties of parallelism • Steep memory hierarchies • Cache coherence protocol #define P 16 static void count ( void � arg) { int � counter = ( int � ) arg; for ( int i = 0; i < B / P; i++) ( � counter)++; } int main ( void ) { pthread t thread[P]; int counters[P] = 0; #define B (1024 � 1024 � 1024) for ( int i = 0; i < P; i++) pthread create (&thread[i], NULL, count, &counters[i]); int main ( void ) { int result = 0; int result = 0; for ( int i = 0; i < B; i++) for ( int i = 0; i < P; i++) { result++; pthread join (thread[i], NULL); return result; result += counters[i]; } } return result; } 3 / 19 Scalable Multi-Core Model Checking

  9. Challenges Di ffi culties of parallelism • Steep memory hierarchies • Cache coherence protocol #define P 16 static void count ( void � arg) { int � counter = ( int � ) arg; for ( int i = 0; i < B / P; i++) ( � counter)++; } int main ( void ) { pthread t thread[P]; int counters[P] = 0; #define B (1024 � 1024 � 1024) for ( int i = 0; i < P; i++) pthread create (&thread[i], NULL, count, &counters[i]); int main ( void ) { int result = 0; int result = 0; for ( int i = 0; i < B; i++) for ( int i = 0; i < P; i++) { result++; pthread join (thread[i], NULL); return result; result += counters[i]; T 1 = 27 sec T 16 = 32 sec } } return result; } 3 / 19 Scalable Multi-Core Model Checking

  10. Challenges Di ffi culties of parallelism • Steep memory hierarchies • Cache coherence protocol (false sharing) #define P 16 static void count ( void � arg) { int � counter = ( int � ) arg; for ( int i = 0; i < B / P; i++) ( � counter)++; } int main ( void ) { pthread t thread[P]; int attribute ((aligned(64))) counters[P] = 0; #define B (1024 � 1024 � 1024) for ( int i = 0; i < P; i++) pthread create (&thread[i], NULL, count, &counters[i]); int main ( void ) { int result = 0; int result = 0; for ( int i = 0; i < B; i++) for ( int i = 0; i < P; i++) { result++; pthread join (thread[i], NULL); return result; result += counters[i]; T 1 = 27 sec T 16 = 32 sec } } return result; } 3 / 19 Scalable Multi-Core Model Checking

  11. Challenges Di ffi culties of parallelism • Steep memory hierarchies • Cache coherence protocol (false sharing) • Fine-grained operations in model checking (e.g. no subsumption) #define P 16 static void count ( void � arg) { int � counter = ( int � ) arg; for ( int i = 0; i < B / P; i++) ( � counter)++; } int main ( void ) { pthread t thread[P]; int attribute ((aligned(64))) counters[P] = 0; #define B (1024 � 1024 � 1024) for ( int i = 0; i < P; i++) pthread create (&thread[i], NULL, count, &counters[i]); int main ( void ) { int result = 0; int result = 0; for ( int i = 0; i < B; i++) for ( int i = 0; i < P; i++) { result++; pthread join (thread[i], NULL); return result; result += counters[i]; T 1 = 27 sec T 16 = 32 sec } } return result; T 16 = 1 . 8 sec } 3 / 19 Scalable Multi-Core Model Checking

  12. ◆ (Explicit-State) Model Checking global x = 7; global y = 3; 1 for ( int a = 1 .. 10) 1 int b = y + x; 2 x += y; 2 y = 2 � b; 4 / 19 Scalable Multi-Core Model Checking

  13. ◆ (Explicit-State) Model Checking global x = 7; global y = 3; 1 for ( int a = 1 .. 10) 1 int b = y + x; 2 x += y; 2 y = 2 � b; S : � x , y , a , b , pc 1 , pc 2 � s 0 = � 7 , 3 , 0 , 0 , 1 , 1 � next state ( � 7 , 3 , 0 , 0 , 1 , 1 � ) = {� 7 , 3 , 1 , 0 , 2 , 1 � , � 7 , 3 , 0 , 10 , 1 , 2 �} 4 / 19 Scalable Multi-Core Model Checking

  14. (Explicit-State) Model Checking global x = 7; global y = 3; 1 for ( int a = 1 .. 10) 1 int b = y + x; 2 x += y; 2 y = 2 � b; S : � x , y , a , b , pc 1 , pc 2 � s 0 = � 7 , 3 , 0 , 0 , 1 , 1 � next state ( � 7 , 3 , 0 , 0 , 1 , 1 � ) = {� 7 , 3 , 1 , 0 , 2 , 1 � , � 7 , 3 , 0 , 10 , 1 , 2 �} Problem : Check all reachable states from s 0 ∈ S using next state ( S ) → 2 S with S = ◆ k (implicit-)graph search 4 / 19 Scalable Multi-Core Model Checking

  15. (Explicit-State) Model Checking global x = 7; global y = 3; 1 for ( int a = 1 .. 10) 1 int b = y + x; 2 x += y; 2 y = 2 � b; S : � x , y , a , b , pc 1 , pc 2 � s 0 = � 7 , 3 , 0 , 0 , 1 , 1 � next state ( � 7 , 3 , 0 , 0 , 1 , 1 � ) = {� 7 , 3 , 1 , 0 , 2 , 1 � , � 7 , 3 , 0 , 10 , 1 , 2 �} Problem : Check all reachable states from s 0 ∈ S using next state ( S ) → 2 S with S = ◆ k (implicit-)graph search Basis for checking LTL/CTL and timed/probabilistic systems! 4 / 19 Scalable Multi-Core Model Checking

  16. Overview 1. Reachability with Shared Hash Table 2. Tree Compression 3. Symbolic Reachability with Decision Diagrams 5 / 19 Scalable Multi-Core Model Checking

  17. Static partitioning or shared hash table store store Worker 1 Worker 2 Queue Queue Queue Queue Worker 3 Worker 4 store store Static partitioning X On-the-fly (BFS) ± Scalability (queue contention) 6 / 19 Scalable Multi-Core Model Checking

  18. Static partitioning or shared hash table store store Queue Queue Worker 1 Worker 2 Worker 1 Worker 2 Queue Queue Store Queue Queue Worker 4 Worker 3 Worker 3 Worker 4 Queue Queue store store Load balancer Static partitioning Shared hash table ✓ (Pseudo) DFS & BFS X On-the-fly (BFS) ? Scalability ± Scalability (queue contention) 6 / 19 Scalable Multi-Core Model Checking

  19. Shared hash table procedure search ( p ) while balance ( Q ) do ⊲ with termination detection s := s ∈ Q p ; Q p := Q p \ s for all s ′ ∈ next state ( s ) do if s ′ � V then V := V ∪ { s ′ } Q p := Q p ∪ { s ′ } procedure reach ( s 0 , P ) V := { s 0 } Q 1 := { s 0 } search ( 1 ) � ... � search ( P ) 7 / 19 Scalable Multi-Core Model Checking

  20. Shared hash table procedure search ( p ) while balance ( Q ) do ⊲ with termination detection s := s ∈ Q p ; Q p := Q p \ s for all s ′ ∈ next state ( s ) do if s ′ � V then � atomic V.find-or-put(s’) V := V ∪ { s ′ } Q p := Q p ∪ { s ′ } procedure reach ( s 0 , P ) V := { s 0 } Q 1 := { s 0 } search ( 1 ) � ... � search ( P ) 7 / 19 Scalable Multi-Core Model Checking

  21. Lockless Hash Table: Design Laarman, van de Pol, Weber [fmcad10] Main bottlenecks • State store: concurrent access • Graph traversal: random memory access (bandwidth / latency) 8 / 19 Scalable Multi-Core Model Checking

  22. Lockless Hash Table: Design Laarman, van de Pol, Weber [fmcad10] Main bottlenecks • State store: concurrent access • Graph traversal: random memory access (bandwidth / latency) Design • Open addressing 8 / 19 Scalable Multi-Core Model Checking

  23. Lockless Hash Table: Design Laarman, van de Pol, Weber [fmcad10] Main bottlenecks • State store: concurrent access • Graph traversal: random memory access (bandwidth / latency) |state| |cache line| Design • Open addressing • Hash memoization • Walking the Line • In-situ locking bucket data 8 / 19 Scalable Multi-Core Model Checking

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend