hunting deadlocks efficiently in micro architectural
play

Hunting Deadlocks Efficiently in Micro-Architectural Models of - PowerPoint PPT Presentation

Hunting Deadlocks Efficiently in Micro-Architectural Models of Communication Fabrics Freek Verbeek and Julien Schmaltz Growing number of cores (W. Tichy - Keynote ICST 2011) AMD Opteron 12 cores Sun Niagara3 16 cores Intel 8 cores ~1.8 Bill.


  1. Hunting Deadlocks Efficiently in Micro-Architectural Models of Communication Fabrics Freek Verbeek and Julien Schmaltz

  2. Growing number of cores (W. Tichy - Keynote ICST 2011) AMD Opteron 12 cores Sun Niagara3 16 cores Intel 8 cores ~1.8 Bill. T. on 2x3.46cm 2 ~1 Bill. T. on 3.7cm 2 ~2.3 Bill. T. on 6.8cm 2 Intel SCC 48 cores ~1.3 Bill. T. on 5.6cm 2 Intel 4 cores Usual: ~582 Mio. T. on 2.86cm 2 - verify cores - verify Intel Research 80 cores interconnect Tilera TILEPro64 64 cores ~100 Mio. T. on 2.75cm 2 Intel 2 cores ~167 Mio. T. on 1.1cm 2

  3. Networks-on-Chips: Example 1, HERMES The topology: • Two dimensional mesh

  4. Networks-on-Chips: Example 1 The routing function: • XY: simple deterministic routing algorithm • First route to the destination column and then to the correct row • No cyclic dependencies and thus deadlock-free

  5. Networks-on-Chips: Example 1 The high-level protocol: active req! Master Slave • Masters send requests and wait for responses • Slaves produce responses when receiving requests • Deadlock-free protocol

  6. Networks-on-Chips: Example 1 The high-level protocol: active req! Master Slave • No message dependencies rsp � req ⇥ req � rsp

  7. Networks-on-Chips: Example 1 Network component Deadlock-free? Topology Routing Function High-level protocol Message Dependencies ? = Deadlockfree system

  8. Networks-on-Chips: Example 1 Core distribution: Slave Slave Slave Master Master Master • Masters on the odd/slaves on the even columns

  9. Networks-on-Chips: Example 1 • Is the system deadlock-free ? Response • No if at least four columns, yes otherwise. Slave Slave Slave Request Master Master Master Green requests waits for blue reponses

  10. Networks-on-Chips: Example 1 Network component Cause of deadlock? Topology Routing Function High-level protocol Message Dependencies = Deadlockfree system

  11. Networks-on-Chips: Example 2, Spidergon from STMElectronics Topology High-level protocol 7 0 1 req! Routing logic 2 6 RelAd = (dest - current ) mod 4 * N if RelAd = 0 then stop 5 4 3 elseif 0 < RelAd <= N then go clockwise elseif 3*N <= RelAd <= 4*N then go counter clockwise • Design by STMicroelectronics else • Simple shortest path routing algorithm go across • Regular for an even number of nodes endif • Packet, circuit, or wormhole switching

  12. Networks-on-Chips: Example 2 Network component Cause of deadlock Routing Function 7 0 1 2 6 5 4 3

  13. Networks-on-Chips: Example 2 • Is the system deadlock-free ? Send 7 0 1 packets Idle cores 2 6 5 4 3

  14. Networks-on-Chips: Example 2 • Is the system deadlock-free ? • Yes ! None of the dependencies in the right upper quarter occur. Send 7 0 1 packets Idle cores 2 6 5 4 3

  15. Networks-on-Chips: Example 2 • Is the system deadlock-free ? Send 0 1 2 14 15 packets Idle cores 3 13 4 12 5 11 9 8 7 6 10

  16. Networks-on-Chips: Example 2 Network component Deadlock-free? Topology Routing Function High-level protocol Message Dependencies Core Distribution Network size = Deadlockfree system

  17. Networks-on-Chips: Example 3 Network component Deadlock-free? Topology Routing Function High-level protocol Message Dependencies Core Distribution Network size Queue sizes Counter information Virtual channel allocation ? = Deadlockfree system

  18. Confusing ... • We need tools to (quickly) check for deadlocks – in large systems – with message dependencies – with the topology, routing and core behavior in one model – able to handle parameters such as queue size

  19. Outline • Intel's micro-architectural description language – xMAS language – Capturing high-level structure and message dependencies • Deadlock verification for xMAS – Definition of deadlocks – Labelled waiting graph – Feasible logically closed subgraph • Conclusion and future work

  20. Intel's abstraction for communication fabrics

  21. xMAS - Executable MicroArchitectural Specifications • Fair sinks and sometimes sources • Diagram is formal model • Friendly to microarchitects

  22. xMAS example q 1 q 0 req,rsp rsp req q 2 req

  23. xMAS example q 1 q 0 P req,rsp rsp req q 2 req

  24. xMAS example q 1 q 0 P req,rsp rsp req q 2 req

  25. xMAS example q 1 P q 0 req,rsp rsp P req q 2 req

  26. xMAS example q 1 q 0 P req,rsp rsp P req q 2 req

  27. xMAS example q 1 q 0 P req,rsp rsp req q 2 req

  28. Outline • Intel's micro-architectural description language – xMAS language – Capturing high-level structure and message dependencies • Deadlock verification for xMAS – Definition of deadlocks – Labelled dependency graph – Feasible logically closed subgraph • Conclusion and future work

  29. Formal definition of "deadlock" in xMAS • Intuition is a "dead" channel • Formal definition based on Linear Temporal Logic – Predicate logic – Temporal operators "eventually" ( ) and "globally" ( ) ♦ � • Channel c is dead iff ⇥ ( c.irdy ∧ � ¬ c.trdy )

  30. xMAS example dead channel requests q 1 q 0 req,rsp rsp req q 2 req • Inject two requests in q0 • Fork creates two copies • One pair is sunk

  31. General approach for deadlock detection in xMAS networks • Define Blocking Equations for all components – Equations capture the reason why a component is idle or blocking • Build a labelled waiting graph for each queue – Labels correspond to the equations – Graph captures the topology, i.e., the dependencies between the xMAS components • Search for a feasible logically closed subgraph – Corresponds to a deadlock situation – Feasibility checked using Linear Programming

  32. General approach for deadlock detection in xMAS networks • Define Blocking Equations for all components – Equations capture the reason why a component is idle or blocking • Build a labelled waiting graph for each queue – Labels correspond to the equations – Graph captures the topology, i.e., the dependencies between the xMAS components • Search for a feasible logically closed subgraph – Corresponds to a deadlock situation – Feasibility checked using Linear Programming

  33. Blocking Equations for a join • 2 cases – output is blocked – the other input is idle • Block (u) = Idle (v) + Block (w) u w v req

  34. Blocking Equations for a join • 2 cases We need to know when a channel is idle ! – output is blocked – the other input is idle • Block (u) = Idle (v) + Block (w) u w v req

  35. Idle equations for a fork • A fork output is idle if the input is idle or the other output is blocked • Idle (w) = Idle (u) + Block (v) v u w req

  36. General approach for deadlock detection in xMAS networks • Define Blocking Equations for all components – Equations capture the reason why a component is idle or blocking • Build a labelled waiting graph for each queue – Labels correspond to the equations – Graph captures the topology, i.e., the dependencies between the xMAS components • Search for a feasible logically closed subgraph – Corresponds to a deadlock situation – Feasibility checked using Linear Programming

  37. Step 2 / labelled dependency graph (1) start join req q1 q 1 . req ≥ 1 join start with a message in q1 and visit the join

  38. Step 2 / labelled dependency graph (2) start join u w v req mrg2 sw q1 q 1 . req ≥ 1 join Block (u) = Idle (v) + Block (w) mrg2 + analyse the join according to its Blocking Equation sw we go forward to the merge and backward to the switch

  39. Step 2 / labelled dependency graph (2) start join u w v req mrg2 sw q1 q 1 . req ≥ 1 join Block (u) = Block (w) mrg2 sink + false forwards to the switch - then the sink can never be blocked sw we assume fair sinks

  40. Step 2 / labelled dependency graph (2) start join u w v req mrg2 sw q1 q 1 . req ≥ 1 join Idle (u) = Idle (w) mrg2 sink + false backwards to the switch sw

  41. Step 2 / labelled dependency graph (2) start join u w req mrg2 sw q1 q 1 . req ≥ 1 join Idle (u) = Idle (w) . Empty (q2) mrg2 sink + false backwards to the queue sw note that we forgot the Block (w') case q 2 . rsp = 0 q2

  42. Step 2 / labelled dependency graph (2) start join u u w v req mrg2 sw q1 q 1 . req ≥ 1 join Idle (w) = Idle (u) . Idle (v) mrg2 sink + false backwards to the merge and branch sw note branching is bad for us q 2 . rsp = 0 q2 mrg1

  43. Step 2 / labelled dependency graph (2) start join v w u u req mrg2 sw Idle (u) = Block (v) + Idle (w) backwards to the merge and branch q1 to the source - idle if no type produced q 1 . req ≥ 1 join to the fork mrg2 sink frk + false . src2 sw q 2 . rsp = 0 q2 mrg1 true

  44. Step 2 / labelled dependency graph (2) start join w u u req mrg2 sw Idle (u) = Idle (w) . Empty (q0) backwards to q0 and the source q1 q 1 . req ≥ 1 join src1 q0 mrg2 sink frk q 0 . rsp = 0 false + false . src2 sw q 2 . rsp = 0 q2 mrg1 true

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend