challenges for worst case execution time analysis of
play

Challenges for Worst-case Execution Time Analysis of Multi-core - PowerPoint PPT Presentation

Challenges for Worst-case Execution Time Analysis of Multi-core Architectures Jan Reineke @ saarland university computer science Intel, Braunschweig April 29, 2013 The Context: Hard Real-Time Systems Safety-critical applications:


  1. Challenges for Worst-case Execution Time Analysis of Multi-core Architectures Jan Reineke @ saarland university computer science Intel, Braunschweig April 29, 2013

  2. The Context: Hard Real-Time Systems Safety-critical applications: ¢ Avionics, automotive, train industries, manufacturing Side airbag in car Crankshaft-synchronous tasks Reaction in < 10 msec Reaction in < 45 microsec ¢ Embedded controllers must finish their tasks within given time bounds. ¢ Developers would like to know the Worst-Case Execution Time (WCET) to give a guarantee. Jan Reineke, Saarland 2

  3. The Timing Analysis Problem ? ¡ + ¡ Embedded Software Timing Requirements + Microarchitecture Jan Reineke, Saarland 3

  4. What does the execution time depend on? ¢ The input, determining which path is taken through the program. ¢ The state of the hardware platform: l Due to caches, pipelining, speculation, etc. ¢ Interference from the environment: l External interference as seen from the analyzed task on shared busses, caches, memory. Simple Memory CPU Jan Reineke, Saarland 4

  5. What does the execution time depend on? ¢ The input, determining which path is taken through the program. ¢ The state of the hardware platform: l Due to caches, pipelining, speculation, etc. ¢ Interference from the environment: l External interference as seen from the analyzed task on shared busses, caches, memory. Complex CPU (out-of-order Simple L1 Main Memory execution, CPU Cache Memory branch prediction, etc.) Jan Reineke, Saarland 5

  6. What does the execution time depend on? ¢ The input, determining which path is taken through the program. ¢ The state of the hardware platform: l Due to caches, pipelining, speculation, etc. ¢ Interference from the environment: l External interference as seen from the analyzed task on shared busses, caches, memory. Complex L1 Complex CPU CPU Cache (out-of-order Simple L1 Main L2 Main Memory execution, ... CPU Cache Memory Cache Memory branch prediction, etc.) Complex L1 CPU Cache Jan Reineke, Saarland 6

  7. Example of Influence of Microarchitectural State LOAD r2, _a x=a+b; LOAD r1, _b ADD r3,r2,r1 PowerPC 755 Jan Reineke, Saarland 7

  8. Example of Influence of Corunning Tasks in Multicores Radojkovic et al. (ACM TACO, 2012) on Intel Atom and Intel Core 2 Quad: up to 14x slow-down due to interference on shared L2 cache and memory controller Jan Reineke, Saarland 8

  9. Challenges 1. Modeling How to construct sound timing models? 2. Analysis How to precisely & efficiently bound the WCET? 3. Design How to design microarchitectures that enable precise & efficient WCET analysis? Jan Reineke, Saarland 9

  10. The Modeling Challenge architecture ? ¡ Timing + Micro- Model Timing model = Formal specification of microarchitecture’s timing Incorrect timing model à possibly incorrect WCET bound. Jan Reineke, Saarland 10

  11. Current Process of Deriving Timing Model ? ¡ Timing Micro- + architecture Model Jan Reineke, Saarland 11

  12. Current Process of Deriving Timing Model ? ¡ Timing Micro- + architecture Model Jan Reineke, Saarland 12

  13. Current Process of Deriving Timing Model ? ¡ Timing Micro- + architecture Model Jan Reineke, Saarland 13

  14. Current Process of Deriving Timing Model ? ¡ Timing Micro- + architecture Model Jan Reineke, Saarland 14

  15. Current Process of Deriving Timing Model ? ¡ Timing Micro- + architecture Model à Time-consuming, and à error-prone. Jan Reineke, Saarland 15

  16. Current Process of Deriving Timing Model ? ¡ Timing Micro- + architecture Model à Time-consuming, and à error-prone. Jan Reineke, Saarland 16

  17. 1. Future Process of Deriving Timing Model Timing Micro- + VHDL architecture Model Model Jan Reineke, Saarland 17

  18. 1. Future Process of Deriving Timing Model Timing Micro- + VHDL architecture Model Model Derive timing model automatically from formal specification of microarchitecture. à Less manual effort, thus less time-consuming, and à provably correct. Jan Reineke, Saarland 18

  19. 1. Future Process of Deriving Timing Model Timing Micro- + VHDL architecture Model Model Derive timing model automatically from formal specification of microarchitecture. à Less manual effort, thus less time-consuming, and à provably correct. Jan Reineke, Saarland 19

  20. 1. Future Process of Deriving Timing Model Timing Micro- + VHDL architecture Model Model Derive timing model automatically from formal specification of microarchitecture. à Less manual effort, thus less time-consuming, and à provably correct. Jan Reineke, Saarland 20

  21. 2. Future Process of Deriving Timing Model Perform Timing Micro- + measurements on Infer model architecture Model hardware Jan Reineke, Saarland 21

  22. 2. Future Process of Deriving Timing Model Perform Timing Micro- + measurements on Infer model architecture Model hardware Derive timing model automatically from measurements on the hardware using ideas from automata learning. à No manual effort, and à (under certain assumptions) provably correct. à Also useful to validate assumptions about microarch. Jan Reineke, Saarland 22

  23. 2. Future Process of Deriving Timing Model Perform Timing Micro- + measurements on Infer model architecture Model hardware Derive timing model automatically from measurements on the hardware using ideas from automata learning. à No manual effort, and à (under certain assumptions) provably correct. à Also useful to validate assumptions about microarch. Jan Reineke, Saarland 23

  24. 2. Future Process of Deriving Timing Model Perform Timing Micro- + measurements on Infer model architecture Model hardware Derive timing model automatically from measurements on the hardware using ideas from automata learning. à No manual effort, and à (under certain assumptions) provably correct. à Also useful to validate assumptions about microarch. Jan Reineke, Saarland 24

  25. 2. Future Process of Deriving Timing Model Perform Timing Micro- + measurements on Infer model architecture Model hardware Derive timing model automatically from measurements on the hardware using ideas from automata learning. à No manual effort, and à (under certain assumptions) provably correct. à Also useful to validate assumptions about microarch. Jan Reineke, Saarland 25

  26. Proof-of-concept: Automatic Modeling of the Cache Hierarchy ¢ Cache Model is important part of Timing Model ¢ Can be characterized by a few parameters: l ABC: associativity, block size, capacity l Replacement policy B = Block Size Tag Data Tag Data Tag Data Tag Data Tag Data Tag Data A = Associativity ... Tag Data Tag Data Tag Data Tag Data Tag Data Tag Data N = Number of Cache Sets chi [Abel and Reineke, RTAS 2013] derives all of these parameters fully automatically. Jan Reineke, Saarland 26

  27. Example: Intel Core 2 Duo E6750, L1 Data Cache |Misses| 90000 80000 70000 60000 50000 L1 Misses 40000 30000 20000 10000 0 |Size| 1 2 3 4 5 6 7 8 9 1011121314151617181920212223242526272829303132333435363738394041424344454647484950 Jan Reineke, Saarland 27

  28. Example: Intel Core 2 Duo E6750, L1 Data Cache |Misses| 90000 80000 70000 60000 50000 L1 Misses 40000 30000 20000 10000 0 |Size| 1 2 3 4 5 6 7 8 9 1011121314151617181920212223242526272829303132333435363738394041424344454647484950 Capacity = 32 KB Jan Reineke, Saarland 28

  29. Example: Intel Core 2 Duo E6750, L1 Data Cache Way Size = 4 KB |Misses| 90000 80000 70000 60000 50000 L1 Misses 40000 30000 20000 10000 0 |Size| 1 2 3 4 5 6 7 8 9 1011121314151617181920212223242526272829303132333435363738394041424344454647484950 Capacity = 32 KB Jan Reineke, Saarland 29

  30. Replacement Policy Approach inspired by methods to learn finite automata. Heavily specialized to problem domain. Jan Reineke, Saarland 30

  31. Replacement Policy Approach inspired by methods to learn finite automata. Heavily specialized to problem domain. Discovered to our knowledge undocumented policy of the Intel Atom D525: d x a b d c x e c d a b d c e f e f a b More information: http://embedded.cs.uni-saarland.de/chi.php Jan Reineke, Saarland 31

  32. Modeling Challenge: Future Work Extend automation to other parts of the microarchitecture: ¢ Translation lookaside buffers, branch predictors ¢ Shared caches in multicores including their coherency protocols ¢ Out-of-order pipelines? Jan Reineke, Saarland 32

  33. The Analysis Challenge ? ¡ ! ¡ Timing + Micro- Precise & Efficient architecture Model Timing Analysis Consider all Consider all possible possible initial program states of the inputs hardware WCET H ( P ) := max h ∈ States ( H ) ET H ( P, i, h ) max i ∈ Inputs Jan Reineke, Saarland 33

  34. The Analysis Challenge Consider all Consider all possible possible initial program states of the inputs hardware WCET H ( P ) := max h ∈ States ( H ) ET H ( P, i, h ) max i ∈ Inputs Explicitly evaluating ET for all inputs and all hardware states is not feasible in practice: ¢ There are simply too many. è Need for abstraction and thus approximation! Jan Reineke, Saarland 34

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend