hardware modeling 3 timing anomalies
play

Hardware Modeling 3 Timing Anomalies Peter Puschner slides credits: - PowerPoint PPT Presentation

Hardware Modeling 3 Timing Anomalies Peter Puschner slides credits: P. Puschner, R. Kirner, B. Huber VU 2.0 182.101 SS 2015 Timing Anomalies Obstacles to building


  1. Hardware Modeling 3 Timing Anomalies Peter Puschner slides credits: P. Puschner, R. Kirner, B. Huber VU 2.0 182.101 SS 2015

  2. Timing Anomalies Obstacles to building models that allow for a safe and tight WCET analysis 2

  3. The State Explosion Problem § Modeling processor timing ð State explosion • instruction timing depends on context (execution history) • caches, pipelines, etc. • even, when using the timing relevant dynamic processor state (TRDPS) § What is needed: Reduction of modeled state space 3

  4. The Non-Locality of Instruction Timing Def : TRDPS (timing-relevant dynamic processor state) contains all memory elements in the target hardware whose content influences the timing and may be modified during program execution. § Complex hardware: TRDPS space of a program may be huge ð use simplified models to compute WCET 4

  5. Complexity Reduction Safe abstraction: over-approximation by reducing granularity of model à behavior of abstract model subsumes multiple concrete behaviors, including the real one à safe by construction, but pessimistic Simplification: approximation by eliminating execution scenarios that are considered irrelevant or non-existing à needs proof of soundness, otherwise dangerous!! • Example: eliminate short execution paths that lead to the same HW state as a long execution path from the further analysis 5

  6. Complexity Reduction (2) Decomposition: Decompose state space into two partitions, A and B. First, solve local problem for partition A. Second, solve the global problem for A and B by building on the solution to the local problem (instead of modeling the state space of partition A). à needs “ continuity properties ” , otherwise dangerous!! • Example: for a processor with pipeline and cache, first analyze cache behavior, then use the cache results to analyze the overall processor including the pipeline. If prerequisites of neither Simplification nor Decomposition are fulfilled, we must try pessimistic strategies ï Anomalies 6

  7. The State Explosion Problem ABSTRACTION “ the ” solution: concrete TRDPS abstract TRDPS 7

  8. The Price of Abstraction § Concrete domain (deterministic computation): only join operations along the traces initial TRDPS 8

  9. The Price of Abstraction (2) § Abstract domain (non-deterministic state transfer): both join and split operations along the traces initial abstract TRDPS 9

  10. The Price of Abstraction (3) ABSTRACTION a limited solution … concrete TRDPS abstract TRDPS 10

  11. Reducing Complexity § alternative to abstraction: reduce the complexity by decomposition. “ divide and conquer ” 11

  12. Series Decomposition Instruction sequence MN ð decompose into two sequences: M N 12

  13. Series Decomposition § Analysis on control-flow graphs instead on the set of execution traces 13

  14. Parallel Decomposition Timing depends on the TRDPS s : a (cache) s = 〈 a,b 〉 s b (pipeline) ð decompose s along the hardware components of the machine used 14

  15. Parallel Composition § The execution time T(I,s) of an instruction sequence I depends on the TRDPS s : 15

  16. Parallel Composition (TRDPS Partitioning) Partitioning the TRDPS between HW component A and HW component B: TRDPS: A × B Dublin, ¡ECRTS'09 ¡ 16

  17. Parallel Composition (TRDPS Partitioning) Variant 1: Max Composition : choose a ∈ A such that absolute delay of HW A is maximal Variant 2: Delta Composition : choose a ∈ A such that absolute delay of HW A is minimal and compensate by the Dublin, ¡ECRTS'09 ¡ 17 maximal variation |a ’ -a ’’ |; a ’ , a ’’ ∈ A

  18. Series Timing Anomalies 18

  19. Series Timing Anomalies There are two types of series timing anomalies: Amplification TA-S-A: ∃ s,s‘ ∈ IN M . 0 < Δ (M,s,s‘) < Δ (M ° N,s,s‘) Inversion Δ (M,s,s‘) Δ (M ° N,s,s’) TA-S-I: ∃ s,s‘ ∈ IN M . Δ (M,s,s‘) > 0 ∧ Δ (M ° N,s,s‘) < 0 Auxiliary definitions • Δ (I,s, s’ 〉 … change of exec. time of instr. sequence I. • IN M … reachable states at start of instr. sequence M 19

  20. Parallel Timing Anomalies 20

  21. Parallel Timing Anomalies 21

  22. Parallel Timing Anomalies 22

  23. Parallel Timing Anomalies 23

  24. Example of Amplification Anomaly out-of-order pipeline + cache + data dependencies: 24

  25. Example of Inversion Anomaly out-of-order pipeline + cache + data dependencies: 25

  26. Concrete Example for Inversion: n n Instructions Latency of instruction A A LD r4, 0(r3) in-order varies by cycles. t 7 Δ = − resource B ADD r5, r4, r4 C ADD r11, r10, r10 n n A B C D E D MUL r12, r11, r11 out-of-order E MUL r13, r12, r12 resource 1 2 3 4 5 6 7 8 9 10 11 12 13 14 LSU A Instruction A IU B C Cache Miss MCIU D E A LSU Instruction A B C IU Cache Hit D E MCIU 26

  27. Domino Effect A B C D E A B C D E A B C D E A B C D E A B C D E A B C D E 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 IU A A A E E A E A E A E E Initially LSU empty B D B D B D B D B D B D Pipeline MCIU C C C C C C A x B C D E A x B C D E IU A E E A E A E A E A E A LSU B D B D B D B D B D B D MCIU C C C C C C First instruction One cycle delayed Instructions A ADD r4, r3, r3 n n in-order resource B SW r4, 0x0 n n out-of-order resource extra delay of C MUL r10, r4, r4 1 cycle each D LW r3, 0x8 E ADD r11, r10, r10 iteration !!! 27

  28. Precondition for Timing Anomalies Common to shown patterns is a changed resource allocation sequence caused by a latency variation. Resource Allocation Criterion : A possible resource allocation decision for a hardware model is a necessary - but not sufficient - condition for the occurrence of timing anomalies. Consequence: Hardware without resource allocation decisions does not allow timing anomalies to occur. Note: Occurrence of timing anomalies depends on hardware features as well as code structure. 28

  29. Soundness of WCET Analysis with Parallel Timing Anomalies TA-P-I & State Analysis no TA-P-A TA-P-I TA-P-A Technique TA-P-x (same b ∈ B) Delta OK OK unsound unsound Composition Max Composition OK unsound OK unsound max (DC, MC) OK OK OK unsound Full State OK OK OK OK 29

  30. Consequences of Timing Anomalies Knowledge of the execution history required to tightly bound the execution time Without knowledge of the execution history (e.g., because it is too complex to analyze): • pessimistic overestimations … abstractions • potentially unsafe approximations … simplifications 30

  31. How can we avoid Timing Anomalies? Eliminate the need for considering long execution histories: • deactivate caches • use synchronization points • choose more predictable HW platform Change code structure to ensure that timing anomalies cannot take place: • e.g., code reordering, instruction insertion 31

  32. Summary So far, no feasible check for timing anomalies is known Extend code generators to produce SW patterns that avoid timing anomalies Develop more predictable systems • hardware components (e.g., scratchpad instead of caches, decisions by compiler instead of processor) • adequate software design patterns (e.g., time-triggered (static) actions) 32

  33. Further Reading Henrik Theiling, Christian Ferdinand, and Reinhard Wilhelm. Fast and Precise WCET Prediction by Separate Cache and Path Analyses , Real-Time Systems 18(2/3), Kluwer, 2000. Raimund Kirner and Martin Schöberl, Modeling the Function Cache for Worst-Case Execution Time Analysis . In Proc. 44th ACM Design Automation Conference, 2007. Raimund Kirner, Albrecht Kadlec, and Peter Puschner. Precise Worst-Case Execution Time Analysis for Processors with Timing Anomalies . In Proc. 21st Euromicro Conf. on Real-Time Systems, 2009. 33

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend