last time
play

Last Time Response time analysis Blocking terms Priority inversion - PowerPoint PPT Presentation

Last Time Response time analysis Blocking terms Priority inversion And solutions Release jitter Release jitter Other extensions Today Timing analysis Answers a question we commonly ask: At most long can this


  1. Last Time � Response time analysis � Blocking terms � Priority inversion � And solutions � Release jitter � Release jitter � Other extensions

  2. Today � Timing analysis � Answers a question we commonly ask: • At most long can this code take to run? � Response time over CAN � Worst-case message times � Holistic scheduling

  3. Timing Analysis Definitions � Worst case execution time (WCET) : Longest execution time of a program on a given platform, considering all possible inputs � Precise timing analysis problem : Compute WCET � Trivially reduces to halting problem • Though not in practice • But still too hard � Timing analysis problem : Compute a conservative estimate of the WCET � I.e., estimate of WCET can be > true WCET � This is decidable • Correct analyzer could always return �

  4. Timing Analysis by Testing � WCET is often estimated by looking for the maximum execution time over many executions � This is easy � However, it does not solve the problem! ions True True Number of executio WCET Execution time WCET Longest Longest estimate observed observed ET #1 ET #2

  5. Timing Analysis by Testing � Always true: � Longest observed ET � true WCET � WCET estimate � Question: What is the requirement for correctly estimating WCET using testing?

  6. Static Timing Analysis � Static timing analysis : Estimate WCET without running a program � Problem 1: Can’t do this from source code � Which variables go into registers? � Which functions are inlined? � Which switches become jump tables vs. cascaded tests? � Solution: Analyze compiler output � Problem 2: Understanding what’s going on in HW � Where are the branch mispredicts? � Where are the icache / dcache misses? � Solution: Build model of the hardware

  7. Static Timing Analysis int foo1 (int a, int b) { int c = b + 31*a; int e = 120 - c - a; return e; } link a6,#0 link a6,#0 move.l 8(a6),d0 moveq #-32,d1 muls.l d1,d0 sub.l 12(a6),d0 addi.l #120,d0 unlk a6 rts � What does it take to estimate WCET of this code?

  8. Analyzing Branches void foo2 (int a) { if (a) { x += 3*a; } else { y -= x-a; }} link a6,#0 move.l 8(a6),d2 tst.l d2 tst.l d2 beq.s *+16 moveq #3,d0 muls.l d0,d2 add.l d2,_x bra.s *+24 move.l _x,d1 sub.l d2,d1 move.l _y,d0 sub.l d1,d0 move.l d0,_y unlk a6 rts

  9. Analyzing Loops void foo3 (int a) { do { y++; } while (a--); } link a6,#0 move.l 8(a6),d2 move.l 8(a6),d2 move.l _y,d1 move.l d2,d0 addq.l #1,d1 subq.l #1,d2 tst.l d0 bne.s *-8 move.l d1,_y unlk a6 rts

  10. Loop Analysis Strategies Programmer annotates loops with bounds 1. � Not very fun � Doesn’t work well for library code � However, could argue that in critical software programmer should always know the loop bounds Analyzer tries to figure out loop bounds Analyzer tries to figure out loop bounds 2. 2. � Doesn’t always work � Derived bound might be too high � Reasonable answer: Analyzer figures out simple loops, programmer annotates difficult ones

  11. Bottom-Up WCET Analysis void foo4 (void) foo4 { 15+3 = 18 if (y==0) { if y == 0 if (x>5) { max(7,13) + 2 = 15 x++; x++; if x>5 x=1 } else { max(5,4) + 2 = 7 3+10 = 13 x = 1; } x++ x = 1 3+2 = 5 3+1 = 4 } else { x *= 3; Return Return Return } 3 3 3 }

  12. Real-World Problems � Timing models for complex processors are difficult to create � Probably impossible for processors like Pentium 4 • Not even Intel knows! � However, easy for ColdFire, ARM, and lots of others � Caches, TLBs, branch predictors enormously � Caches, TLBs, branch predictors enormously complicate WCET analysis � Need to estimate cache, TLB, predictor state at every program point � Difficult, imprecise, and computationally expensive � Pointers and heap allocation are very difficult to analyze � But critical software typically doesn’t do much of these

  13. Hardware Horror Story � Start with some simple code: � Measure time per loop iteration for k = 1..32

  14. Result on NEC V850E

  15. Result on Pentium III

  16. Result on Athlon

  17. Commercial Timing Analysis � aiT from Absint Supports ARM7TDMI, ColdFire 5307, PowerPC 755, � MPC5xx � Analysis steps: Reconstruct control flow from object code � Value analysis: Computation of address ranges for Value analysis: Computation of address ranges for � � instructions accessing memory Cache analysis: Classification of memory references as � cache misses or hits Pipeline analysis: Predicting the behavior of the program � on the processor pipeline Path analysis: Determination of the worst-case execution � path of the program Analysis of loops and recursive procedures �

  18. Making Predictable Systems � Avoid recursion � Avoid deeply nested loops � Avoid if/else where one branch is a lot faster than the other � Avoid data-dependent loops � Avoid data-dependent loops � Use fixed iteration count whenever possible � Avoid variable-time data structures � E.g. hash tables are usually very unpredictable � Avoid unpredictable thread blocking � E.g. on disk, network, etc. � Avoid unpredictable processors � This is any processor that is much faster than its memory

  19. Timing Analysis Summary � WCET estimation for simple hardware + simple software � Largely a solved problem � Technology far less mature than e.g. compiler technology � WCET estimation for complex hardware + complex software software � Open problem � May not be possible � May not even be a good idea • I.e. nobody cares about WCET of spell check in MS Word � Products exist � What kind of chip should one use for a high- performance, time-critical embedded system?

  20. Time Guarantees over CAN � Basic idea: � Processors scheduled using priorities � We know WCET of tasks � CAN scheduled using priorities � We know WCTT of messages � Can put it all together using “holistic scheduling” � Can put it all together using “holistic scheduling” � Why do we care? � Accelerometer on your pickup is on CAN bus � Airbags are also on CAN bus � Want to guarantee airbag deployment within 150 ms of when you start to roll the truck • Even if lots of other stuff is going over the bus

  21. Modeling the CAN Bus � Recall: � CAN message stores 0-64 bits of data � 47 bits of message overhead � Bit stuffing occurs – in worst case every 5 bits has a 6 th added � 34 overhead bits stuffed � 34 overhead bits stuffed � For an n-byte message: � WC number of stuff bits = floor ((34 + 8n -1)/4) � � � � 34 + 8 s − 1 i � � 8 47 C = s + + τ � � � � i i bit � � � � 5

  22. Modeling CAN � Priority of a message determined by message type � Message must have minimum period T i � Blocking term B i = 135 � bit � Release jitter J i equal to queuing delay � Usually equal to worst-case response time of the task the � Usually equal to worst-case response time of the task the queues the message � Now we just reuse the processor scheduling equation! � � � R J + i i R = C + B + C � � i i i j � � T j ∀ j ∈ hp ( i )

  23. Modeling the Whole CAN Network � How to compute minimum queuing time of a message? � Could be clever, but 0 is safe � How to compute maximum queuing time of a message? � Equal to worst-case response time of task that queues the message � How to compute release jitter of a task that awaits a message i? � J dest(i) = R i – 47 � bit � This is nice but there’s a problem… � Circular dependency between task and message jitters

  24. Holistic Analysis � Solution to circular dependencies: � Start out with all jitters set to zero � Iterate between processor and network scheduling until convergence � This is the same trick we used to solve the dependency of task response time on itself last lecture task response time on itself last lecture � Finally: We have worst-case response time for every task and message � …and we can figure out if the airbag deploys on time � What if response times are too long? � Can fiddle with message priorities � Finding an optimal priority ordering (that minimizes global response times) is NP-hard

  25. CAN Bus Scheduling Summary � Can reason about an entire network of processors plus their network using holistic scheduling � Pretty cool result � This is used in practice

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend