Last Time Response time analysis Blocking terms Priority inversion - - PowerPoint PPT Presentation

last time
SMART_READER_LITE
LIVE PREVIEW

Last Time Response time analysis Blocking terms Priority inversion - - PowerPoint PPT Presentation

Last Time Response time analysis Blocking terms Priority inversion And solutions Release jitter Release jitter Other extensions Today Timing analysis Answers a question we commonly ask: At most long can this


slide-1
SLIDE 1

Last Time

Response time analysis Blocking terms Priority inversion

And solutions

Release jitter Release jitter Other extensions

slide-2
SLIDE 2

Today

Timing analysis

Answers a question we commonly ask:

  • At most long can this code take to run?

Response time over CAN

Worst-case message times Holistic scheduling

slide-3
SLIDE 3

Timing Analysis Definitions

Worst case execution time (WCET): Longest

execution time of a program on a given platform, considering all possible inputs

Precise timing analysis problem: Compute WCET

Trivially reduces to halting problem

  • Though not in practice
  • But still too hard

Timing analysis problem: Compute a conservative

estimate of the WCET

I.e., estimate of WCET can be > true WCET This is decidable

  • Correct analyzer could always return
slide-4
SLIDE 4

Timing Analysis by Testing

WCET is often estimated by looking for the

maximum execution time over many executions

This is easy However, it does not solve the problem!

True ions WCET estimate True WCET Longest

  • bserved

ET #2 Execution time Number of executio Longest

  • bserved

ET #1

slide-5
SLIDE 5

Timing Analysis by Testing

Always true:

Longest observed ET true WCET WCET estimate

Question: What is the requirement for correctly

estimating WCET using testing?

slide-6
SLIDE 6

Static Timing Analysis

Static timing analysis: Estimate WCET without

running a program

Problem 1: Can’t do this from source code

Which variables go into registers? Which functions are inlined? Which switches become jump tables vs. cascaded tests?

Solution: Analyze compiler output Problem 2: Understanding what’s going on in HW

Where are the branch mispredicts? Where are the icache / dcache misses?

Solution: Build model of the hardware

slide-7
SLIDE 7

Static Timing Analysis

int foo1 (int a, int b) { int c = b + 31*a; int e = 120 - c - a; return e; } link a6,#0 link a6,#0 move.l 8(a6),d0 moveq #-32,d1 muls.l d1,d0 sub.l 12(a6),d0 addi.l #120,d0 unlk a6 rts

What does it take to estimate WCET of this code?

slide-8
SLIDE 8

Analyzing Branches

void foo2 (int a) { if (a) { x += 3*a; } else { y -= x-a; }} link a6,#0 move.l 8(a6),d2 tst.l d2 tst.l d2 beq.s *+16 moveq #3,d0 muls.l d0,d2 add.l d2,_x bra.s *+24 move.l _x,d1 sub.l d2,d1 move.l _y,d0 sub.l d1,d0 move.l d0,_y unlk a6 rts

slide-9
SLIDE 9

Analyzing Loops

void foo3 (int a) { do { y++; } while (a--); } link a6,#0 move.l 8(a6),d2 move.l 8(a6),d2 move.l _y,d1 move.l d2,d0 addq.l #1,d1 subq.l #1,d2 tst.l d0 bne.s *-8 move.l d1,_y unlk a6 rts

slide-10
SLIDE 10

Loop Analysis Strategies

1.

Programmer annotates loops with bounds

Not very fun Doesn’t work well for library code However, could argue that in critical software programmer

should always know the loop bounds

2.

Analyzer tries to figure out loop bounds

2.

Analyzer tries to figure out loop bounds

Doesn’t always work Derived bound might be too high

Reasonable answer: Analyzer figures out simple

loops, programmer annotates difficult ones

slide-11
SLIDE 11

Bottom-Up WCET Analysis

void foo4 (void) { if (y==0) { if (x>5) { x++;

if y == 0 foo4 max(7,13) + 2 = 15 15+3 = 18

x++; } else { x = 1; } } else { x *= 3; } }

if x>5 x++ x = 1 x=1 Return Return Return 3 3 3 3+2 = 5 3+1 = 4 3+10 = 13 max(5,4) + 2 = 7

slide-12
SLIDE 12

Real-World Problems

Timing models for complex processors are difficult

to create

Probably impossible for processors like Pentium 4

  • Not even Intel knows!

However, easy for ColdFire, ARM, and lots of others

Caches, TLBs, branch predictors enormously Caches, TLBs, branch predictors enormously

complicate WCET analysis

Need to estimate cache, TLB, predictor state at every

program point

Difficult, imprecise, and computationally expensive

Pointers and heap allocation are very difficult to

analyze

But critical software typically doesn’t do much of these

slide-13
SLIDE 13

Hardware Horror Story

Start with some simple code: Measure time per loop iteration for k = 1..32

slide-14
SLIDE 14

Result on NEC V850E

slide-15
SLIDE 15

Result on Pentium III

slide-16
SLIDE 16

Result on Athlon

slide-17
SLIDE 17

Commercial Timing Analysis

aiT from Absint

  • Supports ARM7TDMI, ColdFire 5307, PowerPC 755,

MPC5xx

Analysis steps:

  • Reconstruct control flow from object code
  • Value analysis: Computation of address ranges for
  • Value analysis: Computation of address ranges for

instructions accessing memory

  • Cache analysis: Classification of memory references as

cache misses or hits

  • Pipeline analysis: Predicting the behavior of the program
  • n the processor pipeline
  • Path analysis: Determination of the worst-case execution

path of the program

  • Analysis of loops and recursive procedures
slide-18
SLIDE 18

Making Predictable Systems

Avoid recursion Avoid deeply nested loops Avoid if/else where one branch is a lot faster than

the other

Avoid data-dependent loops Avoid data-dependent loops

Use fixed iteration count whenever possible

Avoid variable-time data structures

E.g. hash tables are usually very unpredictable

Avoid unpredictable thread blocking

E.g. on disk, network, etc.

Avoid unpredictable processors

This is any processor that is much faster than its memory

slide-19
SLIDE 19

Timing Analysis Summary

WCET estimation for simple hardware + simple

software

Largely a solved problem Technology far less mature than e.g. compiler technology

WCET estimation for complex hardware + complex

software software

Open problem May not be possible May not even be a good idea

  • I.e. nobody cares about WCET of spell check in MS Word

Products exist What kind of chip should one use for a high-

performance, time-critical embedded system?

slide-20
SLIDE 20

Time Guarantees over CAN

Basic idea:

Processors scheduled using priorities We know WCET of tasks CAN scheduled using priorities We know WCTT of messages Can put it all together using “holistic scheduling” Can put it all together using “holistic scheduling”

Why do we care?

Accelerometer on your pickup is on CAN bus Airbags are also on CAN bus Want to guarantee airbag deployment within 150 ms of when

you start to roll the truck

  • Even if lots of other stuff is going over the bus
slide-21
SLIDE 21

Modeling the CAN Bus

Recall:

CAN message stores 0-64 bits of data 47 bits of message overhead Bit stuffing occurs – in worst case every 5 bits has a 6th

added

34 overhead bits stuffed 34 overhead bits stuffed

For an n-byte message:

WC number of stuff bits = floor ((34 + 8n -1)/4)

bit i i i

s s C τ

+ + + = 5 1 8 34 47 8

slide-22
SLIDE 22

Modeling CAN

Priority of a message determined by message type Message must have minimum period Ti Blocking term Bi = 135bit Release jitter Ji equal to queuing delay

Usually equal to worst-case response time of the task the Usually equal to worst-case response time of the task the

queues the message

Now we just reuse the processor scheduling

equation!

  • +

+ + =

) (i hp j j j i i i i i

C T J R B C R

slide-23
SLIDE 23

Modeling the Whole CAN Network

How to compute minimum queuing time of a

message?

Could be clever, but 0 is safe

How to compute maximum queuing time of a

message?

Equal to worst-case response time of task that queues the

message

How to compute release jitter of a task that awaits a

message i?

Jdest(i) = Ri – 47bit

This is nice but there’s a problem…

Circular dependency between task and message jitters

slide-24
SLIDE 24

Holistic Analysis

Solution to circular dependencies:

Start out with all jitters set to zero Iterate between processor and network scheduling until

convergence

This is the same trick we used to solve the dependency of

task response time on itself last lecture task response time on itself last lecture

Finally: We have worst-case response time for every

task and message

…and we can figure out if the airbag deploys on time

What if response times are too long?

Can fiddle with message priorities Finding an optimal priority ordering (that minimizes global

response times) is NP-hard

slide-25
SLIDE 25

CAN Bus Scheduling Summary

Can reason about an entire network of processors

plus their network using holistic scheduling

Pretty cool result This is used in practice