Hardware-Software Codesign 10. Performance Analysis of Distributed - - PowerPoint PPT Presentation

hardware software codesign
SMART_READER_LITE
LIVE PREVIEW

Hardware-Software Codesign 10. Performance Analysis of Distributed - - PowerPoint PPT Presentation

Hardware-Software Codesign 10. Performance Analysis of Distributed Embedded Systems Lothar Thiele 10 - 1 System Design Specification System Synthesis Estimation SW-Compilation Instruction Set HW-Synthesis Intellectual Intellectual


slide-1
SLIDE 1

Hardware-Software Codesign

10 - 1

  • 10. Performance Analysis of Distributed

Embedded Systems

Lothar Thiele

slide-2
SLIDE 2

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 2

SW-Compilation HW-Synthesis

System Design

Specification System Synthesis Machine Code Net lists Estimation Instruction Set Intellectual

  • Prop. Block

Intellectual

  • Prop. Code
slide-3
SLIDE 3

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 3

Contents

Overview Real-Time Calculus Modular Performance Analysis Examples

slide-4
SLIDE 4

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 4

Formal Analysis vs. Simulation

e.g. delay

Real System Simulation Formal analysis

Best-Case Worst-Case upper bound lower bound

slide-5
SLIDE 5

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 5

Analysis and Design

Embedded System = Computation + Communication + Resource Interaction Analysis: Infer system properties from subsystem properties. Design: Build a system from subsystems while meeting requirements.

slide-6
SLIDE 6

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 6

Modular Performance Analysis

Load Model (Environment) Service Model (Resources) Performance Model Processing Model (Tasks & Scheduling) Analysis Analysis Results

Input traces Formal specification

System Model Application Architecture Mapping Scheduling

Task graphs architecture diagrams Formal specification WCET Analysis Measure- ments Data sheets

slide-7
SLIDE 7

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 7

Abstract Models for Performance Analysis

Processor Task Input Stream

Service Model Load Model Concrete Instance Abstract Representation Processing Model

slide-8
SLIDE 8

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 8

Modular System Composition

CPU BUS DSP

RM TDMA

GPC GPC GPC GPC GPC GSC

TDMA

slide-9
SLIDE 9

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 9

Overview

slide-10
SLIDE 10

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 10

Contents

Overview Real-Time Calculus Modular Performance Analysis Examples

slide-11
SLIDE 11

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 11

Foundation

Real-Time Calculus can be regarded as a worst- case/best-case variant of classical queuing theory. It is a formal method for the analysis of distributed real-time embedded systems. Related Work:

  • Min-Plus Algebra: F. Baccelli, G. Cohen, G. J. Olster, and J.
  • P. Quadrat, Synchronization and Linearity --- An Algebra for

Discrete Event Systems, Wiley, New York, 1992.

  • Network Calculus: J.-Y. Le Boudec and P. Thiran, Network

Calculus - A Theory of Deterministic Queuing Systems for the Internet, Lecture Notes in Computer Science, vol. 2050, Springer Verlag, 2001.

slide-12
SLIDE 12

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 12

Comparison of Algebraic Structures

Algebraic structure

  • set of elements
  • one or more operators defined on elements of this set

Algebraic structures with two operators

  • plus-times:
  • min-plus:

Infimum:

  • The infimum of a subset of some set is the greatest element,

not necessarily in the subset, that is less than or equal to all

  • ther elements of the subset.
slide-13
SLIDE 13

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 13

Comparison of Algebraic Structures

Joint properties : Example:

  • plus-times:
  • min-plus:
slide-14
SLIDE 14

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 14

Comparison of Algebraic Structures

Joint properties : Differences :

  • plus-times: Existence of a negative element for :
  • min-plus: Idempotency of :
slide-15
SLIDE 15

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 15

Comparison of System Theories

Plus-times system theory

  • signals, impulse response, convolution, time-domain

Min-plus system theory

  • streams, variability curves, time-interval domain, convolution
slide-16
SLIDE 16

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 16

Abstract Models for Performance Analysis

Processor Task Input Stream

Service Model Load Model Concrete Instance Abstract Representation Processing Model

R(t) R’(t) C(t) α(∆) β(∆)

slide-17
SLIDE 17

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 17

From Streams to Cumulative Functions

Data streams: R(t) = number of events in [0, t) Resource stream: C(t) = available resource in [0, t)

R(t) C(t)

slide-18
SLIDE 18

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 18

From Event Streams to Arrival Curves

t [ms] events

maximum / minimum arriving events in any interval of length 2.5 ms

2.5 events ∆ [ms] 2.5

number of events in in t=[0 .. 2.5] ms αl αu

t ∆

Event Stream Arrival Curves α = [αl, αu]

slide-19
SLIDE 19

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 19

From Resources to Service Curves

t [ms] availability

maximum/minimum available service in any interval of length 2.5 ms available service in t=[0 .. 2.5] ms

2.5

βu βl

service ∆ [ms] 2.5

t ∆

Resource Availability Service Curves β = [βl, βu]

slide-20
SLIDE 20

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 20

Example 1: Periodic with Jitter

A common event pattern that is used in literature can be specified by the parameter triple (p, j, d), where p denotes the period, j the jitter, and d the minimum inter-arrival distance of events in the modeled stream.

periodic p periodic jitter p j ≥ d

slide-21
SLIDE 21

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 21

Example 1: Periodic with Jitter

periodic periodic with jitter

slide-22
SLIDE 22

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 22

Example 1: Periodic with Jitter

Arrival curves:

slide-23
SLIDE 23

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 23

Example 2: TDMA Resource

Consider a real-time system consisting of n applications that are executed on a resource with bandwidth B that controls resource access using a TDMA policy. Analogously, we could consider a distributed system with n communicating nodes, that communicate via a shared bus with bandwidth B, with a bus arbitrator that implements a TDMA policy. TDMA policy: In every TDMA cycle of length , one single resource slot of length si is assigned to application i.

c

c

c appl.1 appl.2

  • appl. n

appl.1 appl.2

  • appl. n

sn ... ...

slide-24
SLIDE 24

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 24

Example 2: TDMA Resource

Service curves available to the applications / node i:

B si c si c-si c 2

slide-25
SLIDE 25

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 25

Greedy Processing Component (GPC)

remaining resources

Examples:

  • computation (event – task instance, resource – computing

resource [tasks/second])

  • communication (event – data packet, resource – bandwidth

[packets/second])

FIFO buffer input event stream

  • utput

event stream available resources GPC

slide-26
SLIDE 26

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 26

Greedy Processing Component

GPC

  • Component is triggered by

incoming events.

  • A fully preemptable task is

instantiated at every event arrival to process the incoming event.

  • Active tasks are processed in a

greedy fashion in FIFO order.

  • Processing is restricted by the

availability of resources.

Behavioral Description

slide-27
SLIDE 27

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 27

Greedy Processing Component (GPC)

R(t) C(t) R’(t) C’(t)

t C(t) R(t) R’(t)

Conservation Laws GPC

slide-28
SLIDE 28

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 28

Greedy Processing

For all times u ≤ t we have R’(u) ≤ R(u) (conservation law). We also have R’(t) ≤ R’(u)+C(t)–C(u) as the output can not be larger than the available resources. Combining both statements yields R’(t) ≤ R(u) + C(t) – C(u). Let us suppose that u* is the last time before t with an empty

  • buffer. We have R(u*) = R’(u*) at u* and also R’(t) = R’(u*) +

C(t) – C(u*) as all available resources are used to produce

  • utput. Therefore, R’(t) = R(u*) + C(t) – C(u*).

As a result, we obtain t u* B(t)

slide-29
SLIDE 29

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 29

Abstract Models for Performance Analysis

Processor Task Input Stream

Service Model Load Model Concrete Instance Abstract Representation Processing Model

R(t) R’(t) C(t) α(∆) β(∆)

slide-30
SLIDE 30

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 30

Abstraction

time domain cumulative functions time-interval domain variability curves

GPC GPC

slide-31
SLIDE 31

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 31

Some Definitions and Relations

is called min-plus convolution is called min-plus de-convolution For max-plus convolution and de-convolution: Relation between convolution and deconvolution

slide-32
SLIDE 32

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 32

Arrival and Service Curve

We can determine valid variability curves from cumulative functions as follows: One proof:

slide-33
SLIDE 33

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 33

Abstraction

time domain cumulative functions time-interval domain variability curves

GPC GPC

slide-34
SLIDE 34

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 34

The Most Simple Relations

The output stream of a component satisfies: The output upper arrival curve of a component satisfies: The remaining lower service curve of a component satisfies:

slide-35
SLIDE 35

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 35

Two Sample Proofs

slide-36
SLIDE 36

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 36

Tighter Bounds

Without proof ... .

[αl, αu] [βl, βu] [βl’, βu’] [αl’, αu’] GPC

slide-37
SLIDE 37

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 37

Delay and Backlog

maximum delay D maximum backlog B βl αu [αl, αu] [βl, βu] [βl’, βu’] [αl’, αu’] GPC

slide-38
SLIDE 38

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 38

Proof of Backlog Bound

slide-39
SLIDE 39

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 39

Contents

Overview Real-Time Calculus Modular Performance Analysis Examples

slide-40
SLIDE 40

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 40

System Composition

CPU BUS DSP

GPC GPC GPC GPC GPC

How to inter- connect service?

RM TDMA

Scheduling!

slide-41
SLIDE 41

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 41

FP/RM EDF RR TDMA GPS

Scheduling and Arbitration

GPC GPC GPC GPC

EDF RR

sum share

GPC GPC

TDMA

slide-42
SLIDE 42

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 42

Complete System Composition

CPU BUS DSP

RM TDMA

GPC GPC GPC GPC GPC

TDMA

slide-43
SLIDE 43

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 43

Extending the Framework

  • New HW behavior
  • New SW behavior
  • New scheduling scheme
  • ...

α β β’ α’

RTC

  • Find new relations:

This is the hard part…!

slide-44
SLIDE 44

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 44

Contents

Overview Real-Time Calculus Modular Performance Analysis Examples

slide-45
SLIDE 45

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 45

Case Study

ECU1 BUS CC1 ECU2 CC2 ECU3 CC3

S1 S2 S3 S4 S5

6 Real-Time Input Streams

  • with jitter
  • with bursts
  • deadline > period

3 ECU’s with own CC’s 13 Tasks & 7 Messages

  • with different WCET

2 Scheduling Policies

  • Earliest Deadline First (ECU’s)
  • Fixed Priority (ECU’s & CC’s)

Hierarchical Scheduling

  • Static & Dynamic Polling Servers

Bus with TDMA

  • 4 time slots with different lengths

(#1,#3 for CC1, #2 for CC3, #4 for CC3)

Total Utilization:

  • ECU1

59 %

  • ECU2

87 %

  • ECU3

67 %

  • BUS

56 % S6

slide-46
SLIDE 46

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 46

Specification Data

slide-47
SLIDE 47

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 47

The Distributed Embedded System...

ECU1 BUS (TDMA)

C1.1 C1.2 C2.1 C3.1 C4.1 C5.1 C3.2 T1.1 T1.3 T2.1 T3.1 T3.3 PS

FP FP

CC1 ECU2

T4.1 T5.1

FP

CC2 ECU3

T1.2

FP FP

CC3

T3.2

FP EDF

T2.2 PS T4.2 PS T5.2 S1 S2 S3 S4 S5 S1 S3 T6.1 S6 S6

slide-48
SLIDE 48

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 48

... and its MPA Model

S5 S4 S1 S2 S3 T1.1 T1.3 C4.1 C5.1 CPU T2.1 T3.1 CPU T4.1 T5.1 CPU PS T1.2 EDF PS T3.2 C1.2 C3.2 C2.1 C3.1 C1.1 T5.2 T4.2 T2.2 PS

ECU1 ECU2 ECU3 BUS CC1 CC2 CC3

T6.1 S6 T3.3 TDMA

slide-49
SLIDE 49

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 49

Buffer & Delay Guarantees

S5 S4 S1 S2 S3 T1.1 T1.3 C4.1 C5.1 CPU T2.1 T3.1 CPU T4.1 T5.1 CPU PS T1.2 EDF PS T3.2 C1.2 C3.2 C2.1 C3.1 C1.1 T5.2 T4.2 T2.2 PS

ECU1 ECU2 ECU3 BUS CC1 CC2 CC3

T6.1 S6 T3.3

2 6 2 2 5 7 1 3 1 3 5 5 1 4 5 6 5 5.30 7.12 3.69 d b 1.80 0.50 0.70

TDMA

slide-50
SLIDE 50

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 50

Available & Remaining Service of ECU1

S5 S4 S1 S2 S3 T1.1 T1.3 C4.1 C5.1 TDMA CPU T2.1 T3.1 T3.3 CPU T4.1 T5.1 CPU PS T1.2 EDF PS T3.2 C1.2 C3.2 C2.1 C3.1 C1.1 T5.2 T4.2 T2.2 PS

ECU1 ECU2 ECU3 BUS CC1 CC2 CC3

slide-51
SLIDE 51

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 51

Input of Stream 3

S5 S4 S1 S2 S3 T1.1 T1.3 C4.1 C5.1 CPU T2.1 T3.1 CPU T4.1 T5.1 CPU PS T1.2 EDF PS T3.2 C1.2 C3.2 C2.1 C3.1 C1.1 T5.2 T4.2 T2.2 PS

ECU1 ECU2 ECU3 BUS CC1 CC2 CC3

T6.1 S6 T3.3 TDMA

slide-52
SLIDE 52

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 52

Output of Stream 3

S5 S4 S1 S2 S3 T1.1 T1.3 C4.1 C5.1 CPU T2.1 T3.1 CPU T4.1 T5.1 CPU PS T1.2 EDF PS T3.2 C1.2 C3.2 C2.1 C3.1 C1.1 T5.2 T4.2 T2.2 PS

ECU1 ECU2 ECU3 BUS CC1 CC2 CC3

T6.1 S6 T3.3 TDMA

slide-53
SLIDE 53

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 53

Automated Design Space Exploration

Application Architecture Mapping Estimation Multi-Objective Optimization

We use evolutionary algorithms for multi-objective

  • ptimization!
slide-54
SLIDE 54

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 54

Network Processor Task Model

slide-55
SLIDE 55

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 55

EXPO

slide-56
SLIDE 56

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 56

Results

Performance for encryption/decryption Performance for RT voice processing Cost

slide-57
SLIDE 57

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 57

Analysis vs. Simulation

load

slide-58
SLIDE 58

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 58

Design Space Exploration

Determine mapping Determine performance network Solve system of equations Determine important paramerters (end-to-end delay, throughput, buffer space output jitter, ...) Give feedback to optimization

Application Architecture Mapping Estimation

slide-59
SLIDE 59

Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory 10 - 59

RTC Toolbox

www.mpa.ethz.ch/rtctoolbox