- - PowerPoint PPT Presentation

http lcs ios ac cn zxy
SMART_READER_LITE
LIVE PREVIEW

- - PowerPoint PPT Presentation

I nstitute of S oftware SAVE 2015 C hinese A cademy of S ciences http://lcs.ios.ac.cn/~zxy/ 2015-12-05 Xue-Yang Zhu , Rongjie Yan, Yu-Lei Gu, Jian Zhang,


slide-1
SLIDE 1

朱雪阳

http://lcs.ios.ac.cn/~zxy/ 2015-12-05 Xue-Yang Zhu, Rongjie Yan, Yu-Lei Gu, Jian Zhang, Wenhui Zhang and Guangquan Zhang. Static Optimal Scheduling for Synchronous Data Flow Graphs with Model Checking. In Proc. of the 20th International Symposium on Formal Methods (FM 2015). LNCS, vol. 9109, pp. 551–569.

同步数据流图在异构多处理器平台上 的时间能耗最优调度

Institute of Software Chinese Academy of Sciences SAVE 2015

slide-2
SLIDE 2

Xue-Yang Zhu Institute of Software Chinese Academy of Sciences

Real-time Embedded Systems

Scheduling Models Code

Synchronous Data Flow Graphs

Synthesis

Background

2015-12-05 SAVE2015

1

slide-3
SLIDE 3

Xue-Yang Zhu Institute of Software Chinese Academy of Sciences

Background

2015-12-05 SAVE2015

  • Synchronous dataflow graphs (SDFG) are widely used for modeling data-driven

applications − digital signal processing (DSP) algorithms − streaming media programs

A sample rate converter model--compact disk (CD) to digital audio tape (DAT)[Murthy, et al.1997] A satellite receiver [Ritz, et al. 1995] An MP3 playback [Wiggers, et al. 2007]

2

slide-4
SLIDE 4

Xue-Yang Zhu Institute of Software Chinese Academy of Sciences

Real-time Embedded Systems

Scheduling Models Code

Synchronous Data Flow Graphs

Synthesis Scheduling

Background

2015-12-05 SAVE2015

3

Time and energy

  • ptimization

Model Checking as a tool

Multi-processor Real-time requirements Resource limitations ……

slide-5
SLIDE 5

Xue-Yang Zhu Institute of Software Chinese Academy of Sciences

Outline

2015-12-05 SAVE2015

  • Model Description and Problem Formulation
  • Basic Idea of Our Methods
  • A Timed Automata Semantics of System Models
  • Static Optimal Scheduling and Mapping
  • Dealing with More Constraints
  • Case Studies
  • Conclusions and Future Work

4

slide-6
SLIDE 6

Xue-Yang Zhu Institute of Software Chinese Academy of Sciences

Model Description

2015-12-05 SAVE2015

A system model includes an SDFG G and its execution platform P.

− An execution platform P is a set of heterogeneous processors with energy consumption information – (in use, idle). − An SDFG is a finite directed graph G = <V, E>

delay (number of initial tokens) d(e)

  • eg. d(< C,A >)=3

consumption rate cns(e)

  • eg. cns(< C,A >)=2

number of tokens consumed from < C,A > by each execution of A edge e

  • a FIFO channel
  • data dependency between actors

actor v

  • a computation

computation time when v running on processor p t(v,p)

  • eg. t(A,p1)= 1, t(A,p2)=3

Multi-rate, eg. prd(< C,A >)=1, cns(< C,A >)=2 Different number of firings of each actor in one iteration of execution, eg. 1A,2B,2C in an iteration. production rate prd(e)

  • eg. prd(< C,A >)=1

the number of tokens produced onto < C,A >by each execution of C.

A B 2 2 C 1 1 1 1 1 1

5

slide-7
SLIDE 7

Xue-Yang Zhu Institute of Software Chinese Academy of Sciences

Schedules

2015-12-05 SAVE2015

M1=(G1,P1):

G1: P1:

Schedules of M1:

A static schedule arranges actors of an SDFG to be executed repeatedly.  time arrangement and processor allocation

Iteration period (IP): the average computation time of an iteration. IP=1/thr. 1-schedule: 1-schedule: 2-schedule: f-schedule: f iterations as a schedule cycle IP=8 IP=6 IP=11/2

  • Iter. energy cons. (IEC) is the average

energy cons. per iter. IEC=Occupied time *uEC+idle time *iEC

  • Eg. IEC=(6*90+0*10)+(3*45+3*15)=720

One iteration of G1: one firing of A, two firings of B and C, respectively..

6

A ‘good’ schedule: its IP and IEC are as small as possible

slide-8
SLIDE 8

Xue-Yang Zhu Institute of Software Chinese Academy of Sciences

Problem Formulation

2015-12-05 SAVE2015

Scheduling (our proposal)

System model M=(G,P) f: number of iterations Pareto-optimal f-schedules

Throughput-optimal Best energy cons. Energy cons.-optimal Best throughput

7

An application A heterogeneous multi- processor platform

slide-9
SLIDE 9

Xue-Yang Zhu Institute of Software Chinese Academy of Sciences

Outline

2015-12-05 SAVE2015

  • Model Description and Problem Formulation
  • Basic Idea of Our Methods
  • A Timed Automata Semantics of System Models
  • Static Optimal Scheduling and Mapping
  • Dealing with More Constraints
  • Case Studies
  • Conclusions and Future Work

8

slide-10
SLIDE 10

Xue-Yang Zhu Institute of Software Chinese Academy of Sciences

Basic Idea of Our Methods

2015-12-05 SAVE2015

Scheduling

Throughput-optimal Best energy cons.

f-schedules

N(P)TA CTL Traces UPPAAL/CORA Energy cons.-optimal Best throughput how how how

System model M=(G,P) f: number of iterations

9

slide-11
SLIDE 11

Xue-Yang Zhu Institute of Software Chinese Academy of Sciences

Basic Idea of Our Methods

2015-12-05 SAVE2015

Scheduling

Throughput-optimal Best energy cons.

f-schedules

N(P)TA CTL Traces UPPAAL/CORA Energy cons.-optimal Best throughput how

System model M=(G,P) f: number of iterations

10

slide-12
SLIDE 12

Xue-Yang Zhu Institute of Software Chinese Academy of Sciences

Timed Automata

2015-12-05 SAVE2015

11

  • 例:节能灯--灯打开2分钟后自动关闭

Off On press X< = 2 X= 2 X:= 0

Location Guard Update X: 时钟变量 Invariant

slide-13
SLIDE 13

Xue-Yang Zhu Institute of Software Chinese Academy of Sciences

Network of Timed Automata

2015-12-05 SAVE2015

12

  • 例:节能灯--灯打开2分钟后自动关闭

X= 2 X:= 0 Off On press y:= true i timer y= = true y= = false y:= false X< = 2

||

slide-14
SLIDE 14

Xue-Yang Zhu Institute of Software Chinese Academy of Sciences

A Timed Automata Semantics of System Models

2015-12-05 SAVE2015

M1:

G1: P1:

13

Two processors

  • Actors execute in parallel if only there are sufficient

tokens on their incoming edges and the resources they need are available.

  • Processors run in parallel.

A TA for each actor and a TA for each processors.

good?

Once an actor is firing, it must be running on some processor.

NO!

slide-15
SLIDE 15

Xue-Yang Zhu Institute of Software Chinese Academy of Sciences

A Timed Automata Semantics of System Models

2015-12-05 SAVE2015

A TA tap(α) models the behaviors of actor α running on processor p. The behaviors of processor p is modeled by a TA tap with non-deterministically selecting an actor from V: The behavior of system model M=(G,P) is modeled by an NTA: with a global clock glbClk.

readyS(α): There are sufficient tokens on the incoming edges of α and α has not fired f iterations sFiring(α): consumes tokens on the incoming edges

  • f α according to their consumption rate.

Reset the local clock x. x≤t(α,p) on location running and x==t(α,p) on edge ri: model α running on processor p. eFiring(α): produces tokens to the outgoing edges of α according to their production rate.

14

slide-16
SLIDE 16

Xue-Yang Zhu Institute of Software Chinese Academy of Sciences

A Timed Automata Semantics of System Models

2015-12-05 SAVE2015

idle

running Select α guard: readyS(α) update: sFiring(α),x:=0 guard: x==t(α,p1) update: eFiring(α) Invariant: x≤t(α,p1)

idle

running Select α guard: readyS(α) update: sFiring(α),x:=0 guard: x==t(α,p2) update: eFiring(α) Invariant: x≤t(α,p2) tap1 tap2

||

e.g., t(A,p1)=1, t(A,p2)=3 M1: ntaM1:

G1: P1:

15

Two processors Two processes

slide-17
SLIDE 17

Xue-Yang Zhu Institute of Software Chinese Academy of Sciences

A Timed Automata Semantics of System Models

2015-12-05 SAVE2015

idle running

A B

2 2

C

idle running idle running

Initially, actor A is ready for a firing. Assume p1 selects A to run.

idle running

glbClk=0: p1.sFiring(A)

A B

2 2

C

No actor is ready. Time elapses. After 1 time step (t(A,p1)=1), the firing of A is ready to end.

glbClk=1: p1.eFiring(A)

A B

2 2

C

idle running idle running

Actor B is ready for two firings. They can run on p1 and p2 concurrently, or run on one processor sequentially .

idle running idle running

glbClk=1: p1.sFiring(B), p2.sFiring(B)

tap1 tap2

||

A B

2 2

C

|| || ||

…… ……

16

The behaviors of a system model is the behaviors of its NTA, semantics of which are traces of its LTS

slide-18
SLIDE 18

Xue-Yang Zhu Institute of Software Chinese Academy of Sciences

Basic Idea of Our Methods

2015-12-05 SAVE2015

Scheduling

Throughput-optimal Best energy cons.

f-schedules

N(P)TA CTL Traces UPPAAL/CORA Energy cons.-optimal Best throughput how

System model M=(G,P) f: number of iterations

17

semantics LTS

slide-19
SLIDE 19

Xue-Yang Zhu Institute of Software Chinese Academy of Sciences

Traces and Schedules

2015-12-05 SAVE2015

0 1 2 3 4 5 6 7 8 9 10 time

C B A C B

p1 p2

0 1 2 3 4 5 6 7 8 9 10 time

B B A C C

p1 p2

......

glbClk=0: p1.sFiring(A) glbClk=1: p1.sFiring(B), p2.sFiring(B) glbClk=3: p1.sFiring(C) glbClk=7: p1.sFiring(C)

… … …

glbClk=0: p1.sFiring(A) glbClk=1: p1.sFiring(B) glbClk=5: p1.sFiring(C) glbClk=3: p1.sFiring(B) p2.sFiring(C)

… … …

Trace: Trace: Schedule: Schedule:

How to get a ‘good’ schedule according to the optimization criteria?

18

slide-20
SLIDE 20

Xue-Yang Zhu Institute of Software Chinese Academy of Sciences

Basic Idea of Our Methods

2015-12-05 SAVE2015

Scheduling

Throughput-optimal Best energy cons.

f-schedules

N(P)TA CTL Traces UPPAAL/CORA semantics Energy cons.-optimal Best throughput how

System model M=(G,P) f: number of iterations

LTS

19

slide-21
SLIDE 21

Xue-Yang Zhu Institute of Software Chinese Academy of Sciences

Optimization criteria as CTL formulas

2015-12-05 SAVE2015

  • Guard readyS(α) blocks α when it has fired f iterations, therefore

ntaM will deadlock when all actors have fired f iterations.

  • EF (deadlock)
  • ntaM |= EF φ: true when φ is eventually true at some states of some traces of ntaM
  • Solution 1:
  • CTL formula for check whether a given cycle period t is satisfied:

EF (deadlock and glbClk≤t)

  • A binary search can be used to find the minimal t, which is the reciprocal of the
  • ptimal throughput.
  • Solution 2:
  • Ask UPPAAL to check EF deadlock and to return a fastest trace, which is a trace

with the smallest IP (optimal throughput).

  • returns the same results as the binary search but only checks the property once.

20

slide-22
SLIDE 22

Xue-Yang Zhu Institute of Software Chinese Academy of Sciences

Throughput-Optimal Solution

2015-12-05 SAVE2015

Ask UPPAAL to find a throughput optimal schedule S (by the fastest trace that satisfies EF deadlock ) Ask UPPAAL to find a energy consumption constrained schedule S (by the fastest trace that satisfies EF deadlock and con(ec-1) ) S.IP==optIP ? ec=S.IEC Sopt=S Begin End Let optIP=S.IP, ec=S.IEC and Sopt=S Return Sopt (A throughput-optimal schedule with best energy consumption )

21

Yes No

slide-23
SLIDE 23

Xue-Yang Zhu Institute of Software Chinese Academy of Sciences

Energy-Optimal Solution

2015-12-05 SAVE2015

A priced timed automata ptap(α) models the behavior of actor α running on processor p.

Energy consumptions as cost rates of locations (iCE on idle and uCE on running)

The behavior of M=(G,P) is modeled by an NPTA: with a global clock glbClk.

22

The behaviors of processor :

) ( : α α

p p

pta V pta ∈ ∃ =

slide-24
SLIDE 24

Xue-Yang Zhu Institute of Software Chinese Academy of Sciences

Energy-Optimal Solution

2015-12-05 SAVE2015

Ask UPPAAL CORE to find an energy optimal schedule S (by the best trace that satisfies EF deadlock ) Ask UPPAAL to find a energy optimal schedule S (by the fastest trace that satisfies EF deadlock and con(optIEC) ) Begin End Let optIEC=S.IEC Return S (A Energy -optimal schedule with best throughput)

23

slide-25
SLIDE 25

Xue-Yang Zhu Institute of Software Chinese Academy of Sciences

Dealing with More Constraints

2015-12-05 SAVE2015

  • various kinds of constraints can be integrated into our

method

− Auto-concurrency constraints: an actor can only has limited concurrent firings. − Buffer size constraints: buffer size of each edge is limited. − Constraints on processors: an actor is not allowed to be allocated on some processors

24

slide-26
SLIDE 26

Xue-Yang Zhu Institute of Software Chinese Academy of Sciences

Outline

2015-12-05 SAVE2015

  • Model Description and Problem Formulation
  • Basic Idea of Our Methods
  • A Timed Automata Semantics of System Models
  • Static Optimal Scheduling and Mapping
  • Dealing with More Constraints
  • Case Studies
  • Conclusions and Future Work

25

slide-27
SLIDE 27

Xue-Yang Zhu Institute of Software Chinese Academy of Sciences

MPEG-4 Decoder [Theelen et. al 2012] with different parameters

2015-12-05 SAVE2015

Number of firings in an iter.

System model: Experimental results

Number of iterations Number of processors Buffer Bound

26

slide-28
SLIDE 28

Xue-Yang Zhu Institute of Software Chinese Academy of Sciences

MPEG-4 Decoder [Theelen et. al 2012] with different parameters

2015-12-05 SAVE2015

− A high bound provides more room for the improvement, at the cost

  • f longer execution time and larger memory consumption.

Experimental results

27

− When a low buffer size bound is used, the increase of number of iterations and number of processors have little improvement

  • When four processors, 2-schedule and IEC are considered, state

explosion occurs and hence our methods perform poorly.

slide-29
SLIDE 29

Xue-Yang Zhu Institute of Software Chinese Academy of Sciences

Computation Example [Bouyer2011] with different parameters

2015-12-05 SAVE2015

  • Used to measure the impact of the number of iterations (f) in a

schedule cycle.

  • Set f from 1 to 10, considering combinations of values of three

parameters:

− with and without a buffer bound, with and without auto-concurrency, 2 processors and 4 processors.

  • Observations from the experimental results:

− The throughput and energy consumption of schedules are improved by increasing f; the degree of improvement decreasing accordingly. − The buffer size bound and auto-concurrency constraints have larger impact on the cases with 4 processors than that with 2 processors.

28

slide-30
SLIDE 30

Xue-Yang Zhu Institute of Software Chinese Academy of Sciences

Outline

2015-12-05 SAVE2015

  • Model Description and Problem Formulation
  • Basic Idea of Our Methods
  • A Timed Automata Semantics of System Models
  • Static Optimal Scheduling and Mapping
  • Dealing with More Constraints
  • Case Studies
  • Conclusions and Future Work

29

slide-31
SLIDE 31

Xue-Yang Zhu Institute of Software Chinese Academy of Sciences

Conclusions

2015-12-05 SAVE2015

  • A concise (priced) timed automata semantics of system

models.

− It is concise and flexible to make our methods easy to be extended to deal with additional constraints.

  • Exact methods for optimal scheduling SDFGs on

heterogeneous multiprocessor platforms

  • throughput and energy consumption optimization
  • Various parameters, including unfolding factors, constraints on auto-concurrency,

buffer sizes and processors

  • dealing with moderate scale models within reasonable

execution time; revealing how different parameters impact on the results of different optimization goals.

30

slide-32
SLIDE 32

Xue-Yang Zhu Institute of Software Chinese Academy of Sciences

Future Work

2015-12-05 SAVE2015

  • Deal with larger system models
  • Try to provide more domain insight when encoding the considered

problems to model checking problems

  • Try to tailor model checking techniques to deal with specialized

tasks, instead of using a general model checker directly

  • Deal with communications between processors
  • Consider heuristics

 Yu-Lei Gu, Xue-Yang Zhu and Guangquan Zhang. Pareto Optimal Scheduling

  • f Synchronous Data Flow Graphs via Parallel Methods. In Proc. of the 1st

International Symposium on Dependable Software Engineering: Theories, Tools and Applications (SETTA 2015). Nanjing, China, November 4-6, 2015. LNCS, vol. 9409, pp.217-223.

31

slide-33
SLIDE 33

Xue-Yang Zhu Institute of Software Chinese Academy of Sciences

Reference (SDFG & Model Checking)

2015-12-05 SAVE2015

  • -32/--
  • Geilen, M., Basten, T., Stuijk, S.: Minimising buffer requirements of synchronous

dataflow graphs with model checking. In: Proc. of the 42nd Annu. Design Automation Conf. (DAC) (2005)

  • Gu, Z., Yuan, M., Guan, N., Lv, M., He, X., Deng, Q., Yu, G.: Static scheduling and

software synthesis for dataflow graphs with symbolic model-checking. In: Proc. of 28th International Real-Time Systems Symposium (RTSS), pp. 353–364 (2007)

  • Hartel, P.H., Ruys, T.C., Geilen, M.C.: Scheduling optimisations for SPIN to minimise

buffer requirements in synchronous data flow. In: Proc of the International Conference on Formal Methods in Computer-Aided Design (FMCAD), p. 21 (2008)

  • Theelen, B., Katoen, J.P., Wu, H.: Model checking of scenario-aware dataflow with
  • CADP. In: Proc. of the Conference on Design, Automation and Test in Europe

(DATE), pp. 653–658 (2012)

  • Malik, A., Gregg, D.: Orchestrating stream graphs using model checking. ACM
  • Trans. Archit. Code Optim. 10(3), 19:1–19:25 (2013)
  • ......
slide-34
SLIDE 34

Xue-Yang Zhu Institute of Software Chinese Academy of Sciences 2015-12-05 SAVE2015

33

Thanks!