[PPT] - Message scheduling to reduce AFDX jitter in a mixed NoC/AFDX PowerPoint Presentation

SLIDE 1

Message scheduling to reduce AFDX jitter in a mixed NoC/AFDX architecture

J´ erˆ

me Ermont, Sandrine Mouysset,

Jean-Luc Scharbarg and Christian Fraboul

Universit´ e de Toulouse - IRIT - INPT/ENSEEIHT

October 12, 2018

J. Ermont, S. Mouysset, J.-L. Scharbarg, C. Fraboul

RTNS 2018 - 10-12 october 2018 1/15

SLIDE 2

Context: Avionics architecture of modern planes

Avionics computers:

Mono-core processors: execute avionics functions following IMA (Integrated Modular Avionics) End Systems: interface between CPU and AFDX

AFDX network:

Interconnection of several avionics computers VL: unidirectionnal flow between one source ES to

ne or more destination ES

BAG: minimum interval time between 2 frames of a VL

J. Ermont, S. Mouysset, J.-L. Scharbarg, C. Fraboul

RTNS 2018 - 10-12 october 2018 2/15

SLIDE 3

Transmission of VLs by an ES

VLs from different partitions share the same ES Scheduler between VLs into the ES Introduces a jitter: delay between the beginning of the BAG and the effective transmission of the frame AFDX constraint: jitter < 500 µs

J. Ermont, S. Mouysset, J.-L. Scharbarg, C. Fraboul

RTNS 2018 - 10-12 october 2018 3/15

SLIDE 4

Envisioned avionics architecture

→

To replace mono-core processing unit by many-cores Different applications can be executed in parallel 2 different communications:

Intra-NoC communication Inter-NoC communication

J. Ermont, S. Mouysset, J.-L. Scharbarg, C. Fraboul

RTNS 2018 - 10-12 october 2018 4/15

SLIDE 5

Problem Statement

BAG of VL1 t1 t2 DDR ETH

J. Ermont, S. Mouysset, J.-L. Scharbarg, C. Fraboul

RTNS 2018 - 10-12 october 2018 5/15

SLIDE 6

Problem Statement

VL1

Command

t1 t2 DDR ETH BAG of VL1

DMA

J. Ermont, S. Mouysset, J.-L. Scharbarg, C. Fraboul

RTNS 2018 - 10-12 october 2018 5/15

SLIDE 7

Problem Statement

VL1

Command WCTT DDR ETH

VL1 t1 t2 DDR ETH jitter BAG of VL1

DMA

J. Ermont, S. Mouysset, J.-L. Scharbarg, C. Fraboul

RTNS 2018 - 10-12 october 2018 5/15

SLIDE 8

Problem Statement

VL1

Command WCTT DDR ETH

VL1 t1 t2 DDR ETH jitter BAG of VL1 VL2

DMA

J. Ermont, S. Mouysset, J.-L. Scharbarg, C. Fraboul

RTNS 2018 - 10-12 october 2018 5/15

SLIDE 9

Problem Statement

VL1

Command DMA Command WCTT DDR ETH

VL1 t1 t2 DDR ETH VL2 jitter BAG of VL1 VL1 VL2

DMA

J. Ermont, S. Mouysset, J.-L. Scharbarg, C. Fraboul

RTNS 2018 - 10-12 october 2018 5/15

SLIDE 10

Problem Statement

VL1

Command DMA Command WCTT DDR ETH WCTT DDR ETH

VL1 t1 t2 DDR ETH VL2 jitter VL1 BAG of VL1 VL2

DMA

J. Ermont, S. Mouysset, J.-L. Scharbarg, C. Fraboul

RTNS 2018 - 10-12 october 2018 5/15

SLIDE 11

Problem Statement

VL2

Command DMA Command WCTT DDR ETH WCTT DDR ETH WCTT DDR ETH

VL1 VL1 t1 t2 DDR ETH VL2 jitter jitter VL1 BAG of VL1 VL1

DMA

The jitter depends on the WCTT of flows from other applications WCTT depends on the blocking mechanism of the NoC

J. Ermont, S. Mouysset, J.-L. Scharbarg, C. Fraboul

RTNS 2018 - 10-12 october 2018 5/15

SLIDE 12

Problem Statement

VL2

Command DMA Command WCTT DDR ETH WCTT DDR ETH WCTT DDR ETH

VL1 VL1 t1 t2 DDR ETH VL2 jitter jitter VL1 BAG of VL1 VL1

DMA

The jitter depends on the WCTT of flows from other applications WCTT depends on the blocking mechanism of the NoC Problem How to reduce the jitter induced by the transmission on the NoC ?

J. Ermont, S. Mouysset, J.-L. Scharbarg, C. Fraboul

RTNS 2018 - 10-12 october 2018 5/15

SLIDE 13

A first solution: minimizing the contention for other applications

A new application mapping: Extended MapIO [1]

port1 port2 port3 port4 port5

4 5 6 7 3 2 1 5 6 7 1 2 3 4 (10,0) (0,0) 8 9 10 8 9 10

port11 port12 port13 port14 port15 port16 port17 port18 port6 port7 port8

FADEC7

tf3 tf2 tf0 tf4 tf6 tf5 tf1 th6 th7 th8 th4 th1 th0 th2 th3 th5 th9

HM10 (0,5)

th11 th6 th7 th10 th9 th8 th4 th1 th0 th2 th3 th5 th12 th13 th14 th15

HM16

th6 th7 th8 th4 th1 th0 th2 th3 th5

HM9 FADEC13

tf6 tf7 tf1 tf0 tf4 tf3 tf5 tf2 tf8 tf10 tf9 tf11 tf12 th9 th8 th7 th10 th11 th6 th4 th1 th2 th0 th5 th3

HM12

th9 th8 th7 th10 th6 th4 th1 th2 th0 th5 th3

HM11

tf6 tf7 tf1 tf0 tf4 tf3 tf5 tf2 tf8 tf10tf9

FADEC11 HM7

th3 th4 th5 th6 th0 th1 th2 tDDR

Minimizing contentions reduces the maximum jitter But not for all configurations

[1] Towards a mixed NoC/AFDX architecture for avionics applications, Laure Abdallah and al., WFCS 2017

J. Ermont, S. Mouysset, J.-L. Scharbarg, C. Fraboul

RTNS 2018 - 10-12 october 2018 6/15

SLIDE 14

Our proposition

One node dedicated to schedule the VLs on the ES Use of a TDMA table

1 2 3 4 5 6 7 8 9 10 11 12 31

APP4 APP2

.

1

APP1 APP3

.

2

.

3

APP1

.

4

APP2

.

5

APP1 APP3

.

6

.

7

APP1

.

8

APP2

.

9

APP1 APP3

.

10

.

11

APP1

.

12

APP2

.

13

APP1 APP3

.

14

.

15

APP1

.

127

APP1 APP3

………………………………………………………………………………………………………………………………

WCTT of VL1

J. Ermont, S. Mouysset, J.-L. Scharbarg, C. Fraboul

RTNS 2018 - 10-12 october 2018 7/15

SLIDE 15

Our proposition

One node dedicated to schedule the VLs on the ES Use of a TDMA table

1 2 3 4 5 6 7 8 9 10 11 12 31

APP4 APP2

.

1

APP1 APP3

.

2

.

3

APP1

.

4

APP2

.

5

APP1 APP3

.

6

.

7

APP1

.

8

APP2

.

9

APP1 APP3

.

10

.

11

APP1

.

12

APP2

.

13

APP1 APP3

.

14

.

15

APP1

.

127

APP1 APP3

………………………………………………………………………………………………………………………………

WCTT of VL1

J. Ermont, S. Mouysset, J.-L. Scharbarg, C. Fraboul

RTNS 2018 - 10-12 october 2018 7/15

SLIDE 16

Our proposition

One node dedicated to schedule the VLs on the ES Use of a TDMA table

1 2 3 4 5 6 7 8 9 10 11 12 31

APP4 APP2

.

1

APP1 APP3

.

2

.

3

APP1

.

4

APP2

.

5

APP1 APP3

.

6

.

7

APP1

.

8

APP2

.

9

APP1 APP3

.

10

.

11

APP1

.

12

APP2

.

13

APP1 APP3

.

14

.

15

APP1

.

127

APP1 APP3

………………………………………………………………………………………………………………………………

WCTT of VL1

J. Ermont, S. Mouysset, J.-L. Scharbarg, C. Fraboul

RTNS 2018 - 10-12 october 2018 7/15

SLIDE 17

Our proposition

One node dedicated to schedule the VLs on the ES Use of a TDMA table

1 2 3 4 5 6 7 8 9 10 11 12 31

APP4 APP2

.

1

APP1 APP3

.

2

.

3

APP1

.

4

APP2

.

5

APP1 APP3

.

6

.

7

APP1

.

8

APP2

.

9

APP1 APP3

.

10

.

11

APP1

.

12

APP2

.

13

APP1 APP3

.

14

.

15

APP1

.

127

APP1 APP3

………………………………………………………………………………………………………………………………

WCTT of VL1

Bag of VL4 (from App4): 128 ms VL4 is ready when line 6 of the table is executed

J. Ermont, S. Mouysset, J.-L. Scharbarg, C. Fraboul

RTNS 2018 - 10-12 october 2018 7/15

SLIDE 18

Our proposition

One node dedicated to schedule the VLs on the ES Use of a TDMA table

1 2 3 4 5 6 7 8 9 10 11 12 31

APP4 APP2

.

1

APP1 APP3

.

2

.

3

APP1

.

4

APP2

.

5

APP1 APP3

.

6

.

7

APP1

.

8

APP2

.

9

APP1 APP3

.

10

.

11

APP1

.

12

APP2

.

13

APP1 APP3

.

14

.

15

APP1

.

127

APP1 APP3

………………………………………………………………………………………………………………………………

WCTT of VL1

Bag of VL4 (from App4): 128 ms VL4 is ready when line 6 of the table is executed VL4 will wait line 0

J. Ermont, S. Mouysset, J.-L. Scharbarg, C. Fraboul

RTNS 2018 - 10-12 october 2018 7/15

SLIDE 19

Our proposition

One node dedicated to schedule the VLs on the ES Use of a TDMA table

1 2 3 4 5 6 7 8 9 10 11 12 31

APP4 APP2

.

1

APP1 APP3

.

2

.

3

APP1

.

4

APP2

.

5

APP1 APP3

.

6

.

7

APP1

.

8

APP2

.

9

APP1 APP3

.

10

.

11

APP1

.

12

APP2

.

13

APP1 APP3

.

14

.

15

APP1

.

127

APP1 APP3

………………………………………………………………………………………………………………………………

WCTT of VL1

Bag of VL4 (from App4): 128 ms VL4 is ready when line 6 of the table is executed VL4 will wait line 0 How to reduce this waiting delay ? Our solution To give more slots for the VLs → Oversampling of slots

J. Ermont, S. Mouysset, J.-L. Scharbarg, C. Fraboul

RTNS 2018 - 10-12 october 2018 7/15

SLIDE 20

Our proposition

One node dedicated to schedule the VLs on the ES Use of a TDMA table

1 2 3 4 5 6 7 8 9 10 11 12 31

APP4 APP2

.

1

APP1 APP3

.

2

.

3

APP1

.

4

APP4 APP2

.

5

APP1 APP3

.

6

.

7

APP1

.

8

APP4 APP2

.

9

APP1 APP3

.

10

.

11

APP1

.

12

APP4 APP2

.

13

APP1 APP3

.

14

.

15

APP1

.

127

APP1 APP3

………………………………………………………………………………………………………………………………

WCTT of VL1

Bag of VL4 (from App4): 128 ms VL4 is ready when line 6 of the table is executed VL4 will wait line 0 How to reduce this waiting delay ? Our solution To give more slots for the VLs → Oversampling of slots

J. Ermont, S. Mouysset, J.-L. Scharbarg, C. Fraboul

RTNS 2018 - 10-12 october 2018 7/15

SLIDE 21

How to map the slots to the VLs in the table ?

Constraint The VLs should respect their BAGs VLs with BAG = 1ms allocated to all lines Allocation by considering the minimum BAG value (> 1ms)

APP1 APP2 APP3 APP4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 127

………………………………………………………………………………………………………………………………

J. Ermont, S. Mouysset, J.-L. Scharbarg, C. Fraboul

RTNS 2018 - 10-12 october 2018 8/15

SLIDE 22

How to map the slots to the VLs in the table ?

Constraint The VLs should respect their BAGs VLs with BAG = 1ms allocated to all lines Allocation by considering the minimum BAG value (> 1ms)

APP2 APP3 APP4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 APP1 1 2 APP1 3 4 APP1 5 6 APP1 7 8 APP1 9 10 APP1 11 12 APP1 13 14 APP1 15 127

………………………………………………………………………………………………………………………………

J. Ermont, S. Mouysset, J.-L. Scharbarg, C. Fraboul

RTNS 2018 - 10-12 october 2018 8/15

SLIDE 23

How to map the slots to the VLs in the table ?

Constraint The VLs should respect their BAGs VLs with BAG = 1ms allocated to all lines Allocation by considering the minimum BAG value (> 1ms)

APP3 APP4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 APP1 1 APP2 2 APP1 3 APP2 4 APP1 5 APP2 6 APP1 7 APP2 8 APP1 9 APP2 10 APP1 11 APP2 12 APP1 13 APP2 14 APP1 15 APP2 127 APP2

………………………………………………………………………………………………………………………………

J. Ermont, S. Mouysset, J.-L. Scharbarg, C. Fraboul

RTNS 2018 - 10-12 october 2018 8/15

SLIDE 24

How to map the slots to the VLs in the table ?

Constraint The VLs should respect their BAGs VLs with BAG = 1ms allocated to all lines Allocation by considering the minimum BAG value (> 1ms)

APP4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 APP1 APP3 APP2 1 APP2 2 APP1 APP3 3 APP2 4 APP1 APP3 5 APP2 6 APP1 APP3 7 APP2 8 APP1 APP3 9 APP2 10 APP1 APP3 11 APP2 12 APP1 APP3 13 APP2 14 APP1 APP3 15 APP2 127 APP2

………………………………………………………………………………………………………………………………

J. Ermont, S. Mouysset, J.-L. Scharbarg, C. Fraboul

RTNS 2018 - 10-12 october 2018 8/15

SLIDE 25

How to map the slots to the VLs in the table ?

Constraint The VLs should respect their BAGs VLs with BAG = 1ms allocated to all lines Allocation by considering the minimum BAG value (> 1ms)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 APP1 APP3 APP2 1 APP2 APP4 2 APP1 APP3 3 APP2 APP4 4 APP1 APP3 5 APP2 APP4 6 APP1 APP3 7 APP2 APP4 8 APP1 APP3 9 APP2 APP4 10 APP1 APP3 11 APP2 APP4 12 APP1 APP3 13 APP2 APP4 14 APP1 APP3 15 APP2 APP4 127 APP2 APP4

………………………………………………………………………………………………………………………………

J. Ermont, S. Mouysset, J.-L. Scharbarg, C. Fraboul

RTNS 2018 - 10-12 october 2018 8/15

SLIDE 26

Formulation by a Bin Packing Problem

Objective To allocate VL transmissions into a minimum number of lines Number of lines in which VLs are allocated N = min

j=1...m, BAGj=1 BAGj

Objective function min

N

i=1

yi s.t

m

j=1

ωjxij ≤ C yi, ∀i = 1, .., N

N

i=1

xij = 1, ∀j = 1, .., m yi ∈ {0, 1} , ∀i = 1, .., N xij ∈ {0, 1} , ∀i = 1, .., N, ∀j = 1, .., m. Objective funct

J. Ermont, S. Mouysset, J.-L. Scharbarg, C. Fraboul

RTNS 2018 - 10-12 october 2018 9/15

SLIDE 27

Evaluation case study

A 10x10 Tilera-like NoC 2 types of applications: FADEC: engine control

critical application little amount of exchanged data: 1500 bytes full transmission between task

HM: HouseKeeping

non-critical application lots of data exchanged: > 130 Koctets data are stored in the memory

J. Ermont, S. Mouysset, J.-L. Scharbarg, C. Fraboul

RTNS 2018 - 10-12 october 2018 10/15

SLIDE 28

Evaluation case study

9 (critical or non-critical) considered applications: FADEC7 (4), FADEC11 (8), FADEC13 (16), HM7 (4), HM9 (2), HM10 (16), HM11 (32), HM12 (16), HM16 (32) 2 system configurations:

8 applications: HM7 is removed 9 applications

3 mapping strategies:

SHiC [1]: mapping by considering the core-to-core communications MapIO [2]: mapping by considering core-to-IO communications exMapIO [3]: mapping by considering both core-to-IO and IO-to-core communications

[1] Smart hill climbing for agile dynamic mapping in many-core systems, Mohammad Fattah and al [2] Reducing the contention experienced by real-time core-to-I/O flows over a Tilera-like Network on Chip, Laure Abdallah and al., ECRTS 2016 [3] Towards a mixed NoC/AFDX architecture for avionics applications, Laure Abdallah and al., WFCS 2017

J. Ermont, S. Mouysset, J.-L. Scharbarg, C. Fraboul

RTNS 2018 - 10-12 october 2018 11/15

SLIDE 29

VLs transmissions packed into the table

SHiC

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 1 FADEC7 HM9 HM13 HM12 HM11 FADEC13 FADEC11 HM10

MapIO with 8 applications

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 1 HM13 HM12 HM10 FADEC11 FADEC7 HM11 FADEC13 HM9

Ex MapIO with 8 applications

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 1 HM9 HM11 FADEC7 FADEC13 HM13 HM12 HM10 FADEC11

MapIO with 9 applications

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 1 HM10 HM9 HM7 HM13 FADEC13 FADEC11 HM12 HM11 FADEC7

Ex MapIO with 9 applications

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 1 FADEC7 HM11 HM9 HM10 HM13 HM12 HM7 FADEC13 FADEC11

J. Ermont, S. Mouysset, J.-L. Scharbarg, C. Fraboul

RTNS 2018 - 10-12 october 2018 12/15

SLIDE 30

Results

TDMA table guarantees the transmission every BAG Jitter constraint is respected when using a dedicated node

J. Ermont, S. Mouysset, J.-L. Scharbarg, C. Fraboul

RTNS 2018 - 10-12 october 2018 13/15

SLIDE 31

Conclusion

To replace mono-core processors by NoC based many-cores architecture Sharing the same output port could lead to an execution for which the jitter constraint is exceeded Mapping strategy Extended MapIO minimizes the jitter by reducing the contention

But jitter constraint can be exceeded

Our proposition: one dedicated node schedules the outgoing flows using a TDMA table

The jitter only depends on the contentions for the outgoing flow The jitter is then significantly reduced

Construction of a scheduling table

Guarantee of the BAG constraint Over allocation of slots in order to reduce waiting delays

J. Ermont, S. Mouysset, J.-L. Scharbarg, C. Fraboul

RTNS 2018 - 10-12 october 2018 14/15

SLIDE 32

Further works

SHiC mapping example

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 1 FADEC7 HM9 HM13 HM12 HM11 FADEC13 FADEC11 HM10

What happens if HM13 needs 10 slots ? Different possible solutions

Reduce more the contentions on outgoing flows Relax the constraint of the minimum number of lines for larger BAG value → Variable capacity size bin packing or cutting stock problem

Global transmission delay from one manycore to another via AFDX Implementation of the solution in a real manycore system such as Tilera or Kalray

J. Ermont, S. Mouysset, J.-L. Scharbarg, C. Fraboul

RTNS 2018 - 10-12 october 2018 15/15