Shared-clock methodology for time-triggered multi-cores Keith F. - - PowerPoint PPT Presentation

shared clock methodology for time triggered multi cores
SMART_READER_LITE
LIVE PREVIEW

Shared-clock methodology for time-triggered multi-cores Keith F. - - PowerPoint PPT Presentation

Shared-clock methodology for time-triggered multi-cores Keith F. Athaide Project supervisor: Michael J. Pont Technical supervisor: Devaraj Ayavoo Communicating Process Architectures (CPA) 2008 8 th -10 th September 2008 Overview Aims


slide-1
SLIDE 1

Shared-clock methodology for time-triggered multi-cores Keith F. Athaide

Project supervisor: Michael J. Pont Technical supervisor: Devaraj Ayavoo

Communicating Process Architectures (CPA) 2008 8th-10th September 2008

slide-2
SLIDE 2

2

Overview

 Aims  Execution policies

– Co-operative – Pre-emptive

 Execution architectures  Shared-clock architecture

– Algorithm for non-broadcast topologies

 Multi-processor microcontroller architecture  Case study description  Results  Conclusions

slide-3
SLIDE 3

3

Aims

 Maintain the predictability and robustness of co-operative

single-processor systems

– Custom system-on-chip (SoC) – Time-triggered applications

 Heterogeneous processors  How to synchronise the different processors?

slide-4
SLIDE 4

4

Execution policy

 System has many functions  Functions often decomposed into discretely executing

blocks called tasks

– Periodic or aperiodic tasks

 Periodic tasks may have static or dynamic periods  Tasks have deadlines  Tasks are executed according to a policy

– Co-operative execution policy – Pre-emptive execution policy

slide-5
SLIDE 5

5

Co-operative execution policy

Time Time  Tasks must yield control when required  Resource sharing needs no complex locking

mechanisms

– Same processor, one execution thread

 System responsiveness inversely related to longest task

execution time

slide-6
SLIDE 6

6

Pre-emptive execution policy

Time  Tasks can interrupt each other  Interruption controlled by priorities  Predictability dependent on uniformity in pre-empting

instructions

 Problems such as priority inversion

slide-7
SLIDE 7

7

Scheduler architectures

 Event-triggered

– Multiple events – Feasibility depends on

 the number of events expected  the number of events serviceable by hardware

– “Construct by correction”

 Time-triggered

– Single event – Other events sensed by polling – “Correct by construction” – Can be power hungry

slide-8
SLIDE 8

8

Shared-clock architecture

Receive Tick Send ACK Run tasks Slaves Timer Overflow Send Ticks Run tasks Receive ACKs Master Timer Overflow Hardware [overflow]

slide-9
SLIDE 9

9

Shared-clock non-broadcast topology

 Existing implementations

need communication topologies supporting broadcasts

– Buses like CAN

 Can be simulated by

point-to-point transmissions

– Hardware or software

 Tree broadcast

– MPI collective

communication algorithm

 Lag due to point-to-point

transmissions

a d b c e g h f i g a d h e b c f i

slide-10
SLIDE 10

10

Multiprocessor architecture

 Network Interface Module (NIM)

– Messaging component as peripheral or co-processor

 Debug cluster

– Write to memories – Set breakpoints, stepping, etc.

Processor Debug Messaging peripheral Timer GPIO

NIM

Memory

Cluster Cluster Network-on-chip (NoC) NIM Cluster NIM Cluster NIM

Processor

slide-11
SLIDE 11

11

Network interface modules (NIMs)

 Asynchronous communication  Error detection

– 12-bit checksums (CRCs)

 No automatic error correction

– Errors cause no extra communication – Software notes and corrects errors

 Static routing  Serial-parallel communication  Variable number of channels  Lack of predictability in

communication latency might affect

  • verall predictability of the shared-

clock system

Transport Network Data link Channel Channel

slide-12
SLIDE 12

12

PH Processor

 Single interrupt

– Built for time-triggered applications – Multiplexed from any number of sources

 Soft-core processor (VHDL source available)  32-bit reduced instruction set computer (RISC)  MIPS I ISA (excluding patented instructions)  Harvard architecture  32 registers  5-stage pipeline

slide-13
SLIDE 13

13

Hardware implementation

slide-14
SLIDE 14

14

Hardware usage of NIMs

1 2 3 4 600 650 700 750 800 Bits per channel 6 8 16 Number of channels Hardware slices used

slide-15
SLIDE 15

15

Case study description

 Nine nodes

– Mesh topology

 Three scheduler types

– SCH1: P1 as master; P1

sends Ticks only when previous is acknowledged

– SCH2: P1 as master; P1

sends Ticks in turn

– SCH3: Tree broadcast

 Relative times measured P5 P2 Debug P6 P7 P3 P4 P0 P1 P1 P0 P4 P3 P7 P2 P6 P5

slide-16
SLIDE 16

16

Timer sense times (microseconds)

P0 P2 P3 P4 P5 P6 P7 50 100 150 200 250 300 SCH1 SCH2 SCH3

slide-17
SLIDE 17

17

Timer sense times for SCH3 (microseconds)

P0 P3 P4 P2 P7 P5 P6 10 20 30 40 50 60 70 80

slide-18
SLIDE 18

18

Timer sense time jitter (microseconds)

P0 P2 P3 P4 P5 P6 P7 0.5 1 1.5 SCH1 SCH2 SCH3 SCH3 (local)

slide-19
SLIDE 19

19

Timer sense time jitter in SCH3 (microseconds)

P0 P3 P4 P2 P7 P5 P6 0.5 1 1.5 P1 local

slide-20
SLIDE 20

20

Conclusions

 A custom multiprocessor microcontroller was developed

for time-triggered applications

 The shared-clock protocol was employed on a 9 node

mesh version of this microcontroller using a broadcast simulation algorithm

 Absorption of the broadcast simulation algorithm into

software allows the node sending the ticks to worry only about the ones it is connected to – a scalable situation

 The delay and jitter in SCH3 could be improved