Efficient Execution of Dependent Tasks on Many-Core Processors - - PowerPoint PPT Presentation

efficient execution of dependent tasks on many core
SMART_READER_LITE
LIVE PREVIEW

Efficient Execution of Dependent Tasks on Many-Core Processors - - PowerPoint PPT Presentation

Efficient Execution of Dependent Tasks on Many-Core Processors Hamza Rihani, Claire Maiza, Matthieu Moy Univ. Grenoble Alpes Verimag RTSOPS 2016, July 5, 2016 Context PE PE PE PE High Level Language PE PE PE PE i 1 o 1 2 3


slide-1
SLIDE 1

Efficient Execution of Dependent Tasks on Many-Core Processors

Hamza Rihani, Claire Maiza, Matthieu Moy

  • Univ. Grenoble Alpes Verimag

RTSOPS 2016, July 5, 2016

slide-2
SLIDE 2

Context

High Level Language

τ1 τ2 τ3 τ4 τ5 τ6 i1 i2

  • PE

PE PE PE PE PE PE PE PE PE PE PE PE PE PE PE

  • Hard real-time systems
  • Dependent tasks statically scheduled, on a many-core processor

! Unpredictable delays due to shared resource interference

2 ,

slide-3
SLIDE 3

Context

High Level Language

τ1 τ2 τ3 τ4 τ5 τ6 i1 i2

  • PE

PE PE PE PE PE PE PE PE PE PE PE PE PE PE PE

  • Hard real-time systems
  • Dependent tasks statically scheduled, on a many-core processor

! Unpredictable delays due to shared resource interference Use tightly estimated upper bounds on delays

2 ,

slide-4
SLIDE 4

Context

High Level Language

τ1 τ2 τ3 τ4 τ5 τ6 i1 i2

  • PE

PE PE PE PE PE PE PE PE PE PE PE PE PE PE PE

  • Hard real-time systems
  • Dependent tasks statically scheduled, on a many-core processor

! Unpredictable delays due to shared resource interference Use tightly estimated upper bounds on delays Connect existing approaches for an optimally efficient execution

2 ,

slide-5
SLIDE 5

Outline

1 Solved Problems

Code Generation Task Mapping WCRT Analysis

2 Toward a Solution 3 The Open Problem

3 ,

slide-6
SLIDE 6

Solved Problems

Static Mapping/Scheduling WCRT with Interferences Local WCRT Analysis Timing models (static analysis) Probabilistic Models High-level Program + Executable Binary Binary Generation Code Generation Dependencies Tasks Mapping Execution Order Release Dates + Tasks WCRT WC Access

4 ,

slide-7
SLIDE 7

Solved Problems

Static Mapping/Scheduling WCRT with Interferences High-level Program + Executable Binary Binary Generation Code Generation Dependencies Tasks Mapping Execution Order Release Dates

4 ,

slide-8
SLIDE 8

Solved Problems: Code Generation

Static Mapping/Scheduling WCRT with Interferences High-level Program + Executable Binary Binary Generation Code Generation Dependencies Tasks Mapping Execution Order Release Dates

τ1 τ2 τ3 τ4 τ5 τ6 i1 i2

  • Outputs
  • Task binaries
  • Task dependency graph
  • Execution models: (Pellizzoni et al.[6])
  • Single phase execution
  • acquisition, execution, replication phases

5 ,

slide-9
SLIDE 9

Solved Problems: Task Mapping/Scheduling

Static Mapping/Scheduling WCRT with Interferences High-level Program + Executable Binary Binary Generation Code Generation Dependencies Tasks Mapping Execution Order Release Dates

PE2 PE1 PE0 wcrt0 τ0 wcrt1 τ1 wcrt2 τ2 wcrt3 τ3 wcrt4 τ4 wcrt5 τ5

(*)wcrtx: safe WCRT

  • Respect the dependency constraints
  • Optimize the overall response time

Puffitsch et al. 2013 [7], Giannopoulou et al. 2013 [4], Walter et al. 2015 [8]

6 ,

slide-10
SLIDE 10

Solved Problems: WCRT Analysis

Static Mapping/Scheduling WCRT with Interferences High-level Program + Executable Binary Binary Generation Code Generation Dependencies Tasks Mapping Execution Order Release Dates

PE2 PE1 PE0 wcrt+ τ0 wcrt+

1

τ1 wcrt+

2

τ2 wcrt+

3

τ3 wcrt+

4

τ4 wcrt+

5

τ5

(*)wcrt+

x : refined WCRT

  • Take the interference into account
  • Update the release times

7 ,

slide-11
SLIDE 11

Solved Problems: WCRT Analysis

Static Mapping/Scheduling WCRT with Interferences High-level Program + Executable Binary Binary Generation Code Generation Dependencies Tasks Mapping Execution Order Release Dates

PE2 PE1 PE0 wcrt+ τ0 wcrt+

1

τ1 wcrt+

2

τ2 wcrt+

3

τ3 wcrt+

4

τ4 wcrt+

5

τ5

(*)wcrt+

x : refined WCRT

  • Take the interference into account
  • Update the release times

The overall response time may not be optimal

7 ,

slide-12
SLIDE 12

Toward a Solution

Static Mapping/Scheduling WCRT with Interferences High-level Program + Executable Binary Binary Generation Code Generation Dependencies Tasks Mapping Execution Order Release Dates

Provide new timing information to the mapping/Scheduling analysis

8 ,

slide-13
SLIDE 13

Toward a Solution

Static Mapping/Scheduling WCRT with Interferences High-level Program + Executable Binary Binary Generation Code Generation Dependencies Tasks Mapping Execution Order Release Dates

  • Mapping/Scheduling:
  • Taking into account new timing information
  • Co-schedule communications and

computations (Melani et al. 2015 [5])

  • Clustering non-interfering tasks

(Choi et al. 2016 [2])

9 ,

slide-14
SLIDE 14

Toward a Solution

Static Mapping/Scheduling WCRT with Interferences High-level Program + Executable Binary Binary Generation Code Generation Dependencies Tasks Mapping Execution Order Release Dates

  • Mapping/Scheduling:
  • Taking into account new timing information
  • Co-schedule communications and

computations (Melani et al. 2015 [5])

  • Clustering non-interfering tasks

(Choi et al. 2016 [2])

  • WCRT Analysis:
  • Trade-off: run-time/ pessimism

Altmeyer et al. 2015 [1], Dasari et al. 2015[3]

9 ,

slide-15
SLIDE 15

Toward a Solution

Static Mapping/Scheduling WCRT with Interferences High-level Program + Executable Binary Binary Generation Code Generation Dependencies Tasks Mapping Execution Order Release Dates

  • Mapping/Scheduling:
  • Taking into account new timing information
  • Co-schedule communications and

computations (Melani et al. 2015 [5])

  • Clustering non-interfering tasks

(Choi et al. 2016 [2])

  • WCRT Analysis:
  • Trade-off: run-time/ pessimism

Altmeyer et al. 2015 [1], Dasari et al. 2015[3]

Fixed-point search algorithms

9 ,

slide-16
SLIDE 16

The Open Problem

Static Mapping/Scheduling WCRT with Interferences High-level Program + Executable Binary Binary Generation Code Generation Dependencies Tasks Mapping Execution Order Release Dates

Iterate until an optimal solution is found What about convergence?

10,

slide-17
SLIDE 17

The Open Problem

Static Mapping/Scheduling WCRT with Interferences High-level Program + Executable Binary Binary Generation Code Generation Dependencies Tasks Mapping Execution Order Release Dates

Iterate until an optimal solution is found What about convergence? Suboptimal:

  • Compute several solutions,

choose the best one

  • How many iterations?

10,

slide-18
SLIDE 18

The Open Problem

Static Mapping/Scheduling WCRT with Interferences High-level Program + Executable Binary Binary Generation Code Generation Dependencies Tasks Mapping Execution Order Release Dates

Iterate until an optimal solution is found What about convergence? Suboptimal:

  • Compute several solutions,

choose the best one

  • How many iterations?

Multi/Many-core processors are a game changer in the interaction between WCRT analysis and task mapping/scheduling

10,

slide-19
SLIDE 19

Efficient Execution of Dependent Tasks on Many-Core Processors

Hamza Rihani, Claire Maiza, Matthieu Moy

  • Univ. Grenoble Alpes

Verimag 11,

slide-20
SLIDE 20

References I

  • S. Altmeyer, R. I. Davis, L. Indrusiak, C. Maiza, V. Nelis, and J. Reineke.

A generic and compositional framework for multicore response time analysis. In Proceedings of the 23rd International Conference on Real Time and Networks Systems, RTNS ’15, pages 129–138. ACM, 2015.

  • J. Choi, D. Kang, and S. Ha.

Conservative modeling of shared resource contention for dependent tasks in partitioned multi-core systems. In 2016 Design, Automation Test in Europe Conference Exhibition (DATE), pages 181–186.

  • D. Dasari, V. Nelis, and B. Akesson.

A framework for memory contention analysis in multi-core platforms. Real-Time Systems, pages 1–51, 2015.

12,

slide-21
SLIDE 21

References II

  • G. Giannopoulou, N. Stoimenov, P. Huang, and L. Thiele.

Scheduling of mixed-criticality applications on resource-sharing multicore systems. In Embedded Software (EMSOFT), 2013 Proceedings of the International Conference on, pages 1–15, Sept 2013.

  • A. Melani, M. Bertogna, V. Bonifaci, A. Marchetti-Spaccamela, and
  • G. Buttazzo.

Memory-processor co-scheduling in fixed priority systems. In 23rd ACM International Conference on Real-Time Networks and Systems (RTNS), Lille, France, November, 2015.

  • R. Pellizzoni, A. Schranzhofer, J.-J. Chen, M. Caccamo, and L. Thiele.

Worst case delay analysis for memory interference in multicore systems. In Proceedings of the Conference on Design, Automation and Test in Europe, DATE ’10, pages 741–746.

13,

slide-22
SLIDE 22

References III

  • W. Puffitsch, E. Noulard, and C. Pagetti.

Mapping a multi-rate synchronous language to a many-core processor. In Real-Time and Embedded Technology and Applications Symposium (RTAS), 2013 IEEE 19th, pages 293–302.

  • J. Walter and W. Nebel.

Energy–aware mapping and scheduling of large–scale macro data–flow applications. In 1st International Workshop on Investigating Dataflow in Embedded Computing Architecture, 2015.

14,