WCET Analysis for Multi-Core Processors with Shared Buses and - PowerPoint PPT Presentation

WCET Analysis for Multi-Core Processors with Shared Buses and Event-Driven Bus Arbitration Michael Jacobs, Sebastian Hahn, Sebastian Hack Department of Computer Science Saarland University November 16, 2015 saarland university computer science

saarland Considered HW Platform university computer science Multi-core processor with n cores Shared bus ◮ Connecting the cores to the memory ◮ Event-driven bus arbitration ◮ Running example: round-robin ... Cores C 1 C 2 C n Shared Bus Shared Memory Michael Jacobs WCET Analysis for Multi-Core Processors November 16, 2015 1 / 29

saarland Considered Execution Model university computer science Set of programs: Progs = { p 1 , . . . , p | Progs | } Per program p i ∈ Progs : Minimum inter-start time ( mist p i ) ◮ Optional ◮ Zero if not specified Scheduling: Partitioned Non-preemptive Michael Jacobs WCET Analysis for Multi-Core Processors November 16, 2015 2 / 29

saarland WCET Analysis for Multi-Core Processors university computer science Calculate WCET bound for a program executed on a core ◮ Must consider shared-resource interference! ◮ E.g. cycles blocked at shared bus Two kinds of WCET bounds: Co-runner-insensitive ◮ Independent of co-running programs ◮ Only depend on the HW platform ◮ Implicitly assume worst co-runners Co-runner-sensitive ◮ Take into account co-running programs ◮ Consider (limited) scheduling knowledge ◮ Potentially more precise We propose approaches for both! Michael Jacobs WCET Analysis for Multi-Core Processors November 16, 2015 3 / 29

saarland Existing Approaches university computer science Compositionality [Schranzhofer et al., 2011] ◮ WCET analysis ignores bus blocking ◮ Bound on blocked cycles is added ◮ Ignores indirect effects ⇒ Unsound for many HW platforms, e.g. ◮ In-order pipelines with unblocked stores ◮ Out-of-order pipelines Enumerate possible interleavings of accesses by the cores [Kelter and Marwedel, 2014] ◮ High computational complexity ◮ Strong synchronicity assumptions Michael Jacobs WCET Analysis for Multi-Core Processors November 16, 2015 4 / 29

saarland university computer science Co-Runner-Insensitive Analysis Michael Jacobs WCET Analysis for Multi-Core Processors November 16, 2015 5 / 29

saarland Modeling Shared-Bus Interference university computer science By non-determinism ◮ A pending access request can be: ⋆ granted immediately or ⋆ blocked for another cycle ◮ Splits in micro-architectural analysis Bounding the non-determinism ◮ Worst-case per access request ◮ E.g. for round-robin arbitration ⋆ Each concurrent core is granted a complete access first: Path analysis ◮ Find longest path through graph ◮ Modeled as integer linear program (ILP) ◮ Classical implicit path enumeration [Li and Malik, 1995] Michael Jacobs WCET Analysis for Multi-Core Processors November 16, 2015 6 / 29

saarland Experimental Evaluation university computer science Hardware configuration ◮ In-order execution ◮ local instruction scratchpad (fitting whole program) ◮ local data cache (misses served via bus) ◮ Round-robin bus arbitration 31 benchmarks ◮ Mälardalen ◮ Generated from SCADE models Results normalized to analysis ignoring bus interference Geometric mean over normalized results Michael Jacobs WCET Analysis for Multi-Core Processors November 16, 2015 7 / 29

saarland Poor Scalability university computer science Non-determinism increases with number of cores 2 -Core 4 -Core analysis runtime 8.878 38.840 peak memory cons. 1.581 3.616 Michael Jacobs WCET Analysis for Multi-Core Processors November 16, 2015 8 / 29

saarland Exploiting Pipeline Convergence university computer science Pipeline states often converge ◮ After a few cycles blocked at the bus ◮ State unchanged until access finished ◮ Converged chain Michael Jacobs WCET Analysis for Multi-Core Processors November 16, 2015 9 / 29

saarland Exploiting Pipeline Convergence university computer science Pipeline states often converge ◮ After a few cycles blocked at the bus ◮ State unchanged until access finished ◮ Converged chain , e.g. for s 5 Michael Jacobs WCET Analysis for Multi-Core Processors November 16, 2015 9 / 29

saarland Exploiting Pipeline Convergence university computer science Pipeline states often converge ◮ After a few cycles blocked at the bus ◮ State unchanged until access finished ◮ Converged chain UB time dominated by last state in chain ◮ Safely replace chain by last state in it Fast-forwarding of converged chains Michael Jacobs WCET Analysis for Multi-Core Processors November 16, 2015 9 / 29

saarland Improved Scalability university computer science Fast-forwarding improves scalability In-order execution instr. scratchpad instr. cache data cache data cache 2 -Core 4 -Core 2 -Core 4 -Core WCET bound 1.604 2.803 1.678 3.028 analysis runtime 1.685 1.670 5.905 5.903 peak memory cons. 1.056 1.056 1.430 1.423 Runtime and memory consumption independent of n Michael Jacobs WCET Analysis for Multi-Core Processors November 16, 2015 10 / 29

saarland Improved Scalability university computer science Fast-forwarding improves scalability Out-of-order execution instr. scratchpad instr. cache data cache data cache 2 -Core 4 -Core 2 -Core 4 -Core WCET bound 1.657 2.965 1.726 3.175 analysis runtime 3.339 3.473 39.170 47.271 peak memory cons. 1.165 1.187 6.303 7.591 Moderate growth of runtime and memory consumption w.r.t. n Michael Jacobs WCET Analysis for Multi-Core Processors November 16, 2015 10 / 29

saarland university computer science Co-Runner-Sensitive Analysis Michael Jacobs WCET Analysis for Multi-Core Processors November 16, 2015 11 / 29

saarland Iterative Co-Runner-Sensitive Analysis university computer science co-runner- insensitive W analysis Michael Jacobs WCET Analysis for Multi-Core Processors November 16, 2015 12 / 29

saarland Iterative Co-Runner-Sensitive Analysis university computer science blocked cycle bound BC = � α C j ( W ) C j ∈ Conc i co-runner- insensitive W BC analysis C i = core under analysis Conc i = Cores \ { C i } α C j ( W ) = upper bound on number of access cycles of core C j in W cycles Michael Jacobs WCET Analysis for Multi-Core Processors November 16, 2015 12 / 29

saarland Iterative Co-Runner-Sensitive Analysis university computer science blocked cycle bound BC = � α C j ( W ) C j ∈ Conc i co-runner- insensitive W BC analysis repeat ILP path analysis, additional constraint timesTaken e · LB blocked e ≤ BC � e ∈ Edges C i = core under analysis Conc i = Cores \ { C i } α C j ( W ) = upper bound on number of access cycles of core C j in W cycles Michael Jacobs WCET Analysis for Multi-Core Processors November 16, 2015 12 / 29

saarland Iterative Co-Runner-Sensitive Analysis university computer science blocked cycle bound BC = � α C j ( W ) C j ∈ Conc i co-runner- insensitive W until W reaches fixed point BC analysis repeat ILP path analysis, additional constraint timesTaken e · LB blocked e ≤ BC � e ∈ Edges C i = core under analysis Conc i = Cores \ { C i } α C j ( W ) = upper bound on number of access cycles of core C j in W cycles Michael Jacobs WCET Analysis for Multi-Core Processors November 16, 2015 12 / 29

saarland Upper-Bounding Concurrent Access Cycles university computer science Meaning of α C j ( W ) How many access cycles can core C j perform at most in any interval of W time units? Our approach ◮ Micro-architectural analysis of program(s) executed on C j ◮ Generalized implicit path enumeration ◮ Exploit minimum inter-start time for precision Why generalize? ◮ Implicitly enumerate all paths ≤ W ◮ Path may start / end at any program point ◮ Path may span across multiple program runs ◮ Path may span across different programs Michael Jacobs WCET Analysis for Multi-Core Processors November 16, 2015 13 / 29

saarland Experimental Evaluation university computer science Hardware configuration: Dual-core processor Out-of-order execution Instruction cache Data cache Round-robin bus arbitration Setup for experiments: 19 programs of our benchmark suite ◮ Those for which the co-runner-insensitive analysis needed ≤ 5 minutes Co-runner-sensitive analysis for all 19 2 possible pairs ◮ 361 experiments In each experiment ◮ One program per core ◮ Minimum inter-start time of co-runner identical to its WCET bound Michael Jacobs WCET Analysis for Multi-Core Processors November 16, 2015 14 / 29

WCET Analysis for Multi-Core Processors with Shared Buses and - PowerPoint PPT Presentation

WCET Analysis for Multi-Core Processors with Shared Buses and Event-Driven Bus Arbitration Michael Jacobs, Sebastian Hahn, Sebastian Hack Department of Computer Science Saarland University November 16, 2015 saarland university computer

Presentation on Electric Bus By. Suresh A Pawar AGM(TE) Fleet Composition of Buses Type of Fuel

A Framework for the Derivation of WCET Analyses for Multi-Core Processors Michael Jacobs

Welcome Welcome Core: Core A Regional Destination Core: Core UL Core: Core Downtown

Control Flow Analysis for WCET Analysis Bjrn Lisper School of Innovation, Design, and

Caching, Parallelism, Fault Tolerance Marco Serafini COMPSCI 532 Lectures 2-3 Memory Hierarchy

A Framework for the Derivation of WCET Analyses for Multi-Core Processors t i f A r a c t

COMP 590-154: Computer Architecture Shared-Memory Multi-Processors Shared-Memory Multiprocessors

Extending the Path Analysis Technique to Obtain a Soft WCET Paul Keim, Amanda Noyes, Drew

Worst-Case Execution-Time Analysis WCET Analysis slides: P. Puschner, R. Kirner, B. Huber

Lecture 25: Multi-core Processors Todays topics: Writing parallel programs SMT

Tuning the WCET of Embedded Why Reduce the WCET? Applications more likely to meet timing

Modular WCET Analysis of ARM Processors Andreas Engelbredt Dalsgaard Mads Christian Olesen

WCET Analysis of ARM Processors using Real-Time Model Checking Andreas Engelbredt Dalsgaard Mads

Oregon School Buses July 19, 2016 Kevin Downing | Oregon Department of Environmental Quality

#TheOtherElectricBus Fuel cell electric buses in California Keith Malone 2019 #Pioneer 18

"Evaluations results on buses", Jan. 2005 Extract of the work Clean buses Which

Speaker Verification Systems Haizhou Li Institute for Infocomm Research (I 2 R), Singapore

Cartographic Visualization Jennifer Tillett November 10, 2004 From Metaphor to Method:

INTRODUCTION TO MUSICAL TIMBRE II YU / LAMONT FEBRUARY 22, 2018 LINGUIST 197M, SPRING 2018.

CS 398 ACC MapReduce Part 1 Prof. Robert J. Brunner Ben Congdon Tyler Kim Data Science

IPPM Considerations for the IPv6 PDM Destination Option Nalini Elkins Inside Products, Inc.

Programming Tools for Embedded Multicore Jakob Engblom Technical Marketing Manager Simics

In Search of Lost Time Bernadette Charron-Bost CNRS / Ecole Polytechnique, France Martin Hutle

TDDE18 & 726G77 Inheritance and polymorphism Introduction to inheritance Inheritance

WCET Analysis for Multi-Core Processors with Shared Buses and - PowerPoint PPT Presentation

WCET Analysis for Multi-Core Processors with Shared Buses and Event-Driven Bus Arbitration Michael Jacobs, Sebastian Hahn, Sebastian Hack Department of Computer Science Saarland University November 16, 2015 saarland university computer

Presentation on Electric Bus By. Suresh A Pawar AGM(TE) Fleet Composition of Buses Type of Fuel

A Framework for the Derivation of WCET Analyses for Multi-Core Processors Michael Jacobs

Welcome Welcome Core: Core A Regional Destination Core: Core UL Core: Core Downtown

Control Flow Analysis for WCET Analysis Bjrn Lisper School of Innovation, Design, and

Caching, Parallelism, Fault Tolerance Marco Serafini COMPSCI 532 Lectures 2-3 Memory Hierarchy

A Framework for the Derivation of WCET Analyses for Multi-Core Processors t i f A r a c t

COMP 590-154: Computer Architecture Shared-Memory Multi-Processors Shared-Memory Multiprocessors

Extending the Path Analysis Technique to Obtain a Soft WCET Paul Keim, Amanda Noyes, Drew

Worst-Case Execution-Time Analysis WCET Analysis slides: P. Puschner, R. Kirner, B. Huber

Lecture 25: Multi-core Processors Todays topics: Writing parallel programs SMT

Tuning the WCET of Embedded Why Reduce the WCET? Applications more likely to meet timing

Modular WCET Analysis of ARM Processors Andreas Engelbredt Dalsgaard Mads Christian Olesen

WCET Analysis of ARM Processors using Real-Time Model Checking Andreas Engelbredt Dalsgaard Mads

Oregon School Buses July 19, 2016 Kevin Downing | Oregon Department of Environmental Quality

#TheOtherElectricBus Fuel cell electric buses in California Keith Malone 2019 #Pioneer 18

&quot;Evaluations results on buses&quot;, Jan. 2005 Extract of the work Clean buses Which

Speaker Verification Systems Haizhou Li Institute for Infocomm Research (I 2 R), Singapore

Cartographic Visualization Jennifer Tillett November 10, 2004 From Metaphor to Method:

INTRODUCTION TO MUSICAL TIMBRE II YU / LAMONT FEBRUARY 22, 2018 LINGUIST 197M, SPRING 2018.

CS 398 ACC MapReduce Part 1 Prof. Robert J. Brunner Ben Congdon Tyler Kim Data Science

IPPM Considerations for the IPv6 PDM Destination Option Nalini Elkins Inside Products, Inc.

Programming Tools for Embedded Multicore Jakob Engblom Technical Marketing Manager Simics

In Search of Lost Time Bernadette Charron-Bost CNRS / Ecole Polytechnique, France Martin Hutle

TDDE18 &amp; 726G77 Inheritance and polymorphism Introduction to inheritance Inheritance

"Evaluations results on buses", Jan. 2005 Extract of the work Clean buses Which

TDDE18 & 726G77 Inheritance and polymorphism Introduction to inheritance Inheritance