M ULTICORE H ARDWARE S HARED R ESOURCES : U NDERSTANDING OF THE S - PowerPoint PPT Presentation

C ONTENTION IN M ULTICORE H ARDWARE S HARED R ESOURCES : U NDERSTANDING OF THE S TATE OF THE A RT Gabriel Fernandez 1 , Jaume Abella 2 , Eduardo Quiñones 2 , Christine Rochange 3 , Tullio Vardanega 4 and Francisco J. Cazorla 2,4 1 2 3 5 4 14 th International Workshop on Worst ‐ Case Execution Time Analysis (WCET 2014)

Multicores: benefits and challenges • Multicores – Allow higher “guaranteed performance” • Guaranteed as opposed to average ‐ case – Interference on execution time and WCET due to contention in the access to HW shared resources • Challenge timing analysis • Higher impact than in singlecore • Contention in multicores has been deeply studied by the research community – Different approaches taken to contention • At different levels of abstraction – The solutions space is difficult to fully understand 2

Motivation of this work • Provide a sensible taxonomy of the SoA techniques – Identifying ‘families’ of techniques – Singling out representative works for each class • Without seeking absolutely exhaustive coverage • Review each family – Seeking overlaps and gaps with others – Understanding assumptions and challenges of use – Gaging confidence in WCET bounds and assurance guarantees for industrial use • Capture cross ‐ cutting techniques 3

Taxonomy Handling Contention System WCET Architecture COTS Centric Centric Centric Centric Independent Joint Analysis Analysis Time Task Assignment Analysis and Scheduling Frameworks Bottom ‐ up / Top ‐ down Contention Contention oblivious aware Idealistic ‐ innovative / 4 Practical ‐ pragmatic

System ‐ centric Handling Contention System Centric Time Task Analysis Assignment and Frameworks Scheduling Contention Contention aware oblivious 5

Timing analysis frameworks • Assume replicated on ‐ chip resources – SW on core suffers no parallel contention • Model off ‐ chip shared resources in isolation – Provide worst ‐ case access timing bounds – Contention captured compositionally: off ‐ chip contention in the presence of co ‐ runners • TDMA arbiter – Co ‐ running tasks do not affect one another’s execution time – Worst ‐ case alignment of the requests in the TDMA • Dynamic arbiter – Co ‐ running tasks do affect one another’s execution time – Focus on deriving bounds for the number of accesses per task in a given period of time 6

Task allocation and scheduling • Contention oblivious – The WCET of all tasks is given in input • WCET bounds may be determined before decisions are made on task mapping and on scheduling – Escape circularity in the mutual dependence between WCET analysis and schedulability analysis • Contention aware – Focus on the shared last ‐ level cache – Benefit from HW techniques for cache partitioning or allocate program data to different pages – Assume partitioned scheduling and augment assignment with colouring 7

WCET ‐ centric Handling Contention WCET Centric Independent Joint Analysis Analysis 8

Including contention costs in WCETs • Stall times integrated in the ILP formulation used to derive WCETs (IPET method) – Worst ‐ case memory instruction latencies – Worst ‐ case number of L2 cache misses • Two philosophies to capture worst cases – Contextual • The set of concurrent threads/tasks is known at analysis time ➙ joint analysis – Universal • Concurrent tasks are unknown ➙ independent analysis • Needs hardware/software support 9

Joint analysis of concurrent tasks • Approach A • Approach B – Iterative computation – Timed automata of interferences private resources low ‐ low ‐ low ‐ low ‐ low ‐ low ‐ WCET level level level level level level analysis analysis analysis analysis analysis analysis of Task A shared Task A Task A Task B Task B Task C Task C resources analysis of possible analysis of possible interferences interferences + model checking show: WCET(A) < x tasks schedule 10

Independent analysis • No assumption on the concurrent workload – Independent of task assignment and scheduling • Requires hardware/software support – To derive worst ‐ case latencies and worst ‐ case behaviours – Examples include • Partitioned caches: eliminate impact from concurrent tasks • Static bus arbiters: make it possible to derive worst ‐ case latencies 11

Architecture ‐ Centric Handling Contention Architecture Centric 12

Hardware support for handling contention • Bound contention impact on access time to hardware shared resources – TTA (<‘00), PRET (’06), CompSOC (‘09), MERASA (‘07), … • Time composability – WCET estimates • The execution time of a task varies under different workloads its WCET estimate does not – Execution time • Same execution time under any workload • Time composability is achieved by ‘resource reservation’  performance degradation 13

Hardware support for handling contention • Bound contention impact on access time to hardware shared resources – Indirectly: bandwidth guarantees – Directly: access time guarantees • Type of resources – Stateless (e.g bus): access policy – Stateful (e.g. cache): partition to prevent task interaction • NoC 14

COTS Handling Contention COTS Centric 15

Challenge • Time analyzability properties of real COTS multicores – No assumptions can be made – Analyze hardware shared resources – Analyze their impact on execution time – Bounds derived by ad ‐ hoc experiments • Understanding timing behavior of hardware shared resources – The way they challenge timing analyzability • Software cache partitioning on ARM A9 16

Critique 17

System ‐ centric • Time Analysis frameworks: assumptions – One shared resource, blocking and no split – Program broken down into superblocks with resource usage bounds per block – Dynamic arbiters • WCET estimate dependent on co ‐ runners: this can be tightened but it is no longer time composable • Task assignment and scheduling – Static task ‐ to ‐ CPU assignment determines opponents • This is good but not enough unless you have a viable technique to avoid exploring the space of all possible contentions • Static over ‐ provisioning is never good news and may defeat the purpose 18

WCET ‐ centric techniques • Assumptions Independent analysis Joint analysis • Static (boundable) • One task per core, arbitration of shared schedule known resources • Limits Independent analysis Joint analysis • Pessimism (blind • Not time composable estimation of • Complexity (state contention) explosion) 19

Architecture ‐ centric • Will the proposed designs ever see the silicon? – Applies to all hardware designs ; ‐ ) – Cache partitioning mechanisms: won battle – Proposed changes are ‘simple’ • Timing Anomalies – Design hardware that prevents appearance of TA 20

COTS ‐ centric • Architectural support for isolation or controlled contention – Not fully adopted! • This generates uncertainty – Build confidence arguments in accordance with requirements and practices of the application domain – How safety assurance relate to stipulating bounds on execution time 21

Concluding remarks • More understanding of existing techniques is needed – Do they form a consistent picture from which a user can choose sensibly? • What is the top priority for the industrial user – Question for the audience • Seeking time composability vs. guaranteed performance – First negatively affects the second – Not possible in the single ‐ core sense  compositional 22

Work mainly funded by … 23

M ULTICORE H ARDWARE S HARED R ESOURCES : U NDERSTANDING OF THE S - PowerPoint PPT Presentation

C ONTENTION IN M ULTICORE H ARDWARE S HARED R ESOURCES : U NDERSTANDING OF THE S TATE OF THE A RT Gabriel Fernandez 1 , Jaume Abella 2 , Eduardo Quiones 2 , Christine Rochange 3 , Tullio Vardanega 4 and Francisco J. Cazorla 2,4 1 2 3 5 4 14

TO ERC 02 M ARCH 2011 P ROCUREMENT REPORT TO ERC 02 M ARCH 2011 E XECUTIVE R ESOURCES C OMMITTEE :

S HARING H EALTH C ARE R ESOURCES VA/T RIBAL S HARING A GREEMENT ; A LASKA P RIMARY C ARE A

HARDWARE H ARDWARE T YPES Microcontroller (MCU) Arduino, ESP8266, Particle Single Board

H ARDWARE P REPROCESSING F RAMEWORK (HPF) Traditional hardware Hardware preprocessing description

E XPLOITING S EMANTIC C OMMUTATIVITY IN H ARDWARE S PECULATION G UOWEI Z HANG , V IRGINIA C HIU , D

T IME T RAVELING H ARDWARE AND S OFTWARE S YSTEMS Xiangyao Yu, Srini Devadas CSAIL, MIT F OR F

Architect: Bahgat Sabry TRIPLE S S hared S pace S ystem EVERY MINUTE MATTERS For the first time

What is a PASSE? The P rovider-led A rkansas S hared S avings E ntity (PASSE) is a model of

THE C OST OF U PDATES TO S HARED D ATA IN C ACHE -C OHERENT S YSTEMS G UOWEI Z HANG , W EBB H ORN ,

C.A.R.E. Package for suicide prevention C ommunity A wareness R esources E ducation What is

R ESOURCES A VAILABLE FROM AAP AAP.org/Drowning Social media graphics in English & Spanish

80% 80% b by 201 2018 Reso esources Color Colorectal l Ca Canc ncer Roundt Roundtabl ble

More on KURC EUR September 2015 2 KURC-1 K APPA U nconventional R esources C onsortium

N EW D ATA R ESOURCES : OECD S E C HEM P ORTAL 25 March 2014, Society for Chemical Hazard

Ju July ly Ecosystem em Enrich ichmen ent: t: Reso esources ces for a a Rec ecover

Lake County Clean Water Program Water Quality Activities TMDL L AKE C OUNTY W ATER R ESOURCES D

QtWidgets and QtQuick.Controls - A Comparison Qt Developer Days Europe 2014 Presented by Kevin

XML and Content Management Lecture 3: Modelling XML Documents: XML Schema Maciej Ogrodniczuk,

CS 501: TA Training Seminar Neeraj Kumar cs.ucsb.edu/ leadta CS 501: TA Training Seminar

Making Drupal Friendly for Editors and Clients DrupalGovCon

UNDERSTANDING TRANSACTIONAL MEMORY PERFORMANCE Donald E. Porter and Emmett Witchel The

Shuffling: A Lock Contention Aware Thread Scheduling Technique Kishore Pusukuri Multicores are

COOPERATION INSTEAD OF CONTENTION! THE NEBULOUS CONCEPT OF WIRELESS LINK. Network

Tanima Dey Wei Wang, Jack W. Davidson, Mary L. Soffa e a g, Jac a dso , a y So a Department

M ULTICORE H ARDWARE S HARED R ESOURCES : U NDERSTANDING OF THE S - PowerPoint PPT Presentation

C ONTENTION IN M ULTICORE H ARDWARE S HARED R ESOURCES : U NDERSTANDING OF THE S TATE OF THE A RT Gabriel Fernandez 1 , Jaume Abella 2 , Eduardo Quiones 2 , Christine Rochange 3 , Tullio Vardanega 4 and Francisco J. Cazorla 2,4 1 2 3 5 4 14

TO ERC 02 M ARCH 2011 P ROCUREMENT REPORT TO ERC 02 M ARCH 2011 E XECUTIVE R ESOURCES C OMMITTEE :

S HARING H EALTH C ARE R ESOURCES VA/T RIBAL S HARING A GREEMENT ; A LASKA P RIMARY C ARE A

HARDWARE H ARDWARE T YPES Microcontroller (MCU) Arduino, ESP8266, Particle Single Board

H ARDWARE P REPROCESSING F RAMEWORK (HPF) Traditional hardware Hardware preprocessing description

E XPLOITING S EMANTIC C OMMUTATIVITY IN H ARDWARE S PECULATION G UOWEI Z HANG , V IRGINIA C HIU , D

T IME T RAVELING H ARDWARE AND S OFTWARE S YSTEMS Xiangyao Yu, Srini Devadas CSAIL, MIT F OR F

Architect: Bahgat Sabry TRIPLE S S hared S pace S ystem EVERY MINUTE MATTERS For the first time

What is a PASSE? The P rovider-led A rkansas S hared S avings E ntity (PASSE) is a model of

THE C OST OF U PDATES TO S HARED D ATA IN C ACHE -C OHERENT S YSTEMS G UOWEI Z HANG , W EBB H ORN ,

C.A.R.E. Package for suicide prevention C ommunity A wareness R esources E ducation What is

R ESOURCES A VAILABLE FROM AAP AAP.org/Drowning Social media graphics in English &amp; Spanish

80% 80% b by 201 2018 Reso esources Color Colorectal l Ca Canc ncer Roundt Roundtabl ble

More on KURC EUR September 2015 2 KURC-1 K APPA U nconventional R esources C onsortium

N EW D ATA R ESOURCES : OECD S E C HEM P ORTAL 25 March 2014, Society for Chemical Hazard

Ju July ly Ecosystem em Enrich ichmen ent: t: Reso esources ces for a a Rec ecover

Lake County Clean Water Program Water Quality Activities TMDL L AKE C OUNTY W ATER R ESOURCES D

QtWidgets and QtQuick.Controls - A Comparison Qt Developer Days Europe 2014 Presented by Kevin

XML and Content Management Lecture 3: Modelling XML Documents: XML Schema Maciej Ogrodniczuk,

CS 501: TA Training Seminar Neeraj Kumar cs.ucsb.edu/ leadta CS 501: TA Training Seminar

Making Drupal Friendly for Editors and Clients DrupalGovCon

UNDERSTANDING TRANSACTIONAL MEMORY PERFORMANCE Donald E. Porter and Emmett Witchel The

Shuffling: A Lock Contention Aware Thread Scheduling Technique Kishore Pusukuri Multicores are

COOPERATION INSTEAD OF CONTENTION! THE NEBULOUS CONCEPT OF WIRELESS LINK. Network

Tanima Dey Wei Wang, Jack W. Davidson, Mary L. Soffa e a g, Jac a dso , a y So a Department

R ESOURCES A VAILABLE FROM AAP AAP.org/Drowning Social media graphics in English & Spanish