real time multi many core
play

Real-Time Multi/Many-Core Architecture Heechul Yun 1 Real-Time - PowerPoint PPT Presentation

Real-Time Multi/Many-Core Architecture Heechul Yun 1 Real-Time Multi/Many-Core Architecture Projects on Real-Time CPU Architectures Assigned Papers Shedding the Shackles of Time-Division Multiplexing, RTSS, 2018 Deterministic


  1. Real-Time Multi/Many-Core Architecture Heechul Yun 1

  2. Real-Time Multi/Many-Core Architecture • Projects on Real-Time CPU Architectures • Assigned Papers – Shedding the Shackles of Time-Division Multiplexing, RTSS, 2018 – Deterministic Memory Abstraction and Supporting Multicore System Architecture. ECRTS, 2018 2

  3. Trends in Automotive E/E Systems Source: Bosch A. Hamann (Bosch). “Industrial Challenge: Moving from Classical to High -Performance Real- Time Systems.” WATER, 2018 . Centralization & High-Performance HW 3

  4. Modern System-on-a-Chip (SoC) GPU Core1 NPU… Core2 Shared Cache Memory Controller (MC) DRAM • Integrate multiple cores, GPU, accelerators • Good performance, size, weight, power • Challenges: time predictability 4

  5. Worst-Case Execution Time (WCET) Image source: [Wilhelm et al., 2008] • Real-time scheduling theory is based on the assumption of known WCETs of real-time tasks 5

  6. Computing WCET • Static analysis – Input: program code, architecture model – output: WCET – Problem: architecture model is hard and pessimistic • Measurement – No guarantee on true worst-case – But, widely used in practice 6

  7. Memory Hierarchies, Pipelines, and Buses for Future Architectures in Time-Critical Embedded Systems IEEE TCAD, 2009 7

  8. “Problematic” CPU Features • Architectures are optimized to reduce average performance • WCET estimation is hard because of – Pipelining – TLBs/Caches – Super-scalar – Out-of-order scheduling – Branch predictors – Hardware prefetchers – Basically anything that affect processor state 8

  9. Static Timing Analysis processor’ finally control-flo 9 [11]–[13]. control-flo flo ely—together interactions—to first first ol-flow program’ flo control-flo influence identifies influence ol-flow influence

  10. Control Flow Graph (CFG) • Analyze code • Split basic blocks • Compute per-block WCET – use abstract CPU model 10

  11. Timing Anomalies • Locally faster != globally faster 11 Image source: [Wilhelm et al., 2008]

  12. Timing Anomalies • Locally faster != globally faster 12 Image source: [Wilhelm et al., 2008]

  13. Challenge: Shared Memory Hierarchy • Memory performance varies widely due to interference • Task WCET can be extremely pessimistic Task 3 Task 4 Task 1 Task 2 Core1 Core3 Core4 Core2 I D I D I D I D Shared Cache Memory Controller (MC) DRAM 13

  14. Effect of Memory Interference 12 Solo Corun 10 Normalized Exeuction Time 8 DNN BwWrite 6 Core1 Core2 Core3 Core4 4 LLC DRAM 2 0 DNN (Core 0,1) BwWrite (Core 2,3) • DNN control task suffers >10X slowdown – When co-scheduling different tasks on on idle cores. Waqar Ali and Heechul Yun. “RT -Gang: Real-Time Gang Scheduling Framework for Safety-Critical Systems.” RTAS , 2019 (to appear) 14

  15. Cache Denial-of-Service Attacks victim attackers Core1 Core2 Core3 Core4 LLC • Observed worst-case: >300X (times) slowdown – On simple in-order multicores (Raspberry Pi3, Odroid C2) Difficult to guarantee predictable timing Michael G. Bechtel and Heechul Yun. “Denial -of-Service Attacks on Shared Cache in Multicore: Analysis and Prevention.” In RTAS , 2019 15 (to appear, Outstanding Paper Award )

  16. Real-Time CPU Architectures • PRET – UC Berkeley. • MERASA/parMERASA project – EU • ACROSS – EU • ARAMIS – Germany • EMC2 – EU 16

  17. FlexPRET: A Processor Platform for Mixed-Criticality Systems RTAS, 2014 17

  18. 18

  19. PRET Pipeline Thread 1, Instruction 1 Thread 1, Instruction 2 DECOD EXECUT DECOD EXECUT THREAD#1 FETCH REGACC MEM EXCEPT FETCH REGACC MEM EXCEPT E E E E DECOD EXECUT DECOD EXECUT THREAD#2 FETCH REGACC MEM EXCEPT FETCH REGACC MEM E E E E DECOD EXECUT DECOD THREAD#3 FETCH REGACC MEM EXCEPT FETCH REGACC MEM E E E DECOD EXECUT DECOD FETCH REGACC MEM EXCEPT FETCH REGACC THREAD#4 E E E DECOD EXECUT DECOD FETCH REGACC MEM EXCEPT FETCH THREAD#5 E E E DECOD EXECUT FETCH REGACC MEM EXCEPT FETCH THREAD#6 E E t 1 clock 19

  20. FlexPRET Pipeline 20

  21. Hardware Support for WCET Analysis of Hard Real-Time Multicore Systems ISCA 2009 21

  22. Analyzable Multicore Architecture • Idea1: Bound interference on shared resources – On-chip shared bus – (shared) L2 cache • Idea2: WCET computation mode 22

  23. Architecture 23

  24. Round-Robin Bus Arbitration • UBD = (NHRT – 1) * Lbus 24

  25. Request vs. Job-level WCET Analysis • Request-level analysis – Assume worst-case interference for each access of the task under analysis – Pessimistic as not all accesses will get interference • Job-level analysis – Assume the total number of competing memory access is known – Can reduce pessimism 25

  26. Summary • Timing anomalies – Locally fast != globally fast on non-timing compositional architectures (i.e., most architectures) • Timing compositional architecture – Free of timing anomalies 26

  27. Discussion • Why is this interesting? • Are assumptions realistic? – Task model – Cache model – Memory model – CPU (pipeline) model 27

  28. Discussion • Why is this interesting? • Are assumptions realistic? – Task model – Cache model – Memory model – CPU (pipeline) model 28

  29. Atomic vs. Split-Transaction Bus • … J. P. Shen and M. H. Lipasti. Modern Processor Design: Fundamentals of Superscalar Processors. Wav 29 eland Press, 2013.

  30. Announcement • Mini Project #1 • DeepPicar Competition – Build a self-driving car – Based on DeepPicar – Competition format 30

  31. Acknowledgement • Some slides are from: – Prof. Rodolfo Pellizzoni, University of Waterloo – Prof. Edward A. Lee, University of Berkeley 31

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend