Models of Architecture
Nokia Bell Labs 2018 Maxime Pelcat INSA Rennes, IETR, Institut Pascal
This work has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 732105: CERBERO.
Models of Architecture Maxime Pelcat INSA Rennes, IETR, Institut - - PowerPoint PPT Presentation
Models of Architecture Maxime Pelcat INSA Rennes, IETR, Institut Pascal Nokia Bell Labs 2018 This work has received funding from the European Unions Horizon 2020 research and innovation programme under grant agreement No 732105: CERBERO. INSA
Nokia Bell Labs 2018 Maxime Pelcat INSA Rennes, IETR, Institut Pascal
This work has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 732105: CERBERO.
2
computing, edge computing, many-cores, etc.
Multi-ARM FPGA Multi-ARM GPGPU Multi-ARM DSP Multi-ARM
Multi-ARM FPGA Multi-ARM GPGPU Multi-ARM DSP Multi-ARM
A15 A15 A15 A15
SCU ACE
A7 A7 A7 A7
SCU 2MB 0.5MB 2GB DDR (PoP) Easy to program Linux SMP Thread migration 12Gflops <10W
Low energy cores High Performance cores 2GHz 1.4GHz
Multi-ARM FPGA Multi-ARM GPGPU Multi-ARM DSP Multi-ARM
A57 A57 A57 A57
SCU 4GB external DDR
module Less easy to program Linux SMP + CUDA/OpenCL
32 cores /warp Control path 1.6GHz
256-core Maxwell GPGPU
Data path
H.264 4K 60Hz
Multi-ARM FPGA Multi-ARM GPGPU Multi-ARM DSP Multi-ARM
A15 A15 A15 A15
SCU 4MB Difficult to program (well) Linux SMP + Open Event Machine 160 Gflops <15W
Control path 1.4GHz Data path
Teranet
C66
1MB
C66
1MB
C66
1MB
C66
1MB
C66
1MB
C66
1MB
C66
1MB
C66
1MB
FFTC
6MB MSMC
1.2GHz
Multi-ARM FPGA Multi-ARM GPGPU Multi-ARM DSP Multi-ARM
A53 A53 A53 A53
SCU 1MB More difficult to program (well) Linux SMP + HLS or HDL
Control path 1.5GHz Data path
GPU FPGA R5 R5
Switch fabric
Not GPGPU Up to 4MB 1MFF 0.5MLUT 600MHz
ALU
clk clk clk
+ + + + Ld × + Str SIMD VLIW
–At coarse grain, PEs communicate asynchronously –There is no (or less) centralized processing decision –There is no performance portability (nothing equivalent to C-to-VLIW compilers)
–Can we predict performance at design time? How?
Maxime Pelcat 17
System Prototype
Maxime Pelcat 18
Architecture Design Algorithm Application Redesign Redesign
Model of Architecture (MoA) conform to
19
KPI Architecture Model KPI Evaluation Algorithm Algorithm Model Redesign
Maxime Pelcat
Model of Computation(MoC) conforms to Redesign
20 Maxime Pelcat
PREESM
Feature
SDF ADF IBSDF DSSF PSDF PiSDF SADF SPDF DPN KPN
Expressivity Low Med. Turing complete Hierarchical X X X X Compositional X X X Reconfigurable X X X X X X Statically schedulable X X X X Decidable X X X X (X) (X) X (X) Variable rates X X X X X X X Non-determinism X X X
SDF: Synchronous Dataflow ADF: Affine Dataflow IBSDF: Interface-Based Dataflow DSSF: Deterministic SDF with Shared Fifos PSDF: Parameterized SDF PiSDF Parameterized and Interfaced SDF SADF: Scenario-Aware Dataflow SPDF: Schedulable Parametric Dataflow DPN: Dataflow Process Network KPN: Kahn Process Network
22 Maxime Pelcat
T1 T2 T3
23
Energy Energy Evaluation Algorithm Algorithm Model
Maxime Pelcat
Model of Computation(MoC) conforms to
Maxime Pelcat 24
Model of Architecture (MoA) conform to KPI Architecture Model KPI Evaluation Algorithm Algorithm Model Redesign Redesign
25 Maxime Pelcat
26 Maxime Pelcat
Pelcat, M; Mercat, A; Desnos, K; Maggiani, L; Liu, Y; Heulot, J; Nezan, J-F; Hamidouche, W; Ménard, D; Bhattacharyya, S (2017) "Reproducible Evaluation of System Efficiency with a Model of Architecture: From Theory to Practice", IEEE TCAD.
Maxime Pelcat 27
Model Reproducible Application- independent Abstract AADL
MCA SHIM
UML MARTE
/
AAA
CHARMED
S-LAM
MAPS
LSLA
NFP = MoA( ) activity( )
MoA depends on MoC
28
One and always the same quality evaluation Model H conforms to MoA Model G conforms to MoC Activity
MoC( )
Maxime Pelcat
application
Performance Power Energy Memory T°C Reliability Security Cost
29
KPI MoA MoC Act
Maxime Pelcat
Maxime Pelcat 30
Maxime Pelcat 31
20W 20kW 20MW Need a dissipator 2W 7W Need a fan Embedded system Dedicated system
HPC HPeC influence
33
Task1 signal signal Task2 Task3 Task4 Task5 1 1 1 1 1 1 1
PE1 PE2
CN
10x+1 2x+0 3x+0
16+12+22=50
Maxime Pelcat
token quantum Compositional
34
Task1 signal signal Task2 Task3 Task4 Task5 1 1 1 1 1 1 1
PE1 PE2
CN
10x+1 2x+0 3x+0
16+12+22=50
Maxime Pelcat
SDF: Model of Computation Activity LSLA: Model of Architecture
35 Maxime Pelcat
36
PE PE
CN
PE PE PE PE
CN
PE PE
CN
Maxime Pelcat
37
PE PE
CN
α 1.5W 1.5W PE PE 1.5W 1.5W PE PE
CN
γ 0.3W 0.3W PE PE 0.3W 0.3w
CN
β
Maxime Pelcat
Maxime Pelcat 38
Maxime Pelcat 39
40
Maxime Pelcat
Task1 Task2 1 1 1 Task1 Task2 1 1 1 1
Latency = sum Latency = max
41
Task1 signal signal Task2 Task3 Task4 Task5 1 1 1 1 1 1 1
SDF a) b)
Maxime Pelcat
c)
42
PE1 PE2
CN
10x+1 2x+0 3x+0
Σ 12+12+11=35 Σ 8+6+11=25 max(35,25)=35 a) b)
Maxime Pelcat
MaxPlus
c)
43
PE1 PE2
CN
10x+1 2x+0 3x+0
Σ 24
Maxime Pelcat
System Prototype
Maxime Pelcat 44
Architecture Design Algorithm Application Redesign Redesign
Maxime Pelcat 45
Maxime Pelcat 46
Maxime Pelcat 47
Maxime Pelcat 48
VT AOW DynAA PAPI SPIDER JADE ARTICO3 MDC Intermediate Representation C++ System Model Application / Architecture Runtime Support Low-Level Implementation (Hardware Abstraction) PREESM MECA End-user interaction
Maxime Pelcat 49
Maxime Pelcat 50
Maxime Pelcat 51
–Especially for HPeC systems
–They model HW + protocols + OS + …
–They can and should be simple
–Due to Fog/Edge Computing complexity
Maxime Pelcat 52
KPI MoA MoC Act
Maxime Pelcat 53
Pelcat, M; Mercat, A; Desnos, K; Maggiani, L; Liu, Y; Heulot, J; Nezan, J-F; Hamidouche, W; Ménard, D; Bhattacharyya, S (2017) "Reproducible Evaluation of System Efficiency with a Model of Architecture: From Theory to Practice", IEEE TCAD.
www.cerbero-h2020.eu http://preesm.org