Domain-specific front-end for virtual Domain-specific front-end for - - PowerPoint PPT Presentation
Domain-specific front-end for virtual Domain-specific front-end for - - PowerPoint PPT Presentation
Domain-specific front-end for virtual Domain-specific front-end for virtual system modeling Workshop on Graphical Modeling Language Development at ECMFA 2012 Janne Vatjus-Anttila, Jari Kreku, Kari Tiensyrj VTT Technical Research Centre of
2 03/07/2012
Outline
Motivation System-level performance evaluation techniques ABSOLUT performance modeling and evaluation Workload and platform modeling Performance simulation Domain-specific front-end for ABSOLUT Workload, platform and allocation modeling DSLs Case Example Conclusions
3 03/07/2012
Motivation
Increasing complexity of embedded system design New methods and tools needed for increasing productivity ABSOLUT Early-phase performance modeling and evaluation Set of prototype tools Command-line based tools: high learning curve Domain specific modeling (DSM) Raise the level of abstraction above programming Used on many different domains DSM was used in this work to shorten learning curve, improve usability and raise modeling abstraction of ABSOLUT
4 03/07/2012
Techniques for system-level performance evaluation
Virtual system Abstract application and platform models (no ISS) Instruction and cycle approximate System level exploration Virtual platform Real application software and a virtual platform model (ISS) Instruction accurate Software development Virtual prototype Real application software and a virtual prototype (HDL) Instruction accurate, clock accurate Co-verification
5 03/07/2012
ABSOLUT virtual system modeling
IDLE UPDATE ESTABLISHCONNECTION [xp_connection_succ] /^ConnectionEstablishedInd(); set(T1(),now+40000000); [xp_connection_succ] /^ConnectionEstablishedInd(); set(T1(),now+40000000); ShutDownApplicationReq() /^ShutDownApplicationCnf(); ShutDownApplicationReq() /^ShutDownApplicationCnf(); OpenApplicationReq() /^OpenApplicationCnf(); OpenApplicationReq() /^OpenApplicationCnf(); PROCESS_SCREEN_TAP [xp_processtap_succ] /^TouchScreenTapActionCnf(); [xp_processtap_succ] /^TouchScreenTapActionCnf(); [xp_screenupdate_succ] /^UpdateCompleteInd(); set(T1(),now+40000000); [xp_screenupdate_succ] /^UpdateCompleteInd(); set(T1(),now+40000000); WAITING_EVENTS T1() T1() TouchScreenTapActionReq() TouchScreenTapActionReq() Disassembly of section .init: 00008000 <_init>: 8000: e1a0c00d mov ip, sp 8004: e92ddff8 stmdb sp!, {r3, r4, r5, 8008: e24cb004 sub fp, ip, #4 800c: eb000023 bl 80a0 <frame_dummy 8010: eb000c8c bl b248 <__do_global 8014: e24bd028 sub sp, fp, #40 8018: e89d6ff0 ldmia sp, {r4, r5, r6, 801c: e1a0f00e mov pc, lr Disassembly of section .text: 00008020 <__do_global_dtors_aux>: 8020: e92d4030 stmdb sp!, {r4, r5,Application control behavior Data processing and memory access load
Compos ite structure diagram1 active class VNCInternetBrowser : Application_Workload {1/1} p1 p1 FromNavWL ToNavWL FromNavWL ToNavWL FromConWL ToConWL FromConWL ToConWL +ConWL : Connection_Workload p1 p1 +NavWL : Navigation_Workload p1 p1 VNCCrl : V NCP rocessControl p1 p1 p2 p2 p3 p3 p5 p5 p4 p4 ToMWL FromMWL ToMWL FromMWL +DisWL : Display_Workload p1 p1 + MemWL : Mem oryHandling_Workload p1 p1 F romDisWL ToDi sWL F romDisWL ToDi sWL FromMemWL ToMemWL FromMemWL ToMemWL= Workload model + Enables early simulation Improves simulation speed Reduces modeling effort Presents resource requirements
- f applications
Platform independent ABSINTH model generator
Application modelling Simulation
Model of both hardware and platform software components and interconnections Abstract, transaction-level SystemC models Platform, subsystem and component layers Computing, communication and storage resources Role in ABSOLUT Processes load primitives Provides higher-level services Consumes time (cycle-approximate) ALE library of component models COGNAC text-based configuration system
Platform modelling Resources + Allocation Interfaces Simulation control Utilisation of resources Resources Platform characteristics
System is simulated using OSCI SystemC kernel 2.2
- 1. Workload
models utilise resources with read, write, execute and/or service requests
- 2. OS models
propagate load from WL models to the CPUs
- 3. CPU models model execution time
for data processing instructions and propagate reads and writes to other components Workload / platform interfaces SystemC threads
Performance simulation
Instrumentationof workload and / or platform models performance and power probes Status probes, e.g. resource utilisation Timers, e.g. service processingtime Counters, e.g. numberof reads and writes Visualisationwith VODKA
Display Image Storage Subsystem 1.6 ms 1 ms 1.4 ms 25 DMA transfer 4.8 ms 4.8 ms 4.8 ms 25 Decoding 480 µs 1 µs 450 µs 26 DMA transfer Max Min Average Calls Service Display Image Storage Subsystem 1.6 ms 1 ms 1.4 ms 25 DMA transfer 4.8 ms 4.8 ms 4.8 ms 25 Decoding 480 µs 1 µs 450 µs 26 DMA transfer Max Min Average Calls Service Display subsystem load 0,00 % 5,00 % 10,00 % 15,00 % 20,00 % 25,00 % 30,00 % 35,00 % 40,00 % display.gpp0 display.sram display.dma display.display_if display.bus display.net_if Component Ut ilisa tio n % BusySimulation results OSCI SystemC Utilisation Latencies Etc
6 03/07/2012
Workload modeling
IDLE ESTABLISHCONNECTION [xp_connection_succ] /^ConnectionEstablishedInd(); set(T1(),now+40000000); [xp_connection_succ] /^ConnectionEstablishedInd(); set(T1(),now+40000000); ShutDownApplicationReq() /^ShutDownApplicationCnf(); ShutDownApplicationReq() /^ShutDownApplicationCnf(); OpenApplicationReq() /^OpenApplicationCnf(); OpenApplicationReq() /^OpenApplicationCnf(); [xp_processtap_succ] /^TouchScreenTapActionCnf(); [xp_processtap_succ] /^TouchScreenTapActionCnf(); [xp_screenupdate_succ] /^UpdateCompleteInd(); set(T1(),now+40000000); [xp_screenupdate_succ] /^UpdateCompleteInd(); set(T1(),now+40000000); WAITING_EVENTS
Application control behavior = Workload model + Enables early simulation Improves simulation speed Reduces modeling effort
UPDATE PROCESS_SCREEN_TAP /^TouchScreenTapActionCnf(); /^TouchScreenTapActionCnf(); T1() T1() TouchScreenTapActionReq() TouchScreenTapActionReq()
Disassembly of section .init: 00008000 <_init>: 8000: e1a0c00d mov ip, sp 8004: e92ddff8 stmdb sp!, {r3, r4, r5, 8008: e24cb004 sub fp, ip, #4 800c: eb000023 bl 80a0 <frame_dummy 8010: eb000c8c bl b248 <__do_global 8014: e24bd028 sub sp, fp, #40 8018: e89d6ff0 ldmia sp, {r4, r5, r6, 801c: e1a0f00e mov pc, lr Disassembly of section .text: 00008020 <__do_global_dtors_aux>: 8020: e92d4030 stmdb sp!, {r4, r5,
Data processing and memory access load
Composite structure diagram1 active class VNCInternetBrowser : Application_Workload {1/1}
p1 p1 FromNavWL ToNavWL FromNavWL ToNavWL FromConWL ToConWL FromConWL ToConWL +ConWL : Connection_Workload p1 p1 +NavWL : Navigation_Workload p1 p1 VNCCrl : VNCProcessControl p1 p1 p2 p2 p3 p3 p5 p5 p4 p4 ToMWL FromMWL ToMWL FromMWL +DisWL : Display_Workload p1 p1 +MemWL : MemoryHandling_Workload p1 p1 FromDisWL ToDisWL FromDisWL ToDisWL FromMemWL ToMemWL FromMemWL ToMemWL
+ Presents resource requirements
- f applications
Platform independent Support tools (ABSINTH)
7 03/07/2012
Platform capacity modeling
Model of both hardware and platform software components and interconnections Abstract, transaction-level SystemC models Platform, subsystem and component layers Platform, subsystem and component layers Computing, communication and storage resources Role in ABSOLUT Processes load primitives Provides higher-level services Consumes time (cycle-approximate) Support tools and libraries (COGNAC, ALE)
8 03/07/2012
Performance simulation
OS model ProcessWL 1 ApplicationWL Load primitives, Service calls Subsystem
System is simulated using OSCI SystemC kernel 2.2
- 1. Workload
models utilise
- 2. OS models
propagate load from WL models
Processing unit 1 ProcessWL 2 ProcessWL N Processing unit N
models utilise resources with read, write, execute and/or service requests from WL models to the CPUs
- 3. CPU models model execution time
for data processing instructions and propagate reads and writes to other components Workload / platform interfaces SystemC threads
9 03/07/2012
Simulation results
Instrumentation of workload and / or platform models performance and power probes Status probes, e.g.
Image Storage Subsystem 4.8 ms 4.8 ms 4.8 ms 25 Decoding 480 µs 1 µs 450 µs 26 DMA transfer Max Min Average Calls Service Image Storage Subsystem 4.8 ms 4.8 ms 4.8 ms 25 Decoding 480 µs 1 µs 450 µs 26 DMA transfer Max Min Average Calls Service
Display subsystem load
35,00 % 40,00 %
Status probes, e.g. resource utilisation Timers, e.g. service processing time Counters, e.g. number of reads and writes Visualisation tool (VODKA)
Display 1.6 ms 1 ms 1.4 ms 25 DMA transfer Display 1.6 ms 1 ms 1.4 ms 25 DMA transfer
0,00 % 5,00 % 10,00 % 15,00 % 20,00 % 25,00 % 30,00 % display.gpp0 display.sram display.dma display.display_if display.bus display.net_if Component Utilisation% Busy
10 03/07/2012
Domain-specific modeling (DSM)
DSM commonly used as productivity tool for Developing a DSL to model a specific application/product Generating documentation, validity checking, code, etc. Domain-specific front-end for ABSOLUT Develop DSLs for early phase embedded system design exploration DSLs for the three modeling phases in ABSOLUT Workload modeling Platform modeling Allocation
11 03/07/2012
Workload modeling DSL
Workload primitives of basic blocks (XML) Define and generate workload Control (gzipped trace) generate workload models Output: ”pointers” to generated workload models
12 03/07/2012
Platform modeling DSL
Design a platform model from existing components; generate (XML) Components instantiated from model library
13 03/07/2012
Allocation modeling DSL
Allocate workload models on the processing elements of the platform model Output: allocation specification (INI-file)
14 03/07/2012
ABSOLUT Y-chart approach with DSLs
IDLE UPDATE ESTABLISHCONNECTION [xp_connection_succ] /^ConnectionEs tablishedInd(); set(T1(),now+40000000); [xp_connection_succ] /^ConnectionEs tablishedInd(); set(T1(),now+40000000); ShutDownApplicationReq() /^ShutD- wnApplicationCnf();
- wnApplicationCnf();
- nWL
- nWL
- nWL
- nWL
- nWL : Connecti
- n_Workload
- ad
- f applications
Application modelling Simulation
Model of both hardware and platform software components and interconnections Abstract, transaction-level SystemC models Platform, subsystem and component layers Computing, communication and storage resources Role in ABSOLUT Processes load primitives Provides higher-level services Consumes time (cycle-approximate) ALE library of component models COGNAC text-based configuration system
Platform modelling Resources + Allocation Interfaces
DSL + ABSOLUT
Simulation control Utilisation of resources Resources Platform characteristics
System is simulated using OSCI SystemC kernel 2.2
- 1. Workload
models utilise resources with read, write, execute and/or service requests
- 2. OS models
propagate load from WL models to the CPUs
- 3. CPU models model execution time
for data processing instructions and propagate reads and writes to other components Workload / platform interfaces SystemC threads
Performance simulation
Instrumentationof workload and / or platform models performance and power probes Status probes, e.g. resource utilisation Timers, e.g. service processing time Counters, e.g. number of reads and writes Visualisation with VODKA
Display Image Storage Subsystem 1.6 ms 1 ms 1.4 ms 25 DMA transfer 4.8 ms 4.8 ms 4.8 ms 25 Decoding 480 µs 1 µs 450 µs 26 DMA transfer Max Min Average Calls Service Display Image Storage Subsystem 1.6 ms 1 ms 1.4 ms 25 DMA transfer 4.8 ms 4.8 ms 4.8 ms 25 Decoding 480 µs 1 µs 450 µs 26 DMA transfer Max Min Average Calls Service Display subsy stem load 0, 00 % 5, 00 % 10, 00 % 15, 00 % 20, 00 % 25, 00 % 30, 00 % 35, 00 % 40, 00 %- display. gpp0
Simulation results OSCI SystemC Utilisation Latencies Etc
Back-annotation Back-annotation ABSOLUT
15 03/07/2012
Case example: H.264 video player / recorder
Graphical video player / recorder application x264 codec Single- and multi-threaded OMAP4-like execution platform Two ARM Cortex A9 cores, crossbar, mobile DDR2 memory Software decoding (no accelerators used) Modeling and simulation using DSLs with ABSOLUT Back-annotation of filtered simulation results Verification of simulation results by comparing them with those measured on PandaBoard (work in progress)
16 03/07/2012
Environment
MetaEdit+ 4.5 Workbench Definition of DSLs Easy to use tool Great support ABSOLUT ABSINTH workload model generator COGNAC platform model generator, ALE model library BEER simulator Red Hat Enterprise Linux 6 operating system, Intel Core i7-based workstation
17 03/07/2012
Results [1/2]
Case study succesfully modelled and simulated Platform model with MetaEdit+, COGNAC and ALE Workload models with MetaEdit+, ABSINTH Simulation with BEER Experiences Modelling/simulation UI improves usability and provides lower learning curve Raises modelling abstraction Platform and allocation DSLs work well Workload DSL could be improved Result backannation could be done in a limited way
18 03/07/2012
Results [2/2]
Example: CPU utilisation caused by the multi-threaded x264 workload
19 03/07/2012
Conclusions
New methods and tools needed for embedded system design due to the increasing complexity of both applications and platforms ABSOLUT is early-phase performance modeling and evaluation technique with a set of supporting tools DSLs for the workload, platform and allocation modeling phases of DSLs for the workload, platform and allocation modeling phases of ABSOLUT implemented with MetaEdit+ and serve as front-ends MetaEdit+ and ABSOLUT combination experimented with a video player / recorder case example on OMAP4 platform model DSLs provided improved usability to the modelling/simulation flow
20 03/07/2012