1
Gem5 in a nutshell
Christophe Huriaux, Post-doc
Inria, IRISA — CAIRN Project-Team
CAIRN project-team — SAV 2016 June 30th-July 1st 2016 - 1
Gem5 in a nutshell Christophe Huriaux, Post-doc Inria, IRISA CAIRN - - PowerPoint PPT Presentation
Gem5 in a nutshell Christophe Huriaux, Post-doc Inria, IRISA CAIRN Project-Team CAIRN project-team SAV 2016 June 30th-July 1st 2016 - 1 1 Spoiler alert Not a research report ... but a (quick) overview of how Gem5 works and
1
Christophe Huriaux, Post-doc
Inria, IRISA — CAIRN Project-Team
CAIRN project-team — SAV 2016 June 30th-July 1st 2016 - 1
2
§ Not a research report… § ... but a (quick) overview of how Gem5 works and what it can offer
June 30th-July 1st 2016 CAIRN project-team — SAV 2016
3
§ Introduction
§ What is Gem5 useful for ?
(or rather: what you should not use it for)
§ Overview of the system simulator
§ Simulation modes § Behind the scene of a simulation § What’s under the hood ? § Memory system
§ Running example § Conclusion
June 30th-July 1st 2016 CAIRN project-team — SAV 2016
4
§ Gem5 is the fusion of two projects
§ GEMS : simulation of multi-processor systems § M5 : simulation of networked systems
§ System simulator
§ Accurate simulation of complex components interactions (OS / CPU / Caches / Devices / …)
§ Accuracy depends on the model completeness
§ All-in-one simulation framework
§ Don’t rely on other software
§ But we can plug them in easily…
§ Lot of components available out-of-the-box
§ (CPUs, memories, I/Os, …)
June 30th-July 1st 2016 CAIRN project-team — SAV 2016
5
June 30th-July 1st 2016 CAIRN project-team — SAV 2016
Architectural exploration ? Yes ! 👎
Gem5 provides a fast and easy framework to interconnect hardware components and evaluate them !
Hardware/software performance evaluation ? Yes ! 👎
Gem5 have a good support of various ISA and allows for realistic HW/SW performance evaluation.
6
June 30th-July 1st 2016 CAIRN project-team — SAV 2016
Hardware/software verification ? No … 👏
RTL functional verification is much more mature and accurate !
Software development and verification ? Ugh.. Please stop! 👏 👏 👏
Faster technologies are available through binary-translation (e.g. QEMU, OVP)
7
§ Full-system (FS)
§ Models bare-metal hardware
§ Includes the various specified devices, caches, …
§ Boots an entire OS from scratch
§ Gem5 can boot Linux (several variants) or Android out-
§ Syscall Emulation (SE)
§ Runs a single static application § System calls are emulated or forwarded to the host OS § Lot of simplifications (address translation, scheduling, no pthread …)
June 30th-July 1st 2016 CAIRN project-team — SAV 2016
8
June 30th-July 1st 2016 CAIRN project-team — SAV 2016
Collection of components Simulator internals
C++ C++ / Python
Gem5 binary
Compilation of the simulator
9
June 30th-July 1st 2016 CAIRN project-team — SAV 2016
Python script instanciating the component hierarchy and defining simulation parameters
Gem5 binary
Simulation !
Python interpreter
Collection of component interfaces Assembled C++
Simulation
Output
(statistics, traces…)
10
June 30th-July 1st 2016 CAIRN project-team — SAV 2016
Collection of components
C++ / Python
.cc .py
1 component = 1 simulation object C++ functional model
(for simulation)
Python interface
(for instanciation)
§ SimObjects follow a strict C++ class hierarchy for easier extension with code reuse
SimObject BaseCPU BaseTimingCPU BaseO3CPU
… … Simulation objects
ClockedObject
…
11
§ Gem5 is event-driven
§ Discrete event timing model § Not related to real time whatsoever § The real time duration of 1 tick can be user- defined
§ Simulation objects schedule events for the next cycle of after a specific time elapsed
§ The Gem5 simulation scheduler takes care of the rest !
June 30th-July 1st 2016 CAIRN project-team — SAV 2016
Events
12
§ Memory ports are present on every MemObject
§ They model physical memory connections § You interconnect them during the hierarchy instanciation
§ E.g. a CPU data bus to a L1 cache
§ Work by pairs: 1 master port always connect to 1 slave port § Data is exchanged atomically as packets
June 30th-July 1st 2016 CAIRN project-team — SAV 2016
Memory ports CPU
Inst. L1$ Data L1$ X bar DDR3 Flash
M M M M M M S S S S S S
13
§ 3 types of transport interfaces for packets
§ Functional
§ Instantaneous in a single function call § Caches and memories are updated automagically at
§ Atomic
§ Instantaneous § Memory model updated (caches, coherence …) § Approximate latency, but no contention nor delay
§ Timing
§ Transaction split into multiple phases § Models all timing in the memory system
§ The transport interface depends on the SimObject implementation
June 30th-July 1st 2016 CAIRN project-team — SAV 2016
Memory ports
14
§ Models a system running heterogeneous applications… § … running on heterogeneous processing tiles § ... using heterogeneous memories and interconnect
June 30th-July 1st 2016 CAIRN project-team — SAV 2016
One memory system to rule them all… CPU CPU CPU CPU CPU GPU CPU CPU Accelerators DDR3 SRAM Flash
Interconnect
15
§ Two memory systems in Gem5
§ Classic
§ All components instanciated in a hierarchy along with
CPUs, etc.
§ MOESI coherence protocol only
§ Ruby
§ Detailed simulation model of various cache hierarchies § Various cache coherence protocols (MESI, MOESI, …) § Interconnection networks
§ Classic is faster but less detailed
June 30th-July 1st 2016 CAIRN project-team — SAV 2016
…and in the simulation, interconnect them
16
§ FFT kernel from the SPLASH2 benchmark suite § ARM Instruction Set Architecture § 1 or 4 out-of-order detailed CPUs § Caches
§ L1: 64kb data, 32kb instruction § L2: 2Mb, shared
June 30th-July 1st 2016 CAIRN project-team — SAV 2016
Is my FFT faster with more processors ? Hardware/software performance evaluation ! 👎
Let’s evaluate !
17
§ Quick introduction to Gem5 § Much more things to explore in the framework !
§ Integration of power models in development § Memory / CPU traces generation § Statistics output for performance evaluation § SystemC co-simulation § Automatic benchmark run § Checkpointing, fast-forwarding
June 30th-July 1st 2016 CAIRN project-team — SAV 2016
18
CAIRN project-team — SAV 2016 June 30th-July 1st 2016 - 18