Flexible Timing Simulation of RISC-V Processors with Sniper
Neet eethu B Bal al M Mal ally lya1, Cecilia Gonzalez-Alvarez2, Trevor E. Carlson1
1National University of Singapore, Singapore 2Ghent University, Belgium
Flexible Timing Simulation of RISC-V Processors with Sniper Neet - - PowerPoint PPT Presentation
Flexible Timing Simulation of RISC-V Processors with Sniper Neet eethu B Bal al M Mal ally lya 1 , Cecilia Gonzalez-Alvarez 2 , Trevor E. Carlson 1 1 National University of Singapore, Singapore 2 Ghent University, Belgium Outline Need
1National University of Singapore, Singapore 2Ghent University, Belgium
2/6/2018 2
2/6/2018 3
Verilog/RTL High-level
2/6/2018 4
Verilog/RTL High-level
2/6/2018 5
Fetch Decode Instr. Queue Bypass Queue Commit Writeback ALUs and Caches Learn Slice
Source: T.E.Carlson, et al., “Load Slice Core” [ISCA2015]
2/6/2018 6 1Currently not supported for RISCV
2/6/2018 7
2/6/2018 8
Almost 10 MIPS 1000 cores Average error
with HW
2/6/2018 9
$SNIPER_HOME/include/sim_api.h
2/6/2018 10
Run McPAT Update the statistics
2/6/2018 11
cycles instructions
2/6/2018 12
Spike rv8
2/6/2018 13
Sniper + RISC-V gem5 (RISC5) FireSim / Chisel / Verilog Development Methodology C++ based (SW) C++ based (HW) RTL based (HW) Dev-time +++ ++ + Sim-time +++ ++ ++++/+/+ Simulation model Cycle-level + Cycle- approximate Cycle-level Cycle-exact + Cycle-approximate Flexibility Ease-of-use / modification Requires RTL/ abstract models Fidelity Sophisticated models require hardware validation Cycle-exact models derived from synthesizable RTL
2/6/2018 14
RTL Validation Only Sniper Final Check Short Testing Dev Final Check Short Testing Development RTL
2/6/2018 15
Sniper Backend
L1 L1
L2
L1 L1
L2
Cache Performance Models Core Performance Models
Core Model 1 Core Model 2 Core Model M-1 Core Model M
NoC Thread Scheduler Decoder Library SIFT 1
SIFT 2 SIFT M SIFT M-1
SIFT pipes
Emulation/ Binary Instrumentation
events thread 1
events thread 2 events thread M events thread M-1 Sniper Frontend
2/6/2018 16
Sniper Backend
L1 L1
L2
L1 L1
L2
Cache Performance Models Core Performance Models
Core Model 1 Core Model 2 Core Model M-1 Core Model M
NoC Thread Scheduler Decoder Library SIFT 1
SIFT 2 SIFT M SIFT M-1
SIFT pipes
Emulation/ Binary Instrumentation
events thread 1
events thread 2 events thread M events thread M-1 Sniper Frontend
2/6/2018 17
Configuration files to resemble a BOOM processor
4
RISC-V functional simulators - rv8 / Spike were updated to support SIFT generation
1 3
Core Model Parameters like description of ports/ functional units, latencies,
2
Decoder Library Architectural agnostic methods were added to implement the decoding phase of the processor
Backend
SIFT pipes
Frontend
1 2 3 4
2/6/2018 18
Instruction Execution Order Memory Addresses for Loads and Stores Branch Directions (taken/not taken) Executed/masked info for Predicated instructions Dynamic
2/6/2018 19
Sift::Writer::InstructionCount() Sift::Writer::CacheOnly() Sift::Writer::Instruction() // addresses, branch direction, etc. Instruction Instrumentation Control Sift::Writer::Magic()
2/6/2018 20
rv8 / Spike
2/6/2018 21
Backend
SIFT pipes
Frontend
SIFT rv8 / Spike rv8 / Spike Sift::Writer
2/6/2018 22
$SNIPER_HOME/decoder_lib
$SNIPER_HOME/config
$SNIPER_HOME/common/performance_model
2/6/2018 23
[SNIPER] Start [SNIPER] -------------------------------------------------------------------------------- [SNIPER] Sniper using SIFT/trace-driven frontend [SNIPER] Running full application in DETAILED mode [SNIPER] -------------------------------------------------------------------------------- [SNIPER] Enabling performance models [SNIPER] Setting instrumentation mode to DETAILED Trace Monitor Started [TRACE:0] -- DONE -- [SNIPER] Disabling performance models [SNIPER] Leaving ROI after 18.26 seconds OUT: RUN: TraceThread [SNIPER] Simulated 5.0M instructions, 11.2M cycles, 0.45 IPC [SNIPER] Simulation speed 273.4 KIPS (273.4 KIPS / target core - 3657.1ns/instr) [SNIPER] Setting instrumentation mode to FAST_FORWARD [SNIPER] End [SNIPER] Elapsed time: 18.41 seconds
2/6/2018 24
Testcase IPC KIPS 470.lbm 0.15 97.899 444.namd 1 304.719 450.soplex 1.52 343.668 456.hmmer 2.71 523.41 462.libquantum 2.65 611.968
2/6/2018 25
130.87 130.96
Source: Tuan Ta, et. al, “Simulating Multi-Core RISC-V Systems in gem5”, [CARRV 2018]
cycle-level processor implementations
2/6/2018 26
2/6/2018 27
1National University of Singapore, Singapore 2Ghent University, Belgium