Flexible Timing Simulation of RISC-V Processors with Sniper Neet - - PowerPoint PPT Presentation

flexible timing simulation of risc v processors with
SMART_READER_LITE
LIVE PREVIEW

Flexible Timing Simulation of RISC-V Processors with Sniper Neet - - PowerPoint PPT Presentation

Flexible Timing Simulation of RISC-V Processors with Sniper Neet eethu B Bal al M Mal ally lya 1 , Cecilia Gonzalez-Alvarez 2 , Trevor E. Carlson 1 1 National University of Singapore, Singapore 2 Ghent University, Belgium Outline Need


slide-1
SLIDE 1

Flexible Timing Simulation of RISC-V Processors with Sniper

Neet eethu B Bal al M Mal ally lya1, Cecilia Gonzalez-Alvarez2, Trevor E. Carlson1

1National University of Singapore, Singapore 2Ghent University, Belgium

slide-2
SLIDE 2

Outline

  • Need for Simulation
  • Sniper Simulator Overview
  • Our enhancements to Sniper
  • Initial Processor Performance Analysis
  • Conclusion

2/6/2018 2

slide-3
SLIDE 3

Why do we need Simulation?

2/6/2018 3

Performance analysis of next-generation systems Pre-silicon software optimizations Architecture design space exploration

slide-4
SLIDE 4

Verilog/RTL High-level

Trade-offs in Simulation

2/6/2018 4

slide-5
SLIDE 5

Verilog/RTL High-level

Trade-offs in Simulation

2/6/2018 5

Efficient MLP processor

Fetch Decode Instr. Queue Bypass Queue Commit Writeback ALUs and Caches Learn Slice

Source: T.E.Carlson, et al., “Load Slice Core” [ISCA2015]

slide-6
SLIDE 6

Sniper Simulator – An Overview

  • Parallel simulator based on Interval Simulation

2/6/2018 6 1Currently not supported for RISCV

  • Models multi-/many-cores running multithreaded1

and multi-program workloads

  • Hardware validated for x86
  • Flexible simulation options
slide-7
SLIDE 7

Sniper – Beyond Traditional Simulation

  • Strong adoption in industry and academia
  • 550+ citations
  • 800+ researcher downloads
  • 64+ countries

2/6/2018 7

  • Actively used since 2011
  • Belgium-based team
  • Supports next generation Xeon Phi (KNL++)
  • HiPEAC TechTransfer Award
slide-8
SLIDE 8

Sniper – Key Differentiators

  • Fast development time
  • Enables Limit Studies
  • Branch Prediction
  • Memory Dependence Prediction
  • Shared Multi-level Cache Hierarchy

2/6/2018 8

Almost 10 MIPS 1000 cores Average error

  • f just 11%

with HW

  • High Performance and Scalability
slide-9
SLIDE 9

Sniper - Interacting with the Simulator

  • Python interfaces
  • SimAPI
  • Magic Instructions
  • SimROIStart() - SimROIEnd()

2/6/2018 9

$SNIPER_HOME/include/sim_api.h

slide-10
SLIDE 10

Sniper - Interacting with the Simulator

  • Energy Stats

2/6/2018 10

Run McPAT Update the statistics

slide-11
SLIDE 11

Sniper - Interacting with the Simulator

  • Loop Tracer

2/6/2018 11

cycles instructions

slide-12
SLIDE 12

Sniper + RISC-V ecosystem

  • RISC-V
  • Open, Extensible ISA
  • Collection of related software tools

2/6/2018 12

  • Existing Architecture-level Software implementations
  • Functional simulators
  • Many additional things

Spike rv8

slide-13
SLIDE 13

Comparison with existing solutions

2/6/2018 13

Sniper + RISC-V gem5 (RISC5) FireSim / Chisel / Verilog Development Methodology C++ based (SW) C++ based (HW) RTL based (HW) Dev-time +++ ++ + Sim-time +++ ++ ++++/+/+ Simulation model Cycle-level + Cycle- approximate Cycle-level Cycle-exact + Cycle-approximate Flexibility Ease-of-use / modification Requires RTL/ abstract models Fidelity Sophisticated models require hardware validation Cycle-exact models derived from synthesizable RTL

slide-14
SLIDE 14

Simulation Flow

2/6/2018 14

RTL Validation Only Sniper Final Check Short Testing Dev Final Check Short Testing Development RTL

slide-15
SLIDE 15

Sniper Architecture

2/6/2018 15

Sniper Backend

L1 L1

L2

L1 L1

L2

Cache Performance Models Core Performance Models

Core Model 1 Core Model 2 Core Model M-1 Core Model M

NoC Thread Scheduler Decoder Library SIFT 1

SIFT 2 SIFT M SIFT M-1

SIFT pipes

Emulation/ Binary Instrumentation

events thread 1

events thread 2 events thread M events thread M-1 Sniper Frontend

slide-16
SLIDE 16

How did we enhance Sniper?

2/6/2018 16

Sniper Backend

L1 L1

L2

L1 L1

L2

Cache Performance Models Core Performance Models

Core Model 1 Core Model 2 Core Model M-1 Core Model M

NoC Thread Scheduler Decoder Library SIFT 1

SIFT 2 SIFT M SIFT M-1

SIFT pipes

Emulation/ Binary Instrumentation

events thread 1

events thread 2 events thread M events thread M-1 Sniper Frontend

slide-17
SLIDE 17

How did we enhance Sniper?

2/6/2018 17

Configuration files to resemble a BOOM processor

4

RISC-V functional simulators - rv8 / Spike were updated to support SIFT generation

1 3

Core Model Parameters like description of ports/ functional units, latencies,

  • etc. were updated

2

Decoder Library Architectural agnostic methods were added to implement the decoding phase of the processor

Backend

… …

SIFT pipes

Frontend

1 2 3 4

slide-18
SLIDE 18

Sniper Instruction Trace File Format (SIFT)

2/6/2018 18

  • Dynamic Instruction stream generated by the Frontend

Instruction Execution Order Memory Addresses for Loads and Stores Branch Directions (taken/not taken) Executed/masked info for Predicated instructions Dynamic

slide-19
SLIDE 19

How to add new Frontend?

2/6/2018 19

Sift::Writer::InstructionCount() Sift::Writer::CacheOnly() Sift::Writer::Instruction() // addresses, branch direction, etc. Instruction Instrumentation Control Sift::Writer::Magic()

slide-20
SLIDE 20

How to add new Frontend?

2/6/2018 20

rv8 / Spike

slide-21
SLIDE 21

How to add new Frontend?

2/6/2018 21

Backend

… …

SIFT pipes

Frontend

SIFT rv8 / Spike rv8 / Spike Sift::Writer

slide-22
SLIDE 22

How to update Backend?

2/6/2018 22

$SNIPER_HOME/decoder_lib

  • Decoder Library
  • 2 classes
  • Decoder
  • InstructionDecoded

$SNIPER_HOME/config

  • Config Files

$SNIPER_HOME/common/performance_model

  • Core Model
slide-23
SLIDE 23

How to run Sniper ?

./run-sniper --frontend=[pin|dr|spike|rv8|legacy] --config

2/6/2018 23

[SNIPER] Start [SNIPER] -------------------------------------------------------------------------------- [SNIPER] Sniper using SIFT/trace-driven frontend [SNIPER] Running full application in DETAILED mode [SNIPER] -------------------------------------------------------------------------------- [SNIPER] Enabling performance models [SNIPER] Setting instrumentation mode to DETAILED Trace Monitor Started [TRACE:0] -- DONE -- [SNIPER] Disabling performance models [SNIPER] Leaving ROI after 18.26 seconds OUT: RUN: TraceThread [SNIPER] Simulated 5.0M instructions, 11.2M cycles, 0.45 IPC [SNIPER] Simulation speed 273.4 KIPS (273.4 KIPS / target core - 3657.1ns/instr) [SNIPER] Setting instrumentation mode to FAST_FORWARD [SNIPER] End [SNIPER] Elapsed time: 18.41 seconds

slide-24
SLIDE 24

Experimental Setup

  • Sniper multi-core simulator
  • Similar to BOOM v1 DefaultConfig
  • Dispatch width:2, Issue Width:3, ROB:80
  • 32KB L1s, 1MB L2
  • 2.0GHz
  • SPEC CPU2006 benchmarks
  • First 5M instructions

2/6/2018 24

slide-25
SLIDE 25

Initial Processor Performance Analysis

Testcase IPC KIPS 470.lbm 0.15 97.899 444.namd 1 304.719 450.soplex 1.52 343.668 456.hmmer 2.71 523.41 462.libquantum 2.65 611.968

2/6/2018 25

130.87 130.96

Source: Tuan Ta, et. al, “Simulating Multi-Core RISC-V Systems in gem5”, [CARRV 2018]

slide-26
SLIDE 26

Conclusion

  • An infrastructure extension of Sniper
  • Sniper + RISC-V is now available
  • Next steps
  • Improve the simulator features to allow for a detailed comparison with

cycle-level processor implementations

2/6/2018 26

slide-27
SLIDE 27
  • Thank you
  • Download Today!
  • http://snipersim.org/w/Download
  • Questions?
  • http://groups.google.com/group/snipersim

2/6/2018 27

slide-28
SLIDE 28

Flexible Timing Simulation of RISC-V Processors with Sniper

Neet eethu B Bal al M Mal ally lya1, Cecilia Gonzalez-Alvarez2, Trevor E. Carlson1

1National University of Singapore, Singapore 2Ghent University, Belgium