RVTensor: A light-weight neural network inference framework based - - PowerPoint PPT Presentation

rvtensor a light weight neural network inference
SMART_READER_LITE
LIVE PREVIEW

RVTensor: A light-weight neural network inference framework based - - PowerPoint PPT Presentation

Institute of Software,Chinese Academy of Sciences RVTensor: A light-weight neural network inference framework based on the RISC-V architecture Pengpeng Hou, Jiageng Yu, Yuxia Miao , Yang Tai, Yanjun Wu, Chen Zhao *Corresponding author: Jiageng


slide-1
SLIDE 1

Institute of Software,Chinese Academy of Sciences

RVTensor: A light-weight neural network inference framework based

  • n the RISC-V architecture

1

Pengpeng Hou, Jiageng Yu, Yuxia Miao, Yang Tai, Yanjun Wu, Chen Zhao

*Corresponding author: Jiageng Yu jiageng08@iscas.ac.cn

slide-2
SLIDE 2

Institute of Software,Chinese Academy of Sciences

Introduction

§

RISC-V ISA is developing rapidly

v Open source ISA

§

RISC-V is suitable for IoT scenes

v Basic instruction set + Extended instruction set v IoT scene is fragmented

Basic

extended1 extended4 extended3 extended2

slide-3
SLIDE 3

Institute of Software,Chinese Academy of Sciences

Introduction

§

Popular inference framework

v For server:TensorFlow、 MXNet 、Caffe v For smart phone:TensorFlow Lite、NCNN、MNN

slide-4
SLIDE 4

Institute of Software,Chinese Academy of Sciences

Introduction

§

Inference system for RISC-V +IoT is few

v Architectural limitations

F SIMD feature

v IoT hardware resource limitations

F chip performance is weak F memory capacity is samll

Security surveillance camera price

statistics

Price 90~150 150~775 775< User Rate 34% 37% 29%

slide-5
SLIDE 5

Institute of Software,Chinese Academy of Sciences

Introduction

§

RVTensor:RISC-V Tensor

v A inference system for RISC-V + IoT scene v Dependent third-party libraries are rarely

F only libhd5.so

v Less hardware resource requirements v Based on SERVE.r platform

slide-6
SLIDE 6

Institute of Software,Chinese Academy of Sciences

Overview of RVTensor architecture

§

RVTensor Platform Overview

v Four modules

F Model analysis F Op operators F Construction calculation graph F Execution calculation graph

slide-7
SLIDE 7

Institute of Software,Chinese Academy of Sciences

Overview of RVTensor architecture

§

RVTensor Platform Overview

v Model analysis

F It mainly parses model files such as .pb, and extracts information

such as operator operations and weight data.

slide-8
SLIDE 8

Institute of Software,Chinese Academy of Sciences

Overview of RVTensor architecture

§

RVTensor Platform Overview

v Op operators

F It mainly includes the implementation of each operator, including

conv, add, active, pooling, fc and other operations

slide-9
SLIDE 9

Institute of Software,Chinese Academy of Sciences

Overview of RVTensor architecture

§

RVTensor Platform Overview

v Construction calculation graph

F It builds a calculation graph based on the model analysis and the

  • p operator modules.
slide-10
SLIDE 10

Institute of Software,Chinese Academy of Sciences

Overview of RVTensor architecture

§

RVTensor Platform Overview

v Execution calculation graph

F It obtains the inference results based on the input data (such as

image data) and the calculation graph.

slide-11
SLIDE 11

Institute of Software,Chinese Academy of Sciences

Optimization

§

Reducing dependencies on third-party libraries

v Multi-thread library: Pthread

F Provide many API F Rvtensor only uses a few

slide-12
SLIDE 12

Institute of Software,Chinese Academy of Sciences

Optimization

§

Improving memory utilization

v Memory reuse:Share a global memory block when op is

running

F Global memory block = MAX{ op's memory requirement} F Branch phase as atomic operation

slide-13
SLIDE 13

Institute of Software,Chinese Academy of Sciences

Evaluation

§

Platform: SERVR.r

§

Neural network: Resnet20

§

Date set : Cifar10

slide-14
SLIDE 14

Institute of Software,Chinese Academy of Sciences

Evaluation

§

Accuracy

v RVTensor and Keras have the same results

§

Performance

v The average time to process each image is 13.51

seconds

§

Execution file size

v The executable file size of RVTensor is 193KB

Keras runs on X86 platform

slide-15
SLIDE 15

Institute of Software,Chinese Academy of Sciences

Future work

§

Memory optimization

v Due to the limited memory, there will be memory

swapping in and out issue

§

Sparse convolution

v The Relu op would result in lots zeros in the data, it

would cause the convolution to be inefficient

§

Model pruning

v Compressing the model parameters through pruning

techniques to make them more suitable for IoT scenes

§

The V instruction set adaptation

v Re-implementing the op operator based on the V

instruction set to improve the efficiency

slide-16
SLIDE 16

Institute of Software,Chinese Academy of Sciences

Thanks!

*Corresponding author: Jiageng Yu jiageng08@iscas.ac.cn