Dong Li Seaway Technology Inc. ICT, CAS 2019-11-15
Towards Benchmarking AIOT Device based on MCU Dong Li Seaway - - PowerPoint PPT Presentation
Towards Benchmarking AIOT Device based on MCU Dong Li Seaway - - PowerPoint PPT Presentation
Towards Benchmarking AIOT Device based on MCU Dong Li Seaway Technology Inc. ICT, CAS 2019-11-15 Outline MCU-based AIOT Device and Benchmarking SeawayRTOS Intro. & Auditing Kernel Contents Early Experiments for BenchMarking
Bench19 Seaway tech.
2
Outline
Contents
MCU-based AIOT Device and Benchmarking SeawayRTOS Intro. & Auditing Kernel BenchMarking Goal and Method
2
Early Experiments for BenchMarking
Bench19 Seaway tech.
3
内容提要
2
01
MCU-based AIOT Device and Benchmarking
Bench19 Seaway tech.
4
MCU-based AIOT Device
2
- 1. Tiny Smart Device with computing ability are
Already Cheap and Everywhere.
- 2. the Future of Machine Learning will be Tiny
Bench19 Seaway tech.
5
MCU and Sensors are already in milliwatts ranges
2
6 in' Display 400 mW 4G cell radio 800 mW LP BLE4.0&WIFI 100 mW Gyroscope Sensor 130mW GPS 180 milliwatts. 1/4 CMOS camera 300 milliwatts.
- ARM & Princeton
[arXiv:1905.12107]
Bench19 Seaway tech.
6
Deep Learning Works Well and Energy-Efficient on MCUs
2
- 1. ARM CMSIS-5 for Cortex M
- CMSIS-NN
- uTensor
- 2. TensorFlow Lite For MCU
- Person detection
- Speech Keyword spotting
- Classify physical gestures
- 3. Microsoft Embedded Learning Library (ELL)
ESP32 SOC WIFI and BLE Spark fun Edge with Apollo3 Nordic nRF 52840 BLE STM32F746 Discovery kit
Bench19 Seaway tech.
7
Existing MCUs and New AIOT Low Power Proccessor
- 1. MCU
40~200Mhz
- 2. RAM(SDRAM) 32KB ~ 512KB
- 3. ROM(Flash)
512KB ~ 1MB
- 4. Energy
~100 uA/MHz (1.2V - 5V) Existing MCU/DSP
- 1. MCU+NPU by ARM or RISC-V
- 2. MCU+DSP+ Spec. NN Accelerator by ARM/RISC-V/FPGA
- 3. MCU+PIM(Process in Memory) chip
New AIOT Proccessor (MCU/DSP+NPU) ESP32 by TFLite for Face Recognition ICT RISCV MCU+NPU FPGA Broad
Bench19 Seaway tech.
8
Benchmarking Goal : The Best Shape
picojoules per op
Accuracy Energy Consumption Max RAM Cost Max ROM Computing Performance spindle-shaped is the best shape
Bench19 Seaway tech.
9
2
SeawayRTOS Intro. & Auditing Kernel
02
Bench19 Seaway tech.
10
SeawayRTOS for AIOT Devices
2
KB-Level Runtime KB-level Seaway RTOS Kernelel) KB-Level EdgeStack
- Online AIOT App Store
- Support Javascript and Python
- ROM<100K, RAM<2K
- Function Migration
- Support for MQTT、CoAP and HTTP
- WIFI、BLE、LoRA、NB-IOT and Zigbee
- ROM<32K, RAM<2K
- Resp to Req <200 mS
Data/Ins. Bus
I/O BUS
Little Core
Sensor Hub
Sensors Actua.
Big Core OS
AI core
Inference
Memory Controller Comm. Controller EdgeStack Seaway Kernel
HAL & BSP Seaway Runtime AIOT Framework App App App
Energy Opt.
App
Files
- Auditing Kernel
- Active Sleep Mode
- ROM<10K & RAM<1K & TCB<10B
- ask Fail Rate <0.1%
Bench19 Seaway tech.
11
2
Seaway Runtime
技术特点
- 1. AIOT App Store
- 不落盘AIOT App应用执行方法
- 面向边缘域的拟单机编程
- 2. AIOT Runtime Development
- on Kernel:Native C/C++
- on Runtime:JavaScript/Python
- Dynamic Task Allocation and Execution
- 3. Less Codes than Traditional Embedded Program
Evaluation index Experiment result WebletScript JerryScript Duktape Espruino Compatibility(%) 58.6 99.7 99.4 66.5 Footprint(KB) 80 168 184 231 by ECMA-262 benchmark
Bench19 Seaway tech.
12
Seaway EdgeSuite
2
End AIOT Device Edge AIOT Device Cloud
Seaway RTOS Seasway Edge Seaway Cloud The developer now only need one application for the whole end-Edge-clould system
Bench19 Seaway tech.
13
Auditing Kernel Design
n Enable Kernel information monitoring for event-driven RTOS should be in Kernel n A lightweight resource auditing tool Less than 1KB ROM and 1KB RAM nEarly security warning when the abnormal resource usage pattern is captured
Design Goals
Bench19 Seaway tech.
14
n Process
l Confirm the execution entity of a task l Locate the executable code segment
n Event
l The event statistics data of a tasks in the kernel l Identify the abnormal event usage.
n Hardware resource usage
l Quantity and pattern of the consumption of hardware resource, including Proccessor, Memory, Radio and Sensors
Auditing Kernel Design
5
Bench19 Seaway tech.
15
7
Seaway Resource Auditing Overview
- 1. Resource Auditor Moudle collects the
running information and generates the log data of an AIoT device.
- 2. Seaway analyzes the log data in Edge
devices according to the corresponding resource usage Model.
- 3. the AIoT devices receive the performance
status.
Bench19 Seaway tech.
16
n Data Hook l Process-Event Model l Hardware Time-Base Model n Data processing Module n Warning Handle Module 7
Kernel Auditing Architecture
n kernel inner loop function
l The entity of a task l The executable code segment l Setup hooks in basic kernel function such as do_poll / do_event l Save the data in the locally file system l Or Send them out to the gateway for analysis
Bench19 Seaway tech.
17
n Hardware resource scheduling
l Quantity and pattern of the consumption of event and task
Capture the kernel data for hardware Resources
Category Component Parameter Kernel Events Network Data Package Network wifi_init_result WiFi init wifi_mode WiFi set_mode wifi_state WiFi On/Off source source IP destination destination IP package_transfer System Shcedulin g Data Task Information taskID xTaskCreate task_running_fre quency portYIELD, xPortSysTickHan dler Hardware Module Usage CPU CPU_Frequency CPU frequency switch Sensors nviroment_data sensor_get_data Sensors_Frequen cy sensor frequency switch
Bench19 Seaway tech.
18
2
Experiments for getting bench score
03
Bench19 Seaway tech.
19
n SeawayRTOS
l A event-driven scheduling system l multi-threaded l lightweight threading technology Protothreads l file system(Coffee) l network support: LwIP l OTA
Experiment Setup
n CC2538 + ESP32
l an ARM Cortex-M3 with up to 32MHz clock speed l 32KB of RAM l 256KB flash l Zigbee in CC2538 l WIFI/BLE in ESP32 8
Bench19 Seaway tech.
20
we catch the kernel data of event and process information of an benchmark task using SeawayRTOS EVALUATION 9
Bench19 Seaway tech.
21
The analysis restult of the tcp/ip experiment with process-event Model
n The Process-event Analysis Result
l There are different operations in Period 1056&1057 compared with base behavior of this benchmarking task l The system is using the radio to send data Warning generated
period
10
Bench19 Seaway tech.
22
The analysis result of the Time-Base Model
n The Time-Base Analysis Result
l We got the working state information of CPU, Memory, RADIO and SENSORS l There are suspicious operations in Period 5&6 compared with normal action of this application l The System is using the radio to listen other data l Warning generated, and we should suspend the task waiting for the administrator to decide.
period
12
Bench19 Seaway tech.
23
2
BenchMarking Goal and Method
04
Bench19 Seaway tech.
24
- 1. A open-source Testbed Board with sensors and Radios
2 the main processor
A: Low Power BLE/WIFI Module B: MIC C: Accelerometers D: Temperature & Humidity E: multi-threaded Protothreads F: COMS Image Sensor G: PIR (motion) sensor H: GPS
Bench19 Seaway tech.
25
Run the Benchmark tasks on DataSets
2
MNIST database handwritten digits CIFAR-10 Wechat Audio 100 Keyword Spotting By Seaway Tech. Chars74K dataset Band Accelerator Data 100hours Pattern recognition Band Heart Rate 100hours for DL and SVM alg. By Seaway Tech. Character Recognition We can provide some baseline results on these dataset with our own implementation on STM32 and ESP32
- bjects classification
Bench19 Seaway tech.
26
Benchmark Design
2
First Satisfy:
- 1. Benchmark Alg. Accuracy > baseline
- 2. Max ROM < baseline
- 3. Max RAM < baseline
- 4. Processor Cost
Compare: how much energy a single benchmark task cost given picojoules per op
Bench19 Seaway tech.
27
Thanks
Dong Li Seaway Technology Inc. lidong@haiwei.tech
Bench19 Seaway tech.
28
Comparison
2
AliOS Things Amazon FreeRTOS Microsoft ThreadX Seaway 授权方式 社区版开源 小部分开源 闭源 社区版开源 基础内核Footprint 8KB 8KB 10KB 8KB 物端应用层协栈 各协议分立-80K MQTT协议栈-20K 专有协议-80K MCH综合栈, 32KB ML推理模型支持
- 支持
支持 支持 低功耗控制
- 支持
支持(<0.1w) 边缘计算支持
- 支持
支持 支持 原生安全机制
- 支持
第三方应用支持 物云独立 物云一体 物云一体 端边云一体 IOT云服务 绑定阿里云 绑定AWS 绑定Azure 自由 AI数学库支持
- 至Cortex A级
- 至Cortex M级