INSTITUTE O OF C COMPUTING T TECHNOLOGY
BenchCouncil AIBench
- --A Datacenter AI Benchmark Suite
Wanling Gao, Fei Tang, Jianfeng Zhan
http://www.benchcouncil.org/AIBench/index.html
BenchCouncil
Bench’19, Denver, Colorado, USA
BenchCouncil AIBench --- A Datacenter AI Benchmark Suite Wanling - - PowerPoint PPT Presentation
BenchCouncil AIBench --- A Datacenter AI Benchmark Suite Wanling Gao, Fei Tang, Jianfeng Zhan http://www.benchcouncil.org/AIBench/index.html INSTITUTE O BenchCouncil OF C Bench19, Denver, Colorado, USA COMPUTING T TECHNOLOGY Why
INSTITUTE O OF C COMPUTING T TECHNOLOGY
BenchCouncil
Bench’19, Denver, Colorado, USA
AIBench Bench’19
n
processing images, video, speech, and audio n There is an urgent need for datacenter AI benchmarks
AIBench Bench’19
n Isolation!
AIBench Bench’19
n Various modules and complex execution path n massive scale and complex hierarchy of infrastructure
End-to-end benchmark that models the critical paths and primary modules is needed
Component 1 Merged results
Request st Resp sponse se
Component n Splitted suboperations
AIBench Bench’19
Bianco, S., Cadene, R., Celona, L., and Napoletano, P. Benchmark analysis of representativ e deep neural network architectures . IEEE Access, 6:64270– 64277, 2018.
AIBench Bench’19
n State-of-the-art accuracy
AIBench Bench’19
n SPECCPU 2017 (43), PARSEC3.0 (30), TPC-DS (99)
n top500
AIBench Bench’19
Earlier stage of architecture research Later stage of architecture research Micro or Component ? Application benchmark ?
AIBench Bench’19
n critical paths and primary modules of business AI scenario
n Collectively as a whole end-to-end application n Individually as a micro or component benchmark
n Diverse AI problem domains and datasets are needed
AIBench Bench’19
n Tasks, Models, Datasets, Metrics
AIBench Bench’19
n Contributors: many companies and top
http://www.benchcouncil.org/AIBench/index.html
Wanling Gao, Fei Tang, Lei Wang, Jianfeng Zhan, Chunxin Lan, Chunjie Luo, et al. AIBench: An Industry Standard Internet Service AI Benchmark Suite. Technical Report 2019. arXiv preprint arXiv:1908.08998.
AIBench Bench’19
n
The First end-to-end industry- standard AI benchmark suite
n
Industry-scale Internet services
n
A highly extensible, configurable, and flexible benchmark framework
n
16 prominent AI problem domains
n
Multiple loosely coupled modules
– Micro/Component benchmarks
– Application benchmarks
AIBench Bench’19
n Text Processing (4)
n Text-to-Text translation, Text summarization, Learning to rank,
Recommendation
n Image Processing (8)
n Image classification, Image generation, Image-to-text, Image-to-Image,
Face embedding, Object detection, Image compression, Spatial transformer
n Audio Processing (1)
n Speech recognition
n Video Processing (1)
n Video prediction
n 3D Data Processing (2)
n 3D face recognition, 3D object reconstruction
AIBench Bench’19
n
Query generator:simulate concurrent users and send query requests
n
Online Module:personalized searching and recommendations
n
Offline Module:a training stage to generate a learning model
n
Data storage module:data storage, e.g., user database, product database
AIBench Bench’19
AIBench Bench’19
AIBench Bench’19
n a supervised learning problem to define a set of target
n ResNet neural network, Dataset:ImageNet2012, 100GB+
AIBench Bench’19
n Dataset:LSUN,about million labelled image data n Model: WGAN algorithm
AIBench Bench’19
n Model: Transformer n Dataset:WMT English-German (4.5MB training text data)
AIBench Bench’19
n Model: cycle- GAN algorithm n Datasets: Cityscapes from 50+ cities(300MB)
AIBench Bench’19
n Model:deep speech 2 n Dataset:LibriSpeech, 1000+ hours‘ speech data
AIBench Bench’19
n Model:Faster R-CNN algorithm n Dataset:MSCOCO2014
AIBench Bench’19
n Model:Neural Image Caption model n Dataset:MSCOCO2014
AIBench Bench’19
n Model:FaceNet algorithm n Dataset:VGGFace2
AIBench Bench’19
n Model:3D face models n Dataset:Intellifusion data set,77,715 samples from 253
AIBench Bench’19
n Model:motion-focused predictive models n Dataset: Robot pushing dataset
AIBench Bench’19
n Model:recurrent neural networks n Dataset:ImageNet2012,100GB+
AIBench Bench’19
n Model:Collaborative filtering algorithm n Dataset: MovieLens
AIBench Bench’19
n Model:a convolutional encoder-decoder network n Dataset:ShapeNet
AIBench Bench’19
n Model:sequence-to-sequence model n Dataset: Gigaword
AIBench Bench’19
n Model:spatial transformer networks n Dataset:MNIST
AIBench Bench’19
n Model: ranking distillation n Dataset: Gowalla
AIBench Bench’19
AIBench Bench’19
Query Generator
Ø Concurrency Ø Arriving rate Ø Distribution Ø Thinking time
System under Test Monitoring Tools Result Outputs
Ø Accuracy Ø Latency Ø Tail Latency Ø Throughput
Datasets
AIBench Bench’19
n Latency, Tail latency n Latency-bounded throughput
n Throughput, Energy consumption
n Accuracy deviation with target accuracy is within
AIBench Bench’19
System under Test Monitoring Tools Result Outputs
Ø Accuracy Ø Latency Ø Tail Latency Ø Throughput
Datasets
AIBench Bench’19
n Time-to-accuracy n Energy-to-accuracy n Throughput
AIBench Bench’19
n Each component can be distributed deployed on a
AIBench Bench’19
n Single GPU, Multi GPUs, Distributed versions
n Example distributed training setting
AIBench Bench’19
n Tasks, Models, Datasets, Metrics
AIBench Bench’19
n Prepare the package of AIBench n Prepare the environments of the selected software stack n Prepare corresponding data set n Run the scripts or commands (User Manual!)
– run-tensorflow.sh (TensorFlow), run-pthread.sh (Pthreads)
– run_train_time.sh (Training stage), run_val_time.sh (Inference stage)
– Start online and offline modules » neo4j, Elasticsearch, Recommender, Search-planer
AIBench Bench’19
n http://www.benchcouncil.org/AIBench/download.html
n User Manual
n
http://www.benchcouncil.org/AIBench/files/AIBench-User-Manual.pdf
AIBench Bench’19
AIBench Micro Benchmark Pthreads 12 benchmarks 12 benchmarks TensorFlow TensorFlow 16 benchmarks Component Benchmark PyTorch 16 benchmarks Offline Module 10 benchmarks Online Module Online benchmarks Application Benchmark
AIBench Bench’19
n http://125.39.136.212:8090/AIBench/aibench_framework
AIBench Bench’19
n http://125.39.136.212:8090/AIBench/aibench_application_benchmark
AIBench Bench’19
n http://125.39.136.212:8090/AIBench/DC_AIBench_Component
AIBench Bench’19
n http://125.39.136.212:8090/AIBench/DC_AIBench_Micro
AIBench Bench’19
n Log in and apply for nodes !
AIBench Bench’19
n Model:Neural Image Caption model n Dataset:MSCOCO2014
n Apply for nodes on testbed n Training or inference
cd DC_AIBench_Component/TensorFlow/Image_to_Text/tf-models/research/im2txt
./run_train_time.sh ./run_val_time.sh
AIBench Bench’19
n ResNet neural network, Dataset:ImageNet2012,
n Apply for nodes on testbed n Training or inference
cd DC_AIBench_Component/PyTorch/Image_classification
./run_train_time.sh ./run_val_time.sh
AIBench Bench’19
n Model: Transformer n Dataset:WMT English-German
n Apply for nodes on testbed n Training or inference
cd DC_AIBench_Component/PyTorch/Text_to_Text
./run_train_time.sh ./run_val_time.sh
AIBench Bench’19
n Tasks, Models, Datasets, Metrics
AIBench Bench’19
n AI components change the critical path significantly
n
34.29 vs.. 49.07 milliseconds for average latency
n Model depth and size limit QoS
n
99th percentile latency increasing from 149.12 to 5335.12 milliseconds when model increasing from 184 MB to 253 MB
n AI-related components suffer from higher cache misses
n
61 vs.. 37 for L2 cache misses per Kilo instructions
AIBench Bench’19
n Different models have different execution efficiency n Learning_to_rank has the lowest efficiency
AIBench Bench’19
n Six categories
n using nvprof to trace the running time breakdown and find the
hotspot functions that occupy more than 80% of running time in total
AIBench Bench’19
AIBench Bench’19
n memory dependency stalls, execution dependency stalls
AIBench Bench’19
n Tasks, Models, Datasets, Metrics
AIBench Bench’19
AIBench Bench’19
n Benchmarking
n
AIBench: An Industry Standard Internet Service AI Benchmark Suite. Technical Report, 2019.
n
AIBench: Towards Scalable and Comprehensive Datacenter AI Benchmarking. Bench18.
n
HPC AI500: A Benchmark Suite for HPC AI Systems. Bench18.
n
Edge AIBench: Towards Comprehensive End-to-end Edge Computing Benchmarking. Bench18.
n
AIoT Bench: Towards Comprehensive Benchmarking Mobile and Embedded device
n
Data Motifs: A Lens Towards Fully Understanding Big Data and AI Workloads. PACT’18.
n
BigDataBench: a Big Data Benchmark Suite from Internet Services. HPCA’14
n
Data Motif-based Proxy Benchmarks for Big Data and AI Workloads. IISWC 2018.
n
Auto-tuning Spark Big Data Workloads on POWER8: Prediction-Based Dynamic SMT . PACT’16
n
CVR: Efficient Vectorization of SpMV on X86 Processors. CGO’18.
n
Characterizing data analysis workloads in data centers. IISWC 13 best paper award.
AIBench Bench’19