AutoML for Object Detection Xiangyu Zhang MEGVII Research 1 - - PowerPoint PPT Presentation

automl for object detection
SMART_READER_LITE
LIVE PREVIEW

AutoML for Object Detection Xiangyu Zhang MEGVII Research 1 - - PowerPoint PPT Presentation

AutoML for Object Detection Xiangyu Zhang MEGVII Research 1 AutoML for Advances in AutoML Object Detection 2 Search for Detection Systems 1 AutoML for Advances in AutoML Object Detection 2 Search for Detection Systems


slide-1
SLIDE 1

AutoML for Object Detection

Xiangyu Zhang MEGVII Research

slide-2
SLIDE 2

AutoML for Object Detection

  • Advances in AutoML
  • Search for Detection Systems

2 1

slide-3
SLIDE 3

AutoML for Object Detection

  • Advances in AutoML
  • Search for Detection Systems

2 1

slide-4
SLIDE 4

Introduction

v AutoML

  • A meta-approach to generate machine learning systems
  • Automatically search vs. manually design

v AutoML for Deep Learning

  • Neural architecture search (NAS)
  • Hyper-parameters turning
  • Loss function
  • Data augmentation
  • Activation function
  • Backpropagation

slide-5
SLIDE 5

Revolution of AutoML

v ImageNet 2012 -

  • Hand-craft feature
  • vs. deep learning

v Era of Deep Learning begins!

27 26.2 16.4 8.1 7.3 6.6 4.9 3.57 OXFORD ISI AlexNet SPPnet VGG GoogleNet PReLU ResNet 152 Classification Top-5 Error (%)

slide-6
SLIDE 6

Revolution of AutoML (cont’d)

v ImageNet 2017 -

  • Manual architecture
  • vs. AutoML models

19.1 17.3 17.3 17.1 16.1 15.6 ResNeXt-101 SENet NASNet-A PNASNet-5 AmoebaNet-A EfficientNet Classification Top-1 Error (%)

Era of AutoML?

slide-7
SLIDE 7

Revolution of AutoML (cont’d)

v Literature

  • 200+ since 2017
slide-8
SLIDE 8

Revolution of AutoML (cont’d)

v Literature

  • 200+ since 2017

v Google Trends

slide-9
SLIDE 9

Recent Advances in AutoML (1)

v Surpassing handcraft models

  • NASNet

v Keynotes

  • RNN controller + policy gradient
  • Flexible search space
  • Proxy task needed

Zoph et al. Learning Transferable Architectures for Scalable Image Recognition Zoph et al. Neural Architecture Search with Reinforcement Learning

slide-10
SLIDE 10

Recent Advances in AutoML (2)

v Search on the target task

  • MnasNet

v Keynotes

  • Search directly on ImageNet
  • Platform aware search
  • Very costly (thousands of TPU-days)

Tan et al. MnasNet: Platform-Aware Neural Architecture Search for Mobile

slide-11
SLIDE 11

Recent Advances in AutoML (3)

v Weight Sharing for Efficient Search & Evaluation

  • ENAS
  • One-shot methods

v Keynotes

  • Super network
  • Finetuning & inference only instead of retraining
  • Inconsistency in super net evaluation

Pham et al. Efficient Neural Architecture Search via Parameter Sharing Bender et al. Understanding and Simplifying One-Shot Architecture Search Guo et al. Single Path One-Shot Neural Architecture Search with Uniform Sampling

slide-12
SLIDE 12

Recent Advances in AutoML (4)

v Gradient-based methods

  • DARTS
  • SNAS, FBNet, ProxylessNAS, …

v Keynotes

  • Joint optimization of architectures and weights
  • Weight sharing implied
  • Sometimes less flexible

Liu et al. DARTS: Differentiable Architecture Search Xie et al. SNAS: Stochastic Neural Architecture Search Cai et al. ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware Wu et al. FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search

slide-13
SLIDE 13

Recent Advances in AutoML (5)

v Performance Predictor

  • Neural Architecture Optimization
  • ChamNet

v Keynotes

  • Architecture encoding
  • Performance prediction models
  • Cold start problem

Luo et al. Neural Architecture Optimization Dai et al. ChamNet: Towards Efficient Network Design through Platform-Aware Model Adaptation

slide-14
SLIDE 14

Recent Advances in AutoML (6)

v Hardware-aware Search

  • Search with complexity budget
  • Quantization friendly
  • Energy-aware search

v Keynotes

  • Complexity-aware loss & reward
  • Multi-target search
  • Device in the loop

Wu et al. Mixed Precision Quantization of ConvNets via Differentiable Neural Architecture Search V´eniat et al. Learning Time/Memory-Efficient Deep Architectures with Budgeted Super Networks Wang et al. HAQ: Hardware-Aware Automated Quantization with Mixed Precision

slide-15
SLIDE 15

Recent Advances in AutoML (7)

v AutoML in Model Pruning

  • NetAdapt
  • AMC
  • MetaPruning

v Keynotes

  • Search for the pruned architecture
  • Hyper-parameters like channels, spatial size, …

Yang et al. NetAdapt: Platform-Aware Neural Network Adaptation for Mobile Applications He et al. AMC: AutoML for Model Compression and Acceleration on Mobile Devices Liu et al. MetaPruning: Meta Learning for Automatic Neural Network Channel Pruning

slide-16
SLIDE 16

Recent Advances in AutoML (8)

v Handcraft + NAS

  • Human-expert guided search (IRLAS)
  • Boosting existing handcraft models (EfficientNet,

MobileNet v3)

v Keynotes

  • Very competitive performance
  • Efficient
  • Search space may be restricted

Howard et al. Searching for MobileNetV3 Tan et al. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks Guo et al. IRLAS: Inverse Reinforcement Learning for Architecture Search

slide-17
SLIDE 17

Recent Advances in AutoML (9)

v Various Tasks

  • Object Detection
  • Semantic Segmentation
  • Super-resolution
  • Face Recognition

… Liu et al. Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation Chu et al. Fast, Accurate and Lightweight Super-Resolution with Neural Architecture Search Ramachandra et al. Searching for Activation Functions Alber et al. Backprop Evolution

v Not only NAS, search for everything!

  • Activation function
  • Loss function
  • Data augmentation
  • Backpropagation

slide-18
SLIDE 18

Recent Advances in AutoML (10)

v Rethinking the Effectiveness of NAS

  • Random search
  • Random wire network

v Keynotes

  • Reproducibility
  • Search algorithm or search space?
  • Baselines

Li et al. Random Search and Reproducibility for Neural Architecture Search Xie et al. Exploring Randomly Wired Neural Networks for Image Recognition

slide-19
SLIDE 19

Summary: Trends and Challenges

v Trends

  • Efficient & high-performance algorithm
  • Flexible search space
  • Device-aware optimization
  • Multi-task / Multi-target search

v Challenges

  • Trade-offs between efficiency, performance and flexibility
  • Search space matters!
  • Fair benchmarks
  • Pipeline search

Efficiency Flexibility Performance

slide-20
SLIDE 20

AutoML for Object Detection

  • Advances in AutoML
  • Search for Detection Systems

2 1

slide-21
SLIDE 21

AutoML for Object Detection

v Components to search

  • Image preprocessing
  • Backbone
  • Feature fusion
  • Detection head & loss function

slide-22
SLIDE 22

AutoML for Object Detection

v Components to search

  • Image preprocessing
  • Backbone
  • Feature fusion
  • Detection head & loss function

slide-23
SLIDE 23

AutoML for Object Detection

v Components to search

  • Image preprocessing
  • Backbone
  • Feature fusion
  • Detection head & loss function

slide-24
SLIDE 24

AutoML for Object Detection

v Components to search

  • Image preprocessing
  • Backbone
  • Feature fusion
  • Detection head & loss function

slide-25
SLIDE 25

AutoML for Object Detection

v Components to search

  • Image preprocessing
  • Backbone
  • Feature fusion
  • Detection head & loss function

slide-26
SLIDE 26

Search for Detection Systems

Feature Fusion Augmentation

DetNAS

Chen et al. DetNAS: Backbone Search for Object Detection

slide-27
SLIDE 27

Challenges of Backbone Search

v Similar to general NAS, but …

  • Controller & evaluator loop
  • Performance evaluation is very slow

v Detection backbone evaluation involves a costly pipeline

  • ImageNet pretraining
  • Finetuning on the detection dataset (e.g. COCO)
  • Evaluation on the validation set
slide-28
SLIDE 28

Related Work: Single Path One-shot NAS

v Decoupled weight training and architecture optimization v Super net training

Guo et al. Single Path One-Shot Neural Architecture Search with Uniform Sampling

slide-29
SLIDE 29

Pipeline

v Single-pass approach

  • Pretrain and finetune super net only once
slide-30
SLIDE 30

Search Space

v Single path super net

  • 20 (small) or 40 (large) choice blocks
  • 4 candidates for each choice block
  • Search space size: 420 or 440
slide-31
SLIDE 31

Search Algorithm

v Evolutionary search

  • Sample & reuse the weights from super net
  • Very efficient
slide-32
SLIDE 32

Results

v High performance

  • Significant improvements over commonly used backbones (e.g. ResNet 50) with fewer FLOPs
  • Best classification backbones may be suboptimal for object detection
slide-33
SLIDE 33

Results

v Search cost

  • Super nets greatly speed up search progress!
slide-34
SLIDE 34

Search for Detection Systems

Backbone Feature Fusion Augmentation

NAS-FPN

Ghaisi et al. NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection

slide-35
SLIDE 35

Feature Fusion Modules

v Multi-scale feature fusion

  • Used in state-of-the-art detectors (e.g. SSD, FPN, SNIP, FCOS, …)

v Automatic search vs. manual design

slide-36
SLIDE 36

First Glance

v Searched architecture

  • Very different from handcraft structures
slide-37
SLIDE 37

Search Space

v Stacking repeated FPN blocks v For each FPN block, N different merging cells v For each merging cell, 4-step generations

slide-38
SLIDE 38

Search Algorithm

v Controller

  • RNN-based controller
  • Search with Proximal Policy Optimization (PPO)

v Candidate evaluation

  • Training a light-weight proxy task
slide-39
SLIDE 39

Architectures During Search

v Many downsamples and upsamples

slide-40
SLIDE 40

Results

v State-of-the-art speed/AP trade-off

slide-41
SLIDE 41

Search for Detection Systems

Backbone Feature Fusion Augmentation

Auto-Augment for Detection

Zoph et al. Learning Data Augmentation Strategies for Object Detection

slide-42
SLIDE 42

Data Augmentation for Object Detection

v Augmentation pool

  • Color distortions
  • Geometric transforms
  • Random noise (e.g. cutout, drop block, …)
  • Mix-up

v Search for the best augmentation configurations

slide-43
SLIDE 43

Search Space Design

v Mainly follows AutoAugment v Randomly sampling from K sub-policies v For each sub-policy, N image transforms v Each image transform selected from 22 operations:

  • Color operations
  • Geometric operations
  • Bounding box operations

Cubuk et al. AutoAugment: Learning Augmentation Strategies from Data

slide-44
SLIDE 44

Search Space Design (cont’d)

slide-45
SLIDE 45

Search Algorithm

v Very similar to NAS-FPN v Controller

  • RNN-based controller
  • Search with Proximal Policy Optimization (PPO)

v Evaluation

  • A small proxy dataset
  • Short-time training
slide-46
SLIDE 46

Results

v Significantly outperforms previous state-of-the-arts

slide-47
SLIDE 47

Analysis

v Better regularization

slide-48
SLIDE 48

Future Work

v More search dimensions

  • E.g. loss, anchor boxes, assign rules, post-processing, …

v Reducing search cost v Joint optimization

slide-49
SLIDE 49

Q & A