Chenxi Liu , Liang-Chieh Chen, Florian Schrofg, Haruwig Adam, Wei - - PowerPoint PPT Presentation

chenxi liu liang chieh chen florian schrofg haruwig adam
SMART_READER_LITE
LIVE PREVIEW

Chenxi Liu , Liang-Chieh Chen, Florian Schrofg, Haruwig Adam, Wei - - PowerPoint PPT Presentation

Chenxi Liu , Liang-Chieh Chen, Florian Schrofg, Haruwig Adam, Wei Hua, Alan Yuille, Li Fei-Fei 06/18/2019 @CVPR Neural Architecture Search for Image Classifjcation Zoph, Barret, et al. "Learning transferable architectures for scalable image


slide-1
SLIDE 1

Chenxi Liu, Liang-Chieh Chen, Florian Schrofg, Haruwig Adam, Wei Hua, Alan Yuille, Li Fei-Fei 06/18/2019 @CVPR

slide-2
SLIDE 2

Neural Architecture Search for Image Classifjcation

Zoph, Barret, et al. "Learning transferable architectures for scalable image recognition." In CVPR. 2018. Liu, Chenxi, et al. "Progressive neural architecture search." In ECCV. 2018. Real, Esteban, et al. "Regularized evolution for image classifier architecture search." In AAAI. 2019. Liu, Hanxiao, Karen Simonyan, and Yiming Yang. "Darts: Differentiable architecture search." In ICLR. 2019.

slide-3
SLIDE 3

Neural Architecture Search for Dense Image Prediction

  • Image classification is a good starting point for NAS, but should not

be the end point.

  • Our paper is one of the first efforts to extend NAS to dense image

prediction (semantic segmentation to be exact).

slide-4
SLIDE 4

Challenge 1: Network Level Search Space

Inner Cell Level Outer Network Level

slide-5
SLIDE 5

Challenge 1: Network Level Search Space

Inner Cell Level (automatically search) Outer Network Level (hand design)

slide-6
SLIDE 6

Challenge 2: Need for High Resolution & Effjcient NAS

slide-7
SLIDE 7

Challenge 2: Need for High Resolution & Effjcient NAS

airplane 32x32

slide-8
SLIDE 8

Challenge 2: Need for High Resolution & Effjcient NAS

airplane > 321x321 32x32

slide-9
SLIDE 9

Idea of Difgerentiable NAS

Network\Layer 1 2 …… L-1 L #1 #2 #3 #4

slide-10
SLIDE 10

Idea of Difgerentiable NAS

……

Network\Layer 1 2 …… L-1 L #1 #2

#4L

slide-11
SLIDE 11

Idea of Difgerentiable NAS

Network\Layer 1 2 …… L-1 L #1

slide-12
SLIDE 12

Idea of Difgerentiable NAS

ɑ1 ɑ2 ɑ3 ɑ4

Liu, Hanxiao, Karen Simonyan, and Yiming Yang. "Darts: Differentiable architecture search." In ICLR. 2019.

Network\Layer 1 2 …… L-1 L #1

slide-13
SLIDE 13

Idea of Difgerentiable NAS

ɑ1 ɑ2 ɑ3 ɑ4

Liu, Hanxiao, Karen Simonyan, and Yiming Yang. "Darts: Differentiable architecture search." In ICLR. 2019.

Network\Layer 1 2 …… L-1 L #1 ɑ3 is the largest among the four ❌ ❌ ❌

slide-14
SLIDE 14

Idea of Difgerentiable NAS

Network\Layer 1 2 …… L-1 L #1

slide-15
SLIDE 15

Network Level Search Space

1 Downsample\Layer 2 4 8 16 …… 1 L 2 3 4 5 L-1 ……

slide-16
SLIDE 16

Network Level Search Space

1 Downsample\Layer 2 4 8 16 1 L 2 3 4 5 L-1 …… ……

slide-17
SLIDE 17

Network Level Search Space

1 Downsample\Layer 2 4 8 16 1 L 2 3 4 5 L-1 …… ……

slide-18
SLIDE 18

Network Level Search Space

1 Downsample\Layer 2 4 8 16 1 L 2 3 4 5 L-1 …… ……

slide-19
SLIDE 19

Network Level Search Space

1 Downsample\Layer 2 4 8 16 1 L 2 3 4 5 L-1 ……

32

slide-20
SLIDE 20

Network Level Search Space

1 Downsample\Layer 2 4 8 16 1 L 2 3 4 5 L-1 …… 32

AS PP AS PP AS PP AS PP

slide-21
SLIDE 21

DeepLabv3

1

AS PP AS PP AS PP AS PP

Downsample\Layer 2 4 8 16 32 1 L 2 3 4 5 L-1 ……

Chen, Liang-Chieh, George Papandreou, Florian Schroff, and Hartwig Adam. "Rethinking atrous convolution for semantic image segmentation." arXiv preprint arXiv:1706.05587 (2017).

slide-22
SLIDE 22

Conv-Deconv

1 Downsample\Layer 2 4 8 16 32 1 L 2 3 4 5 L-1 ……

Noh, Hyeonwoo, Seunghoon Hong, and Bohyung Han. "Learning deconvolution network for semantic segmentation." In ICCV. 2015.

slide-23
SLIDE 23

Stacked Hourglass

Newell, Alejandro, Kaiyu Yang, and Jia Deng. "Stacked hourglass networks for human pose estimation." In ECCV. 2016.

1 Downsample\Layer 2 4 8 16 32 1 L 2 3 4 5 L-1 ……

slide-24
SLIDE 24

Network Level Search Space

1 Downsample\Layer 2 4 8 16 1 L 2 3 4 5 L-1 …… 32

AS PP AS PP AS PP AS PP

slide-25
SLIDE 25

Network Level Search Space

1 Downsample\Layer 2 4 8 16 1 L 2 3 4 5 L-1 …… 32

AS PP AS PP AS PP AS PP

slide-26
SLIDE 26

Network Level Search Space

1 Downsample\Layer 2 4 8 16 1 L 2 3 4 5 L-1 …… 32

AS PP AS PP AS PP AS PP

slide-27
SLIDE 27

Experiments

  • 321x321 image crops from Cityscapes
  • Number of layers L = 12
  • 40 epochs; less than 3 days on one P100 GPU
slide-28
SLIDE 28

Auto-DeepLab Cell Architecture

Hl-1 Hl-2

...

Hl

concat atr 5x5 sep 3x3

+

atr 3x3 sep 3x3

+

sep 3x3 sep 3x3

+

sep 5x5 sep 5x5

+

atr 5x5 sep 5x5

+

slide-29
SLIDE 29

Auto-DeepLab Cell Architecture

Hl-1 Hl-2

...

Hl

concat atr 5x5 sep 3x3

+

atr 3x3 sep 3x3

+

sep 3x3 sep 3x3

+

sep 5x5 sep 5x5

+

atr 5x5 sep 5x5

+

Atrous convolution is often used

slide-30
SLIDE 30

Auto-DeepLab Network Architecture

1

AS PP AS PP AS PP AS PP

Downsample\Layer 2 4 8 16 32 1 L 2 3 4 5 L-1 ……

slide-31
SLIDE 31

Auto-DeepLab Network Architecture

1

AS PP AS PP AS PP AS PP

Downsample\Layer 2 4 8 16 32 1 L 2 3 4 5 L-1 …… General tendency to downsample

slide-32
SLIDE 32

Auto-DeepLab Network Architecture

1

AS PP AS PP AS PP AS PP

Downsample\Layer 2 4 8 16 32 1 L 2 3 4 5 L-1 …… General tendency to upsample

slide-33
SLIDE 33

Pergormance on Cityscapes (Test Set)

Method ImageNet? Coarse? mIOU (%)

GridNet 69.5 FRRN-B 71.8 Auto-DeepLab-S 79.9 Auto-DeepLab-L 80.4 Auto-DeepLab-S Yes 80.9 Auto-DeepLab-L Yes 82.1 DeepLabv3+ Yes Yes 82.1 DPC Yes Yes 82.7

Fourure, Damien, et al. "Residual conv-deconv grid network for semantic segmentation." In BMVC. 2017. Pohlen, Tobias, et al. "Full-resolution residual networks for semantic segmentation in street scenes." In CVPR. 2017. Chen, Liang-Chieh, et al. "Encoder-decoder with atrous separable convolution for semantic image segmentation." In ECCV. 2018. Chen, Liang-Chieh, et al. "Searching for efficient multi-scale architectures for dense image prediction." In NeurIPS. 2018.

slide-34
SLIDE 34

Pergormance on Cityscapes (Test Set)

Method ImageNet? Coarse? mIOU (%)

GridNet 69.5 FRRN-B 71.8 Auto-DeepLab-S 79.9 Auto-DeepLab-L 80.4 Auto-DeepLab-S Yes 80.9 Auto-DeepLab-L Yes 82.1 DeepLabv3+ Yes Yes 82.1 DPC Yes Yes 82.7

Fourure, Damien, et al. "Residual conv-deconv grid network for semantic segmentation." In BMVC. 2017. Pohlen, Tobias, et al. "Full-resolution residual networks for semantic segmentation in street scenes." In CVPR. 2017. Chen, Liang-Chieh, et al. "Encoder-decoder with atrous separable convolution for semantic image segmentation." In ECCV. 2018. Chen, Liang-Chieh, et al. "Searching for efficient multi-scale architectures for dense image prediction." In NeurIPS. 2018.

slide-35
SLIDE 35

Thank You

@chenxi116 htups://cs.jhu.edu/~cxliu/