NEURAL NETWORK FOR OBJECT RECOGNITION Ming Lang and Xialoin Hu May - - PowerPoint PPT Presentation

neural network for object
SMART_READER_LITE
LIVE PREVIEW

NEURAL NETWORK FOR OBJECT RECOGNITION Ming Lang and Xialoin Hu May - - PowerPoint PPT Presentation

RECURRENT CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION Ming Lang and Xialoin Hu May 3, 2016 Presenter: Ceren Guzel Turhan CONTENT Overview Problem statement Motivation Overview of approach Related studies RCNN


slide-1
SLIDE 1

RECURRENT CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION

Presenter: Ceren Guzel Turhan Ming Lang and Xialoin Hu May 3, 2016

slide-2
SLIDE 2

CONTENT

  • Overview
  • Problem statement
  • Motivation
  • Overview of approach
  • Related studies
  • RCNN model
  • Implementations
  • Experimental setups
  • Experimental results
  • Conclusion

RECURRENT CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION 2

slide-3
SLIDE 3

OVERVIEW

  • Inspired by the fact that the number of recurrent synapses outnumber feed-forward

and top-down synapses in the brain

  • Idea: recurrent connections within convolutional layers

 Activity of each unit can be modulated by activities of its neighboring units

 Enhancing capability of context information

 Recurrence connections provide multiple paths: facilitating learning

RECURRENT CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION 3

slide-4
SLIDE 4

PROBLEM STATEMENT

  • Task: object recognition

from Fast R-CNN Object detection with caffe by Ross Girshick

RECURRENT CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION 4

slide-5
SLIDE 5

MOTIVATION

  • State-of-the-art results using CNN in object recognition
  • in ImageNet [26]
  • in ILSVRC-2012, Pascal VOC-2007, Pascal VOC-2012, Caltech 101, Caltech-256 [5]
  • in Pascal VOC-2007 [43]
  • in ILSVRC-2014 [50]
  • in CIFAR-10, CIFAR-100, MNIST [33]

RECURRENT CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION 5

slide-6
SLIDE 6

MOTIVATION

  • Brain-CNN and Brain-RNN relationship
  • CNN
  • originates from

neuroscience (the first artificial neuron)

  • is related to cells

in primary visual cortex

From Daniel L. K. Yamins and James J. DiCarlo

RECURRENT CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION 6

slide-7
SLIDE 7

MOTIVATION

  • Brain-CNN and Brain-RNN relationship

 RNN

 Recurrent synapsis in neocortex  Outnumbers feed-forward and top-down synapsis  Play an role in context modulation

RECURRENT CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION 7

slide-8
SLIDE 8

MOTIVATION

  • Object recognition – RNN relationship:

 Object recognition acts a dynamic process thanks to recurrent and top-down synapsis  The processing of visual signals is related to context information  The response properties of neurons related to context around RFs

RECURRENT CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION 8

slide-9
SLIDE 9

MOTIVATION

  • Context information:

 important for object recognition  can be obtained in higher layers of feed-forward models with larger RFs  cannot modulated in lower layer for smaller objects

  • Strategies for context information

 top-down connections  recurrent connections (in this study)

 recurrent connections in the same layer

RECURRENT CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION 9

slide-10
SLIDE 10

OVERVIEW OF APPROACH

  • Similar to RMLP:

 instead of full connections in RMLP shared local connections

  • RCNN: Feed-forward CNN and recurrent connections inside CNN

RECURRENT CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION 10

slide-11
SLIDE 11

RELATED STUDIES

  • Similar named studies:

 Recurrent convolutional neural networks for scene labeling (2014)  Convolutional neural networks with Intro-Layer Recurrent connections for Scene Labeling (2015)  Long-term Recurrent Convolutional Networks for Visual Recognition and Description (2015)  Recurrent Convolutional neural networks for Object-class segmentation of RGB-D Video (2015)

RECURRENT CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION 11

slide-12
SLIDE 12

RELATED STUDIES

  • MDRNN [20]:

 takes images as 2D sequential data  only one hidden layer  could not generate features like CNN

  • Hierarchical RNN (NAP) [2]:

 Recurrent and feedback connections

 Vertical and lateral recurrent connections

 Abstract image representation  Network with excitatory and inhibitory units  Only feed-forward version in test phase  Recurrent version for image reconstruction

RECURRENT CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION 12

slide-13
SLIDE 13

RELATED STUDIES

  • CDBN [31]:

 top-down connections  unsupervised feature learning by propagation of information from top layer to bottom layer

  • rCNN for scene labeling [36]:

 Recurrent connection in different layers  𝑠𝐷𝑂𝑂𝑜 : n network instance of 𝐷𝑂𝑂𝑜  Each network instance takes RBG image and previous network output as input

from Pedro O. Pinheiro and Ronan Collobert [36]

RECURRENT CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION 13

slide-14
SLIDE 14

RELATED STUDIES

  • Sparse coding models [15]

 iterative optimization procedures implicitly defines recurrent neural networks

  • Recursive CNN [9]

 time-unfolded version of RCNN

RECURRENT CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION 14

slide-15
SLIDE 15

RCNN MODEL: RCL LAYER

  • 𝑣 𝑗,𝑘 𝑢 : feed-forward input
  • 𝑦 𝑗,𝑘 𝑢 − 1 : recurrent input
  • 𝑗, 𝑘 : location of unit
  • 𝑙: feature map
  • 𝑥𝑙

𝑔: feed-forward weight

  • 𝑥𝑙

𝑠: recurrent weight

  • 𝑐𝑙: bias
  • 𝑔: rectified linear function
  • 𝑕: local response normalization

𝑣 𝑦 𝑣(𝑗,𝑘,𝑙) 𝑥𝑙

𝑠

𝑥𝑙

𝑔

𝑥𝑙

𝑔

RECURRENT CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION 15

slide-16
SLIDE 16

RCNN MODEL

RECURRENT CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION 16

slide-17
SLIDE 17

RCNN MODEL ARCHITECTURE

  • Standard convolutional layer, 2 RCLs, pooling, 2 RCLs, pooling, FC layer
  • Dropout after each pooling layer except layer 5
  • Cross-entropy loss using BPTT
  • (T+1): the depth of each RTL
  • 4(T+1)+2: the length of longest path

RECURRENT CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION 17

slide-18
SLIDE 18

IMPLEMENTATIONS

  • Cuda-convnet2
  • 2 Titan GPU
  • Hyper-parameters:
  • 𝑙: 96
  • Feed-forward filter size in layer: 5 × 5
  • Feed-forward and recurrent filter size in layer 2 to 4: 3 × 3
  • For LRN
  • 𝛽: 0.001
  • 𝛾: 0.75
  • 𝑂 = 𝑙/8 + 1

RECURRENT CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION 18

slide-19
SLIDE 19

EXPERIMENTAL SETUPS

  • Datasets:
  • CIFAR-10
  • CIFAR-100
  • MNIST
  • SVHN
  • Trained using BPTT in combination with stochastic gradient descent
  • Learning rate: 0.01
  • When accuracy stopped improving, it is decreased to its 1/10
  • Final learning rate is set to 0.0001
  • Momentum: 0.9
  • Iteration number: 3

RECURRENT CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION 19

slide-20
SLIDE 20

EXPERIMENTAL RESULTS: CIFAR-10

  • Dataset:
  • 60000 images (50000/10000/10000)
  • 32 × 32 pixel resolutions
  • 10 classes
  • Baseline models:
  • WCNN-128: (removed recurrent connections version of RNN

with 3 × 3 filters

  • rCNN-96: (removed recurrent connections of RCLs but adding

cascade of duplicated convolutional layers)

RECURRENT CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION 20

slide-21
SLIDE 21

EXPERIMENTAL RESULTS: CIFAR-10

  • Comparison with baseline models:

Model # of parameters Error (%) Training Testing rCNN-96 (1 iter) 0.67 M 4.61 12.65 rCNN-96 (1 iter) 0.67 M 2.26 12.99 rCNN-96 (1 iter) 0.67 M 1.24 14.92 WCNN-128 (1 iter) 0.60 M 3.45 9.98 RCNN-96 (1 iter) 0.67 M 4.99 9.95 RCNN-96 (2 iter) 0.67 M 3.58 9.63 RCNN-96 (3 iter) 0.67 M 3.06 9.31

RECURRENT CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION 21

slide-22
SLIDE 22

EXPERIMENTAL RESULTS: CIFAR-10

  • Comparison with state-of-the-art models without data augmentation:

Model # of parameters Testing error (%) Maxout[17] > 5 M 11.68 Prob maxout [47] > 5 M 11.35 NIN [33] 0.97 M 10.41 DSN [30] 0.97 M 9.69 RCNN-96 0.67 M 9.31 RCNN-128 1.19 M 8.98 RCNN-160 1.86 M 8.69 RCNN-96 (no dropout) 0.67 M 13.56

RECURRENT CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION 22

slide-23
SLIDE 23

EXPERIMENTAL RESULTS: CIFAR-10

  • Comparison with state-of-the-art models with data augmentation:

Model # of parameters Testing error (%) Prob maxout [47] > 5 M 9.39 Maxout[17] > 5 M 9.38 DropConnect (12 nets) [51]

  • 9.32

NIN [33] 0.97 M 8.81 DSN [30] 0.97 M 7.97 RCNN-96 0.67 M 7.37 RCNN-128 1.19 M 7.24 RCNN-160 1.86 M 7.09

RECURRENT CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION 23

slide-24
SLIDE 24

EXPERIMENTAL RESULTS: CIFAR-100

  • Dataset:
  • 60000 images (50000|10000|10000)
  • 32 × 32 pixel resolutions
  • 100 classes
  • Same settings as CIFAR-10 without further

tuning hyper-parameters

RECURRENT CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION 24

slide-25
SLIDE 25

EXPERIMENTAL RESULTS: CIFAR-100

Model # of parameters Testing error (%) Maxout [17] > 5 M 38.57 Prob maxout [47] > 5 M 38.14 Tree based priors [49]

  • 36.85

NIN [33] 0.98 M 35.68 DSN [30] 0.98 M 34.57 RCNN-96 0.68 M 34.18 RCNN-128 1.20 M 32.59 RCNN-160 1.87 M 31.75

RECURRENT CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION 25

slide-26
SLIDE 26

EXPERIMENTAL RESULTS: CIFAR-100

  • Comparison with state-of-the-art models with data augmentation:

Model # of parameters Testing error (%) Prob maxout [47] > 5 M 9.39 Maxout[17] > 5 M 9.38 DropConnect (12 nets) [51]

  • 9.32

NIN [33] 0.97 M 8.81 DSN [30] 0.97 M 7.97 RCNN-96 0.67 M 7.37 RCNN-128 1.19 M 7.24 RCNN-160 1.86 M 7.09

RECURRENT CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION 26

slide-27
SLIDE 27

EXPERIMENTAL RESULTS: MNIST

  • Dataset
  • 10 classes
  • 70000 images (60000|10000)
  • 28 × 28 pixel

Model # of parameters Testing error (%) NIN [33] 0.35 M 0.47 Maxout [17] 0.42 M 0.45 DSN [30] 0.35 M 0.39 RCNN-32 0.08 M 0.42 RCNN-64 0.30 M 0.32 RCNN-96 0.67 M 0.32

RECURRENT CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION 27

slide-28
SLIDE 28

EXPERIMENTAL RESULTS: SVHN

  • Dataset:
  • 10 classes
  • 630420 images (73257|26032|531131)
  • 32 × 32 pixel
  • Without data augmentation:

Model # of parameters Testing error (%) Maxout [17] > 5 M 2.47 Prob Maxout [47] > 5 M 2.39 NIN [33] 1.98 M 2.35 DSN [30] 1.98 M 1.92 RCNN-32 1.19 M 1.87 RCNN-64 1.86 M 1.80 RCNN-96 2.67 M 1.77

RECURRENT CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION 28

slide-29
SLIDE 29

EXPERIMENTAL RESULTS: SVHN

  • With data augmentation:
  • Without data augmentation:

Model # of parameters Testing error (%) Multi-digit number recognition [16] > 5 M 2.16 Drop Connect (5 nets) [51]

  • 1.94

Model # of parameters Testing error (%) RCNN-32 1.19 M 1.87 RCNN-64 1.86 M 1.80 RCNN-96 2.67 M 1.77

RECURRENT CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION 29

slide-30
SLIDE 30

CONCLUSION

  • Inspired by recurrent synapsis in the brain
  • Idea: adding recurrent connection within convolutional layer
  • Enhanced capability of context information about objects
  • facilitating learning by multiple paths thanks to time-unfolded RCNN
  • Increasing network depth with constant adjustable parameters
  • going deeper with relatively small number of parameters
  • With fewer parameter outperforms state-of-the-art models over four

benchmark

  • Increasing parameter causes even better performance

RECURRENT CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION 30

slide-31
SLIDE 31

THANK YOU

  • Any question ?

RECURRENT CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION 31