deep neural networks ii
play

Deep Neural Networks II Sen Wang UDRC Co-I WP3.1 and WP3.2 - PowerPoint PPT Presentation

Deep Neural Networks II Sen Wang UDRC Co-I WP3.1 and WP3.2 Assistant Professor in Robotics and Autonomous Systems Institute of Signals, Sensors and Systems Heriot-Watt University UDRC-EURASIP Summer School 26th June 2019 - Edinburgh


  1. Deep Neural Networks II Sen Wang UDRC Co-I – WP3.1 and WP3.2 Assistant Professor in Robotics and Autonomous Systems Institute of Signals, Sensors and Systems Heriot-Watt University UDRC-EURASIP Summer School 26th June 2019 - Edinburgh Slides adapted from Andrej Karpathy, Kaiming He

  2. Outline Learning features for machines to solve problems Convolutional Neural Networks (CNNs) • Deep Learning Architectures (focus on CNNs) - learning features • Some Deep Learning Applications - problems • o Object detection (image, radar, sonar) o Semantic segmentation o Visual odometry o 3D reconstruction o Semantic mapping o Robot navigation o Manipulation and grasping o ………… UDRC-EURASIP Summer School 1

  3. Deep Learning Deep Learning: a learning technique combining layers of neural networks to automatically identify features that are relevant to the problem to solve Training (supervised learning) forward prediction error label ⊗ backward big data data label Testing forward prediction Low-level Middle-level High-level Raw Data test data trained DNN Features Features Features UDRC-EURASIP Summer School 2

  4. Deep Learning in Robotics ICRA2018 ~2500 submissions: the most popular keyword IJRR 2016 IJCV 2018 UDRC-EURASIP Summer School 3

  5. Deep Learning in Robotics UDRC-EURASIP Summer School 4

  6. Convolutional Neural Networks (CNNs) UDRC-EURASIP Summer School 5

  7. From MLPs to CNNs • Feed-forward Neural Networks or Multi-Layer Perceptrons (MLPs) o many multiplications • CNNs are similar to Feed-forward Neural Networks o convolution instead of general matrix multiplication UDRC-EURASIP Summer School 6

  8. CNNs fully-connected layer ………. • 3 Main Types of Layers: o convolutional layer pooling layer o activation layer o pooling layer activation layer • repeat many times convolutional layer input layer UDRC-EURASIP Summer School 7

  9. CNNs: Convolution Layer 32x32x3 image 5x5x3 filter height 32 Convolve the filter with the image i.e. “slide over the image spatially, computing dot products” width 32 depth 3 Slides courtesy of Andrej Karpathy UDRC-EURASIP Summer School 8

  10. CNNs: Convolution Layer Filters always extend the full depth of the input volume 32x32x3 image 5x5x3 filter 32 Convolve the filter with the image i.e. “slide over the image spatially, computing dot products” 32 3 UDRC-EURASIP Summer School 9

  11. CNNs: Convolution Layer 32x32x3 image 5x5x3 filter 32 1 number: the result of taking a dot product between the filter and a small 5x5x3 chunk of the image 32 (i.e. 5*5*3 = 75-dimensional dot product + bias) 2 important ideas: 3 • local connectivity • parameter sharing UDRC-EURASIP Summer School 10

  12. CNNs: Convolution Layer activation map 32x32x3 image 5x5x3 filter 32 28 convolve (slide) over all spatial locations 28 32 3 1 UDRC-EURASIP Summer School 11

  13. CNNs: Convolution Layer consider a second, green filter activation maps 32x32x3 image 5x5x3 filter 32 28 convolve (slide) over all spatial locations 28 32 3 1 UDRC-EURASIP Summer School 12

  14. CNNs: Convolution Layer For example, if we had 6 of 5x5 filters, we’ll get 6 separate activation maps: activation maps 32 28 Convolution Layer 28 32 6 3 We stack these up to get a “new image” of size 28x28x6 UDRC-EURASIP Summer School 13

  15. CNNs: Convolution Layer For example, if we had 6 of 5x5 filters, we’ll get 6 separate activation maps: activation maps 32 28 Convolution Layer 28 32 3 6 We processed [32x32x3] volume into [28x28x6] volume. Q: how many parameters would this be if we used a fully connected layer instead? courtesy of Andrej Karpathy UDRC-EURASIP Summer School 14

  16. CNNs: Convolution Layer For example, if we had 6 of 5x5 filters, we’ll get 6 separate activation maps: activation maps 32 28 Convolution Layer 28 32 3 6 We processed [32x32x3] volume into [28x28x6] volume. Q: how many parameters would this be if we used a fully connected layer instead? A: (32*32*3)*(28*28*6) = 14.5M parameters , ~ 14.5M multiplies UDRC-EURASIP Summer School 15

  17. CNNs: Convolution Layer For example, if we had 6 of 5x5 filters, we’ll get 6 separate activation maps: activation maps 32 28 Convolution Layer 28 32 3 6 We processed [32x32x3] volume into [28x28x6] volume. Q: how many parameters are used instead? UDRC-EURASIP Summer School 16

  18. CNNs: Convolution Layer For example, if we had 6 of 5x5 filters, we’ll get 6 separate activation maps: activation maps 32 28 Convolution Layer 28 32 3 6 We processed [32x32x3] volume into [28x28x6] volume. Q: how many parameters are used instead? --- And how many multiplies? A: (5*5*3)*6 = 450 parameters UDRC-EURASIP Summer School 17

  19. CNNs: Convolution Layer For example, if we had 6 of 5x5 filters, we’ll get 6 separate activation maps: activation maps 32 28 Convolution Layer 2 Merits: • vastly reduce the amount of parameters • more efficient 28 32 3 6 We processed [32x32x3] volume into [28x28x6] volume. Q: how many parameters are used instead? A: (5*5*3)*6 = 450 parameters , (5*5*3)*(28*28*6) = ~350K multiplies UDRC-EURASIP Summer School 18

  20. CNNs: Activation Layer • 3 Main Types of Layers: fully-connected layer o convolutional layer o activation layer ………. o pooling layer pooling layer activation layer convolutional layer input layer UDRC-EURASIP Summer School 19

  21. CNNs: Pooling Layer • 3 Main Types of Layers: fully-connected layer o convolutional layer o activation layer ………. o pooling layer pooling layer • repeat many times activation layer convolutional layer makes the representations smaller and more manageable input layer UDRC-EURASIP Summer School 20

  22. CNNs: A sequence of Convolutional Layers 32 28 32 28 24 …. CONV, CONV, CONV, CONV, ReLU ReLU ReLU ReLU e.g. 6 e.g. 6 e.g. 10 5x5x3 5x5x3 5x5x 6 32 28 32 28 24 filters filters filters 3 3 6 6 10 UDRC-EURASIP Summer School 21

  23. Deep Learning Architectures UDRC-EURASIP Summer School 22

  24. Hand-Crafted Features by Human Feature Extraction Pervasive Data Inference (hand-crafted) Activities, Context, … time-series data Locations, Scene types, Semantics, … vision Objects, Structure, … point cloud UDRC-EURASIP Summer School 23

  25. Feature Engineering and Representation Pervasive Data time-series data 256 3 x800x600 Raw data vision ≈ Bad Representation 2 ?x?x? point cloud UDRC-EURASIP Summer School 24

  26. Deep Learning: Representation Learning Pervasive Data End-to-End Learning Inference time-series data Activities, Context, … Locations, Scene types, … Structure, vision Semantics, … automatically learn effective point cloud feature representation to solve the problem UDRC-EURASIP Summer School 25

  27. LeNet - 1998 Convolution: • o locally-connected o spatially weight-sharing Foundation of modern ConvNets! weight-sharing is a key in DL • Subsampling • Fully-connected outputs • “Gradient-based learning applied to document recognition”, LeCun et al. 1998 UDRC-EURASIP Summer School 26

  28. AlexNet – 2012 8 layers: 5 conv and max-pooling + 3 fully-connected LeNet-style backbone, plus: ReLU • o Accelerate training o better gradprop (vs. tanh) Dropout • o Reduce overfitting Data augmentation • o Image transformation o Reduce overfitting “ImageNet Classification with Deep Convolutional Neural Networks”, Krizhevsky, Sutskever, Hinton. NIPS 2012 UDRC-EURASIP Summer School 27

  29. VGG16/19 - 2014 Very deep ConvNet Modularized design • 3x3 Conv as the module • Stack the same module • Same computation for each module Stage-wise training • VGG-11 => VGG-13 => VGG-16 “Very Deep Convolutional Networks for Large-Scale Image Recognition”, Simonyan & Zisserman. arXiv 2014 (ICLR 2015) UDRC-EURASIP Summer School 28

  30. GoogleNet/Inception - 2014 22 layers Multiple branches • e.g., 1x1, 3x3, 5x5, pooling • merged by concatenation • Reduce dimensionality by 1x1 before expensive 3x3/5x5 conv Szegedy et al. “Going deeper with convolutions”. arXiv 2014 (CVPR 2015) UDRC-EURASIP Summer School 29

  31. Going Deeper Simply stacking layers? • Plain nets: stacking 3x3 conv layers • 56-layer net has higher training error and test error than 20-layer net • A deeper model should not have higher training error UDRC-EURASIP Summer School 30

  32. Going Deeper Cannot go deeper for deep neural networks! Problem: deeper plain nets have higher training error on various datasets Optimization difficulties: o vanishing gradient o solvers struggle to find the solution when going deeper Kaiming He, Xiangyu Zhang, Shaoqing Ren, & Jian Sun. “Deep Residual Learning for Image Recognition”. CVPR 2016. UDRC-EURASIP Summer School 31

  33. ResNets-2016 Plain net Residual net gradients can flow directly through the skip connections backwards Kaiming He, Xiangyu Zhang, Shaoqing Ren, & Jian Sun. “Deep Residual Learning for Image Recognition”. CVPR 2016. UDRC-EURASIP Summer School 32

  34. ResNets-2016 • Deep ResNets can be trained easier • Deeper ResNets have lower training error, and also lower test error Kaiming He, Xiangyu Zhang, Shaoqing Ren, & Jian Sun. “Deep Residual Learning for Image Recognition”. CVPR 2016. UDRC-EURASIP Summer School 33

  35. ImageNet experiments top 5 error % UDRC-EURASIP Summer School 34

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend