Convolutional Neural Nets CS447 Natural Language Processing (J. - PowerPoint PPT Presentation

Lecture 8:   Convolutional Neural Nets Convolutional Neural Nets CS447 Natural Language Processing (J. Hockenmaier) https://courses.grainger.illinois.edu/cs447/ 1

  Convolutional Neural Nets (ConvNets, CNNs) [4 parameters, applied 3 times, non-overlapping inputs] Sparse Networks Dense   (with shared parameters : CNNs) (Fully-Connected) Networks [last lecture]   [3 parameters, applied 4 times, overlapping inputs] 2 CS447 Natural Language Processing (J. Hockenmaier) https://courses.grainger.illinois.edu/cs447/

Convolutional Neural Nets 2D CNNs are a standard architecture for image data. Neocognitron (Fukushima, 1980):   CNN with convolutional and downsampling (pooling) layers CNNs are inspired by receptive fields in the visual cortex: Individual neurons respond to small regions (patches) of the visual field. Neurons in deeper layers respond to larger regions. Neurons in the same layer share the same weights . This parameter tying allows CNNs to handle variable size inputs with a fixed number of parameters . CNNs can be used as input to fully connected nets. In NLP, CNNs are mainly used for classification . 3 CS447 Natural Language Processing (J. Hockenmaier) https://courses.grainger.illinois.edu/cs447/

A toy example a b c d e f g h i j k l A 3x4 black-and-white image is a 3x4 matrix of pixels. 4 CS447 Natural Language Processing (J. Hockenmaier) https://courses.grainger.illinois.edu/cs447/

Applying a 2x2 filter w x [ z ] a b c d y e f g h i j k l [ ] aw + bx + ey + fz bw + cx + fy + gz cw + dx + gy + hz ew + fx + iy + jz fw + gx + jy + kz gw + hx + ky + lz N × N N × N A filter is an -size matrix that can be applied to N × N -size patches of the input image. This operation is called convolution, but it works just like a dot product of vectors. 5 CS447 Natural Language Processing (J. Hockenmaier) https://courses.grainger.illinois.edu/cs447/

Applying a 2x2 filter w x [ z ] a b c d y e f g h i j k l [ ] aw + bx + ey + fz bw + cx + fy + gz cw + dx + gy + hz ew + fx + iy + jz fw + gx + jy + kz gw + hx + ky + lz N × N N × N We can apply the same filter to all -size patches of the input image. We obtain another matrix (the next layer in our network). The elements of the filter are the parameters of this layer. 6 CS447 Natural Language Processing (J. Hockenmaier) https://courses.grainger.illinois.edu/cs447/

Applying a 2x2 filter w x [ z ] a b c d y e f g h i j k l [ gw + hx + ky + lz ] aw + bx + ey + fz bw + cx + fy + gz cw + dx + gy + hz ew + fx + iy + jz fw + gx + jy + kz N × N N × N We can apply the same filter to all -size patches of the input image. We obtain another matrix (the next layer in our network). The elements of the filter are the parameters of this layer. 10 CS447 Natural Language Processing (J. Hockenmaier) https://courses.grainger.illinois.edu/cs447/

Applying a 2x2 filter w x [ z ] a b c d y e f g h i j k l [ gw + hx + ky + lz ] aw + bx + ey + fz bw + cx + fy + gz cw + dx + gy + hz ew + fx + iy + jz fw + gx + jy + kz We’ve turned a 3x4 matrix into a 2x3 matrix ,   so our image has shrunk. Can we preserve the size of the input? 11 CS447 Natural Language Processing (J. Hockenmaier) https://courses.grainger.illinois.edu/cs447/

Zero padding 0 0 0 0 0 w x [ z ] 0 a b a b c c d d y 0 e e f f g g h h i j k l 0 i j k l 0 0 0 0 0 0 0 w + 0 x + 0 y + az 0 w + 0 x + 0 y + az 0 w + 0 x + ay + bz 0 w + 0 x + ay + bz 0 w + 0 x + by + cz 0 w + 0 x + by + cz 0 w + 0 x + cy + dz 0 w + 0 x + cy + dz 0 0 w + ax + 0 y + ez 0 w + ax + 0 y + ez aw + bx + ey + fz aw + bx + ey + fz bw + cx + fy + gz bw + cx + fy + gz cw + dx + gy + hz cw + dx + gy + hz 0 0 w + ex + 0 y + iz 0 w + ex + 0 y + iz ew + fx + iy + jz ew + fx + iy + jz fw + gx + jy + kz fw + gx + jy + kz gw + hx + ky + lz gw + hx + ky + lz If we pad each matrix with 0s, we can maintain the same size throughout the network 12 CS447 Natural Language Processing (J. Hockenmaier) https://courses.grainger.illinois.edu/cs447/

After the nonlinear activation function w x [ z ] 0 0 0 0 0 y 0 a b c d e f g h 0 0 i j k l 0 0 0 0 0 0 g ( az ) g ( ay + bz ) g ( by + cz ) g ( cy + dz ) 0 g ( ax + ez ) g ( aw + bx + ey + fz ) g ( bw + cx + fy + gz ) g ( cw + dx + gy + hz ) 0 g ( ex + iz ) g ( ew + fx + iy + jz ) g ( fw + gx + jy + kz ) g ( gw + hx + ky + lz ) NB: Convolutional layers are typically followed by ReLUs. 13 CS447 Natural Language Processing (J. Hockenmaier) https://courses.grainger.illinois.edu/cs447/

Going from layer to layer… w x [ z ] 0 0 0 0 0 First   y 0 a b c d Input   Convolution e f g h 0 Data 0 i j k l [ z 1 ] w 1 x 1 0 0 0 0 0 Second   First 0 a 1 b 1 c 1 d 1 y 1 Convolution Hidden 0 e 1 f 1 g 1 h 1 Layer 0 i 1 j 1 k 1 l 1 0 0 0 0 0 One element in the 2nd Second 0 a 2 b 2 c 2 d 2 layer corresponds to a Hidden 3x3 patch in the input: e 2 f 2 g 2 h 2 0 Layer The “ receptive field ” 0 i 2 j 2 k 2 l 2 gets larger in each layer 14 CS447 Natural Language Processing (J. Hockenmaier) https://courses.grainger.illinois.edu/cs447/

Changing the stride Stride = the step size for sliding across the image Stride = 1: Consider all patches [see previous example] Stride = 2: Skip one element between patches Stride = 3: Skip two elements between patches,… A larger stride size yields a smaller output image. 0 0 0 0 a b c d w x [ z ] Input: Filter: y e f g h [Note that different zero-padding   i j k l may be required with a different   stride] [ gw + hx + ky + lz ] 0 w + 0 x + ay + bz 0 w + 0 x + cy + dz Stride = 2: ew + fx + iy + jz 15 CS447 Natural Language Processing (J. Hockenmaier) https://courses.grainger.illinois.edu/cs447/

Handling color images: channels Color images have a number of color channels: Each pixel in an RGB image is a (red, green, blue) triplet: ■ =(255, 0, 0) or ■ =(120, 5, 155) N × M N × M × 3 An RGB image is a tensor   height width depth   × × #channels = depth of the image Convolutional filters are applied to all channels   of the input We still specify filter size in terms of the image patch, because the #channels is a function of the data (not a parameter we control) C We still talk about 2 2 or 3 3 etc. filters, although with channels, × × N × N × C N × N × C they apply to a region (and have weights) 16 CS447 Natural Language Processing (J. Hockenmaier) https://courses.grainger.illinois.edu/cs447/

  Channels in internal layers N × N So far, we have just applied a single filter   to get to the next layer. K N × N But we could run different filters (with K different weights) to define a layer with channels. (If we initialize their weights randomly, they will learn different properties of the input) The hidden layers of CNNs have often   a large number of channels. (Useful trick: 1x1 convolutions increase or decrease the nr. of channels without affecting the size of the visual field) 17 CS447 Natural Language Processing (J. Hockenmaier) https://courses.grainger.illinois.edu/cs447/

Pooling Layers Pooling layers reduce the size of the representation, and are often used following a pair of conv+ReLU layers Each pooling layer returns a 3D tensor of the same depth as its input (but with smaller height & width) and is defined by — a filter size (what region gets reduced to a single value) — a stride (step size for sliding the window across the input) — a pooling function ( max pooling , avg pooling, min pooling, …) Pooling units don’t have weights, but simply return the maximum/ minimum/average value of their inputs Typically, pooling layers only receive input from a single channel. So they don’t reduce the depth (#channels). 18 CS447 Natural Language Processing (J. Hockenmaier) https://courses.grainger.illinois.edu/cs447/

Convolutional Neural Nets CS447 Natural Language Processing (J. - PowerPoint PPT Presentation

Lecture 8: Convolutional Neural Nets Convolutional Neural Nets CS447 Natural Language Processing (J. Hockenmaier) https://courses.grainger.illinois.edu/cs447/ 1 Convolutional Neural Nets (ConvNets, CNNs) [4 parameters, applied 3

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Convolutional Neural Networks ---- Off the shelf top notch performances Convolutional Neural

Convolutional Neural Nets 4-25-16 Reading Quiz Convolutional neural networks are most commonly

Deep Convolutional Neural Nets COMPSCI 371D Machine Learning COMPSCI 371D Machine

Convolutional Kuan-Ting Lai 2020/3/31 Neural Network Convolutional Neural Networks (CNN)

Conflict nets: Efficient locally canonical MALL proof nets Dominic J. D. Hughes and Willem

Neural Nets for Adaptive Filter and Adaptive Neural Nets as Adaptive Filters Pattern Recognition

Introduction CSCE 970 CSCE 970 Lecture 4: Lecture 4: Convolutional Convolutional Neural

Petri Nets Petri Nets Inputs and Outputs Petri Nets vs FSM Lionel Morel Modeling Templates

Mix-Nets Lecture 19 Some tools for electronic-voting (and other things) Mix-Nets Mix-Nets

Petri Nets and Model Checking Natasa Gkolfi University of Oslo March 31, 2017 Petri Nets and

NLP Programming Tutorial 8 - Recurrent Neural Nets Graham Neubig Nara Institute of Science and

Convolutional Neural Networks for Sentence Classification Yoon Kim New York University 1 / 34

Convolutional Neural Networks 08, 10 & 17 Nov, 2016 J. Ezequiel Soto S. Image Processing

and Inference for Convolutional Neural Networks 1 2 FFT IFFT 3 4 Mathieu et al.: Fast

Today CS 188: Artificial Intelligence Neural Nets (wrap-up) and Decision Trees Neural Nets --

A passion for precision Theodor W. Hnsch Max-Planck-Institute for Quantum Optics, Garching,

An Introduction to IoT Penetration Testing @libertyunix www.kmco.com The Agenda n IoT Attack

Chapters 5-6 Discussion and exercises Deliverable Answer the following questions in Google Doc

Artistic Stylization and Rendering Aaron Hertzmann Adobe Research San Francisco class

NMR Spectroscopy Dr. Joshua Osbourn Dept. of Chemistry, West Virginia University 1. Theory of

Texture CS 419 Slides by Ali Farhadi What is a Texture? Texture Spectrum Steven Li, James

Hertzsprung-Russell 47 Tuc The Milky Way M74 M87 SNIa Hubble diagram 2dF COBE: CMB Rotation

Vehic icle-to to-Vehicle Message Content Pla lausibil ility Check through Low-Power Beaconing

Convolutional Neural Nets CS447 Natural Language Processing (J. - PowerPoint PPT Presentation

Lecture 8: Convolutional Neural Nets Convolutional Neural Nets CS447 Natural Language Processing (J. Hockenmaier) https://courses.grainger.illinois.edu/cs447/ 1 Convolutional Neural Nets (ConvNets, CNNs) [4 parameters, applied 3

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Convolutional Neural Networks ---- Off the shelf top notch performances Convolutional Neural

Convolutional Neural Nets 4-25-16 Reading Quiz Convolutional neural networks are most commonly

Deep Convolutional Neural Nets COMPSCI 371D Machine Learning COMPSCI 371D Machine

Convolutional Kuan-Ting Lai 2020/3/31 Neural Network Convolutional Neural Networks (CNN)

Conflict nets: Efficient locally canonical MALL proof nets Dominic J. D. Hughes and Willem

Neural Nets for Adaptive Filter and Adaptive Neural Nets as Adaptive Filters Pattern Recognition

Introduction CSCE 970 CSCE 970 Lecture 4: Lecture 4: Convolutional Convolutional Neural

Petri Nets Petri Nets Inputs and Outputs Petri Nets vs FSM Lionel Morel Modeling Templates

Mix-Nets Lecture 19 Some tools for electronic-voting (and other things) Mix-Nets Mix-Nets

Petri Nets and Model Checking Natasa Gkolfi University of Oslo March 31, 2017 Petri Nets and

NLP Programming Tutorial 8 - Recurrent Neural Nets Graham Neubig Nara Institute of Science and

Convolutional Neural Networks for Sentence Classification Yoon Kim New York University 1 / 34

Convolutional Neural Networks 08, 10 &amp; 17 Nov, 2016 J. Ezequiel Soto S. Image Processing

and Inference for Convolutional Neural Networks 1 2 FFT IFFT 3 4 Mathieu et al.: Fast

Today CS 188: Artificial Intelligence Neural Nets (wrap-up) and Decision Trees Neural Nets --

A passion for precision Theodor W. Hnsch Max-Planck-Institute for Quantum Optics, Garching,

An Introduction to IoT Penetration Testing @libertyunix www.kmco.com The Agenda n IoT Attack

Chapters 5-6 Discussion and exercises Deliverable Answer the following questions in Google Doc

Artistic Stylization and Rendering Aaron Hertzmann Adobe Research San Francisco class

NMR Spectroscopy Dr. Joshua Osbourn Dept. of Chemistry, West Virginia University 1. Theory of

Texture CS 419 Slides by Ali Farhadi What is a Texture? Texture Spectrum Steven Li, James

Hertzsprung-Russell 47 Tuc The Milky Way M74 M87 SNIa Hubble diagram 2dF COBE: CMB Rotation

Vehic icle-to to-Vehicle Message Content Pla lausibil ility Check through Low-Power Beaconing

Convolutional Neural Networks 08, 10 & 17 Nov, 2016 J. Ezequiel Soto S. Image Processing