Advanced Section #8: Neural Networks for Image Analysis Camilo - PowerPoint PPT Presentation

Advanced Section #8: Neural Networks for Image Analysis Camilo Fosco CS109A Introduction to Data Science Pavlos Protopapas and Kevin Rader 1

Outline • Image analysis: why neural networks? • Multi Layer Perceptron refresher Convolutional Neural Networks • • How they work • How to build them • Building your own image classifier • Evolution of CNNs CS109A, P ROTOPAPAS , R ADER 2

Image analysis – why neural networks? Imagine that we want to recognize swans in an image: Round, elongated oval with orange protuberance Oval-shaped white blob (body) Long white rectangular shape (neck) CS109A, P ROTOPAPAS , R ADER 3

Cases can be a bit more complex… Round, elongated head with orange or black beak Oval-shaped white body with or without large white symmetric blobs (wings) Long white neck, square shape CS109A, P ROTOPAPAS , R ADER 4

Now what? Small black circles, Round, elongated head with Long white neck, can bend Black triangular can be facing the orange or black beak, can around, not necessarily shaped form, on the camera, sometimes be turned backwards straight head, can have can see both different sizes Luckily, the White tail, generally far White elongated piece, can Black feet, under from the head, looks color is be squared or more White, oval shaped body, can have feathery triangular, can be obstructed body, with or without different shapes consistent… CS109A, P ROTOPAPAS , R ADER sometimes wings visible 5

CS109A, P ROTOPAPAS , R ADER 6

We need to be able to deal with these cases. CS109A, P ROTOPAPAS , R ADER 7

Image features We’ve been basically talking about detecting features in • images, in a very naïve way. Researchers built multiple computer vision techniques to deal • with these issues: SIFT, FAST, SURF, BRIEF, etc. However, similar problems arose: the detectors where either too • general or too over-engineered. Humans were designing these feature detectors, and that made them either too simple or hard to generalize. FAST corner SIFT feature detection descriptor algorithm CS109A, P ROTOPAPAS , R ADER 8

What if we learned the features to detect? • • We need a system that can do Representation Learning (or Feature Learning). Representation Learning: technique that allows a system to automatically find relevant features for a given task. Replaces manual feature engineering. Multiple techniques for this: • Unsupervised (K-means, PCA, …). Supervised (Sup. Dictionary learning, Neural Networks!) • CS109A, P ROTOPAPAS , R ADER 9

MULTILAYER PERCEPTRON Or Fully Connected Network (FCN) 10

Perceptron to MLP Multilayer Perceptron The Perceptron 𝑦 " 𝑦 # 𝑍 = 𝑔(𝛾 + + 𝛾 " 𝑦 " + 𝛾 # 𝑦 # + 𝛾 $ 𝑦 $ + 𝛾 % 𝑦 % ) 𝑦 $ 𝑦 % Output Layer Hidden Layer Input layer They can be more complex… CS109A, P ROTOPAPAS , R ADER 11

Main advantages of MLP Ability to find patterns in complex and messy data. • • Network with one hidden layer and sufficient hidden nodes has been proven to be an universal approximator. • Can take the raw data as input, and learn its own features internally to better classify. • Amount of human involvement is low: we only prepare and feed the data. No feature engineering needed. MLP makes no assumption on the distribution of input • data. CS109A, P ROTOPAPAS , R ADER 12

Combatting overfitting: Dropout Method of regularization consisting of randomly dropping • nodes during training. Similar to bagging. • • We re-randomize our network at each training iteration. • During test time, we use the full network where nodes are scaled by their probability of appearing. CS109A, P ROTOPAPAS , R ADER 13

Multilayer perceptron - visualization Let’s have a look at a cool tool to play with MLPs: https://playground.tensorflow.org/ CS109A, P ROTOPAPAS , R ADER 14

Drawbacks • MLPs use one perceptron for each pixel in an image, multiplied by 3 in RGB case. the amount of weights rapidly becomes unmanageable for large images. • Training difficulties arise, overfitting can appear. • MLPs react differently to an image and its shifted version – they are not translation invariant. CS109A, P ROTOPAPAS , R ADER 15

Drawbacks Imagine we want to build a cat detector with an MLP. In this case, the red weights will be modified to better recognize cats In this case, the green weights will be modified. We are learning redundant features. Approach is not robust, as cats could appear in yet another position. CS109A, P ROTOPAPAS , R ADER 16

Drawbacks Example: CIFAR10 Simple 32x32 color images (3 channels) Each pixel is a feature: an MLP would have 32x32x3+1 = 3073 weights per neuron! CS109A, P ROTOPAPAS , R ADER 17

Drawbacks Example: ImageNet Images are usually 224x224x3: an MLP would have 150129 weights per neuron. If the first layer of the MLP is around 128 nodes, which is small, this already becomes very heavy to calculate. Model complexity is extremely high: overfitting. CS109A, P ROTOPAPAS , R ADER 18

CONVOLUTIONAL NEURAL NETWORKS The smart way of looking at images 19

Basics of CNNs We know that MLPs: • Do not scale well for images Ignore the information bought by pixel position and correlation with • neighbors • Cannot handle translations The general idea of CNNs is to intelligently adapt to properties of images: • Pixel position and neighborhood has semantic meaning. • Elements of interest can appear anywhere in the image. CS109A, P ROTOPAPAS , R ADER 20

Basics of CNNs MLP CNN CNNs are also composed of layers, but those layers are not fully connected: they have filters, sets of cube-shaped weights that are applied throughout the image. Each 2D slice of the filters are called kernels. These filters introduce translation invariance and parameter sharing. CS109A, P ROTOPAPAS , R ADER How are they applied? Convolutions! 21

� Convolution and cross-correlation Convolution of f and g (𝑔 ∗ 𝑕) is defined as the integral of • the product, having one of the functions inverted and shifted: Function is 𝑔 ∗ 𝑕 𝑢 = 1𝑔 𝑏 𝑕 𝑢 − 𝑏 𝑒𝑏 inverted and shifted left by t 6 • Discrete convolution: 8 𝑔 ∗ 𝑕 𝑢 = 7 𝑔 𝑏 𝑕(𝑢 − 𝑏) 69:8 • Discrete cross-correlation: 8 𝑔 ⋆ 𝑕 𝑢 = 7 𝑔 𝑏 𝑕(𝑢 + 𝑏) CS109A, P ROTOPAPAS , R ADER 69:8 22

Convolutions – step by step CS109A, P ROTOPAPAS , R ADER 23

Convolutions – another example CS109A, P ROTOPAPAS , R ADER 24

Convolutions – 3D input CS109A, P ROTOPAPAS , R ADER 25

Convolutions – what happens at the edges? If we apply convolutions on a normal image, the result will be downsampled by an amount depending on the size of the filter. We can avoid this by padding the edges in different ways. CS109A, P ROTOPAPAS , R ADER 26

Padding Full padding. Introduces zeros such that all Same padding. Ensures that the pixels are visited the same amount of times by output has the same size as the the filter. Increases size of output. input. CS109A, P ROTOPAPAS , R ADER 27

Convolutional layers Convolutional layer with four 3x3 filters Convolutional layer with four 3x3 filters on a on an RGB image. As you can see, the black and white image (just one channel) filters are now cubes, and they are applied on the full depth of the image.. CS109A, P ROTOPAPAS , R ADER 28

• To be clear: each filter is convolved with the entirety of the 3D input cube, but generates a 2D feature map. • Because we have multiple filters, we end up with a 3D output: one 2D feature map per filter. • The feature map dimension can change drastically from one conv layer to the next: we can enter a layer with a 32x32x16 input and exit with a 32x32x128 output if that layer has 128 filters. CS109A, P ROTOPAPAS , R ADER 29

Why does this make sense? In image is just a matrix of pixels. Convolving the image with a filter produces a feature map that highlights the presence of a given feature in the image. CS109A, P ROTOPAPAS , R ADER 30

CS109A, P ROTOPAPAS , R ADER 31

In a convolutional layer, we are basically applying multiple filters at over the image to extract different features. But most importantly, we are learning those filters! One thing we’re missing: non-linearity. CS109A, P ROTOPAPAS , R ADER 32

Introducing ReLU The most successful non-linearity for CNNs is the Rectified Non-Linear unit (ReLU): Combats the vanishing gradient problem occurring in sigmoids, is easier to compute, generates sparsity (not always beneficial) CS109A, P ROTOPAPAS , R ADER 33

Convolutional layer so far A convolutional layer convolves each of its filters with the • input. Input: a 3D tensor, where the dimensions are Width, Height • and Channels (or Feature Maps) Output: a 3D tensor, with dimensions Width, Height and • Feature Maps (one for each filter) • Applies non-linear activation function (usually ReLU) over each value of the output. • Multiple parameters to define: number of filters, size of filters, stride, padding, activation function to use, regularization. CS109A, P ROTOPAPAS , R ADER 34

Advanced Section #8: Neural Networks for Image Analysis Camilo - PowerPoint PPT Presentation

Advanced Section #8: Neural Networks for Image Analysis Camilo Fosco CS109A Introduction to Data Science Pavlos Protopapas and Kevin Rader 1 Outline Image analysis: why neural networks? Multi Layer Perceptron refresher

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

Neural networks Chapter 20, Section 5 Chapter 20, Section 5 1 Outline Brains Neural

Image Restoration Image Enhancement and Image Restoration both deal with improving images. Image

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

Neural Networks 0. Logistics Spring 2019 1 Neural Networks are taking over! Neural networks

Neural Networks 1. Introduction Fall 2017 Neural Networks are taking over! Neural networks

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I :

CHAPTER II III I CHAPTER Neural Networks as Neural Networks as Associative Memory

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Neural Networks and their Application to Go Neural Networks Learning Blackjack Theory Training

Neural Networks 1. Introduction Spring 2020 1 Neural Networks are taking over! Neural

Neural Networks 1. Introduction Spring 2019 1 Neural Networks are taking over! Neural

Neural Networks Neural Net Basics Dan Klein, John DeNero UC Berkeley Slides adapted from Greg

Human Centered Engineered Systems for Holistic Living Rajendra Singh D. Houser Banks Professor

Segmentation Driven Object Detection with Introduction Fisher Vectors State of the art Method

Illegal Wildlife Trade Challenge Fund: Assessing Stages 1 and 2 applications Donnamarie

SYSTEM The Network of Networks Gender Q2 Meeting Charles Russell Speechlys LLP 27 April 2016

Learning Two-View Stereo Matching Jianxiong Xiao Jingni Chen Dit-Yan Yeung Long Quan

EXETER TOWNSHIP 2020 Budget Department Presentations September 17, 2019 Mission Statement The

National implementation of Resolution National implementation of Resolution 1540 (2004)- -the

Enhancing Sketch-Based Image Retrieval by Re-Ranking and Relevance Feedback Heechan Shin CS688