Deep Convolutional Networks and their impact on solving large scale - PowerPoint PPT Presentation

IDA Machine Learning Seminars - Fall 2015 Deep Convolutional Networks and their impact on solving large scale visual recognition problems Hossien Azizpour, Computer Vision Group , KTH . Thanks to: J. Sullivan, A. S. Razavian, A. Maki and S. Carlsson

What Deep Learning has done for Computer Vision? Deep Learning has resulted in 1. much better automatic - visual image classification and - object detection, 2. much more powerful generic image representations.

What ConvNets have done for Computer Vision? ConvNets have resulted in 1. much better automatic - visual image classification and - object detection, 2. much more powerful generic image representations.

Image Classification Task: ILSVRC Steel!drum! Output:* Output:* Scale! Scale! ✗ ! ✔ ! TPshirt! TPshirt! Steel!drum! Giant!panda! Drums1ck! Drums1ck! Mud!turtle! Mud!turtle! 1 � Error = 1( incorrect on image i ) 100 , 000 100 , 000 images Source : Detecting avocados to zucchinis: what have we done, and where are we going? O. Russakovsky et al., ICCV 2013

ConvNets → much better image classification 30 28 . 2 25 . 8 Classification error (%) 20 16 . 4 11 . 7 10 6 . 7 0 2010 2011 2012 2013 2014 Performance of winning entry in ILSVRC competitions (2010-14). Red indicates when deep ConvNets were introduced.

How well would a human perform on ImageNet? • Andrej Karpathy, Stanford, set himself this challenge. • Replicated the 1000 way classification problem for a human. - Person shown image on the left of the image - On the right shown 13 examples from each of the 1000 classes. - Must pick 5 of classes as the potential ground truth label.

How well would a human perform on ImageNet? • Efforts and results reported on his blog What I learned from competing against a ConvNet on ImageNet • Estimated his own accuracy on ImageNet as 5.1%. ( After some training period. ) • Later conjectured (Feb 2015) a dedicated and motivated human classifier capable of error rate in the range of 2%–3%

Race is on to beat human level performance 30 28 . 2 25 . 8 Classification error (%) 20 16 . 4 11 . 7 10 6 . 7 5 . 33 4 . 94 4 . 82 0 2010 2011 2012 2013 2014 Jan Feb Mar Recent progress made by Baidu, MSR and Google.

��

Pascal VOC: Object Detection Classifica>on:=person,=motorcycle= Detec4on( Person= Motorcycle(

ConvNets → much better object detection Accuracy Deep learning 80 plant 70 person chair 60 cat 50 car 40 aeroplane all classes 30 20 10 Year 2007 2008 2009 2010 2011 2012 2013 2014 2015 Progress of object detection for the Pascal VOC 2007 challenge.

ConvNets → much better image representation

Other Common Tasks in Computer Vision • Fine-Grained classification Task: - Label the sub-categories within a class.

Other Common Tasks in Computer Vision • Attribute Classification Task: - Predict the attributes describing a scene (person, etc.)

Other Common Tasks in Computer Vision • Image Retrieval Have a database of images. Task: - Given a query image. - Find images in database with same content as the query image. Database images ranked closest to query image. Query image correct result, incorrect result.

Solving these tasks often involves a complicated pipeline CNN • Example: fine-grained classification Representation Learn Extract Features Strong Part Image Normalized SVM RGB, gradient, DPM Annotations Pose LBP

Solving these tasks often involves a complicated pipeline CNN • Example: fine-grained classification Representation Learn Extract Features Strong Part Image Normalized SVM RGB, gradient, DPM Annotations Pose LBP • Can IMPROVE RESULTS by replacing the complicated pipeline with CNN Representation Learn Extract Features Strong Part Image SVM Normalized RGB, gradient, Annotations DPM Pose LBP • ConvNet used must be deep and trained on a large diverse labelled dataset.

What we mean by a ConvNet feature 224 × 224 × 3 55 × 55 × 48 4096 4096 27 × 27 × 128 13 × 13 × 192 13 × 13 × 192 13 × 13 × 128 1000 dense dense dense Input Image Convolutional layers Fully connected layers Output

ConvNets → much better image representation Best state-of-the-art ConvNet off-the-shelf + Linear SVM 100 91 . 4 91 . 1 89 . 5 89 . 3 86 . 8 84 . 3 81 . 7 81 . 9 80 . 7 79 . 5 80 77 . 2 74 . 9 73 71 . 1 69 . 9 69 68 64 61 . 8 60 56 . 8 45 . 4 42 . 3 40 n n n n n n l l l l l a a a a a o o o o o o v v v v v i i i i i i e e e e e t t t t t t a a a i c c i i i i i r r r r r n c c z e e t t t t t fi fi i g t t e e e e e r o e e i i R R R R R s s o D D c s s g e a a s s s e e e R e e l l t g g e g c C C t t r a n n a n u u c s u i i m a r b b t e b d d t e t c i i p n l l I s u w r r e i i n e u u l S t t u e j o c t t B B I b n S l A A c d O F S e t r c s d c n t i i S e B r c r a o j e a b m f j P x O b u O O H Source : CNN Features off-the-shelf: an Astounding Baseline for Recognition, A. Sharif Razavian et al., arXiv, March 2013.

Reason for jump in performance : Learn feature hierarchies from the data

Modern Visual Recognition Systems 1. Training Phase - Gather labelled training data. - Extract a feature representation for each training example. - Construct a decision boundary. 2. Test Phase - Extract feature representation from the test example. - Compare to the learnt decision boundary.

Modern Visual Recognition Systems 1. Training Phase - Gather labelled training data. - Extract a feature representation for each training example. - Construct a decision boundary. 2. Test Phase - Extract feature representation from the test example. - Compare to the learnt decision boundary. It’s just supervised learning.

Is it a bike or a face? ?

Construct a decision boundary Decision Boundary

The two extremes of feature extraction Ideal features Far from ideal

The two extremes of feature extraction Ideal features Far from ideal Supervised Deep Learning allows you to learn more ideal features.

Learning Representations/Features Traditional Pattern Recognition : Fixed/Handcrafted feature extraction Feature Trainable Extractor Classifier Modern Pattern Recognition : Unsupervised mid-level features Feature Mid-level Trainable Extractor Features Classifier Deep Learning : Trained hierarchical representations Low-level Mid-level High-level Trainable Features Features Features Classifier Source : Talk Computer Perception with Deep Learning by Yann LeCun

Key Properties of Deep Learning Provides a mechanism to: • Learn a highly non-linear function. ( Efficiently encoded in a deep structure. ) • Learn it from data. • Build feature hierarchies - Distributed representations - Compositionality • Perform end-to-end learning.

How? Convolutional Networks

Convolutional Networks • Are deployed in many practical applications Image recognition, speech recognition, Google’s and Baidu’s photo taggers • Have won several competitions ImageNet, Kaggle Facial Expression, Kaggle Multimodal Learning, German Traffic Signs, Connectomics, Handwriting... • Are applicable to array data where nearby values are correlated Images, sound, time-frequency representations, video, volumetric images, RGB-Depth images.... Source : Talk Computer Perception with Deep Learning by Yann LeCun

Convolutional Network • Training is supervised and with stochastic gradient descent . • LeCun et al. ’89, ’98 Source : Talk Computer Perception with Deep Learning by Yann LeCun

ConvNets: History • Fukushima 1980 : designed network with same basic structure but did not train by backpropagation. • LeCun from late 80s : figured out backpropagation for ConvNets, popularized and deployed ConvNets for OCR applications etc. • Poggio from 1999 : same basic structure but learning is restricted to top layer (k-means at second stage) • LeCun from 2006 : unsupervised feature learning • DiCarlo from 2008 : large scale experiments, normalization layer • LeCun from 2009 : harsher non-linearities, normalization layer, learning unsupervised and supervised. • Mallat from 2011 : provides a theory behind the architecture • Hinton 2012 : use bigger nets, GPUs, more data

E M Convolutional I Neural Net 2012 T Convolutional Neural Net 1998 Convolutional Neural Net 1988 Reasons for breakthrough now: • Data and GPUs , • Networks have been made deeper.

Modern Convolutional Network 224 × 224 × 3 55 × 55 × 48 4096 4096 27 × 27 × 128 13 × 13 × 192 13 × 13 × 192 13 × 13 × 128 1000 dense dense dense Input Image Convolutional layers Fully connected layers Output Alex Net 2012

Convolutional Networks for RGB Images: The Basic Operations

Deep Convolutional Networks and their impact on solving large scale - PowerPoint PPT Presentation

IDA Machine Learning Seminars - Fall 2015 Deep Convolutional Networks and their impact on solving large scale visual recognition problems Hossien Azizpour, Computer Vision Group , KTH . Thanks to: J. Sullivan, A. S. Razavian, A. Maki and S.

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Convolutional Neural Networks ---- Off the shelf top notch performances Convolutional Neural

Integrating Problem Solving 2020 Integrating Problem Solving 2020 Integrating Problem Solving

Introduction CSCE 970 CSCE 970 Lecture 4: Lecture 4: Convolutional Convolutional Neural

Convolutional Kuan-Ting Lai 2020/3/31 Neural Network Convolutional Neural Networks (CNN)

CS7015 (Deep Learning) : Lecture 13 Visualizing Convolutional Neural Networks, Guided

Convolutional Neural Networks 08, 10 & 17 Nov, 2016 J. Ezequiel Soto S. Image Processing

Convolutional Neural Networks (Part III) 08, 10 & 17 Nov, 2016 J. Ezequiel Soto S. Image

Convolutional Neural Networks for Sentence Classification Yoon Kim New York University 1 / 34

Convolutional Networks Lecture slides for Chapter 9 of Deep Learning Ian Goodfellow 2016-09-12

Deep Convolutional Neural Nets COMPSCI 371D Machine Learning COMPSCI 371D Machine

15-780 Graduate Artificial Intelligence: Convolutional and recurrent networks J. Zico Kolter

and Inference for Convolutional Neural Networks 1 2 FFT IFFT 3 4 Mathieu et al.: Fast

Semantic Segmentation of the sekleton in bone scintigraphy images with convolutional neural

Convolutional Neural Networks in Speech Lecture 20 CS 753 Instructor: Preethi Jyothi

Anytime Reliability of Systematic LDPC Motivation Convolutional Codes LDPC Convolutional Codes

Hashing and Dictionaries 15-110 Monday 03/02 Learning Goals Understand how and why hashing

Combinatorial Interaction Testing Justyna Petke C entre for R esearch in E volution, S earch and

Cracking the Container Scale Problem with Apache Mesos Connor Doyle connor@mesosphere.io Sunil

Build an Alien Sightings Dashboard BUILDIN G W EB AP P LICATION S W ITH S H IN Y IN R Kaelen

t t s sstrs

On the double shuffle Lie algebra structure: Ecalles approach Adriana Salerno (joint work

Physics of MRF Regularization for Segmentation of Materials Microstructure Images Jeff

N-grams and Morpheme Analysis in IR Paul McNamee Johns Hopkins University Applied Physics

Deep Convolutional Networks and their impact on solving large scale - PowerPoint PPT Presentation

IDA Machine Learning Seminars - Fall 2015 Deep Convolutional Networks and their impact on solving large scale visual recognition problems Hossien Azizpour, Computer Vision Group , KTH . Thanks to: J. Sullivan, A. S. Razavian, A. Maki and S.

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Convolutional Neural Networks ---- Off the shelf top notch performances Convolutional Neural

Integrating Problem Solving 2020 Integrating Problem Solving 2020 Integrating Problem Solving

Introduction CSCE 970 CSCE 970 Lecture 4: Lecture 4: Convolutional Convolutional Neural

Convolutional Kuan-Ting Lai 2020/3/31 Neural Network Convolutional Neural Networks (CNN)

CS7015 (Deep Learning) : Lecture 13 Visualizing Convolutional Neural Networks, Guided

Convolutional Neural Networks 08, 10 &amp; 17 Nov, 2016 J. Ezequiel Soto S. Image Processing

Convolutional Neural Networks (Part III) 08, 10 &amp; 17 Nov, 2016 J. Ezequiel Soto S. Image

Convolutional Neural Networks for Sentence Classification Yoon Kim New York University 1 / 34

Convolutional Networks Lecture slides for Chapter 9 of Deep Learning Ian Goodfellow 2016-09-12

Deep Convolutional Neural Nets COMPSCI 371D Machine Learning COMPSCI 371D Machine

15-780 Graduate Artificial Intelligence: Convolutional and recurrent networks J. Zico Kolter

and Inference for Convolutional Neural Networks 1 2 FFT IFFT 3 4 Mathieu et al.: Fast

Semantic Segmentation of the sekleton in bone scintigraphy images with convolutional neural

Convolutional Neural Networks in Speech Lecture 20 CS 753 Instructor: Preethi Jyothi

Anytime Reliability of Systematic LDPC Motivation Convolutional Codes LDPC Convolutional Codes

Hashing and Dictionaries 15-110 Monday 03/02 Learning Goals Understand how and why hashing

Combinatorial Interaction Testing Justyna Petke C entre for R esearch in E volution, S earch and

Cracking the Container Scale Problem with Apache Mesos Connor Doyle connor@mesosphere.io Sunil

Build an Alien Sightings Dashboard BUILDIN G W EB AP P LICATION S W ITH S H IN Y IN R Kaelen

t t s sstrs

On the double shuffle Lie algebra structure: Ecalles approach Adriana Salerno (joint work

Physics of MRF Regularization for Segmentation of Materials Microstructure Images Jeff

N-grams and Morpheme Analysis in IR Paul McNamee Johns Hopkins University Applied Physics

Convolutional Neural Networks 08, 10 & 17 Nov, 2016 J. Ezequiel Soto S. Image Processing

Convolutional Neural Networks (Part III) 08, 10 & 17 Nov, 2016 J. Ezequiel Soto S. Image