Object Recognition with and without Objects Zhuotun Zhu , Lingxi Xie, - PowerPoint PPT Presentation

Object Recognition with and without Objects Zhuotun Zhu , Lingxi Xie, Alan Yuille Johns Hopkins University

Object Recognition • A fundamental vision problem ✦ This task traditionally means each image has exactly one label that can take a single value among a finite number of choices. The assumption is that each image contains exactly one recognisable object (or perhaps none, in which case it takes the "background" label).

Object Recognition • Before deep learning SIFT BoW SVM HOG LLC Cat? KNN SURF VLAD etc… etc… etc…

Object Recognition • Deep learning ✦ Computational resources, e.g. , GPU ✦ Large Dataset, e.g. , ImageNet

Object Recognition • Deep learning ✦ Computational resources: GPU ✦ Large Dataset: ImageNet

Object Recognition • Multiple layers of learned feature detectors :) • Local feature detectors are replicated across space :) • Detectors get bigger in higher layers in space :) • Foreground and background are learnt together implicitly :( First three claims are borrowed from G.E. Hinton’s recent talk, “What is wrong with convolutional neural nets”.

Intuitions • Two examples

Intuitions • Two examples Bird? Snake? Squirrel? Snail? Monkey? Lizard? Bat? Scorpion? … …

Intuitions • Two examples

Key Questions • How well can deep neural networks learn on the pure foreground (object) and background (context)? • Could there be any difference between human and networks for understanding image (especially the foreground and background)? • What can the networks do by learning the foreground and background models separately?

Datasets • ILSVRC2012[2]: 1K classes, 1.28M training, 50K testing Images w/ bounding box BGSet Annotated bounding box(es) OrigSet Images w/o bounding box HybridSet FGSet [2] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. Berg, and L. Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision , pages 1–42, 2015.

Datasets • Summary of the datasets

Experiments • AlexNet[3] v.s. Human [3] A. Krizhevsky, I. Sutskever, and G. Hinton. ImageNet Classification with Deep Convolutional Neural Networks. NIPS , 2012.

Experiments • Cross Validation

Experiments • Ratio of bounding box The top 1 accuracy The top 5 accuracy 0.7 The accuracy averaged by class The accuracy averaged by class 0.8 0.6 0.7 0.5 0.6 0.4 0.5 0.3 0.4 0.2 OrigNet OrigNet 0.3 0.1 FGNet FGNet BGNet BGNet 0.2 0 HybridNet HybridNet 0.1 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 The ratio of bounding box w.r.t the whole image The ratio of bounding box w.r.t the whole image

Experiments • Patches Visualization[4] [4] J. Wang, Z. Zhang, V. Premachandran, and A. Yuille. Discovering Internal Representations from Object-CNNs Using Population Encoding. arXiv preprint, arXiv: 1511.06855 , 2015.

Experiments • Recognition w. & w/o. objects

Conclusions • AlexNet can learn reasonable models to explore the correlation between the foreground object and background context • AlexNet tend to perform better than human on background without objects but is beaten on foreground with object • Combining the learnt networks can be beneficial for object recognition

Future Works • An end-to-end training framework for explicitly separating and then combining the foreground and background information

Object Recognition with and without Objects Zhuotun Zhu , Lingxi Xie, - PowerPoint PPT Presentation

Object Recognition with and without Objects Zhuotun Zhu , Lingxi Xie, Alan Yuille Johns Hopkins University Object Recognition A fundamental vision problem This task traditionally means each image has exactly one label that can take a

Object recognition and hierarchical computation Challenges in object recognition.

Object Recognition using Invariant Local Features Goal: Identify known objects in new images

Object Detection Deep ConvNets for Recognition for... Images (global) Objects (local) Video

CSE 152: Computer Vision Hao Su Lecture 10: Object Recognition How do we represent objects -

Beyond Object Recognition in 2D Georgia Gkioxari Object Recognition in 2D The World is 3D

Supervised object recognition, unsupervised object recognition then Perceptual organization Bill

Overview Object Recognition Neurobiology of Vision Computational Object Recognition: Whats

Objects and Classes Objects with attributes Objects are the basis of object-oriented programming.

Introduction to Artificial Intelligence Object Recognition Classifiers Cascade and HOG/SVM

ECG782: Multidimensional Digital Signal Processing Object Recognition

Active Object Recognition using Vocabulary Trees N Govender, J. Claassens, P. Torr, J. Warrell

Everything is an object. 1. Objects communicate by sending and 2. receiving messages. Objects

Selective Search for Object Recognition Uijlings et al. (IJCV 2013) Some figures are from

Selective Search for Object Recognition Uijlings et al. Schuyler Smith Overview

View Planning for Object Recognition Gabriel Oliveira and Volkan Isler RSN Lab Motivation 2/30

Developing Objects Segregation capabilities and the notion of Object Containment from unlabeled

Lecture 14 Objects and Classes Object-oriented programming (OOP) Were always looking for

Shape Matching Shape-Based Recognition Intro Humans can recognize many objects based on

Object Recognition 16-385 Computer Vision (Kris Kitani) Carnegie Mellon University What do we

Object Oriented Object 3 Programming Object 1 Object 2 Object 4 For : COP 3330. Object

What is an object? Objects are units of data with the following properties: typed and

LCS 11: Cognitive Science 1. Gestalt principles 2. Recognition by components theory Object

In This Talk Object recognition in computer vision Brief definition and overview

Distributed Objects: A Lightning Tour Distributed Objects: A Lightning Tour What is an