Object Recognition 16-385 Computer Vision (Kris Kitani) Carnegie - PowerPoint PPT Presentation

Henderson and Davis. Shape recognition using hierarchical Constraint Analysis. 1979 Object Recognition 16-385 Computer Vision (Kris Kitani) Carnegie Mellon University

What do we mean by ‘object recognition’?

Is this a street light? (Verification / classification)

Where are the people? (Detection)

Is that Potala palace? (Identification)

Sky What’s in the scene? (semantic segmentation) Mountain Trees Building Vendors People Ground

What type of scene is it? (Scene categorization) Outdoor Marketplace City

Challenges (Object Recognition)

Viewpoint variation

Illumination variation

Scale variation

Background clutter

Deformation

Occlusion

Intra-class variation

Common approaches

Common approaches: object recognition Feature Spatial Window Matching reasoning classification

Feature matching

What object do these parts belong to?

Some local feature are very informative An object as a collection of local features (bag-of-features) • deals well with occlusion • scale invariant • rotation invariant Are the positions of the parts important?

Pros • Simple • Efficient algorithms • Robust to deformations Cons • No spatial reasoning

Common approaches: object recognition Feature Spatial Window Matching reasoning classification

Spatial reasoning

The position of every part depends on the positions of all the other parts p o s i t i o n a l d e p e n d e n c e Many parts, many dependencies!

1. Extract features 2. Match features 3. Spatial verification

1. Extract features 2. Match features 3. Spatial verification an old idea…

Fu and Booth. Grammatical Inference. 1975 Scene Structural (grammatical) description

Description for left edge of face 1972

A more modern probabilistic approach… think of locations as random variables (RV) RV RV RV set of part locations L = { L 1 , L 2 , . . . , L M } vector of RVs:  

A more modern probabilistic approach… think of locations as random variables (RV) RV RV RV set of part locations L = { L 1 , L 2 , . . . , L M } vector of RVs:   image (N pixels) What are the dimensions of R.V. L? L 1 L 2 How many possible combinations of part locations? L M

A more modern probabilistic approach… think of locations as random variables (RV) RV RV RV set of part locations L = { L 1 , L 2 , . . . , L M } vector of RVs:   image What are the dimensions of R.V. L? L 1 L 2 L m = [ x y ] How many possible combinations of part locations? L M

A more modern probabilistic approach… think of locations as random variables (RV) RV RV RV set of part locations L = { L 1 , L 2 , . . . , L M } vector of RVs:   image What are the dimensions of R.V. L? L 1 L 2 L m = [ x y ] How many possible combinations of part locations? L M N M

Most likely set of locations L is found by maximizing: part locations image p ( L | I ) ∝ p ( I | L ) p ( L ) Posterior Likelihood:   Prior:   How likely it is to observe spatial prior controls the image I given that the M parts geometric configuration of the are at locations L   parts (scaled output of a classifier) What kind of prior can we formulate?

Given any collection of selfie images, where would you expect the nose to be? What would be an appropriate prior ? P ( L nose ) =?

A simple factorized model Y p ( L ) = p ( L m ) m Break up the joint probability into smaller (independent) terms

Independent locations Y p ( L ) = p ( L m ) m Each feature is allowed to move independently Does not model the relative location of parts at all

Tree structure (star model) M − 1 Y p ( L ) = p ( L root ) p ( L m | L root ) m =1 Root (reference) Represent the location of node all the parts relative to a single reference part Assumes that one reference part is defined   (who will decide this?)

Fully connected (constellation model) p ( L ) = p ( l 1 , . . . , l N ) p o s i t i o Explicitly represents the n a l joint distribution of locations d e p e n d e n c Good model: e Models relative location of parts BUT Intractable for moderate number of parts

Pros • Retains spatial constraints • Robust to deformations Cons • Computationally expensive • Generalization to large inter-class variation (e.g., modeling chairs)

Feature Spatial Window Matching reasoning classification

Window-based

Template Matching 1. get image window 2. extract features 3. classify When does this work and when does it fail? How many templates do you need?

Per-exemplar exemplar template top hits from test data find the ‘nearest’ exemplar, inherit its label

Template Matching 1. get image window   2. extract features 3. compare to template (or region proposals) Do this part with one big classifier ‘end to end learning’

Convolutional   Neural Networks Convolution Pooling Image patch Image patch (raw pixels values) (raw pixels values) max/min response over a region response of one ‘filter’ response of one ‘filter’ A 96 x 96 image convolved with 400 filters Pooling aggregates statistics and (features) of size 8 x 8 generates about 3 lowers the dimension of convolution million values (89 2 x400)

96 ‘filters’ 224/4=56 630 million connections 60 millions parameters to learn Krizhevsky, A., Sutskever, I. and Hinton, G. E. ImageNet Classification with Deep Convolutional Neural Networks, NIPS 2012.

Pros • Retains spatial constraints • Efficient test time performance Cons • Many many possible windows to evaluate • Requires large amounts of data • Sometimes (very) slow to train

How to write an effective CV resume

Deep Learning +1-DEEP-LEARNING deeplearning@deeplearning http://deeplearning Summary : Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Experience : Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Education Deep Learning Deep Learning ? Deep Learning Deep Learning Deep Learning Experience Deep Learning Deep Learning . Deep Learning Deep Learning, Deep Learning · Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning · Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning · Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning in another country Deep Learning Deep Learning , Deep Learning , Deep Learning · Deep Learning ... wait.. Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning · Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning Deep Learning · Very Deep Learning Publications 1. Deep Learning in Deep Learning People who do Deep Learning things. Conference of Deep Learning. 2. Shallow Learning... Nawww.. Deep Learning bruh Under submission while Deep Learning Patent 1. System and Method for Deep Learning . Deep Learning, Deep Learning , Deep Learning , Deep Learning

Object Recognition 16-385 Computer Vision (Kris Kitani) Carnegie - PowerPoint PPT Presentation

Henderson and Davis. Shape recognition using hierarchical Constraint Analysis. 1979 Object Recognition 16-385 Computer Vision (Kris Kitani) Carnegie Mellon University What do we mean by object recognition? Is this a street light?

Object Oriented Object 3 Programming Object 1 Object 2 Object 4 For : COP 3330. Object

Instance-level Recognition Pingmei Xu Object Recognition Friends SE01EP02 Recognition: Find the

Supervised object recognition, unsupervised object recognition then Perceptual organization Bill

Beyond Object Recognition in 2D Georgia Gkioxari Object Recognition in 2D The World is 3D

Object-Oriented Databases Object Oriented Databases ODMG Standard Object Model, Object

Object oriented Object oriented Object oriented Object oriented approach and UML approach and

A summary of deep models for face recognition Qianli Liao Face recognition Face recognition:

8-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches

Learning for Action Recognition Yemin Shi shiyemin@pku.edu.cn 2018-03 1 Background Action

Overview Object Recognition Neurobiology of Vision Computational Object Recognition: Whats

Object recognition and hierarchical computation Challenges in object recognition.

Selective Search for Object Recognition Uijlings et al. Schuyler Smith Overview

CS6501: Deep Learning for Visual Recognition Object Detection: RCNN, Fast-RCNN, Faster-RCNN

Object Space Volume Rendering Object Space Volume Rendering Ronald Peikert SciVis 2010 - Object

EMPLOYEE RECOGNITION OBJECTIVES Types of recognition Creating a culture of recognition

License Plate Recognition License Plate Recognition License Plate Recognition License Plate

Test automation Whenever we alter our code, we should re-test Our collections of test cases

Repeatable Runs for Test Collection Documentation Ian Soboroff, NIST SIGIR 2019 OSIRRC Workshop,

Testing and DITA-OT Should we maybe do that? How do we do that? @jelovirt (Jarno) @robander

Stochastic Simulation Random number generation Bo Friis Nielsen Applied Mathematics and Computer

CS156: The Calculus of @pre 0 u < | a | Computation @post rv i . i

Electromigration-aware Redundant Via Insertion Jan. 21 st , 2015 Jiwoo Pak, Bei Yu, David Z. Pan

Lecture 4, Non-linear Time Series Erik Lindstrm Its not a bug, its a feature!

Dependence patterns related to the BMAP Rosa E. Lillo, Joanna Rodr guez, Pepa Ram

Object Recognition 16-385 Computer Vision (Kris Kitani) Carnegie - PowerPoint PPT Presentation

Henderson and Davis. Shape recognition using hierarchical Constraint Analysis. 1979 Object Recognition 16-385 Computer Vision (Kris Kitani) Carnegie Mellon University What do we mean by object recognition? Is this a street light?

Object Oriented Object 3 Programming Object 1 Object 2 Object 4 For : COP 3330. Object

Instance-level Recognition Pingmei Xu Object Recognition Friends SE01EP02 Recognition: Find the

Supervised object recognition, unsupervised object recognition then Perceptual organization Bill

Beyond Object Recognition in 2D Georgia Gkioxari Object Recognition in 2D The World is 3D

Object-Oriented Databases Object Oriented Databases ODMG Standard Object Model, Object

Object oriented Object oriented Object oriented Object oriented approach and UML approach and

A summary of deep models for face recognition Qianli Liao Face recognition Face recognition:

8-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches

Learning for Action Recognition Yemin Shi shiyemin@pku.edu.cn 2018-03 1 Background Action

Overview Object Recognition Neurobiology of Vision Computational Object Recognition: Whats

Object recognition and hierarchical computation Challenges in object recognition.

Selective Search for Object Recognition Uijlings et al. Schuyler Smith Overview

CS6501: Deep Learning for Visual Recognition Object Detection: RCNN, Fast-RCNN, Faster-RCNN

Object Space Volume Rendering Object Space Volume Rendering Ronald Peikert SciVis 2010 - Object

EMPLOYEE RECOGNITION OBJECTIVES Types of recognition Creating a culture of recognition

License Plate Recognition License Plate Recognition License Plate Recognition License Plate

Test automation Whenever we alter our code, we should re-test Our collections of test cases

Repeatable Runs for Test Collection Documentation Ian Soboroff, NIST SIGIR 2019 OSIRRC Workshop,

Testing and DITA-OT Should we maybe do that? How do we do that? @jelovirt (Jarno) @robander

Stochastic Simulation Random number generation Bo Friis Nielsen Applied Mathematics and Computer

CS156: The Calculus of @pre 0 u &lt; | a | Computation @post rv i . i

Electromigration-aware Redundant Via Insertion Jan. 21 st , 2015 Jiwoo Pak, Bei Yu, David Z. Pan

Lecture 4, Non-linear Time Series Erik Lindstrm Its not a bug, its a feature!

Dependence patterns related to the BMAP Rosa E. Lillo, Joanna Rodr guez, Pepa Ram

CS156: The Calculus of @pre 0 u < | a | Computation @post rv i . i