Introduction to Recognition Computer Vision CS 543 / ECE 549 - - PowerPoint PPT Presentation

introduction to recognition
SMART_READER_LITE
LIVE PREVIEW

Introduction to Recognition Computer Vision CS 543 / ECE 549 - - PowerPoint PPT Presentation

Introduction to Recognition Computer Vision CS 543 / ECE 549 University of Illinois Many Slides from D. Hoiem, L. Lazebnik. Outline Overview of image and region categorization Task description What is a category Example of


slide-1
SLIDE 1

Introduction to Recognition

Computer Vision CS 543 / ECE 549 University of Illinois

Many Slides from D. Hoiem, L. Lazebnik.

slide-2
SLIDE 2

Outline

  • Overview of image and region categorization

– Task description – What is a category

  • Example of spatial pyramids bag-of-words scene

categorizer

  • Key concepts: features and classification
  • Deep convolutional neural networks (CNNs)
slide-3
SLIDE 3

Recognition as 3D Matching

Recognizing solid objects by alignment with an image. Huttenlocher and Ullman IJCV 1990.

“Instance” Recognition “Category-level” Recognition

slide-4
SLIDE 4

Detection, semantic segmentation, instance segmentation

semantic segmentation instance segmentation image classification

  • bject detection

Image source

slide-5
SLIDE 5

“Classic” recognition pipeline

Feature representation Trainable classifier Image Pixels Class label

slide-6
SLIDE 6

Overview

Training Labels Training Images Classifier Training

Training

Image Features Image Features

Testing

Test Image

Trained Classifier

Trained Classifier Outdoor Prediction

slide-7
SLIDE 7

Classifiers: Nearest neighbor

f(x) = label of the training example nearest to x

  • All we need is a distance or similarity function for our inputs
  • No training required!

Test example Training examples from class 1 Training examples from class 2

slide-8
SLIDE 8

K-nearest neighbor classifier

  • Which classifier is more robust to outliers?

Credit: Andrej Karpathy, http://cs231n.github.io/classification/

slide-9
SLIDE 9

Linear classifiers

  • Find a linear function to separate the classes:

f(x) = sgn(w × x + b)

slide-10
SLIDE 10
  • Linearly separable dataset in 1D:
  • Non-separable dataset in 1D:
  • We can map the data to a higher-dimensional space:

x x x x2

Nonlinear SVMs

Slide credit: Andrew Moore

slide-11
SLIDE 11

Bag of features

1. Extract local features 2. Learn “visual vocabulary” 3. Quantize local features using visual vocabulary 4. Represent images by frequencies of “visual words”

slide-12
SLIDE 12

Digit Classification Case Study

slide-13
SLIDE 13

The MNIST DATABASE of handwritten digits

Yann LeCun & Corinna Cortes

  • Has a training set of 60 K

examples (6K examples for each digit), and a test set of 10K examples.

  • Each digit is a 28 x 28 pixel grey

level image. The digit itself

  • ccupies the central 20 x 20

pixels, and the center of mass lies at the center of the box.

slide-14
SLIDE 14

Bias-Variance Trade-off

10

2

10

3

10

4

10

5

5 10 15 20 25 30 35 Number of Training Examples Error Rate Performance on MNIST Dataset Gradient, Int Gradient, Linear Raw, Poly Raw, Rbf

slide-15
SLIDE 15

Bias and Variance

slide-16
SLIDE 16

Bias-Variance Trade-off

Performance as a function of model complexity (SVM)

slide-17
SLIDE 17

Model Selection

slide-18
SLIDE 18

Bias-Variance Trade-off

As a function of dataset size

slide-19
SLIDE 19

Generalization Error

Testing Training Number of Training Examples Error Generalization Error Fixed classifier

slide-20
SLIDE 20

Features vs Classifiers

10

2

10

3

10

4

10

5

5 10 15 20 25 30 35 Number of Training Examples Error Rate Performance on MNIST Dataset Gradient, Int Gradient, Linear Raw, Poly Raw, Rbf

slide-21
SLIDE 21

What are the right features?

Depend on what you want to know!

  • Object: shape

– Local shape info, shading, shadows, texture

  • Scene : geometric layout

– linear perspective, gradients, line segments

  • Material properties: albedo, feel, hardness

– Color, texture

  • Action: motion

– Optical flow, tracked points

slide-22
SLIDE 22

Stuff vs Objects

  • recognizing cloth fabric vs recognizing cups
slide-23
SLIDE 23

Feature Design Process

  • 1. Start with a model
  • 2. Look at errors on development set
  • 3. Think of features that can improve

performance

  • 4. Develop new model, test whether new

features help.

  • 5. If not happy, go to step 1.
  • 6. “Ablations”: Simplify system, prune out

features that don’t help anymore in presence

  • f other features.
slide-24
SLIDE 24

Features vs Classifiers

10

2

10

3

10

4

10

5

5 10 15 20 25 30 35 Number of Training Examples Error Rate Performance on MNIST Dataset Gradient, Int Gradient, Linear Raw, Poly Raw, Rbf

slide-25
SLIDE 25

“Classic” recognition pipeline

Feature representation Trainable classifier Image Pixels Class label

slide-26
SLIDE 26

Categorization involves features and a classifier

Training Labels Training Images Classifier Training

Training

Image Features Image Features

Testing

Test Image Trained Classifier Outdoor Prediction Trained Classifier

slide-27
SLIDE 27

New training setup with moderate sized datasets

Training Labels Training Images Tune CNN features and Neural Network classifier

Trained Classifier

Dataset similar to task with millions of labeled examples

Initialize CNN Features

slide-28
SLIDE 28

Categorization involves features and a classifier

Training Labels Training Images Classifier Training

Training

Image Features Image Features

Testing

Test Image Trained Classifier Outdoor Prediction Trained Classifier

slide-29
SLIDE 29

New training setup with moderate sized datasets

Training Labels Training Images Tune CNN features and Neural Network classifier

Trained Classifier

Dataset similar to task with millions of labeled examples

Initialize CNN Features