9.54 Class 16
Features for recognition
supervised, unsupervised and innate
Shimon Ullman + Tomaso Poggio Danny Harari + Daniel Zysman + Darren Seibert
9.54 Class 16 Features for recognition supervised, unsupervised - - PowerPoint PPT Presentation
9.54 Class 16 Features for recognition supervised, unsupervised and innate Shimon Ullman + Tomaso Poggio Danny Harari + Daniel Zysman + Darren Seibert Visual recognition The initial input is just image intensities Object Categories -- We
supervised, unsupervised and innate
Shimon Ullman + Tomaso Poggio Danny Harari + Daniel Zysman + Darren Seibert
The initial input is just image intensities
Individual Recognition
Object parts
Headlight Window Door knob Back wheel Mirror Front wheel Headlight Window Bumper
Categorization: dealing with class variability
Class Non-class
Natural for the brain, difficult computationally
Features and Classifiers
Features and Classifiers
Generic Features
Simple (wavelets) Complex (Geons)
Image features Classifier
Visual Class: Similar Configurations of Shared Image Components
What will be optimal image building- blocks for the class?
Optimal Class Components?
everywhere Find features that carry the highest amount of information
Mutual Information I(C,F) I(F,C) = H(C) – H(C|F)
C c F f F f
f c P Log f c P f p f F C H f p F C H )) ( ( ) ( ) ( ) ( ) ( ) (
the class entropy and conditional entropy of the class given a feature:
C c
c P Log c P C H )) ( ( ) ( ) (
Mutual Information I(C,F)
Class: 1 1 1 1 Feature 1 1 1 1
I(F,C) = H(C) – H(C|F)
Computing MI from Examples
examples:
100 Non-faces Feature: 44 times 6 times Mutual information: 0.1525
H(C) = 1, H(C|F) = 0.8475
Simple neural-network approximations
Optimal classification features
error
Error = H – I(C;F)
Mutual Info vs. Threshold
0.00 20.00 40.00 Detection threshold Mutual Info
forehead hairline mouth eye nose nosebridge long_hairline chin twoeyes
‘Imprinting’ many receptive fields and selecting a subset
(Avoiding redundancy by max-min selection) ?
ΔMI Maxi Mink ΔMI (Fi, Fk) Compare new fragments Fi to all the previous ones. Select F which maximizes the additional information
Competition between units with similar responses
Optimal receptive fields for Faces
Ullman et al Nature Neuroscience 2002
Horse-class features Car-class features
∑wk Fk > θ
On all detected fragments within their regions
Star model
Detected fragments ‘vote’ for the center location Find location with maximal vote In variations, a popular state-of-the art scheme
Image parts informative for classification
Fergus, Perona, Zisserman 2003 Agarwal, Roth 2002
Ullman, Sali 1999
Variability of Airplanes Detected
Image representation for recognition HoG Descriptor
Dallal, N & Triggs, B. Histograms of Oriented Gradients for Human Detection
Object model using HoG
fMRI
Functional Magnetic Resonance Imaging
Looking for Class Features in the Brain: fMRI
Lerner, Epshtein Ullman Malach JCON 2008
Malach et al 2008
EEG
Harel, Ullman, Epshtein, Bentin
FACE FEATURES milliseconds 200 400 600 200 400 600 milliseconds Left Hemisphere Right Hemisphere Posterior-Temporal sites FACE FEATURES milliseconds 200 400 600 200 400 600 milliseconds Left Hemisphere Right Hemisphere Posterior-Temporal sites
MI 1 — MI 2 — MI 3 — MI 4 — MI 5 —
Harel, Ullman,Epshtein, Bentin Vis Res 2007
Innate mechanisms for unsupervised learning
Object 1 Object 2 Background
[Kellman & Spelke 1983; Spelke 1990; Kestenbaum et al., 1987]
5 months
Even basic Gestalt cues are initially missing
[Schmidt et al. 1986]
Adults
Grouping by common motion precedes figural goodness
[Spelke 1990 - review]
Motion discontinuities provide an early cue for occlusion boundaries
[Granrud et al. 1984]
Static segregation Local occlusion boundaries Object form Motion discontinuities Common motion
Boundary General Accurate Noisy Incomplete Global Object-specific Complete Inaccurate
Motion-based segregation
Dorfman, Harari & Ullman, CogSci 2013
Boundary
Extremal edges Convexity T-junctions
[Ghose & Palmer 2010]
Boundary
Global
Motion
Motion Boundary Global
Figure Ground Unknown
Need many examples for good results (1000+) Boundary
Figure
Ground? Figure
Ground? Novel object, novel background
78% success Using 100,000 training examples
Boundary
Boundary
Figure Background
Standard object recognition algorithm
Learns local features and their relative locations
Global
Global
Combined Boundary
Accurate Noisy & Incomplete
Global
Complete Inaccurate
Figure Background
Default GrabCut With segregation cue
[Rother et al. 2004]
More complex algorithms
Default GrabCut With segregation cue
[Rother et al. 2004]
Boundary
Motion discontinuities Occlusion boundaries
(Need a rich library, including extremal edges)
Global
Common motion Object form
adult segregation is much more complex
such as object recognition and segregation.
manner given labeled examples.
unsupervised manner using statistical regularities or domain-specific cues.