Shifting from Naming to Describing: Semantic Attribute Models
Rogerio Feris, June 2014
Shifting from Naming to Describing: Semantic Attribute Models - - PowerPoint PPT Presentation
Shifting from Naming to Describing: Semantic Attribute Models Rogerio Feris, June 2014 Recap Large-Scale Semantic Modeling Feature Coding and Pooling Low-Level Feature Extraction Training Data Slide credit: Rogerio Feris Learning Visual
Rogerio Feris, June 2014
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Training Data Low-Level Feature Extraction Feature Coding and Pooling Large-Scale Semantic Modeling
Slide credit: Rogerio Feris
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Slide credit: Rogerio Feris
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
ImageNet has 30 mushroom synsets, each with ≈1000 images.
Slide credit: Christoph Lampert
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
In nature, there are ≈14,000 mushroom species.
Slide adapted from Christoph Lampert Image: http://www.evogeneao.com/
have classes with few or no training examples at all.
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Slide credit: Rogerio Feris
Suspect Search in Surveillance Videos
[Feris et al, IBM]
available, only textual descriptions.
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Slide credit: Rogerio Feris
Prediction of concrete nouns from neural imaging data (mind reading) [Mark Palatucci et al, NIPS 2009]
examples (costly label acquisition)
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Slide credit: Rogerio Feris
Similar problems in other fields:
each word (need sub-word modeling like phonemes)
Large Vocabulary Speech Recognition
ratings (also known as “cold-start problem”) [Schin et al, SIGIR 2002]
Recommendation Systems
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
[Lampert et al, CVPR 2009] [Farhadi et al, CVPR 2009] [Palatucci et al, NIPS 2009]
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Slide adapted from Christoph Lampert
Attributes:
properties that are shared across classes
representation
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Unseen categories Unseen categories
Semantic Attribute Classifiers
Standard multi-class classification Attribute-based classification [Lampert et al, CVPR 2009]
Slide credit: Rogerio Feris
Semantic Output Code Classifier (SOCC)
[Palatucci et al, NIPS 2009]
Similar to Error-Correcting Output codes (ECOC [Dietterich & Bakiri, 1995]), but semantic codes are used instead
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Slide credit: Rogerio Feris
negative samples and train a classifier (e.g., using SVM or Neural networks) Positive (Stripe) Negative (Non-Stripe) Binary Attribute Model Example: “Stripe” Attribute Attributes transcend class boundaries
Learning “stripe” attribute with images of zebras, clothing, …
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
[Parikh and Grauman, ICCV 2011]
Smiling Not smiling ??? Natural Not natural ???
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Max-margin learning to rank formulation of Joachims 2002
i j i j
[Parikh and Grauman, ICCV 2011]
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Manual Specification of Class-Attribute Associations
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Associations may be extracted automatically from other sources
[Rohrbach et al, CVPR 2010]
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
[Rohrbach et al, CVPR 2010] [Felix Yu et al, CVPR 2013] [Mensink et al, CVPR 2014]
Attribute-based Direct similarity
“giant pandas are similar to grizzly and polar bears”
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
[Akata et al , CVPR 2013] Check talk by Florent Perronnin on “Output embedding for large-scale visual recognition” (LSVR CVPR 2014 tutorial)
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Frome et al . "DeViSE: A Deep Visual-Semantic Embedding Model", NIPS 2013
Label Embedding Framework
Automatic Discovery of word associations
Label Image Real-Value word vector representation Skip-gram model: Semantically related words are mapped to similar vector representations
Deep Learning
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Language Model Source Code: https://code.google.com/p/word2vec/
Zero-Shot Learning / Semantically close mistakes
Label Embedding Framework
Automatic Discovery of word associations [Frome et al, NIPS 2013]
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014 Slide credit: Rogerio Feris
Check the CVPR 2013 tutorial on Attributes: https://filebox.ece.vt.edu/~parikh/attributes/
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
[Feris et al, IBM - WACV 2009, CVPR 2011, ICMR 2014]
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014 Slide credit: Rogerio Feris
http://www.today.com/video/today/51630165/#51630165
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014 Slide credit: Rogerio Feris
Traditional Approaches: Face Recognition (“Naming”)
resolution imagery (typical conditions in surveillance scenarios).
Attribute-based People Search (“Describing”)
search framework based on fine-grained semantic attributes.
Query Example: “Show me all people with a beard and sunglasses, wearing a white hat and a patterned blue shirt, from all metro cameras in the downtown area, from 2pm to 4pm last Saturday".
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014 Slide credit: Rogerio Feris
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014 Slide credit: Rogerio Feris
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014 Slide credit: Rogerio Feris
[Siddiquie et al, CVPR 2011]
sunglasses, eyeglasses, absence of glasses, beard, mustache, absence of facial hair, skin tone (dark, medium,light), gender, …
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014 Slide credit: Rogerio Feris
“Learning to rank”- confidence of individual attributes as features Pairwise attribute modeling
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014 Slide credit: Rogerio Feris
Improved performance over other ranking methods (RankSVM, RankBoost, DORM, TagProp) in three standard datasets (LFW, FaceTracer, PASCAL)
See [Siddiquie, Feris and Davis, CVPR 2011]
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014 Slide credit: Rogerio Feris
[Feris et al, ICMR 2014]
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014 Slide credit: Rogerio Feris
“Show me all images of people matching the suspect description from time X to time Y from all cameras in area Z.”
Ability to spot a person with e.g., a white hat in a crowded scene
Suspect #1 found in 4 images in top 8 results Suspect #2 found in 3 images in top page
1071 detected faces from 50 high-res Boston images (all from Flickr)
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014 Slide credit: Rogerio Feris
“Show me all blue trucks larger than 7ft length traveling at high speed northbound last Saturday, from 2pm to 5pm.” [Feris et al, IEEE Trans on Multimedia, 2012]
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
[Kovashka et al, CVPR 2012, ICCV 2013] [Yu & Grauman, CVPR 2014]
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Slide credit: Kristen Grauman
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Check Whittle Search demo at: http://godel.ece.vt.edu/whittle/
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
http://rogerioferis.com/VisualRecognitionAndSearch2014/Resources.html
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
Galaxy Morphological Attributes
Data available at: http://data.galaxyzoo.org/
Slide credit: Rogerio Feris 304,122 Galaxy Images 58,719,719 Annotations 83,943 volunteers 11 tasks / 38 answers (fine morphological attributes)
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
http://www.snapshotserengeti.org/
Slide credit: Rogerio Feris
5 Terabytes of annotated data
Data will be made publicly available soon!
Learning Visual Semantics: Models, Massive Computation, and Innovative Applications CVPR 2014
https://filebox.ece.vt.edu/~parikh/PnA2014/ http://rogerioferis.com/PartsAndAttributes/ http://pub.ist.ac.at/~chl/PnA2012/ (ECCV 2010) (ECCV 2012) (ECCV 2014) Check the Call for Extended Abstracts (Posters) Submission deadline: June 30th, 2014