Learning Better Object Models using Video Data Patrick Li, Inmar - - PowerPoint PPT Presentation
Learning Better Object Models using Video Data Patrick Li, Inmar - - PowerPoint PPT Presentation
Learning Better Object Models using Video Data Patrick Li, Inmar Givoni, Brendan Frey Motivation Training on a collection of static monocular images is unnatural. Labelled Training Images are hard to get. And the lack of is becoming a problem.
Motivation
Training on a collection of static monocular images is unnatural. Labelled Training Images are hard to get. And the lack of is becoming a problem. Tere is a wealth of video data available.
First Attempt: Learning Bags of Features Models for Image Classification
Goal: Represent Objects as Bags of SIFT Features Use unsupervised learning to learn models of objects Use learned models for image classification
Image Classification
INPUT: OUTPUT: TRAINING: “Cow” “Boat” “Car” “Sofa” ...
PART 3
Overview of the Technique
Unsupervised Training from Video Supervised Training on Labelled Images Testing
PART 1 PART 2 PART 3 PART 60 PART 2
...
“Cow”
PART 8 PART 1
“Sofa”
Bags of Features Models
PART 1 PART 2 PART 60 ...
Latent Dirichlet Allocation for Topic Modelling
SPORT POLITICS BANKING
BASEBALL HIT KICK S O C C E R L E A D E R D E M O C R A C Y CAPITALISM S H O U T E R S MONEY T R A N S A C T I O N S T R A N S A C T I O N S
ANIMALS
CAT D O G FROG CAT FROG D O G B A S E B A L L SOCCER C A P I T A L I S M L E A D E R D E M O C R A C Y
20% ANIMALS 40% POLITICS 39% BANKING 1% SPORTS Single Document
Latent Dirichlet Allocation for Topic Modelling
Corpus of Documents
? ? ? ?
1 2 3 ... 60
Latent Dirichlet Allocation for Topic Modelling
Corpus of Documents
? ? ?
1 2 3 ... 60 Money Transactions
Latent Dirichlet Allocation for Object Modelling
Single Image
COW CAR BOAT SOFT DRINKS 90% SOFT DRINKS 10% CORPORATE LOGOS
Latent Dirichlet Allocation for Object Modelling
Image Collection
? ? ? ?
1 2 3 ... 60
Flow-LDA for Motion Modelling
COW CAR BOAT VILLAIN
50% SWORD 50% VILLAIN
Pair of Consecutive Frame Pairs
Flow-LDA for Motion Modelling
? ? ? ?
1 2 3 ... 60
Frame Pair Collection
Flow-LDA for Motion Modelling
Unsupervised Training from Video using FLDA Training And Testing Images
Image Recognition
PART 1 PART 2 PART 60 ...
0.8 Part 1 0.2 Part 2 0.7 Part 1 0.2 Part 3 0.1 Part 4 0.6 Part 2 0.2 Part 13 0.2 Part 24
Initial Results
Naive Guesser: 8.6% Error SVM trained on SIFT histograms directly: 8.6% Error SVM trained using LDA model (no motion): 5.6% Error SVM trained using FLDA model (motion): 3.7% Error
... to continue
Experiment on Real Dataset Go beyond Bags of Features models
- Hierarchical Models
- Account for Spatial Relations
- Account for temporal relations between more than 2 frames