Learning Better Object Models using Video Data Patrick Li, Inmar - - PowerPoint PPT Presentation

▶

Sep 18, 2023 374 likes •569 views

Learning Better Object Models using Video Data Patrick Li, Inmar Givoni, Brendan Frey Motivation Training on a collection of static monocular images is unnatural. Labelled Training Images are hard to get. And the lack of is becoming a problem.

SLIDE 1

Learning Better Object Models using Video Data

Patrick Li, Inmar Givoni, Brendan Frey

SLIDE 2

Motivation

Training on a collection of static monocular images is unnatural. Labelled Training Images are hard to get. And the lack of is becoming a problem. Tere is a wealth of video data available.

SLIDE 3

First Attempt: Learning Bags of Features Models for Image Classification

Goal: Represent Objects as Bags of SIFT Features Use unsupervised learning to learn models of objects Use learned models for image classification

SLIDE 4

Image Classification

INPUT: OUTPUT: TRAINING: “Cow” “Boat” “Car” “Sofa” ...

SLIDE 5

PART 3

Overview of the Technique

Unsupervised Training from Video Supervised Training on Labelled Images Testing

PART 1 PART 2 PART 3 PART 60 PART 2

...

“Cow”

PART 8 PART 1

“Sofa”

SLIDE 6

Bags of Features Models

PART 1 PART 2 PART 60 ...

SLIDE 7

Latent Dirichlet Allocation for Topic Modelling

SPORT POLITICS BANKING

BASEBALL HIT KICK S O C C E R L E A D E R D E M O C R A C Y CAPITALISM S H O U T E R S MONEY T R A N S A C T I O N S T R A N S A C T I O N S

ANIMALS

CAT D O G FROG CAT FROG D O G B A S E B A L L SOCCER C A P I T A L I S M L E A D E R D E M O C R A C Y

20% ANIMALS 40% POLITICS 39% BANKING 1% SPORTS Single Document

SLIDE 8

Latent Dirichlet Allocation for Topic Modelling

Corpus of Documents

? ? ? ?

1 2 3 ... 60

SLIDE 9

Latent Dirichlet Allocation for Topic Modelling

Corpus of Documents

? ? ?

1 2 3 ... 60 Money Transactions

SLIDE 10

Latent Dirichlet Allocation for Object Modelling

Single Image

COW CAR BOAT SOFT DRINKS 90% SOFT DRINKS 10% CORPORATE LOGOS

SLIDE 11

Latent Dirichlet Allocation for Object Modelling

Image Collection

? ? ? ?

1 2 3 ... 60

SLIDE 12

Flow-LDA for Motion Modelling

COW CAR BOAT VILLAIN

50% SWORD 50% VILLAIN

Pair of Consecutive Frame Pairs

SLIDE 13

Flow-LDA for Motion Modelling

? ? ? ?

1 2 3 ... 60

Frame Pair Collection

SLIDE 14

Flow-LDA for Motion Modelling

SLIDE 15

Unsupervised Training from Video using FLDA Training And Testing Images

Image Recognition

PART 1 PART 2 PART 60 ...

0.8 Part 1 0.2 Part 2 0.7 Part 1 0.2 Part 3 0.1 Part 4 0.6 Part 2 0.2 Part 13 0.2 Part 24

SLIDE 16

Initial Results

Naive Guesser: 8.6% Error SVM trained on SIFT histograms directly: 8.6% Error SVM trained using LDA model (no motion): 5.6% Error SVM trained using FLDA model (motion): 3.7% Error

SLIDE 17

... to continue

Experiment on Real Dataset Go beyond Bags of Features models

Hierarchical Models
Account for Spatial Relations
Account for temporal relations between more than 2 frames

SLIDE 18

Learning Better Object Models using Video Data

Patrick Li, Inmar Givoni, Brendan Frey

Motivation

Training on a collection of static monocular images is unnatural. Labelled Training Images are hard to get. And the lack of is becoming a problem. Tere is a wealth of video data available.

First Attempt: Learning Bags of Features Models for Image Classification

Goal: Represent Objects as Bags of SIFT Features Use unsupervised learning to learn models of objects Use learned models for image classification

Image Classification

INPUT: OUTPUT: TRAINING: “Cow” “Boat” “Car” “Sofa” ...

Overview of the Technique

Unsupervised Training from Video Supervised Training on Labelled Images Testing

...

“Cow”

“Sofa”

Bags of Features Models

PART 1 PART 2 PART 60 ...

Latent Dirichlet Allocation for Topic Modelling

Latent Dirichlet Allocation for Topic Modelling

? ? ? ?

Latent Dirichlet Allocation for Topic Modelling

? ? ?

Latent Dirichlet Allocation for Object Modelling

Single Image

Latent Dirichlet Allocation for Object Modelling

Image Collection

? ? ? ?

Flow-LDA for Motion Modelling

Pair of Consecutive Frame Pairs

Flow-LDA for Motion Modelling

? ? ? ?

Frame Pair Collection

Flow-LDA for Motion Modelling

Unsupervised Training from Video using FLDA Training And Testing Images

Image Recognition

PART 1 PART 2 PART 60 ...

Initial Results

Naive Guesser: 8.6% Error SVM trained on SIFT histograms directly: 8.6% Error SVM trained using LDA model (no motion): 5.6% Error SVM trained using FLDA model (motion): 3.7% Error

... to continue

Experiment on Real Dataset Go beyond Bags of Features models

Tank you!