[PPT] - Large-Scale Live Active Learning: Training Object Detectors with PowerPoint Presentation

SLIDE 1

Large-Scale Live Active Learning: Training Object Detectors with Crawled Data and Crowds

Sudheendra Vijayanarasimhan Kristen Grauman

Department of Computer Science University of Texas at Austin Austin, Texas

SLIDE 2

Introduction Problem Our Approach Results Conclusions

Object Detection

Challenge: Best results require large amount of cleanly labeled training examples.

2

SLIDE 3

Introduction Problem Our Approach Results Conclusions

Ways to Reduce Effort

Active learning

minimize effort by focusing label requests on the most informative examples

[Kapoor et al. ICCV 2007, Qi et al. CVPR 2008, Vijayanarasimhan et al. CVPR 2009, Joshi et al. CVPR 2009, Siddique et al. CVPR 2010]

3

SLIDE 4

Introduction Problem Our Approach Results Conclusions

Ways to Reduce Effort

Active learning

Labeled data Unlabeled data

minimize effort by focusing label requests on the most informative examples

[Kapoor et al. ICCV 2007, Qi et al. CVPR 2008, Vijayanarasimhan et al. CVPR 2009, Joshi et al. CVPR 2009, Siddique et al. CVPR 2010]

3

SLIDE 5

Introduction Problem Our Approach Results Conclusions

Ways to Reduce Effort

Active learning

Labeled data Unlabeled data Current category model

minimize effort by focusing label requests on the most informative examples

[Kapoor et al. ICCV 2007, Qi et al. CVPR 2008, Vijayanarasimhan et al. CVPR 2009, Joshi et al. CVPR 2009, Siddique et al. CVPR 2010]

3

SLIDE 6

Introduction Problem Our Approach Results Conclusions

Ways to Reduce Effort

Active learning

Labeled data Unlabeled data Current category model

Active Selection Function

minimize effort by focusing label requests on the most informative examples

[Kapoor et al. ICCV 2007, Qi et al. CVPR 2008, Vijayanarasimhan et al. CVPR 2009, Joshi et al. CVPR 2009, Siddique et al. CVPR 2010]

3

SLIDE 7

Introduction Problem Our Approach Results Conclusions

Ways to Reduce Effort

Active learning

Labeled data Unlabeled data Current category model Actively selected example(s)

Active Selection Function

minimize effort by focusing label requests on the most informative examples

[Kapoor et al. ICCV 2007, Qi et al. CVPR 2008, Vijayanarasimhan et al. CVPR 2009, Joshi et al. CVPR 2009, Siddique et al. CVPR 2010]

3

SLIDE 8

Introduction Problem Our Approach Results Conclusions

Ways to Reduce Effort

Active learning

Labeled data Annotator(s) Unlabeled data Current category model

“Annotation

request” Actively selected example(s)

Active Selection Function

minimize effort by focusing label requests on the most informative examples

[Kapoor et al. ICCV 2007, Qi et al. CVPR 2008, Vijayanarasimhan et al. CVPR 2009, Joshi et al. CVPR 2009, Siddique et al. CVPR 2010]

3

SLIDE 9

Introduction Problem Our Approach Results Conclusions

Ways to Reduce Effort

Active learning

Labeled data Annotator(s) Unlabeled data Current category model

“Annotation

request” Actively selected example(s)

Active Selection Function

minimize effort by focusing label requests on the most informative examples

[Kapoor et al. ICCV 2007, Qi et al. CVPR 2008, Vijayanarasimhan et al. CVPR 2009, Joshi et al. CVPR 2009, Siddique et al. CVPR 2010]

3

SLIDE 10

Introduction Problem Our Approach Results Conclusions

Ways to Reduce Effort

Active learning

Labeled data Annotator(s) Unlabeled data Current category model

“Annotation

request” Actively selected example(s)

Active Selection Function

minimize effort by focusing label requests on the most informative examples

[Kapoor et al. ICCV 2007, Qi et al. CVPR 2008, Vijayanarasimhan et al. CVPR 2009, Joshi et al. CVPR 2009, Siddique et al. CVPR 2010]

Crowd-sourced annotations

package annotation tasks to

btain from online human

workers

[von Ahn et al. CHI 2004, Russell et al. IJCV 2007, Sorokin et al. 2008, Welinder et al. ACVHL 2010, Deng et al, CVPR 2009]

3

SLIDE 11

Introduction Problem Our Approach Results Conclusions

Problem

Thus far techniques are only tested in artificially controlled settings:

4

SLIDE 12

Introduction Problem Our Approach Results Conclusions

Problem

Thus far techniques are only tested in artificially controlled settings: use “sandbox” datasets - dataset’s source and scope is fixed

4

SLIDE 13

Introduction Problem Our Approach Results Conclusions

Problem

Thus far techniques are only tested in artificially controlled settings: use “sandbox” datasets - dataset’s source and scope is fixed computational cost of active selection and retraining the model generally ignored - linear/quadratic time

4

SLIDE 14

Introduction Problem Our Approach Results Conclusions

Problem

Thus far techniques are only tested in artificially controlled settings: use “sandbox” datasets - dataset’s source and scope is fixed computational cost of active selection and retraining the model generally ignored - linear/quadratic time crowd-sourced collection requires iterative fine-tuning

4

SLIDE 15

Introduction Problem Our Approach Results Conclusions

Goal

Take active learning and crowd-sourced annotation collection out of the “sandbox”.

5

SLIDE 16

Introduction Problem Our Approach Results Conclusions

Goal

Take active learning and crowd-sourced annotation collection out of the “sandbox”. break free from dataset-based learning

5

SLIDE 17

Introduction Problem Our Approach Results Conclusions

Goal

Take active learning and crowd-sourced annotation collection out of the “sandbox”. break free from dataset-based learning collect information on the fly (no manual intervention)

5

SLIDE 18

Introduction Problem Our Approach Results Conclusions

Goal

Take active learning and crowd-sourced annotation collection out of the “sandbox”. break free from dataset-based learning collect information on the fly (no manual intervention) large-scale data

5

SLIDE 19

Introduction Problem Our Approach Results Conclusions

Goal

Take active learning and crowd-sourced annotation collection out of the “sandbox”. break free from dataset-based learning collect information on the fly (no manual intervention) large-scale data

Our Approach: Live Learning

Live active learning system that autonomously builds models for object detection

5

SLIDE 20

Introduction Problem Our Approach Results Conclusions

Goal

Take active learning and crowd-sourced annotation collection out of the “sandbox”. break free from dataset-based learning collect information on the fly (no manual intervention) large-scale data

Our Approach: Live Learning

Live active learning system that autonomously builds models for object detection

“bicycle”

Category model

5

SLIDE 21

Introduction Problem Our Approach Results Conclusions

Our Approach: Live Learning

“bicycle”

6

SLIDE 22

Introduction Problem Our Approach Results Conclusions

Our Approach: Live Learning

“bicycle”

Unlabeled images

6

SLIDE 23

Introduction Problem Our Approach Results Conclusions

Our Approach: Live Learning

“bicycle”

Unlabeled images Unlabeled windows

Generate

bject

windows

6

SLIDE 24

Introduction Problem Our Approach Results Conclusions

Our Approach: Live Learning

“bicycle”

Unlabeled images Unlabeled windows

Generate

bject

windows Select images to annotate

Actively selected examples

6

SLIDE 25

Introduction Problem Our Approach Results Conclusions

Our Approach: Live Learning

“bicycle”

Actively selected examples Labeled data Unlabeled images Unlabeled windows

Generate

bject

windows Online annotation collection Select images to annotate

6

SLIDE 26

Introduction Problem Our Approach Results Conclusions

Our Approach: Live Learning

“bicycle”

Actively selected examples Labeled data

Object representation and classifier

Unlabeled images Unlabeled windows

Generate

bject

windows Online annotation collection Select images to annotate

Category model

6

SLIDE 27

Introduction Problem Our Approach Results Conclusions

Main Contributions

Linear classification part-based linear detector based on non-linear feature coding 7

SLIDE 28

Introduction Problem Our Approach Results Conclusions

Main Contributions

Linear classification part-based linear detector based on non-linear feature coding Large-scale active selection sub-linear time hashing scheme for efficiently selecting uncertain examples [ Jain, Vijayanarasimhan & Grauman, NIPS 2010] 7

SLIDE 29

Introduction Problem Our Approach Results Conclusions

Main Contributions

Linear classification part-based linear detector based on non-linear feature coding Large-scale active selection sub-linear time hashing scheme for efficiently selecting uncertain examples [ Jain, Vijayanarasimhan & Grauman, NIPS 2010] Live learning results for active detection of unprecedented scale and autonomy for the first time 7

SLIDE 30

Introduction Problem Our Approach Results Conclusions

Outline

“bicycle”

Actively selected examples Labeled data

Object representation and classifier

Unlabeled images Unlabeled windows

Generate

bject

windows Online annotation collection Select images to annotate

?

Category model

8

SLIDE 31

Introduction Problem Our Approach Results Conclusions

Outline

“bicycle”

Actively selected examples Labeled data

Object representation and classifier

Unlabeled images Unlabeled windows

Generate

bject

windows Online annotation collection Select images to annotate

?

Category model

Linear classification

fast/incremental training using linear SVM

8

SLIDE 32

Introduction Problem Our Approach Results Conclusions

Outline

“bicycle”

Actively selected examples Labeled data

Object representation and classifier

Unlabeled images Unlabeled windows

Generate

bject

windows Online annotation collection Select images to annotate

?

Category model

Linear classification

fast/incremental training using linear SVM efficient active selection using our hyperplane hash functions

8

SLIDE 33

Introduction Problem Our Approach Results Conclusions

Object Representation and Classifier

Part based object representation 9

SLIDE 34

Introduction Problem Our Approach Results Conclusions

Object Representation and Classifier

Part based object representation

Root

9

SLIDE 35

Introduction Problem Our Approach Results Conclusions

Object Representation and Classifier

Part based object representation

Root Parts

9

SLIDE 36

Introduction Problem Our Approach Results Conclusions

Object Representation and Classifier

Part based object representation

Root Parts Context

9

SLIDE 37

Introduction Problem Our Approach Results Conclusions

Object Representation and Classifier

Part based object representation [φ(r)

Root Sparse Max Pooling Parts Context

9

SLIDE 38

Introduction Problem Our Approach Results Conclusions

Object Representation and Classifier

Part based object representation [φ(r)

Root Sparse Max Pooling Parts

, φ(p1) … φ(pP)

Context

9

SLIDE 39

Introduction Problem Our Approach Results Conclusions

Object Representation and Classifier

Part based object representation [φ(r)

Root Sparse Max Pooling Parts

, φ(p1) … φ(pP)

Context

, φ(c1) … φ(cC)]

9

SLIDE 40

Introduction Problem Our Approach Results Conclusions

Object Representation and Classifier

Part based object representation [φ(r)

Root Sparse Max Pooling Parts

, φ(p1) … φ(pP)

Context

, φ(c1) … φ(cC)]

Sparse Max Pooling [Yang et al. ’10] – similar to bag of words:

9

SLIDE 41

Introduction Problem Our Approach Results Conclusions

Object Representation and Classifier

Part based object representation [φ(r)

Root Sparse Max Pooling Parts

, φ(p1) … φ(pP)

Context

, φ(c1) … φ(cC)]

Sparse Max Pooling [Yang et al. ’10] – similar to bag of words: sparse coding – fuller representation of original features

9

SLIDE 42

Introduction Problem Our Approach Results Conclusions

Object Representation and Classifier

Part based object representation [φ(r)

Root Sparse Max Pooling Parts

, φ(p1) … φ(pP)

Context

, φ(c1) … φ(cC)]

Sparse Max Pooling [Yang et al. ’10] – similar to bag of words: sparse coding – fuller representation of original features max pooling – better discriminability in clutter [Boureau ’10].

9

SLIDE 43

Introduction Problem Our Approach Results Conclusions

Object Representation and Classifier

Part based object representation [φ(r)

Root Sparse Max Pooling Parts

, φ(p1) … φ(pP)

Context

, φ(c1) … φ(cC)]

Score windows as linear sum: f (O) = wTϕ(O) = wrϕ(r) +

P

i=1

wpiϕ(pi) +

C

i=1

wciϕ(ci), w - linear SVM weights Sparse Max Pooling [Yang et al. ’10] – similar to bag of words: sparse coding – fuller representation of original features max pooling – better discriminability in clutter [Boureau ’10].

9

SLIDE 44

Introduction Problem Our Approach Results Conclusions

Object Representation and Classifier

Relationship to existing detection models

Spatial pyramid (SP)

[Lazebnik et al. ’06, Vedaldi et al. ’09]

Latent SVM (LSVM)

[Felzenswalb et al. ’09]

+ +

Hard VQ +avg pooling

f local features, discard locs per window

+ + root parts deformations

dense gradients at fixed locs within window

10

SLIDE 45

Introduction Problem Our Approach Results Conclusions

Object Representation and Classifier

Relationship to existing detection models

Spatial pyramid (SP)

[Lazebnik et al. ’06, Vedaldi et al. ’09]

Latent SVM (LSVM)

[Felzenswalb et al. ’09]

+ +

Hard VQ +avg pooling

f local features, discard locs per window

+ + root parts deformations

dense gradients at fixed locs within window

Ours

Sparse code +max pooling

root parts

+

(p1) ! (pP) (r) local features, discard locs per window ( )

10

SLIDE 46

Introduction Problem Our Approach Results Conclusions

Object Representation and Classifier

Relationship to existing detection models

Spatial pyramid (SP)

[Lazebnik et al. ’06, Vedaldi et al. ’09]

Latent SVM (LSVM)

[Felzenswalb et al. ’09]

+ +

Hard VQ +avg pooling

f local features, discard locs per window

+ + root parts deformations

dense gradients at fixed locs within window

Ours

Sparse code +max pooling

root parts

+

(p1) ! (pP) (r) local features, discard locs per window ( )

faster training (linear SVM)

10

SLIDE 47

Introduction Problem Our Approach Results Conclusions

Object Representation and Classifier

Relationship to existing detection models

Spatial pyramid (SP)

[Lazebnik et al. ’06, Vedaldi et al. ’09]

Latent SVM (LSVM)

[Felzenswalb et al. ’09]

+ +

Hard VQ +avg pooling

f local features, discard locs per window

+ + root parts deformations

dense gradients at fixed locs within window

Ours

Sparse code +max pooling

root parts

+

(p1) ! (pP) (r) local features, discard locs per window ( )

faster training (linear SVM) results comparable to non-linear detectors

10

SLIDE 48

Introduction Problem Our Approach Results Conclusions

Selecting Images to Annotate

“bicycle”

Actively selected examples Labeled data

Part-based linear detector

Unlabeled images Unlabeled windows

Jumping window prediction Online annotation collection Select images to annotate

?

Category model

11

SLIDE 49

Introduction Problem Our Approach Results Conclusions

Selecting Images to Annotate

“bicycle”

Actively selected examples Labeled data

Part-based linear detector

Unlabeled images Unlabeled windows

Jumping window prediction Online annotation collection Select images to annotate

?

Category model

Efficient active selection select most useful windows from millions

11

SLIDE 50

Introduction Problem Our Approach Results Conclusions

Active Selection of Object Windows

SVM margin criterion for active selection

w

? Select point nearest to hyperplane decision boundary for labeling. x∗ = argmin

xi∈U

|wTxi|

[Tong & Koller, 2000; Schohn & Cohen, 2000; Campbell et al. 2000]

12

SLIDE 51

Introduction Problem Our Approach Results Conclusions

Active Selection of Object Windows

SVM margin criterion for active selection

w

? Select point nearest to hyperplane decision boundary for labeling. x∗ = argmin

xi∈U

|wTxi|

[Tong & Koller, 2000; Schohn & Cohen, 2000; Campbell et al. 2000]

Problem: With massive unlabeled pool, cannot afford exhaustive linear scan to make selection. 12

SLIDE 52

Introduction Problem Our Approach Results Conclusions

Active Selection of Object Windows

Sub-linear time selection through hyperplane hashing

hash function h(.) - high probability of collision when ϕ(O) close to w

[Jain, Vijayanarasimhan and Grauman, NIPS 2010]

13

SLIDE 53

Introduction Problem Our Approach Results Conclusions

Active Selection of Object Windows

Sub-linear time selection through hyperplane hashing

hash function h(.) - high probability of collision when ϕ(O) close to w

[Jain, Vijayanarasimhan and Grauman, NIPS 2010]

Unlabeled windows

13

SLIDE 54

Introduction Problem Our Approach Results Conclusions

Active Selection of Object Windows

Sub-linear time selection through hyperplane hashing

hash function h(.) - high probability of collision when ϕ(O) close to w preprocess - hash unlabeled windows into table

[Jain, Vijayanarasimhan and Grauman, NIPS 2010]

Unlabeled windows 1100

( )

) (

i

O h ϕ

Hash table

13

SLIDE 55

Introduction Problem Our Approach Results Conclusions

Active Selection of Object Windows

Sub-linear time selection through hyperplane hashing

hash function h(.) - high probability of collision when ϕ(O) close to w preprocess - hash unlabeled windows into table

[Jain, Vijayanarasimhan and Grauman, NIPS 2010]

Unlabeled windows 1111 1010 1100

( )

) (

i

O h ϕ

Hash table

13

SLIDE 56

Introduction Problem Our Approach Results Conclusions

Active Selection of Object Windows

Sub-linear time selection through hyperplane hashing

hash function h(.) - high probability of collision when ϕ(O) close to w preprocess - hash unlabeled windows into table active learning loop - hash classifier w and retrieve examples

[Jain, Vijayanarasimhan and Grauman, NIPS 2010]

Unlabeled windows 1111 1010 1100

( )

) (

i

O h ϕ

Hash table Category model

13

SLIDE 57

Introduction Problem Our Approach Results Conclusions

Active Selection of Object Windows

Sub-linear time selection through hyperplane hashing

hash function h(.) - high probability of collision when ϕ(O) close to w preprocess - hash unlabeled windows into table active learning loop - hash classifier w and retrieve examples

[Jain, Vijayanarasimhan and Grauman, NIPS 2010]

Unlabeled windows 1111 1010 1100

( )

) (

i

O h ϕ

( )

w h

Hash table Category model

13

SLIDE 58

Introduction Problem Our Approach Results Conclusions

Active Selection of Object Windows

Sub-linear time selection through hyperplane hashing

hash function h(.) - high probability of collision when ϕ(O) close to w preprocess - hash unlabeled windows into table active learning loop - hash classifier w and retrieve examples

[Jain, Vijayanarasimhan and Grauman, NIPS 2010]

Unlabeled windows 1111 1010 1100

( )

) (

i

O h ϕ

( )

w h Sub-linear Time Retrieval

Hash table Category model

13

SLIDE 59

Introduction Problem Our Approach Results Conclusions

Active Selection of Object Windows

Sub-linear time selection through hyperplane hashing

hash function h(.) - high probability of collision when ϕ(O) close to w preprocess - hash unlabeled windows into table active learning loop - hash classifier w and retrieve examples evaluate ∼ 103 windows vs. ∼ 106 for exhaustive

[Jain, Vijayanarasimhan and Grauman, NIPS 2010]

Unlabeled windows 1111 1010 1100

( )

) (

i

O h ϕ

( )

w h Sub-linear Time Retrieval

Hash table Category model

13

SLIDE 60

Introduction Problem Our Approach Results Conclusions

Online Annotation Collection

“bicycle”

Actively selected examples Labeled data

Part-based linear detector

Unlabeled images Unlabeled windows

Jumping window prediction Online annotation collection

1111 1010 1100

Hash table ( )

) (

i

O h ϕ

( )

w h

?

Category model

n the fly

reliable annotations without pruning

14

SLIDE 61

Introduction Problem Our Approach Results Conclusions

Online Annotation Collection

Mechanical Turk Interface 15

SLIDE 62

Introduction Problem Our Approach Results Conclusions

Online Annotation Collection

Mechanical Turk Interface

post same image to multiple (5-10) annotators

15

SLIDE 63

Introduction Problem Our Approach Results Conclusions

Online Annotation Collection

Mechanical Turk Interface

post same image to multiple (5-10) annotators cluster all bounding boxes to obtain consensus

15

SLIDE 64

Introduction Problem Our Approach Results Conclusions

Online Annotation Collection

Mechanical Turk Interface

post same image to multiple (5-10) annotators cluster all bounding boxes to obtain consensus

15

SLIDE 65

Introduction Problem Our Approach Results Conclusions

Summary: Live Learning

“bicycle”

Actively selected examples Labeled data

Part-based linear detector

Unlabeled images Unlabeled windows

Jumping window prediction

1111 1010 1100

Hash table ( )

) (

i

O h ϕ

( )

w h Consensus (Mean shift)

Category model

16

SLIDE 66

Introduction Problem Our Approach Results Conclusions

Results

PASCAL 2007 challenge 20 different objects under changes in viewpoint, scale, and background clutter. ∼ 5000 training and test examples given an image detect all objects

17

SLIDE 67

Introduction Problem Our Approach Results Conclusions

Results

PASCAL 2007 challenge 20 different objects under changes in viewpoint, scale, and background clutter. ∼ 5000 training and test examples given an image detect all objects Live learning on Flickr 6 of the most challenging PASCAL objects New Flickr testset

17

SLIDE 68

Introduction Problem Our Approach Results Conclusions

Results

PASCAL 2007 challenge 20 different objects under changes in viewpoint, scale, and background clutter. ∼ 5000 training and test examples given an image detect all objects Live learning on Flickr 6 of the most challenging PASCAL objects New Flickr testset Features 30,000 SIFT features densely extracted 60,000 words with hierarchical kmeans sparse coding using LLC [Yang et al. ’10]

17

SLIDE 69

Introduction Problem Our Approach Results Conclusions

Results

PASCAL 2007 challenge 20 different objects under changes in viewpoint, scale, and background clutter. ∼ 5000 training and test examples given an image detect all objects Live learning on Flickr 6 of the most challenging PASCAL objects New Flickr testset Features 30,000 SIFT features densely extracted 60,000 words with hierarchical kmeans sparse coding using LLC [Yang et al. ’10] Implementation 12 parts from LSVM detector 100 images per active iteration

17

SLIDE 70

Introduction Problem Our Approach Results Conclusions

Sandbox Results (PASCAL 2007)

Comparison to state-of-art

aero. cat dog sheep sofa train bicyc. bird boat bottl bus · · · Mean BoF SP 30.4 17.7 18.0 19.1 14.7 35.7 43.1 6.9 3.5 10.8 35.8 23.0 Ours 48.4 30.7 21.8 28.8 33.0 47.7 48.3 14.1 13.6 15.3 43.9 30.5

part-based, single feature representation, linear model

SLIDE 71

Introduction Problem Our Approach Results Conclusions

Sandbox Results (PASCAL 2007)

Comparison to state-of-art

aero. cat dog sheep sofa train bicyc. bird boat bottl bus · · · Mean BoF SP 30.4 17.7 18.0 19.1 14.7 35.7 43.1 6.9 3.5 10.8 35.8 23.0 Ours 48.4 30.7 21.8 28.8 33.0 47.7 48.3 14.1 13.6 15.3 43.9 30.5 LSVM+HOG1 32.8 21.3 8.8 16.2 24.4 39.2 56.8 2.5 16.8 28.5 39.7 29.1 SP+MKL2 37.6 30.0 21.5 23.9 28.5 45.3 47.8 15.3 15.3 21.9 50.7 32.1

1[Felzenszwalb et al. ’09] 2[Vedaldi et al. ’09]

part-based, single feature representation, linear model competitive with state-of-art (better for 6 classes) 18

SLIDE 72

Introduction Problem Our Approach Results Conclusions

Live Learning Results

Live learning tested on PASCAL testset

19

SLIDE 73

Introduction Problem Our Approach Results Conclusions

Live Learning Results

Live learning tested on PASCAL testset

bird boat dog potted plant sheep chair Ours 15.8 18.9 25.3 11.6 28.4 9.1 Previous best 15.3 16.8 21.5 14.6 23.9 17.9 Significant improvements in state-of-art on challenging categories

19

SLIDE 74

Introduction Problem Our Approach Results Conclusions

Live Learning Results

Live learning tested on PASCAL testset

bird boat dog potted plant sheep chair Ours 15.8 18.9 25.3 11.6 28.4 9.1 Previous best 15.3 16.8 21.5 14.6 23.9 17.9 Significant improvements in state-of-art on challenging categories

Computation Time

Active selection Training Detection per image Ours + active 10 mins 5 mins 150 secs LSVM [Felzenszwalb et al. 2009] 3 hours 4 hours 2 secs SP+MKL [Vedaldi et al. 2009] 93 hours > 2 days 67 secs Our approach’s efficiency makes live learning feasible.

19

SLIDE 75

Introduction Problem Our Approach Results Conclusions

Live Learning Results

Live learning tested on Flickr testset

General test set of web images

20

SLIDE 76

Introduction Problem Our Approach Results Conclusions

Live Learning Results

Live learning tested on Flickr testset

General test set of web images

keyword+image: randomly select a keyword filtered image to get bounding box keyword+window: randomly select a window within a keyword filtered image and obtain binary label

20

SLIDE 77

Introduction Problem Our Approach Results Conclusions

Live Learning Results

Live learning tested on Flickr testset

General test set of web images

Annotations added, out of 3 million examples Average Precision 5000 5500 6000 0.1 0.15 0.2 0.25 boat 5000 5500 6000 0.44 0.46 0.48 0.5 0.52 0.54 dog 5000 5500 6000 0.2 0.25 0.3 bird 5000 5500 6000 0.2 0.4 pottedplant 5000 5500 6000 0.2 0.25 0.3 0.35 sheep 5000 5500 6000 0.1 0.2 0.3 chair Live active (ours) Keyword+image Keyword+window

keyword+image: randomly select a keyword filtered image to get bounding box keyword+window: randomly select a window within a keyword filtered image and obtain binary label

20

SLIDE 78

Introduction Problem Our Approach Results Conclusions

Live Learning Results

Live learning tested on Flickr testset

General test set of web images

Annotations added, out of 3 million examples Average Precision 5000 5500 6000 0.1 0.15 0.2 0.25 boat 5000 5500 6000 0.44 0.46 0.48 0.5 0.52 0.54 dog 5000 5500 6000 0.2 0.25 0.3 bird 5000 5500 6000 0.2 0.4 pottedplant 5000 5500 6000 0.2 0.25 0.3 0.35 sheep 5000 5500 6000 0.1 0.2 0.3 chair Live active (ours) Keyword+image Keyword+window

keyword+image: randomly select a keyword filtered image to get bounding box keyword+window: randomly select a window within a keyword filtered image and obtain binary label

dramatic improvements for most categories 20

SLIDE 79

Introduction Problem Our Approach Results Conclusions

Live Learning Results

Live learning tested on Flickr testset

General test set of web images

Annotations added, out of 3 million examples Average Precision 5000 5500 6000 0.1 0.15 0.2 0.25 boat 5000 5500 6000 0.44 0.46 0.48 0.5 0.52 0.54 dog 5000 5500 6000 0.2 0.25 0.3 bird 5000 5500 6000 0.2 0.4 pottedplant 5000 5500 6000 0.2 0.25 0.3 0.35 sheep 5000 5500 6000 0.1 0.2 0.3 chair Live active (ours) Keyword+image Keyword+window

keyword+image: randomly select a keyword filtered image to get bounding box keyword+window: randomly select a window within a keyword filtered image and obtain binary label

dramatic improvements for most categories

utperforms status quo approach of learning

20

SLIDE 80

Introduction Problem Our Approach Results Conclusions

Conclusions

autonomous online learning - break-free from sandbox learning no intervention in example/annotation selection or pruning

btains results better than state-of-art on challenging