An Egocentric Perspec/ve on Ac/ve Vision and Visual Object Learning - - PowerPoint PPT Presentation

an egocentric perspec ve on ac ve vision and visual
SMART_READER_LITE
LIVE PREVIEW

An Egocentric Perspec/ve on Ac/ve Vision and Visual Object Learning - - PowerPoint PPT Presentation

An Egocentric Perspec/ve on Ac/ve Vision and Visual Object Learning in Toddlers S. Bambach, D. Crandall, L. Smith, C. Yu. ICDL 2017 Experiment presenters: Arjun, Ginevra Their Experiments Image source: paper Their Experiments Authors could


slide-1
SLIDE 1

An Egocentric Perspec/ve on Ac/ve Vision and Visual Object Learning in Toddlers

  • S. Bambach, D. Crandall, L. Smith, C. Yu.

ICDL 2017 Experiment presenters: Arjun, Ginevra

slide-2
SLIDE 2

Their Experiments

Image source: paper

slide-3
SLIDE 3

Their Experiments

Authors could not control training set

Image source: paper

slide-4
SLIDE 4

Our Experiments

  • We generate images where

– Labeled object occupies fixed percentage of view – Background objects do not move

Image source: collages we made from Caltech 256 database

slide-5
SLIDE 5

Our Experiments

  • Simulate toddler bringing object to face

– We control scale to measure its effect on tes/ng accuracy

Image source: collages we made from Caltech 256 database

slide-6
SLIDE 6

Our Dataset

  • 5 classes, 3633 images
  • Collages

– Construct ‘scenes of toys’ using Caltech-256 – 1 posi/ve image amongst many nega/ves – Simulate toddler perspec/ve

Image source: Caltech 256 database

slide-7
SLIDE 7

Scene Genera/on

  • Scene dim: 224 x 224

– Scale largest image dim to 70 – Rotate randomly from -15° to 15°

  • 10 nega/ves

– Select uniformly from Caltech-256 nega/ves – Placed randomly in within scene boundary

  • 1 posi/ve

– Scale 0 (1x), 1 (1.5x), 2 (2x), 3 (3x) – Place randomly within scene boundary (at scale 1)

  • 2 scenes per training instance
slide-8
SLIDE 8

VGG 16

Image source, and source of some code used in the experiments: h]ps://www.cs.toronto.edu/~frossard/post/vgg16/

slide-9
SLIDE 9

VGG 16 for 5 classes

Image source: h]ps://www.cs.toronto.edu/~frossard/post/vgg16/, modified by us

slide-10
SLIDE 10

Experiment Setup

  • Experiment 1

– Train on different scales, test on clean image

  • Experiment 2

– Train on different scales and clean, test on different scales

Scale 0 10% of view Scale 1 20% of view Scale 2 30% of view Scale 3 60% of view Clean Image

Image source: collages we made from Caltech 256 database

slide-11
SLIDE 11

Experiment Setup

  • Experiment 1

– Train on different scales, test on clean image

  • Experiment 2

– Train on different scales and clean, test on different scales

Scale 0 10% of view Scale 1 20% of view Scale 2 30% of view Scale 3 60% of view Clean Image

Image source: collages we made from Caltech 256 database

slide-12
SLIDE 12

Experiment 1 - objec/ve

  • Test effect of ‘bringing object to face’ for

isolated classifica/on

  • Ques/ons to consider

– Effect of viewing at mul/ple scales? – Single ideal scale or result of mul/ple scales?

Image source: h]ps://en.wik/onary.org/wiki/ques/on_mark

slide-13
SLIDE 13

Experiment 1 - data

Train0

Image source: collages we made from Caltech 256 database

slide-14
SLIDE 14

Experiment 1 - data

Train1

Image source: collages we made from Caltech 256 database

slide-15
SLIDE 15

Experiment 1 - data

Train2

Image source: collages we made from Caltech 256 database

slide-16
SLIDE 16

Experiment 1 - data

Train3

Image source: collages we made from Caltech 256 database

slide-17
SLIDE 17

Experiment 1 - data

Train3only

Image source: collages we made from Caltech 256 database

slide-18
SLIDE 18

Experiment 1 - data

Correct number of epochs to compensate for more training examples

Image source: collages we made from Caltech 256 database

slide-19
SLIDE 19

Experiment 1 - data

Test

Image source: collages we made from Caltech 256 database

slide-20
SLIDE 20

Experiment 1 - results

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Train0 Train1 Train2 Train3 Train3only Tes*ng accuracy on clean image Train Set

slide-21
SLIDE 21

Experiment 1 - results

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Train0 Train1 Train2 Train3 Train3only Tes*ng accuracy on clean image Train Set

slide-22
SLIDE 22

Experiment 1 - results

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Train0 Train1 Train2 Train3 Train3only Tes*ng accuracy on clean image Train Set

Training on larger scale images only yields to best test accuracy.

slide-23
SLIDE 23

Experiment 1 - results

  • Images misclassified when network trained in low

scales benefit from training in higher scales

Image source: Caltech 256 database

Misclassified aier train0, train1, train2 Correctly classified aier train3 and train3only (Category: bag)

slide-24
SLIDE 24

Experiment 1 - results

  • Images misclassified when network trained in low

scales benefit from training in higher scales

Image source: Caltech 256 database

Misclassified aier train0, train1, train2, train3 Correctly classified only aier train3only (Category: plane)

slide-25
SLIDE 25

Experiment 1 - results

  • Images misclassified aier train3only were

misclassified aier all other trainings

Image source: Caltech 256 database

Bag Plane Plane

slide-26
SLIDE 26

Experiment 1 - conclusions

  • Toddler’s data gives be]er training because
  • bject is closer, not because it is ‘brought to face’
  • Significant jump in accuracy if object occupies

>30% of view in training

  • Training images where object occupies <30% of

view do more harm than good

Image source: collages we made from Caltech 256 database

slide-27
SLIDE 27

Experiment Setup

  • Experiment 1

– Train on different scales, test on clean image

  • Experiment 2

– Train on different scales and clean, test on different scales

Scale 0 10% of view Scale 1 20% of view Scale 2 30% of view Scale 3 60% of view Clean Image

Image source: collages we made from Caltech 256 database

slide-28
SLIDE 28

Experiment 2 - objec/ve

  • Effect of ‘bringing to face’ for object-in-scene

detec/on

  • Ques/ons to consider

– Does ‘cleaning’ the scene decrease detec/on in clu]ered environment?

Image source: h]ps://en.wik/onary.org/wiki/ques/on_mark

slide-29
SLIDE 29

Experiment 2 - data

Train0

Image source: collages we made from Caltech 256 database

slide-30
SLIDE 30

Experiment 2 - data

Train1

Image source: collages we made from Caltech 256 database

slide-31
SLIDE 31

Experiment 2 - data

Train2

Image source: collages we made from Caltech 256 database

slide-32
SLIDE 32

Experiment 2 - data

Train3

Image source: collages we made from Caltech 256 database

slide-33
SLIDE 33

Experiment 2 - data

TrainClean

Image source: collages we made from Caltech 256 database

slide-34
SLIDE 34

Experiment 2 - data

Correct number of epochs to compensate for more training examples

Image source: collages we made from Caltech 256 database

slide-35
SLIDE 35

Experiment 2 - data

Test0

Image source: collages we made from Caltech 256 database

On different images compared to train sets

slide-36
SLIDE 36

Experiment 2 - data

Test1only

Image source: collages we made from Caltech 256 database

On different images compared to train sets

slide-37
SLIDE 37

Experiment 2 - data

Test2only

Image source: collages we made from Caltech 256 database

On different images compared to train sets

slide-38
SLIDE 38

Experiment 2 - data

Test3only

Image source: collages we made from Caltech 256 database

On different images compared to train sets

slide-39
SLIDE 39

Experiment 2 - results

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Train0 Train1 Train2 Train3 TrainClean Tes*ng accuracy Train set Test0 Test1only Test2only Test3only

slide-40
SLIDE 40

Experiment 2 - results

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Train0 Train1 Train2 Train3 TrainClean Tes*ng accuracy Train set Test0 Test1only Test2only Test3only

slide-41
SLIDE 41

Experiment 2 - results

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Train0 Train1 Train2 Train3 TrainClean Tes*ng accuracy Train set Test0 Test1only Test2only Test3only

slide-42
SLIDE 42

Experiment 2 - results

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Train0 Train1 Train2 Train3 TrainClean Tes*ng accuracy Train set Test0 Test1only Test2only Test3only

Training by ‘bringing to face’ yields to best accuracy

slide-43
SLIDE 43

Experiment 2 - conclusions

  • Can learn more from different scales than

from clean, as long as scale 3 is included

  • Learning from different scales gives be]er

accuracies when tested on lower scales

  • Test on clean much be]er than test on scales

Image source: collages we made from Caltech 256 database

slide-44
SLIDE 44

Conclusions

  • With our controlled datasets, we could verify

that network learns be]er from larger scale

  • Tes/ng needs to be done on clean images, no

ma]er which scales were used in training

  • Training on scales >30% gives more

robustness when tes/ng on all scales

  • Training on scales <30% hurts accuracy