an egocentric perspec ve on ac ve vision and visual
play

An Egocentric Perspec/ve on Ac/ve Vision and Visual Object Learning - PowerPoint PPT Presentation

An Egocentric Perspec/ve on Ac/ve Vision and Visual Object Learning in Toddlers S. Bambach, D. Crandall, L. Smith, C. Yu. ICDL 2017 Experiment presenters: Arjun, Ginevra Their Experiments Image source: paper Their Experiments Authors could


  1. An Egocentric Perspec/ve on Ac/ve Vision and Visual Object Learning in Toddlers S. Bambach, D. Crandall, L. Smith, C. Yu. ICDL 2017 Experiment presenters: Arjun, Ginevra

  2. Their Experiments Image source: paper

  3. Their Experiments Authors could not control training set Image source: paper

  4. Our Experiments • We generate images where – Labeled object occupies fixed percentage of view – Background objects do not move Image source: collages we made from Caltech 256 database

  5. Our Experiments • Simulate toddler bringing object to face – We control scale to measure its effect on tes/ng accuracy Image source: collages we made from Caltech 256 database

  6. Our Dataset • 5 classes, 3633 images • Collages – Construct ‘scenes of toys’ using Caltech-256 – 1 posi/ve image amongst many nega/ves – Simulate toddler perspec/ve Image source: Caltech 256 database

  7. Scene Genera/on • Scene dim: 224 x 224 – Scale largest image dim to 70 – Rotate randomly from -15° to 15° • 10 nega/ves – Select uniformly from Caltech-256 nega/ves – Placed randomly in within scene boundary • 1 posi/ve – Scale 0 (1x), 1 (1.5x), 2 (2x), 3 (3x) – Place randomly within scene boundary (at scale 1) • 2 scenes per training instance

  8. VGG 16 Image source, and source of some code used in the experiments: h]ps://www.cs.toronto.edu/~frossard/post/vgg16/

  9. VGG 16 for 5 classes Image source: h]ps://www.cs.toronto.edu/~frossard/post/vgg16/, modified by us

  10. Experiment Setup • Experiment 1 – Train on different scales, test on clean image • Experiment 2 – Train on different scales and clean, test on different scales Scale 0 Scale 1 Scale 2 Scale 3 Clean 10% of view 20% of view 30% of view 60% of view Image Image source: collages we made from Caltech 256 database

  11. Experiment Setup • Experiment 1 – Train on different scales, test on clean image • Experiment 2 – Train on different scales and clean, test on different scales Scale 0 Scale 1 Scale 2 Scale 3 Clean 10% of view 20% of view 30% of view 60% of view Image Image source: collages we made from Caltech 256 database

  12. Experiment 1 - objec/ve • Test effect of ‘bringing object to face’ for isolated classifica/on • Ques/ons to consider – Effect of viewing at mul/ple scales? – Single ideal scale or result of mul/ple scales? Image source: h]ps://en.wik/onary.org/wiki/ques/on_mark

  13. Experiment 1 - data Train0 Image source: collages we made from Caltech 256 database

  14. Experiment 1 - data Train1 Image source: collages we made from Caltech 256 database

  15. Experiment 1 - data Train2 Image source: collages we made from Caltech 256 database

  16. Experiment 1 - data Train3 Image source: collages we made from Caltech 256 database

  17. Experiment 1 - data Train3only Image source: collages we made from Caltech 256 database

  18. Experiment 1 - data Correct number of epochs to compensate for more training examples Image source: collages we made from Caltech 256 database

  19. Experiment 1 - data Test Image source: collages we made from Caltech 256 database

  20. Experiment 1 - results 1 0.9 0.8 Tes*ng accuracy on clean image 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Train0 Train1 Train2 Train3 Train3only Train Set

  21. Experiment 1 - results 1 0.9 0.8 Tes*ng accuracy on clean image 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Train0 Train1 Train2 Train3 Train3only Train Set

  22. Experiment 1 - results 1 0.9 0.8 Tes*ng accuracy on clean image 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Train0 Train1 Train2 Train3 Train3only Train Set Training on larger scale images only yields to best test accuracy.

  23. Experiment 1 - results • Images misclassified when network trained in low scales benefit from training in higher scales Misclassified aier train0, train1, train2 Correctly classified aier train3 and train3only (Category: bag) Image source: Caltech 256 database

  24. Experiment 1 - results • Images misclassified when network trained in low scales benefit from training in higher scales Misclassified aier train0, train1, train2, train3 Correctly classified only aier train3only (Category: plane) Image source: Caltech 256 database

  25. Experiment 1 - results • Images misclassified aier train3only were misclassified aier all other trainings Bag Plane Plane Image source: Caltech 256 database

  26. Experiment 1 - conclusions • Toddler’s data gives be]er training because object is closer, not because it is ‘brought to face’ • Significant jump in accuracy if object occupies >30% of view in training • Training images where object occupies <30% of view do more harm than good Image source: collages we made from Caltech 256 database

  27. Experiment Setup • Experiment 1 – Train on different scales, test on clean image • Experiment 2 – Train on different scales and clean, test on different scales Scale 0 Scale 1 Scale 2 Scale 3 Clean 10% of view 20% of view 30% of view 60% of view Image Image source: collages we made from Caltech 256 database

  28. Experiment 2 - objec/ve • Effect of ‘bringing to face’ for object-in-scene detec/on • Ques/ons to consider – Does ‘cleaning’ the scene decrease detec/on in clu]ered environment? Image source: h]ps://en.wik/onary.org/wiki/ques/on_mark

  29. Experiment 2 - data Train0 Image source: collages we made from Caltech 256 database

  30. Experiment 2 - data Train1 Image source: collages we made from Caltech 256 database

  31. Experiment 2 - data Train2 Image source: collages we made from Caltech 256 database

  32. Experiment 2 - data Train3 Image source: collages we made from Caltech 256 database

  33. Experiment 2 - data TrainClean Image source: collages we made from Caltech 256 database

  34. Experiment 2 - data Correct number of epochs to compensate for more training examples Image source: collages we made from Caltech 256 database

  35. Experiment 2 - data Test0 On different images compared to train sets Image source: collages we made from Caltech 256 database

  36. Experiment 2 - data Test1only On different images compared to train sets Image source: collages we made from Caltech 256 database

  37. Experiment 2 - data Test2only On different images compared to train sets Image source: collages we made from Caltech 256 database

  38. Experiment 2 - data Test3only On different images compared to train sets Image source: collages we made from Caltech 256 database

  39. Experiment 2 - results 1 0.9 0.8 0.7 0.6 Tes*ng accuracy Test0 0.5 Test1only Test2only 0.4 Test3only 0.3 0.2 0.1 0 Train0 Train1 Train2 Train3 TrainClean Train set

  40. Experiment 2 - results 1 0.9 0.8 0.7 0.6 Tes*ng accuracy Test0 0.5 Test1only Test2only 0.4 Test3only 0.3 0.2 0.1 0 Train0 Train1 Train2 Train3 TrainClean Train set

  41. Experiment 2 - results 1 0.9 0.8 0.7 0.6 Tes*ng accuracy Test0 0.5 Test1only Test2only 0.4 Test3only 0.3 0.2 0.1 0 Train0 Train1 Train2 Train3 TrainClean Train set

  42. Experiment 2 - results 1 0.9 0.8 0.7 0.6 Tes*ng accuracy Test0 0.5 Test1only Test2only 0.4 Test3only 0.3 0.2 0.1 0 Train0 Train1 Train2 Train3 TrainClean Train set Training by ‘bringing to face’ yields to best accuracy

  43. Experiment 2 - conclusions • Can learn more from different scales than from clean, as long as scale 3 is included • Learning from different scales gives be]er accuracies when tested on lower scales • Test on clean much be]er than test on scales Image source: collages we made from Caltech 256 database

  44. Conclusions • With our controlled datasets, we could verify that network learns be]er from larger scale • Tes/ng needs to be done on clean images, no ma]er which scales were used in training • Training on scales >30% gives more robustness when tes/ng on all scales • Training on scales <30% hurts accuracy

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend