Elisa Maiettini and Dr. Giulia Pasquale
Joint work with:
- Prof. Lorenzo Natale, Prof. Lorenzo Rosasco
MUNICH 10-12 OCT 2017
NATURAL, INTERACTIVE TRAINING OF SERVICE ROBOTS TO DETECT NOVEL - - PowerPoint PPT Presentation
MUNICH 10-12 OCT 2017 NATURAL, INTERACTIVE TRAINING OF SERVICE ROBOTS TO DETECT NOVEL OBJECTS Elisa Maiettini and Dr. Giulia Pasquale Joint work with: Prof. Lorenzo Natale, Prof. Lorenzo Rosasco iCub R1 What do we need? Target scenario
Elisa Maiettini and Dr. Giulia Pasquale
Joint work with:
MUNICH 10-12 OCT 2017
R1 iCub
WINDOW To close at night SOFA Elisa’s favorite sofa LIBRARY Contains books To dust every week PLANT To water every friday PLANT To water every monday
Plant Window Sofa Sofa Library Are we done with Object Recognition? The R1 perspective. Giulia Pasquale, GTC 2017, San Jose, CA [http://on-demand.gputechconf.com/gtc/2017/video/s7295-giulia-pasquale-are-
we-done-with-object-recognition-the-r1-robot-perspective.PNG.mp4k ]
Sofa Plant
Grid-based approaches[1][2]:
Region-based approaches[3][4][5]:
[3] Girshick R et al., 2014 [5] Shaoqing R et al., 2015 [4] Girshick R et al., 2015 [6] Uijlings J. R. R., 2013 [1] Redmond J. et al, 2016 [2] Liu W. et al, 2016
It’s a sofa!
No object … No object No object
We cannot run a classifier for each possible window!
Sliding window approach:
RPN CNN
Feature Map
ROI pooling Layer For each ROI fc6 fc7
Region proposals Classification scores Predicted Bounding boxes Classifier Bounding box regressor
Where to look? Where? What?
[5] Shaoqing R et al., NIPS, 2015
Modularity = Flexibility!! RPN is faster and more efficient than external methods
from sensors (e.g. depth)
model training
[8] Pasquale et al., Frontiers 2016
Bounding boxes Labels
Detection system deployed on R1 thanks to the on board Jetson Tx2
[9] Pasquale et al. IROS 2016 [https://robotology.github.io/iCubWorld]
1 2 3 4 5 6 7 8 9 10
Scenario 1: same HRI setting
1. Predictions compared with automatically acquired Ground Truth (Mean Average Precision = 0.71)
Ground Truth (Mean Average Precision = 0.75)
Even better!
Scenario 2: different scenes
New sequences acquisition and manual annotation Promising results: mAPfloor=0.55, mAPtable=0.66 mAPshelf=0,53
Performed offline on: GPU: NVIDIA Tesla P100 using: CNN: Zeiler and Fergus network[9] DATASET: iCubWorld Transformations with: Num images: ~27k RPN Train: Iterations: 81k Time: ~40 minutes Detector Train: Iterations: 54k Time: ~53 minutes Total train Time: ~3 hours
[9] Zeiler M. D. and Fergus R., CoRR, 2013
Thanks to NVIDIA Jetson Tx2: fast & easy fully autonomous platform easy systems integration CAFFE & YARP & Python Evaluation of Tensor RT framework
Regions per Frame Frame per second
100 ~4 300 ~3 1000 ~2
Future steps:
information to improve precision (e.g. time coherence)
towards scene understanding task
Contribution:
annotation for robotic platforms
NVIDIA Jetson Tx2 on board
Carlo Ciliberto Research Associate, UCL IRIS Vadim Tikhanoff Technologist Researcher, iCub Facility Raffaello Camoriano Postdoc, LCSL Ugo Pattacini Technologist Researcher, iCub Facility Marco Randazzo Senior Technician, iCub Facility Alberto Cardellino Junior Technician, iCub Facility Lorenzo Rosasco Team Leader, LCSL Lorenzo Natale Principal Investigator, iCub Facility Giorgio Metta Research Director, iCub Facility Alessandro Rudi Postdoc, LCSL Tanis Mar Postdoc, iCub Facility
Learning
Francesca Odone Assistant Professor,