S e ma n t i c w e b - mi n i n g a n d d e e p v i s i o n f o r l i f e l o n g o b j e c t d i s c o v e r y V a l e r i o B a s i l e D a t a S c i e n c e Me e t u p L e a r n i n g C e n t e r S o p h i a P o l y t e c h 2 0 / 6 / 2 0 1 7 valeriobasile@gmail.com
Mo i Valerio Basile From: Italy PhD in Groningen (NL) Postdoc at WIMMICS, Inria Computational Semantics, Semantic Web, Natural Language Generation, Information Extraction, Linguistic Annotation, Distributional Semantics, General Knowledge Bases, Gamification, Social Media, Sentiment Analysis, Legal Informatics, Argument Mining, Math, Pasta, Videogames, ...
T o d a y Robots Computer vision Objects Semantics Learning Data
P a r t I : Mo t i v a t i o n
We deploy robots in human-inhabited environments. Our robots autonomously collect real-world data. We use information available on the Semantic Web to identify the semantics of objects.
● Background Knowledge ● Room detection ● Object classification ● Frame semantics ● ...
F r a me S e ma n t i c s Bob, du pain svp!
F r a me S e ma n t i c s Frame name Frame type role Frame element
F r a me S e ma n t i c s
P a r t I I : D e e p V i s i o n
P e r c e p t i o n a n d I d e n t i fi c a t i o n
P e r c e p t i o n a n d I d e n t i fi c a t i o n
P e r c e p t i o n a n d I d e n t i fi c a t i o n monitor keyboard mousepad
P e r c e p t i o n a n d I d e n t i fi c a t i o n
S i t u a t e d R o b o t P e r c e p t i o n Robot deployments in office environments The robot visits fixed waypoints on the map, taking full 360° RGB-D scans
S i t u a t e d R o b o t P e r c e p t i o n RGB-D depth segmentation algorithm (Potapova et al., 2014) Pictures → Point clouds → Patches → → Surfaces → Clustering → Filtering → → Candidate objects
S u p e r v i s e d O b j e c t R e c o g n i t i o n Training Prediction CNN CNN dbr:Keyboard
S u p e r v i s e d O b j e c t R e c o g n i t i o n Convolutional Neural Network model trained on 1,000 object classes (More detail in Young et al., ICRA 2017)
21,841 WordNet Synsets (1,000 with SIFT features) 14,197,122 images (1,034,908 with bounding box) http:/ /image-net.org/ http:/ /wordnet-rdf.princeton.edu/
S u p e r v i s e d O b j e c t R e c o g n i t i o n http:/ /image-net.org/
P e r f o r ma n c e o f R o b o t V i s i o n Good but not great Mugs on ImageNet Mugs seen by a robot (training data) (validation data)
P a r t I I I : S e ma n t i c s a n d V i s i o n
O b j e c t I d e n t i fi c a t i o n
P l a c e C l a s s i fi c a t i o n
S e ma n t i c R e l a t e d n e s s
S e ma n t i c R e l a t e d n e s s
S e ma n t i c R e l a t e d n e s s
S e ma n t i c R e l a t e d n e s s C o - o c c u r r e n c e ma t r i x Washing_machine Ashtray Bathroom 5 2 Bedroom 0 1 Living_room 1 6
S e ma n t i c R e l a t e d n e s s C o - o c c u r r e n c e ma t r i x S i n g u l a r v a l u e d e c o mp o s i t i o n Washing_machine Ashtray * M = U ΣV Bathroom 5 2 L o w - r a n k a p p r o x i ma t i o n Bedroom 0 1 * = M k U k Σ k V k Living_room 1 6 NASARI: A Novel Approach to a Semantically-Aware Representation of Items (Camacho-Collados, Pilehvar and Navigli, 2015)
S e ma n t i c S i mi l a r i t y bn:00008995n Bathroom -0.03750793 0.06731935 -0.02334246 -0.02009913 0.02251291 0.07689607 0.01527985 -0.10780967 0.18232885 0.1234034 -0.0520944 -0.25805958 0.12200121 -0.04875973 -0.03544397 -0.03841146 0.00970973 … bn:00007365n Washing_machine -0.00911299 0.11549547 -0.04274256 0.03672424 -0.06627292 0.13761881 0.01171631 -0.08721243 0.08270955 0.13095092 -0.00137408 -0.16226186 0.0422162 0.0545828 -0.01007292 0.10094466 -0.05663372 0.09864459 0.10167608 7.534e-05 0.08067719 0.05527394 C o s i n e s i mi l a r i t y : n ∑ A i B i A ⋅ B i = 1 ‖ A ‖‖ B ‖= √ ∑ A i ² √ ∑ n n A i ² i = 1 i = 1 http://lcl.uniroma1.it/nasari/
S e ma n t i c S i mi l a r i t y Wa s h i n g _ ma c h i n e B a t h r o o m α β A s h t r a y α ≈ s i m( B a t h r o o m, Wa s h i n g _ ma c h i n e ) = c o s () 0 . 7 1 β ≈ s i m( B a t h r o o m, A s h t r a y ) = c o s () 0 . 3 7
P l a c e C l a s s i fi c a t i o n = Cosine similarity on NASARI + aggregation, weighting by distance, ...
P l a c e C l a s s i fi c a t i o n 0° RGB-D scans
D i s t r i b u t i o n a l R e l a t i o n a l H y p o t h e s i s S e ma n t i c R e l a t i o n Type A Type B S e ma n t i c S i mi l a r i t y Entity 1 Entity 2
D i s t r i b u t i o n a l R e l a t i o n a l H y p o t h e s i s i s L o c a t e d A t Object Room S e ma n t i c S i mi l a r i t y Entity 1 Entity 2
D i s t r i b u t i o n a l R e l a t i o n a l H y p o t h e s i s i s L o c a t e d A t Object Room S e ma n t i c S i mi l a r i t y Entity 1 Entity 2 Successfully applied to object-location relation extraction (Basile et al, EKAW 2016) and improving object detection (Young et al., ICRA 2017) More on this later
P a r t I V : Mo r e S e ma n t i c s
http://dbpedia.org/page/Table_knife
http://conceptnet.io/c/en/knife
http://knowrob.org/kb/knowrob.owl
http://babelnet.org/synset?word=table+knife
DB KR BN DK CN Taxonomy Function Location Linked Data DBpedia ✔ ✘ ✘ ✔ ✔ ✔ ✔ ConceptNet partly KnowRob ✔ ✔ partly ✘ BabelNet ✔ ✘ ✘ ✔ DeKO partly ✔ ✔ ✔
DB KR BN DK CN Keyword Linking Taxonomy Function Location Linked Data DBpedia ✔ ✔ ✔ ✔ ✔ ✔ ✔ ✔ ConceptNet KnowRob ✔ ✔ ✔ ✔ BabelNet ✔ ✔ ✔ ✔ DeKO ✔ ✔ ✔ ✔
K e y w o r d L i n k i n g Me t h o d s DBpedia Lookup “official” search API of DBpedia String Match (+redirect) Try http:/ /dbpedia.org/resource/{KEYWORD} Babelfy State of the art algorithm for Word Sense Disambiguation/Entity Linking
K e y w o r d L i n k i n g Me t h o d s Vector-based Contextual disambiguation ● Run String Match on the keywords ● Split the missed keywords into tokens ● Run String Match on the tokens ● Compute the semantic similarity of each token-entity with all the previously recognized entities ● Select the highest scoring token-entity e.g., basket_of_banana dbr: Basket →
T h e S U N d a t a b a s e
T h e S U N d a t a b a s e http:/ /groups.csail.mit.edu/vision/SUN 131,067 Images 908 Scene categories 313,884 Segmented objects 4,479 Object categories After linking 2,493 objects in DBpedia 679 locations in DBpedia 2,935 object-location relations
T h e S U N d a t a b a s e http:/ /groups.csail.mit.edu/vision/SUN 131,067 Images 908 Scene categories 313,884 Segmented objects 4,479 Object categories After linking Yes, this is hard 2,493 objects in DBpedia 679 locations in DBpedia 2,935 object-location relations
E p i l o g u e : P u t t i n g i t a l l t o g e t h e r
D e f a u l t K n o w l e d g e a b o u t O b j e c t s RDF dataset of common sense knowledge about objects. Object classification, prototypical location, actions, frames... Knowledge extracted from parsing, crowdsourcing, distributional semantics, keyword linking
A u t o n o mo u s L e a r n i n g Task-level Loop Keyword Linking Distributional Relational Hypothesis ...
A u t o n o mo u s L e a r n i n g Learning-level Loop Robot perception Data collection Knowldge Building
A u t o n o mo u s L e a r n i n g Learning-level Loop Robot perception Validation Data collection Validation Knowldge Building
A u t o n o mo u s L e a r n i n g Learning-level Loop Robot perception Validation Data collection Validation Knowldge Building
F i n ( Q / A )
Recommend
More recommend