SLIDE 1 S e ma n t i c w e b
n i n g a n d d e e p v i s i
f
l i f e l
g
j e c t d i s c
e r y
V a l e r i
a s i l e D a t a S c i e n c e Me e t u p L e a r n i n g C e n t e r S
h i a P
y t e c h 2 / 6 / 2 1 7
valeriobasile@gmail.com
SLIDE 2
Valerio Basile From: Italy PhD in Groningen (NL) Postdoc at WIMMICS, Inria Computational Semantics, Semantic Web, Natural Language Generation, Information Extraction, Linguistic Annotation, Distributional Semantics, General Knowledge Bases, Gamification, Social Media, Sentiment Analysis, Legal Informatics, Argument Mining, Math, Pasta, Videogames, ...
Mo i
SLIDE 3 Robots Computer vision Objects Semantics Learning Data
T
a y
SLIDE 4 P a r t I : Mo t i v a t i
SLIDE 5
SLIDE 6
We deploy robots in human-inhabited environments. Our robots autonomously collect real-world data. We use information available on the Semantic Web to identify the semantics of objects.
SLIDE 7
SLIDE 8
SLIDE 9
- Background Knowledge
- Room detection
- Object classification
- Frame semantics
- ...
SLIDE 10
F r a me S e ma n t i c s
Bob, du pain svp!
SLIDE 11
F r a me S e ma n t i c s
Frame name Frame type Frame element role
SLIDE 12
F r a me S e ma n t i c s
SLIDE 13 P a r t I I : D e e p V i s i
SLIDE 14 P e r c e p t i
a n d I d e n t i fi c a t i
SLIDE 15 P e r c e p t i
a n d I d e n t i fi c a t i
SLIDE 16 P e r c e p t i
a n d I d e n t i fi c a t i
monitor keyboard mousepad
SLIDE 17 P e r c e p t i
a n d I d e n t i fi c a t i
SLIDE 18 S i t u a t e d R
P e r c e p t i
Robot deployments in office environments The robot visits fixed waypoints on the map, taking full 360° RGB-D scans
SLIDE 19 S i t u a t e d R
P e r c e p t i
RGB-D depth segmentation algorithm (Potapova et al., 2014) Pictures→Point clouds→Patches→ →Surfaces→Clustering→Filtering→ →Candidate objects
SLIDE 20 S u p e r v i s e d O b j e c t R e c
n i t i
CNN Training Prediction CNN
dbr:Keyboard
SLIDE 21 S u p e r v i s e d O b j e c t R e c
n i t i
Convolutional Neural Network model trained
(More detail in Young et al., ICRA 2017)
SLIDE 22
21,841 WordNet Synsets (1,000 with SIFT features) 14,197,122 images (1,034,908 with bounding box) http:/ /image-net.org/ http:/ /wordnet-rdf.princeton.edu/
SLIDE 23 S u p e r v i s e d O b j e c t R e c
n i t i
http:/ /image-net.org/
SLIDE 24 P e r f
ma n c e
R
V i s i
Good but not great
Mugs on ImageNet (training data) Mugs seen by a robot (validation data)
SLIDE 25 P a r t I I I : S e ma n t i c s a n d V i s i
SLIDE 26 O b j e c t I d e n t i fi c a t i
SLIDE 27 P l a c e C l a s s i fi c a t i
SLIDE 28
S e ma n t i c R e l a t e d n e s s
SLIDE 29
S e ma n t i c R e l a t e d n e s s
SLIDE 30
S e ma n t i c R e l a t e d n e s s
SLIDE 31 S e ma n t i c R e l a t e d n e s s
Washing_machine Ashtray Bathroom 5 2 Bedroom 1 Living_room 1 6
C
c u r r e n c e ma t r i x
SLIDE 32 S e ma n t i c R e l a t e d n e s s
Washing_machine Ashtray Bathroom 5 2 Bedroom 1 Living_room 1 6
C
c u r r e n c e ma t r i x S i n g u l a r v a l u e d e c
i t i
M=U ΣV
*
U k ΣkV k
*=M k
L
a n k a p p r
i ma t i
NASARI: A Novel Approach to a Semantically-Aware Representation of Items (Camacho-Collados, Pilehvar and Navigli, 2015)
SLIDE 33 S e ma n t i c S i mi l a r i t y
bn:00008995n Bathroom -0.03750793 0.06731935 -0.02334246 -0.02009913 0.02251291 0.07689607 0.01527985 -0.10780967 0.18232885 0.1234034
- 0.0520944 -0.25805958 0.12200121 -0.04875973 -0.03544397 -0.03841146
0.00970973 … bn:00007365n Washing_machine -0.00911299 0.11549547 -0.04274256 0.03672424
- 0.06627292 0.13761881 0.01171631 -0.08721243 0.08270955 0.13095092
- 0.00137408 -0.16226186 0.0422162 0.0545828 -0.01007292 0.10094466
- 0.05663372 0.09864459 0.10167608 7.534e-05 0.08067719 0.05527394
C
i n e s i mi l a r i t y :
A⋅B ‖A‖‖B‖=
∑
i=1 n
AiBi
√∑
i=1 n
Ai²√∑
i=1 n
Ai²
http://lcl.uniroma1.it/nasari/
SLIDE 34 S e ma n t i c S i mi l a r i t y
B a t h r
A s h t r a y Wa s h i n g _ ma c h i n e
α β s i m( B a t h r
Wa s h i n g _ ma c h i n e ) = c
() . 7 1 α ≈ s i m( B a t h r
A s h t r a y ) = c
() . 3 7 β ≈
SLIDE 35 P l a c e C l a s s i fi c a t i
= Cosine similarity on NASARI + aggregation, weighting by distance, ...
SLIDE 36 P l a c e C l a s s i fi c a t i
0° RGB-D scans
SLIDE 37 D i s t r i b u t i
a l R e l a t i
a l H y p
h e s i s
Entity 1 Entity 2 Type A Type B S e ma n t i c R e l a t i
S e ma n t i c S i mi l a r i t y
SLIDE 38 Entity 1 Entity 2 Object Room i s L
a t e d A t S e ma n t i c S i mi l a r i t y
D i s t r i b u t i
a l R e l a t i
a l H y p
h e s i s
SLIDE 39 Entity 1 Entity 2 Object Room i s L
a t e d A t S e ma n t i c S i mi l a r i t y
Successfully applied to object-location relation extraction (Basile et al, EKAW 2016) and improving object detection (Young et al., ICRA 2017)
D i s t r i b u t i
a l R e l a t i
a l H y p
h e s i s
More on this later
SLIDE 40
P a r t I V : Mo r e S e ma n t i c s
SLIDE 41
http://dbpedia.org/page/Table_knife
SLIDE 42
http://conceptnet.io/c/en/knife
SLIDE 43
http://knowrob.org/kb/knowrob.owl
SLIDE 44
http://babelnet.org/synset?word=table+knife
SLIDE 45
Taxonomy Function Location Linked Data DBpedia ✔ ✘ ✘ ✔ ConceptNet ✔ ✔ ✔ partly KnowRob ✔ ✔ partly ✘ BabelNet ✔ ✘ ✘ ✔ DeKO partly ✔ ✔ ✔
BN DB CN DK KR
SLIDE 46
Taxonomy Function Location Linked Data DBpedia ✔ ✔ ✔ ✔ ConceptNet ✔ ✔ ✔ ✔ KnowRob ✔ ✔ ✔ ✔ BabelNet ✔ ✔ ✔ ✔ DeKO ✔ ✔ ✔ ✔
BN DB CN DK KR Keyword Linking
SLIDE 47 K e y w
d L i n k i n g Me t h
s
DBpedia Lookup “official” search API of DBpedia String Match (+redirect) Try http:/ /dbpedia.org/resource/{KEYWORD} Babelfy State of the art algorithm for Word Sense Disambiguation/Entity Linking
SLIDE 48 K e y w
d L i n k i n g Me t h
s
Vector-based Contextual disambiguation
- Run String Match on the keywords
- Split the missed keywords into tokens
- Run String Match on the tokens
- Compute the semantic similarity of each
token-entity with all the previously recognized entities
- Select the highest scoring token-entity
e.g., basket_of_banana dbr: → Basket
SLIDE 49
SLIDE 50
T h e S U N d a t a b a s e
SLIDE 51
http:/ /groups.csail.mit.edu/vision/SUN 131,067 Images 908 Scene categories 313,884 Segmented objects 4,479 Object categories After linking 2,493 objects in DBpedia 679 locations in DBpedia 2,935 object-location relations
T h e S U N d a t a b a s e
SLIDE 52
http:/ /groups.csail.mit.edu/vision/SUN 131,067 Images 908 Scene categories 313,884 Segmented objects 4,479 Object categories After linking 2,493 objects in DBpedia 679 locations in DBpedia 2,935 object-location relations
T h e S U N d a t a b a s e
Yes, this is hard
SLIDE 53 E p i l
u e : P u t t i n g i t a l l t
e t h e r
SLIDE 54 D e f a u l t K n
l e d g e a b
t O b j e c t s
RDF dataset of common sense knowledge about
Object classification, prototypical location, actions, frames... Knowledge extracted from parsing, crowdsourcing, distributional semantics, keyword linking
SLIDE 55
SLIDE 56 A u t
u s L e a r n i n g
Task-level Loop
Keyword Linking Distributional Relational Hypothesis ...
SLIDE 57 A u t
u s L e a r n i n g
Learning-level Loop
Robot perception Data collection Knowldge Building
SLIDE 58 A u t
u s L e a r n i n g
Learning-level Loop
Robot perception Validation Data collection Validation Knowldge Building
SLIDE 59 A u t
u s L e a r n i n g
Learning-level Loop
Robot perception Validation Data collection Validation Knowldge Building
SLIDE 60
F i n ( Q / A )
SLIDE 61 L i n k s
Me http://valeriobasile.github.io/ Project https://project.inria.fr/aloof/ Databases & Resources http://image-net.org/ http://groups.csail.mit.edu/vision/SUN http://knowrob.org/kb/knowrob.owl http://dbpedia.org/page/ http://babelnet.org/ http://conceptnet.io/ http://lcl.uniroma1.it/nasari/