S e ma n t i c w e b - mi n i n g a n d d e e p - - PowerPoint PPT Presentation

s e ma n t i c w e b mi n i n g a n d d e e p v i s i o n
SMART_READER_LITE
LIVE PREVIEW

S e ma n t i c w e b - mi n i n g a n d d e e p - - PowerPoint PPT Presentation

S e ma n t i c w e b - mi n i n g a n d d e e p v i s i o n f o r l i f e l o n g o b j e c t d i s c o v e r y V a l e r i o B a s i l e D a t a S c i e n c e Me e t


slide-1
SLIDE 1

S e ma n t i c w e b

  • mi

n i n g a n d d e e p v i s i

  • n

f

  • r

l i f e l

  • n

g

  • b

j e c t d i s c

  • v

e r y

V a l e r i

  • B

a s i l e D a t a S c i e n c e Me e t u p L e a r n i n g C e n t e r S

  • p

h i a P

  • l

y t e c h 2 / 6 / 2 1 7

valeriobasile@gmail.com

slide-2
SLIDE 2

Valerio Basile From: Italy PhD in Groningen (NL) Postdoc at WIMMICS, Inria Computational Semantics, Semantic Web, Natural Language Generation, Information Extraction, Linguistic Annotation, Distributional Semantics, General Knowledge Bases, Gamification, Social Media, Sentiment Analysis, Legal Informatics, Argument Mining, Math, Pasta, Videogames, ...

Mo i

slide-3
SLIDE 3

Robots Computer vision Objects Semantics Learning Data

T

  • d

a y

slide-4
SLIDE 4

P a r t I : Mo t i v a t i

  • n
slide-5
SLIDE 5
slide-6
SLIDE 6

We deploy robots in human-inhabited environments. Our robots autonomously collect real-world data. We use information available on the Semantic Web to identify the semantics of objects.

slide-7
SLIDE 7
slide-8
SLIDE 8
slide-9
SLIDE 9
  • Background Knowledge
  • Room detection
  • Object classification
  • Frame semantics
  • ...
slide-10
SLIDE 10

F r a me S e ma n t i c s

Bob, du pain svp!

slide-11
SLIDE 11

F r a me S e ma n t i c s

Frame name Frame type Frame element role

slide-12
SLIDE 12

F r a me S e ma n t i c s

slide-13
SLIDE 13

P a r t I I : D e e p V i s i

  • n
slide-14
SLIDE 14

P e r c e p t i

  • n

a n d I d e n t i fi c a t i

  • n
slide-15
SLIDE 15

P e r c e p t i

  • n

a n d I d e n t i fi c a t i

  • n
slide-16
SLIDE 16

P e r c e p t i

  • n

a n d I d e n t i fi c a t i

  • n

monitor keyboard mousepad

slide-17
SLIDE 17

P e r c e p t i

  • n

a n d I d e n t i fi c a t i

  • n
slide-18
SLIDE 18

S i t u a t e d R

  • b
  • t

P e r c e p t i

  • n

Robot deployments in office environments The robot visits fixed waypoints on the map, taking full 360° RGB-D scans

slide-19
SLIDE 19

S i t u a t e d R

  • b
  • t

P e r c e p t i

  • n

RGB-D depth segmentation algorithm (Potapova et al., 2014) Pictures→Point clouds→Patches→ →Surfaces→Clustering→Filtering→ →Candidate objects

slide-20
SLIDE 20

S u p e r v i s e d O b j e c t R e c

  • g

n i t i

  • n

CNN Training Prediction CNN

dbr:Keyboard

slide-21
SLIDE 21

S u p e r v i s e d O b j e c t R e c

  • g

n i t i

  • n

Convolutional Neural Network model trained

  • n 1,000 object classes

(More detail in Young et al., ICRA 2017)

slide-22
SLIDE 22

21,841 WordNet Synsets (1,000 with SIFT features) 14,197,122 images (1,034,908 with bounding box) http:/ /image-net.org/ http:/ /wordnet-rdf.princeton.edu/

slide-23
SLIDE 23

S u p e r v i s e d O b j e c t R e c

  • g

n i t i

  • n

http:/ /image-net.org/

slide-24
SLIDE 24

P e r f

  • r

ma n c e

  • f

R

  • b
  • t

V i s i

  • n

Good but not great

Mugs on ImageNet (training data) Mugs seen by a robot (validation data)

slide-25
SLIDE 25

P a r t I I I : S e ma n t i c s a n d V i s i

  • n
slide-26
SLIDE 26

O b j e c t I d e n t i fi c a t i

  • n
slide-27
SLIDE 27

P l a c e C l a s s i fi c a t i

  • n
slide-28
SLIDE 28

S e ma n t i c R e l a t e d n e s s

slide-29
SLIDE 29

S e ma n t i c R e l a t e d n e s s

slide-30
SLIDE 30

S e ma n t i c R e l a t e d n e s s

slide-31
SLIDE 31

S e ma n t i c R e l a t e d n e s s

Washing_machine Ashtray Bathroom 5 2 Bedroom 1 Living_room 1 6

C

  • c

c u r r e n c e ma t r i x

slide-32
SLIDE 32

S e ma n t i c R e l a t e d n e s s

Washing_machine Ashtray Bathroom 5 2 Bedroom 1 Living_room 1 6

C

  • c

c u r r e n c e ma t r i x S i n g u l a r v a l u e d e c

  • mp
  • s

i t i

  • n

M=U ΣV

*

U k ΣkV k

*=M k

L

  • w
  • r

a n k a p p r

  • x

i ma t i

  • n

NASARI: A Novel Approach to a Semantically-Aware Representation of Items (Camacho-Collados, Pilehvar and Navigli, 2015)

slide-33
SLIDE 33

S e ma n t i c S i mi l a r i t y

bn:00008995n Bathroom -0.03750793 0.06731935 -0.02334246 -0.02009913 0.02251291 0.07689607 0.01527985 -0.10780967 0.18232885 0.1234034

  • 0.0520944 -0.25805958 0.12200121 -0.04875973 -0.03544397 -0.03841146

0.00970973 … bn:00007365n Washing_machine -0.00911299 0.11549547 -0.04274256 0.03672424

  • 0.06627292 0.13761881 0.01171631 -0.08721243 0.08270955 0.13095092
  • 0.00137408 -0.16226186 0.0422162 0.0545828 -0.01007292 0.10094466
  • 0.05663372 0.09864459 0.10167608 7.534e-05 0.08067719 0.05527394

C

  • s

i n e s i mi l a r i t y :

A⋅B ‖A‖‖B‖=

i=1 n

AiBi

√∑

i=1 n

Ai²√∑

i=1 n

Ai²

http://lcl.uniroma1.it/nasari/

slide-34
SLIDE 34

S e ma n t i c S i mi l a r i t y

B a t h r

  • m

A s h t r a y Wa s h i n g _ ma c h i n e

α β s i m( B a t h r

  • m,

Wa s h i n g _ ma c h i n e ) = c

  • s

() . 7 1 α ≈ s i m( B a t h r

  • m,

A s h t r a y ) = c

  • s

() . 3 7 β ≈

slide-35
SLIDE 35

P l a c e C l a s s i fi c a t i

  • n

= Cosine similarity on NASARI + aggregation, weighting by distance, ...

slide-36
SLIDE 36

P l a c e C l a s s i fi c a t i

  • n

0° RGB-D scans

slide-37
SLIDE 37

D i s t r i b u t i

  • n

a l R e l a t i

  • n

a l H y p

  • t

h e s i s

Entity 1 Entity 2 Type A Type B S e ma n t i c R e l a t i

  • n

S e ma n t i c S i mi l a r i t y

slide-38
SLIDE 38

Entity 1 Entity 2 Object Room i s L

  • c

a t e d A t S e ma n t i c S i mi l a r i t y

D i s t r i b u t i

  • n

a l R e l a t i

  • n

a l H y p

  • t

h e s i s

slide-39
SLIDE 39

Entity 1 Entity 2 Object Room i s L

  • c

a t e d A t S e ma n t i c S i mi l a r i t y

Successfully applied to object-location relation extraction (Basile et al, EKAW 2016) and improving object detection (Young et al., ICRA 2017)

D i s t r i b u t i

  • n

a l R e l a t i

  • n

a l H y p

  • t

h e s i s

More on this later

slide-40
SLIDE 40

P a r t I V : Mo r e S e ma n t i c s

slide-41
SLIDE 41

http://dbpedia.org/page/Table_knife

slide-42
SLIDE 42

http://conceptnet.io/c/en/knife

slide-43
SLIDE 43

http://knowrob.org/kb/knowrob.owl

slide-44
SLIDE 44

http://babelnet.org/synset?word=table+knife

slide-45
SLIDE 45

Taxonomy Function Location Linked Data DBpedia ✔ ✘ ✘ ✔ ConceptNet ✔ ✔ ✔ partly KnowRob ✔ ✔ partly ✘ BabelNet ✔ ✘ ✘ ✔ DeKO partly ✔ ✔ ✔

BN DB CN DK KR

slide-46
SLIDE 46

Taxonomy Function Location Linked Data DBpedia ✔ ✔ ✔ ✔ ConceptNet ✔ ✔ ✔ ✔ KnowRob ✔ ✔ ✔ ✔ BabelNet ✔ ✔ ✔ ✔ DeKO ✔ ✔ ✔ ✔

BN DB CN DK KR Keyword Linking

slide-47
SLIDE 47

K e y w

  • r

d L i n k i n g Me t h

  • d

s

DBpedia Lookup “official” search API of DBpedia String Match (+redirect) Try http:/ /dbpedia.org/resource/{KEYWORD} Babelfy State of the art algorithm for Word Sense Disambiguation/Entity Linking

slide-48
SLIDE 48

K e y w

  • r

d L i n k i n g Me t h

  • d

s

Vector-based Contextual disambiguation

  • Run String Match on the keywords
  • Split the missed keywords into tokens
  • Run String Match on the tokens
  • Compute the semantic similarity of each

token-entity with all the previously recognized entities

  • Select the highest scoring token-entity

e.g., basket_of_banana dbr: → Basket

slide-49
SLIDE 49
slide-50
SLIDE 50

T h e S U N d a t a b a s e

slide-51
SLIDE 51

http:/ /groups.csail.mit.edu/vision/SUN 131,067 Images 908 Scene categories 313,884 Segmented objects 4,479 Object categories After linking 2,493 objects in DBpedia 679 locations in DBpedia 2,935 object-location relations

T h e S U N d a t a b a s e

slide-52
SLIDE 52

http:/ /groups.csail.mit.edu/vision/SUN 131,067 Images 908 Scene categories 313,884 Segmented objects 4,479 Object categories After linking 2,493 objects in DBpedia 679 locations in DBpedia 2,935 object-location relations

T h e S U N d a t a b a s e

Yes, this is hard

slide-53
SLIDE 53

E p i l

  • g

u e : P u t t i n g i t a l l t

  • g

e t h e r

slide-54
SLIDE 54

D e f a u l t K n

  • w

l e d g e a b

  • u

t O b j e c t s

RDF dataset of common sense knowledge about

  • bjects.

Object classification, prototypical location, actions, frames... Knowledge extracted from parsing, crowdsourcing, distributional semantics, keyword linking

slide-55
SLIDE 55
slide-56
SLIDE 56

A u t

  • n
  • mo

u s L e a r n i n g

Task-level Loop

Keyword Linking Distributional Relational Hypothesis ...

slide-57
SLIDE 57

A u t

  • n
  • mo

u s L e a r n i n g

Learning-level Loop

Robot perception Data collection Knowldge Building

slide-58
SLIDE 58

A u t

  • n
  • mo

u s L e a r n i n g

Learning-level Loop

Robot perception Validation Data collection Validation Knowldge Building

slide-59
SLIDE 59

A u t

  • n
  • mo

u s L e a r n i n g

Learning-level Loop

Robot perception Validation Data collection Validation Knowldge Building

slide-60
SLIDE 60

F i n ( Q / A )

slide-61
SLIDE 61

L i n k s

Me http://valeriobasile.github.io/ Project https://project.inria.fr/aloof/ Databases & Resources http://image-net.org/ http://groups.csail.mit.edu/vision/SUN http://knowrob.org/kb/knowrob.owl http://dbpedia.org/page/ http://babelnet.org/ http://conceptnet.io/ http://lcl.uniroma1.it/nasari/