Plausible reasoning based on qualitative entity embeddings Steven - - PowerPoint PPT Presentation
Plausible reasoning based on qualitative entity embeddings Steven - - PowerPoint PPT Presentation
Plausible reasoning based on qualitative entity embeddings Steven Schockaert (joint work with Joaquin Derrac, Shoaib Jameel, Thomas Ager) School of Computer Science & Informatics Cardi ff University, Cardi ff , UK SchockaertS1@cardi ff
Plausible inference patterns
Mary enjoys hiking in the Alps Mary enjoys hiking in the Pyrenees the Alps are similar to the Pyrenees Similarity based reasoning
the Alps are similar to the Pyrenees
Plausible inference patterns
Mary enjoys hiking in the Alps Mary enjoys hiking in the Pyrenees Similarity based reasoning
the Alps are similar to the Pyrenees Mary enjoys hiking in the Alps Mary enjoys hiking in the Pyrenees Similarity based reasoning ITV is regulated by Ofcom BBC is regulated by Ofcom All British broadcasters are regulated by Ofcom Category based induction BBC and ITV are representative examples of British broadcasters
Plausible inference patterns
the Alps are similar to the Pyrenees Mary enjoys hiking in the Alps Mary enjoys hiking in the Pyrenees Similarity based reasoning ITV is regulated by Ofcom BBC is regulated by Ofcom All British broadcasters are regulated by Ofcom Category based induction BBC and ITV are representative examples of British broadcasters
Plausible inference patterns
Sandwich shops in Wales are required to display food hygiene ratings Interpolation Restaurants in Wales are required to display food hygiene ratings Cafes in Wales are required to display food hygiene ratings cafes are conceptually between sandwich shops and restaurants
Plausible inference patterns
Sandwich shops in Wales are required to display food hygiene ratings Interpolation Restaurants in Wales are required to display food hygiene ratings Cafes in Wales are required to display food hygiene ratings cafes are conceptually between sandwich shops and restaurants
Plausible inference patterns
Sandwich shops in Wales are required to display food hygiene ratings Interpolation Restaurants in Wales are required to display food hygiene ratings Cafes in Wales are required to display food hygiene ratings cafes are conceptually between sandwich shops and restaurants University staff are not permitted to travel in business class A fortiori reasoning first class is more expensive than business class University staff are not permitted to travel in first class the underlying reason is because business class is too expensive
Plausible inference patterns
Sandwich shops in Wales are required to display food hygiene ratings Interpolation Restaurants in Wales are required to display food hygiene ratings Cafes in Wales are required to display food hygiene ratings cafes are conceptually between sandwich shops and restaurants University staff are not permitted to travel in business class A fortiori reasoning first class is more expensive than business class University staff are not permitted to travel in first class the underlying reason is because business class is too expensive
Plausible inference patterns
The item for sale is original art Borderline effects The item for sale is a poster limited-edition art print is a borderline case of original art The item for sale is a limited-edition art print The concepts original art and poster are disjoint limited-edition art print is a borderline case of poster
Plausible inference patterns
The item for sale is original art The item for sale is a poster limited-edition art print is a borderline case of original art The item for sale is a limited-edition art print limited-edition art print is a borderline case of poster The concepts original art and poster are disjoint Borderline effects
Plausible inference patterns
domain theory domain theory domain theory domain theory domain theory Completed domain theory unstructured data domain theory
Interpretable machine learning models Ontology based data access Recognising textual entailment
Motivation
Supporting plausible inference using entity embeddings
Representing lexical information
Key problem: the meaning of many words can cannot be captured by necessary and sufficient conditions (e.g. “game”) Similarity plays a key role in modelling meaning:
- Wittgenstein: concepts as family resemblances
- Prototype and exemplar theories
This leads to the use of geometric models of meaning
- Neural network embeddings
- Information retrieval: probabilistic topic models
- Conceptual spaces (Gärdenfors)
Conceptual spaces
focused on food formal
wine bar pub fine dining restaurant bistro bar restaurant cafe sandwich shop hotel bar sports bar gastropub
focused on food formal
wine bar pub fine dining restaurant bistro bar restaurant cafe sandwich shop hotel bar sports bar
is-a relations
gastropub
focused on food formal
wine bar pub fine dining restaurant bistro bar restaurant cafe sandwich shop hotel bar sports bar
Similarity
gastropub
focused on food formal
wine bar pub gastropub fine dining restaurant bistro bar restaurant cafe sandwich shop hotel bar sports bar
Representativeness
focused on food formal
wine bar pub gastropub fine dining restaurant bistro bar restaurant cafe sandwich shop hotel bar sports bar
Conceptual betweenness
focused on food formal
wine bar pub gastropub fine dining restaurant bistro bar restaurant cafe sandwich shop hotel bar sports bar
Conceptual neighbourhood
focused on food formal
wine bar pub gastropub fine dining restaurant bistro bar restaurant cafe sandwich shop hotel bar sports bar
Interpretable directions
...
100-dimensional Euclidean space
Classical multi- dimensional scaling
PPMI weighted term co-
- ccurrence vectors
(Open NLP project) POS tagger, chunker
Inducing a conceptual space of films
Betweenness
aliens between star trek and cloverfield cast away between titanic and into the wild lord of the rings between harry potter and troy mission impossible between the rock and skyfall star wars between lord of the rings and star trek troy between braveheart and thor wall-e between monsters inc and 2001: a space odyssey good will hunting between dead poets society and rain man unbreakable between sin city and the sixth sense scarface between sin city and the godfather forest gump between million dollar baby and stand by me shrek 2 between wedding crashers and the lion king
Betweenness
abbey between castle and chapel bistro between restaurant and tea room butcher shop between marketplace and slaughterhouse conservatory between greenhouse and playhouse duplex between detached house and triplex flower shop between garden center and gift shop grocery store between convenience store and farmers market manor between castle and mansion house rice paddy between bamboo forest and cropland sushi restaurant between Japanese restaurant and tapas restaurant veterinarian between animal shelter and emergency room wine shop between gourmet shop and liquor store
Direction towards more “violent” films
films whose associated text contains the word “violent” films whose associated text does not contain the word “violent”
Learning interpretable directions
Learning interpretable directions
Blair Witch Project the Godfather
spooky, scary, scarier, not scary, creepy, pretty creepy italian, corrupt, immoral, unsurpassed, absolutely wonderful witch, scary movies, spooky, a horror flick, a horror movie, scares
- rganised crime, the gangsters, the
mob, gangsters, the assassination, loyalty ADJ NOUNS
Fight Club Gladiator
insightful, provocative, disturbing, depressed, depressing epic, historically accurate, historical, lavish, magnificent conformity, society, voyeurism, our society, a dark comedy epics, the battle scenes, battle scenes, an epic, the epic ADJ NOUNS
measure.
Foursquare GeoNames OpenCYC Algorithm n Acc. F1 Acc. F1 Acc. F1 Col 0.947 0.717 0.881 0.401 0.956 0.383 BtwA 0.949 0.717 0.883 0.395 0.956 0.373 BtwB 0.943 0.617 0.881 0.349 0.954 0.295 AnalogA 0.921 0.636 0.822 0.330 0.933 0.375 AnalogB 0.940 0.707 0.853 0.347 0.945 0.382 AnalogC 0.925 0.686 0.859 0.411 0.942 0.391 FOIL0 0.926 0.564 0.876 0.201 0.950 0.267 FOIL1 50 0.925 0.596 0.860 0.272 0.943 0.329 FOIL2 0.926 0.627 0.861 0.285 0.946 0.335 FOIL3 0.928 0.594 0.876 0.300 0.949 0.268 1-NN 0.939 0.710 0.853 0.357 0.945 0.380 C4.5MDS 0.925 0.534 0.849 0.178 0.941 0.245 C4.5dir 0.918 0.382 0.849 0.374 0.939 0.262 SVMMDS 0.932 0.656 0.859 0.343 0.912 0.328 SVMBoW 0.913 0.358 0.874 0.172 0.946 0.205
Commonsense classifiers
interpolation analogical classifiers a fortiori inference
Open-domain semantic spaces
Idea: learn semantic space representation for every entity that has a Wikipedia page Problems:
- Each semantic space should only contain entities of the same type
- Number of dimensions of each space needs to be carefully selected
- Relations between entities of different types can provide valuable
information (e.g. films directed by the similar directors tend to be similar)
Open-domain semantic spaces
Idea: learn semantic space representation for every entity that has a Wikipedia page Problems:
- Each semantic space should only contain entities of the same type
- Number of dimensions of each space needs to be carefully selected
- Relations between entities of different types can provide valuable
information (e.g. films directed by the similar directors tend to be similar) Solution:
- Learn single vector space that has a subspace for each semantic type
- Use nuclear norm regularization to automatically select the number of
dimensions for each of these spaces
- Align subspaces based on known relations between entities
Open-domain semantic spaces
JE
text =
X
ei∈E
X
tj∈Wei
f(yji)(pei · wj + bi + bj log yji)2
Intuition: similar entities should have similar representations Adaptation of the GloVe model for word embedding
Open-domain semantic spaces
JE
text =
X
ei∈E
X
tj∈Wei
f(yji)(pei · wj + bi + bj log yji)2
Jtype = X
s∈S
X
e∈Es
kpe
n
X
j=0
λe,s
j ps jk2
P All entities of the same semantic type should be located in some subspace
Open-domain semantic spaces
JE
text =
X
ei∈E
X
tj∈Wei
f(yji)(pei · wj + bi + bj log yji)2
Jtype = X
s∈S
X
e∈Es
kpe
n
X
j=0
λe,s
j ps jk2
P
J1
reg =
X
s∈S
kMsk∗ This subspace should be low-dimensional
Open-domain semantic spaces
JE
text =
X
ei∈E
X
tj∈Wei
f(yji)(pei · wj + bi + bj log yji)2
Jtype = X
s∈S
X
e∈Es
kpe
n
X
j=0
λe,s
j ps jk2
P Jdim
rel =
X
k∈R
X
p∈Pe,k
kp
n
X
j=0
µe,k
j
qe,k
j
k2 + X
p∈Pk,f
kp
n
X
j=0
µk,f
j
qk,f
j
k2 Jdist
rel =
X
f∈rhs(e,k)
d(pf, pe + rk)2 + X
e∈rhs(k,f)
d(pe, pf rk)2
J1
reg =
X
s∈S
kMsk∗ All entities that are in a given relation with another entity should be in a low- dimensional subspace ...
Open-domain semantic spaces
JE
text =
X
ei∈E
X
tj∈Wei
f(yji)(pei · wj + bi + bj log yji)2
Jtype = X
s∈S
X
e∈Es
kpe
n
X
j=0
λe,s
j ps jk2
P Jdim
rel =
X
k∈R
X
p∈Pe,k
kp
n
X
j=0
µe,k
j
qe,k
j
k2 + X
p∈Pk,f
kp
n
X
j=0
µk,f
j
qk,f
j
k2 Jdist
rel =
X
f∈rhs(e,k)
d(pf, pe + rk)2 + X
e∈rhs(k,f)
d(pe, pf rk)2
J1
reg =
X
s∈S
kMsk∗ ... and close to each other Note: TransE and TransH as special cases
Open-domain semantic spaces
Open-domain semantic spaces
Semantic Type Number of Entities NN-Dimensions human 191211 288 railway station 4120 121 house 2762 136
- rganization
1379 88 national park 1307 56 building 1269 52 food 1155 55 college 858 33 automobile 31 12 candy 10 2
Open-domain semantic spaces
instances. Population Inception Date of Birth Malta General Electric Valmiki Bermuda IBM Jesus Christ Monaco Hewlett Packard Cleopatra San Marino Microsoft Ptolemy Barbados Oracle Corporation Plato instances. Population Inception Date of Birth China Alphabet Inc. Prince George of Cambridge India Tencent Holdings Isabela Moner USA Facebook, Inc. Justin Bieber Soviet Union Uber Lionel Messi Brazil Amazon.com Kim Kardashian Lowest ranked entities Highest ranked entities
Ranking Induction Analogy ρ MAP P@5 MRR Acc. Skip-gram 0.155 0.176 0.356 0.505 0.184 CBOW 0.159 0.182 0.350 0.500 0.213 RESCAL 0.081 0.020 0.189 0.423 0.371 TransE 0.110 0.060 0.200 0.451 0.382 TransH 0.142 0.072 0.210 0.415 0.382 TransR 0.100 0.102 0.302 0.489 0.378 CTransR 0.122 0.132 0.323 0.499 0.402 pTransEanch 0.099 0.101 0.301 0.488 0.476 pTransEart 0.202 0.218 0.475 0.751 0.512 pTransEfull 0.213 0.224 0.490 0.756 0.532 EECSfull 0.319 0.231 0.609 0.883 0.591 EECSno rel 0.301 0.229 0.588 0.868 0.552 EECSno type 0.266 0.225 0.585 0.854 0.549 EECSno NN 0.258 0.220 0.581 0.843 0.545 EECStext 0.254 0.218 0.579 0.831 0.540 EECStype-comb 0.312 0.231 0.601 0.883 0.595 EECStype-dist 0.295 0.231 0.585 0.858 0.550 EECSrel-dim 0.309 0.225 0.585 0.859 0.551 EECSrel-dist 0.299 0.225 0.585 0.855 0.549
Open-domain semantic spaces
Learning rules using autoencoders
Input space obtained using MDS Lower-dimensional space capturing more general properties
Learning rules using autoencoders
Input space obtained using MDS Lower-dimensional space capturing more general properties
dumb c h e e s y sci-fi IF sci-fi and dumb THEN cheesy
Qualitative entity embeddings
Limitations of learned semantic spaces
Vector/point representations can only provide a coarse approximation of the meaning of a concept
- Regions are needed to model the diversity of a concept: while some
restaurants are similar to ice cream shops, some are very different
- Regions are needed to model is-a relationships, conceptual neighbourhood,
disjointness, overlap, typicality/borderline effects
Limitations of learned semantic spaces
Faithful vector space representations require a sufficient amount of co-occurrence data, which may not be available for rare terms Vector/point representations can only provide a coarse approximation of the meaning of a concept
- Regions are needed to model the diversity of a concept: while some
restaurants are similar to ice cream shops, some are very different
- Regions are needed to model is-a relationships, conceptual neighbourhood,
disjointness, overlap, typicality/borderline effects
Limitations of learned semantic spaces
Learning faithful region representations from data seems only feasible in low-dimensional spaces, as the number of vertices is typically exponential in the number of dimensions Vector/point representations can only provide a coarse approximation of the meaning of a concept
- Regions are needed to model the diversity of a concept: while some
restaurants are similar to ice cream shops, some are very different
- Regions are needed to model is-a relationships, conceptual neighbourhood,
disjointness, overlap, typicality/borderline effects
Faithful vector space representations require a sufficient amount of co-occurrence data, which may not be available for rare terms
Qualitative semantic representations
pizzeria is-a italian restaurant italian restaurant is-a restaurant restaurant is-a venue ice cream shop is-a shop shop is-a venue dessert restaurant is-a restaurant
venue restaurant shop
italian restaurant ice cream shop dessert restaurant pizzeria
Easy to use in a variety of KR and NLP tasks Easy to understand/correct/extend by human domain experts Can be automatically learned/refined from text collections
Qualitative semantic representations
pizzeria is-a italian restaurant italian restaurant is-a restaurant restaurant is-a venue ice cream shop is-a shop shop is-a venue dessert restaurant is-a restaurant
venue restaurant shop
italian restaurant ice cream shop dessert restaurant pizzeria
Easy to use in a variety of KR and NLP tasks Can be automatically learned/refined from text collections Is-a relationships often not sufficient Often many ways to structure the terms
- f a given domain
Often too shallow to allow for reliable inductive inferences Easy to understand/correct/extend by human domain experts
Region connection calculus
a DC b disconnected
a b
a EC b externally connected
a b
a PO b partially overlapping
a b
a EQ b equal
a b
a TPP b tangential proper part
a b
b TPP-1 a a NTPP b non-tangential PP
a b
b NTPP-1 a
Qualitative semantic representations
pizzeria is-a italian restaurant italian restaurant is-a restaurant restaurant is-a venue ice cream shop is-a shop shop is-a venue dessert restaurant is-a restaurant
venue restaurant shop
italian restaurant ice cream shop dessert restaurant pizzeria
pizzeria ice cream shop dessert restaurant italian restaurant
restaurant shop venue
pizzeria is a part of italian restaurant shop is a part of venue ... ice cream shop is adjacent to dessert restaurant ice cream shop is disjoint from pizzeria
Betweenness
A B C D E A⋈B
Directions
A B C D E
d1 d2
A ↓ d
1
C ↓ d
1
D ↓ d
1
D↓d2 B↓d2
Directions
formal cheap trendy healthy
ice cream shop Michelin star restaurant
Directions
formal cheap trendy healthy
ice cream shop Michelin star restaurant Betweenness, analogy and adjacency can only be approximately represented Compact and easy-to-understand representation Possible to encode (and efficient to reason with) incomplete information (e.g. from text) Similarity estimates possibly less accurate
Directions
Modelling vagueness/typicality effects
formal cheap trendy healthy
Michelin star restaurant
Directions
formal cheap trendy healthy
Modelling context effects
Extracting qualitative representations
Economy class is cheaper than Business Class but at what cost to the employee?
https://companytravel.wordpress.com/ category/business-travel/
Business class is cheaper than first class, but they are definitely worth the fare.
http://ezinearticles.com/ ?What-You-Need-To-Know-Flying-Business-Class &id=8963765
Extracting qualitative representations
Mediterraneo bar is something between a bar and a club
http://www.sitges-tourist-guide.com/en/ bars/sitges-bars.html
A raga is something between a scale and a composition, it is richer then a scale, but not as fixed as a composition.
http://www.jazzguitar.be/exotic_guitar_ scales.html
Brunch is a combination of breakfast and lunch eaten usually during the late morning ...
http://en.wikipedia.org/wiki/Brunch
A phablet is a cross between a smartphone and a tablet, it’s bigger than a phone but smaller than a tablet.
https://www.tigermobiles.com/2014/07/ bad-eyesight-smartphones
Conclusions
Plausible reasoning, based on lexical background knowledge, can be used to fill in the gaps or resolve inconsistencies in imperfect knowledge bases The required lexical information can be encoded as (mostly) qualitative spatial relations between semantic space representations Both point representations and qualitative spatial relations can be
- btained from the Web
References
Qualitative reasoning in semantic spaces Learning semantic spaces from data Plausible reasoning based on qualitative lexical relations
Steven Schockaert, Sanjiang Li. Realizing RCC8 networks using convex regions. Artificial Intelligence, 2015 Steven Schockaert, Sanjiang Li. Combining RCC5 relations with betweenness information. IJCAI 2013 Steven Schockaert, Jae Hee Lee. Qualitative reasoning about directions in semantic spaces. IJCAI 2015 Joaquín Derrac, Steven Schockaert. Inducing semantic relations from conceptual spaces: a data- driven approach to plausible reasoning. Artificial Intelligence, 2015 Steven Schockaert, Henri Prade. Interpolative and extrapolative reasoning in propositional theories using qualitative knowledge about conceptual spaces. Artificial Intelligence, 2013 Steven Schockaert, Henri Prade. Solving conflicts in information merging by a flexible interpretation
- f atomic propositions. Artificial Intelligence, 2011
Shoaib Jameel, Steven Schockaert. Entity embeddings with conceptual sub-spaces. Under review Steven Schockaert, Shoaib Jameel. Plausible reasoning based on qualitative entity embeddings. IJCAI, 2016