An Interpretable Knowledge Transfer Model for Knowledge Base Completion
Qizhe Xie, Xuezhe Ma, Zihang Dai, Eduard Hovy
Carnegie Mellon University Language Technologies Institute
August 2, 2017
1 / 28
An Interpretable Knowledge Transfer Model for Knowledge Base - - PowerPoint PPT Presentation
An Interpretable Knowledge Transfer Model for Knowledge Base Completion Qizhe Xie, Xuezhe Ma, Zihang Dai, Eduard Hovy Carnegie Mellon University Language Technologies Institute August 2, 2017 1 / 28 Outline Introduction Task Motivation
1 / 28
2 / 28
3 / 28
◮ Given lots of triples such as
◮ Predict missing facts (Leonardo DiCaprio, Profession, ?)
4 / 28
Log(Frequency)
2.75 5.5 8.25 11
Frequency
10000 20000 30000 40000
Relation
Frequency Log(Frequency)
Log(Frequency)
2.5 5 7.5 10
Frequency
4000 8000 12000 16000
Relation
Frequency Log(Frequency)
5 / 28
◮ On WN18, the rarer the relation is, the greater the improvements
◮ Reverse relations, undirected relations and similar relations are
◮ On FB15k, the number of parameters can be reduced to 1/90 of
6 / 28
7 / 28
◮ Training data: (h = Leonardo DiCaprio, r = won award, t = Oscar) ◮ Test data: (h = Leonardo DiCaprio, r = Profession, t = ?)
◮ TransE [Bordes et al., 2013]:
◮ STransE [Nguyen et al., 2016]:
8 / 28
9 / 28
10 / 28
◮ Transfer learning can be done effectively through parameter
◮ We can interpret similar relations ◮ All parameters are trainable by SGD
11 / 28
◮ Uninterpretable: In practice, even with ℓ1 regularization, we get a distributed
◮ Inefficient: Computation involves all concept matrices ◮ Unnecessary: Intuitively, each relation can be composed of at most K concepts 12 / 28
◮ Given current embeddings, a correct mapping should minimize the loss function ◮ For each relation, assign a single concept to the relation and compute the loss ◮ Greedily choose the top K concepts that minimize the loss 13 / 28
◮ Optimize embeddings and attention weights with SGD ◮ Reassign mappings 14 / 28
◮ Uniform negative sample: (Steve Jobs, was born in, CMU) ◮ Domain negative sample: (Steve Jobs, was born in, China) 15 / 28
16 / 28
Model Additional Information WN18 FB15k Mean Rank Hits@10 Mean Rank Hits@10 SE [Bordes et al., 2011] No 985 80.5 162 39.8 Unstructured [Bordes et al., 2014] No 304 38.2 979 6.3 TransE [Bordes et al., 2013] No 251 89.2 125 47.1 TransH [Wang et al., 2014] No 303 86.7 87 64.4 TransR [Lin et al., 2015b] No 225 92.0 77 68.7 CTransR [Lin et al., 2015b] No 218 92.3 75 70.2 KG2E [He et al., 2015] No 348 93.2 59 74.0 TransD [Ji et al., 2015] No 212 92.2 91 77.3 TATEC [Garc´ ıa-Dur´ an et al., 2016] No
76.7 NTN [Socher et al., 2013] No
DISTMULT [Yang et al., 2015] No
STransE [Nguyen et al., 2016] No 206 (244) 93.4 (94.7) 69 79.7 ITransF No 205 94.2 65 81.0 ITransF (domain sampling) No 223 95.2 77 81.4
RTransE [Garc´
ıa-Dur´ an et al., 2015] Path
76.2 PTransE [Lin et al., 2015a] Path
84.6 NLFeat [Toutanova and Chen, 2015] Node + Link Features
Random Walk [Wei et al., 2016] Path
IRN [Shen et al., 2016] External Memory 249 95.3 38 92.7
17 / 28
18 / 28
Hits@10
25 50 75 100
Relation Bin
Frequent Medium Rare ITransF (ours) STransE
Hits@10
25 50 75 100
Relation Bin
Frequent Medium Rare ITransF (ours) STransE
19 / 28
20 / 28
◮ Reverse relations: hyponym and hypernym; award winning work and
21 / 28
◮ Reverse relations: hyponym and hypernym; award winning work and
◮ Undirected relations: spouse; similar to.
22 / 28
◮ Reverse relations: hyponym and hypernym; award winning work and
◮ Undirected relations: spouse; similar to. ◮ Similar relations: was anominated for and won award for; instance hypernym
23 / 28
24 / 28
Hits@10
70 73.25 76.5 79.75 83
# concepts
15 30 75 300 600 1200 1345 2200 2690 ITransF STransE CTransR
Hits@10
90 91.25 92.5 93.75 95
# concepts
18 22 26 30 36 45 ITransF STransE CTransR
25 / 28
26 / 28
27 / 28
28 / 28