 
              Multilingual Knowledge Graph Embeddings for Cross-lingual Knowledge Alignment Muhao Chen 1 , Yingtao Tian 2 , Mohan Yang 1 , and Carlo Zaniolo 1 University of California, Los Angeles 1 Stony Brook University 2
Outline • Background • MTransE — A multilingual knowledge graph embedding model • Evaluation • Open Challenges and Future Work
Knowledge Graphs • Symbolic representation of entities and relations Monolingual knowledge: triples (relation facts of entities) (California, capital city, Sacramento) Cross-lingual knowledge: ( カリフォルニア , 首都 , サクラメント ) alignment of monolingual knowledge across languages
Knowledge Graph Embeddings • Encode entities as vectors Male Semantic similarity of entities Bach Capture Relational inferences as Encode vector algebra Eisenach – France – Paris ≈ capital Enable Germany – US – USD ≈ currency – Bach – German ≈ nationality Knowledge Graph – … Paris (0.036, -0.12, ..., 0.323) capital (0.102, 0.671, …, -0.101) Applications • KG Completion France (0.138, 0.551, …, 0.222) … • Relation extraction from text • Question answering Embeddings
Current KG Embedding Approaches • Focused on embedding monolingual triples (h, r, t) Later approaches Embedding of monolingual knowledge seems to be – TransH [Wang et al. 2014] – TransR [Lin et al. 2015] well-addressed. TransE: h + r ≈ t – TransD [Ji et al. 2015] – HolE [Nickle et al. 2016] – ComplEx [ Trouillon et al. 2016 ] – … What about cross-lingual knowledge?
Emerging challenge • Existing works do not characterize cross-lingual knowledge – Entity inter-lingual links (ILLs): (ambulance --- krankenwagen) – Triple-wise alignment (TWA): ((State of California, capital city, Sacramento) --- ( カリフォ ルニア , 首都 , サクラメント )) – Many KGs store such knowledge Difficult to characterize: • Fewer samples: Cross-lingual knowledge currently Why important? • Enables multilingual accounts for a small portion of each KB • Larger domains: Cross-lingual knowledge applies on the semantic representations • Benefits cross-lingual NLP entire spaces of involved languages • Incoherence: Language-specific versions of KG are – Knowledge alignment – Machine translation usually incoherent • Heterogeneity: Applies to both entities and – Cross-lingual Q&A – … monolingual relations with inconsistent vocabularies
What does MTransE use and enable? + • Corpora: (partially-aligned) multilingual KGs France Capital Paris • Enabling: inferable embeddings of multilingual semantics • Can be applied to: – Knowledge alignment – Cross-lingual Q&A – Multilingual chat-bots + フラ 首都 パリ – … ンス
MTransE Model Components • Knowledge model Alignment model 𝑇 𝐿 =   ||𝐢 + 𝐬 − 𝐮|| 𝑀∈{𝑀 𝑗 ,𝑀 𝑘 } 𝑈∈𝐻 𝑀 • Alignment model 𝑇 𝑏 (𝑈, 𝑈 ′ ) 𝑇 𝐵 =  𝑈,𝑈 ′ ∈𝜀(𝑀 𝑗 ,𝑀 𝑘 ) All aligned triples Space L 2 Space L 1 (h , r , t ) • Objective of learning (h, r, t) Knowledge model – Minimizing 𝐾(𝜄) = 𝑇 𝐿 + 𝛽𝑇 𝐵
Translation vectors • Encoding cross-lingual Dif ifferent alignment techniques transitions just like monolingual relations Translate Translate Translate Space L i Space L j Linear Transformations Space L j • Transformations across Space L i embedding spaces of Transformations different languages M ij Axis calibration • Cross-lingual counterparts have close embeddings Space L i Space L j
Ali lignment Scores and Five Model Variants • Var i combines the i th alignment model with the knowledge model Variant Alignment Score Remark Var 1 𝑇 𝑏 1 = 𝒊 − 𝒊′ + 𝒖 − 𝒖′ Axis Calibration Var 2 𝑇 𝑏 2 = 𝒊 − 𝒊′ + 𝒔 − 𝒔′ + 𝒖 − 𝒖′ 𝒇 = −𝒘 𝒌𝒋 𝒇 , 𝒘 𝒋𝒌 𝒔 = −𝒘 𝒌𝒋 𝒇 − 𝒊′ + 𝒔 + 𝒘 𝒋𝒌 𝒔 − 𝒔′ 𝒔 Var 3 𝒘 𝒋𝒌 𝑇 𝑏 3 = 𝒊 + 𝒘 𝒋𝒌 Translation 𝒇 − 𝒖′ + 𝒖 + 𝒘 𝒋𝒌 Vector 𝑓 ∈ ℝ 𝒍×𝒍 , 𝑵 𝑗𝑘 𝑠 ∈ ℝ 𝒍×𝒍 𝑓 𝒊 − 𝒊′ + 𝑵 𝑗𝑘 𝑓 𝒖 − 𝒖′ Var 4 𝑇 𝑏 4 = 𝑵 𝑗𝑘 𝑵 𝑗𝑘 Linear 𝑓 𝒊 − 𝒊′ + 𝑵 𝑗𝑘 𝑠 𝒔 − 𝒔′ Var 5 𝑇 𝑏 5 = 𝑵 𝑗𝑘 Transforms 𝑓 𝒖 − 𝒖′ + 𝑵 𝑗𝑘
Experimental Evaluation • Cross-lingual knowledge alignment tasks – Entity Matching – Triple-wise Alignment (TWA) Verification • Monolingual relation extraction task • Trilingual data sets – Wiki-based (WK3l-15k, WK3l-120k) – ConceptNet-based (CN3l) • Baselines – LM [Mikolov et al. 2013] + Knowledge models – CCA [Faruqui et al. 2014] + Knowledge models – OT [Xing et al. 2015] + Knowledge models These three data sets are available at https://github.com/muhaochen/MTransE
Entity Matching What is the German entity for the English entity “Regulation of Property”? • Evaluation protocol – For each ( e , e’ ), rank e’ in the neighborhood of 𝜐 𝒇 • Training sets – Pairs of language-specific graphs and corresponding alignment sets • Test data – Entity Inter-lingual links {( e , e ’)} (Unidirectional)
Entity Matching Hits@10 on WK3l-15k Hits@10 on WK3l-120k Hits@10 on CN3l 100 100 100 80 80 80 60 60 60 40 40 40 20 20 20 0 0 0 Hits@10/En-Fr Hits@10/Fr-En Hits@10/En-De Hits@10/De-En Hits@10/En-Fr Hits@10/Fr-En Hits@10/En-De Hits@10/De-En Hits@10/En-Fr Hits@10/Fr-En Hits@10/En-De Hits@10/De-En LM CCA OT Var1 Var2 Var3 Var4 Var5 LM CCA OT Var1 Var2 Var3 Var4 Var5 LM CCA OT Var1 Var2 Var3 Var4 Var5 Mean on CN3l Mean on WK3l-15k 10000 10000 1000 1000 100 100 10 10 1 1 Mean/En-Fr Mean/Fr-En Mean/En-De Mean/De-En Mean/En-Fr Mean/Fr-En Mean/En-De Mean/De-En Axis Calibration Var 1 , Var 2 LM CCA OT Var1 Var2 Var3 Var4 Var5 LM CCA OT Var1 Var2 Var3 Var4 Var5 Trans. Vectors Var 3 Var 4 ≈ Var 5 >Var 1 ≈ Var 3 ≈ OT>Var 2 ≫ CCA>LM Linear Transforms Var 4 , Var 5
Triple-wise Ali lignment Verifi fication Accuracy of TWA Verification Var 4 ≈ Var 5 >Var 1 >Var 2 >Var 3 ≈ OT 100 ≫ CCA>LM 90 80 We receive similar 70 evaluation 60 50 conclusions in all 40 30 settings. 20 10 0 Axis Calibration Var 1 , Var 2 Trans. Vectors Var 3 LM CCA OT Var1 Var2 Var3 Var4 Var5 Linear Transforms Var 4 , Var 5
Monolingual Relation Ext xtraction (E (English, French) Predicting Missing Tails (Hits@10) 45 • Train/Test 40 35 – Train Sets: 90% triples and 30 25 20 intersecting alignment sets 15 10 – Test Sets: 10% triples 5 0 • WK3l-15k/EN WK3l-15k/FR WK3l-120k/EN WK3l-120k/FR MTransE preserves well the TransE Var1 Var2 Var3 Var4 Var5 monolingual relations Predicting Missing Relations (Hits@10) 80 70 60 50 40 30 Axis Calibration Var 1 , Var 2 20 10 Trans. Vectors Var 3 0 WK3l-15k/EN WK3l-15k/FR WK3l-120k/EN WK3l-120k/FR Linear Transforms Var 4 , Var 5 TransE Var1 Var2 Var3 Var4 Var5
Applications based on MTransE • Multilingual Q&A • Cross-lingual relation prediction • Improving monolingual KG completion using multilingual correlation • Knowledge alignment across knowledge bases
Examples of f Cross-lingual Question Answering Bold-faced ones are correct answers, italic ones are close answers.
Improve the embedding model • Other forms of knowledge models and alignment models – Neural knowledge models such as HolE and ComplEx – Other alignment models such as affine transformations – Alignment models which consider disambiguation • Encoding more information from multilingual KGs – Entity domains, class templates, entity descriptions, etc – Cross-lingual disambiguation • Jointly embedding with other forms of corpora such as multilingual documents
References • [Bordes et al., 2013] Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. Translating embeddings for modeling multi-relational data. In NIPS, pages 2787 – 2795, 2013. • [Nickel et al., 2016] Maximilian Nickel, Lorenzo Rosasco, Tomaso Poggio, et al. Holographic embeddings of knowledge graphs. In AAAI, 2016. • [Saxe et al., 2014] Andrew M Saxe, James L McClelland, and Surya Ganguli. Exact solutions to the nonlinear dynamics of learning in deep linear neural networks. ICLR, 2014. • [Wang et al., 2014] Zhen Wang, Jianwen Zhang, Jianlin Feng, and Zheng Chen. Knowledge graph embedding by translating on hyperplanes. In AAAI, 2014. • [Lin et al., 2015] Yankai Lin, Zhiyuan Liu, Maosong Sun, Yang Liu, and Xuan Zhu. Learning entity and relation embeddings for knowledge graph completion. In AAAI, 2015. • [Ji et al., 2015] Guoliang Ji, Shizhu He, Liheng Xu, Kang Liu, and Jun Zhao. Knowledge graph embedding via dynamic mapping matrix. In ACL, pages 687 – 696, 2015. • [Mikolov et al., 2013] Tomas Mikolov, Quoc V Le, and Ilya Sutskever. Exploiting similarities among languages for machine translation. arXiv, 2013. • [Faruqui and Dyer, 2014] Manaal Faruqui and Chris Dyer. Improving vector space word representations using multilingual correlation. EACL, 2014. • [Xing et al., 2015] Chao Xing, Dong Wang, Chao Liu, and Yiye Lin. Normalized word embedding and orthogonal transform for bilingual word translation. In NAACL HLT, pages 1006 – 1011, 2015.
Thank You 20
Recommend
More recommend