Knowledge Graph Embedding and Its Applications Xiaolong Jin CAS Key - PDF document

Knowledge Graph Embedding and Its Applications Xiaolong Jin CAS Key Lab of Network Data Science and Technology, Institute of Computing Technology (ICT), Chinese Academy of Sciences (CAS) 2019 ‐ 11 ‐ 30@Fudan Agenda  Background  Knowledge Graph Embedding (KGE)  Applications of KGE  Conclusions 2

Background  A Knowledge Graph (KG) is a system that understands facts about people, places and things and how these entities are all connected  Examples  Dbpedia  YAGO  NELL  Freebase  Wolfram Alpha  Probase  Google KG  …… 3 Background  Typical applications of KGs  Vertical search  Intelligent QA  Disease diagnosis  Financial anti ‐ fraud  Abnormal data analysis  Machine translation  …… 4

Vertical Search 5 Intelligent QA  IBM’s Watson  Google’s Google Now  Apple’s Siri  Amazon’s Alexa  Microsoft’s Xiaobing & Cortana  Baidu’s Dumi ( 度秘 )  Sogou’s Wangzai ( 旺仔 )  … 6

Disease Diagnosis  Watson Care Manager  Knowledge service platform for Traditional Chinese Medicine (TCM)  … 基于知识图谱的癌症研究 @安德森癌症中心&IBM Watson 中医药知识服务平台 @中医科学院中医药信息研究所 7 Typical Representation of KGs  Symbolic triples: (head entity, relation, tail entity)  e.g., (Eiffel Tower, is_located_in, Paris)  (Eiffel Tower, is_a, place)  (Bob, is_a_friend_of, Alice)  8

Inference over KGs  Logic based models  Pros: Easily interpretable  Cons: Highly complex  Path ranking algorithms  Pros: Easily interpretable  Cons: Cannot handle rare relations  Cannot handle KGs with low connectivity  Extracting paths is time ‐ consuming   Embedding ‐ based methods  Pros: Highly efficient  Can capture semantic information  9  Cons: Less interpretable Agenda  Background  Knowledge Graph Embedding (KGE)  Applications of KGE  Conclusions 10

Knowledge Graph Embedding (KGE)  Map the entities, relations, and even paths of a KG into a low ‐ dimensional vector space  Encode semantic information  Computationally efficient TransE (Translational Embeddings)  Basic idea  Treat relations as the translation operations between vectors corresponding to entities  The score function of h + r = t China + Capital = Beijing France + Capital = Paris  Loss function Optimal Margin Positive Negative 12 triple set triple set

Trans Series of KGE  TransE cannot well handle 1 ‐ N, N ‐ 1, or N ‐ M relations  TransH  TransR TransH  … TransR 13 Agenda  Background  Knowledge Graph Embedding (KGE)  Applications of KGE  Conclusions 14

The Applications of KGE  Basic applications  Link prediction ?  Entity alignment  KG integration  … ?  Advanced applications  Vertical search KG2 KG1 (group 2) (group 1)  Intelligent QA  Disease diagnosis  … aligned entity pairs Application 1 ： Link Prediction Shared embedding based neural networks for knowledge graph completion S. Guan, X. Jin, Y. Wang, et al. The 27th ACM International Conference on Information and Knowledge Management (CIKM’18) 16

Motivation  Existing methods for link prediction  Handle three types of tasks:  Do not distinguish them in training  These prediction tasks have quite different performance  Link prediction upon reasoning  It is a process that gradually approaches to the target  FCN of decreasing hidden nodes can imitate such a process 17 The Proposed Method  Shared Embedding based Neural Network (SENN)  Explicitly distinguish the three prediction tasks  Integrate them into a FCN based framework  Extend SENN to SENN+  Use to improve and 18

The SENN Method  The framework  2 shared embedding matrices  3 substructures: head_pred , rel._pred and tail_pred 19 The Three Substructures  Head_pred  The score function where is the ReLU function 20

The Three Substructures  Head_pred  The prediction label vector where is the sigmoid or softmax function  Each element of indicates the probability of the corresponding entity h to form a valid triple (h, r, t) 21 The Three Substructures  rel._pred and tail_pred do similarly  The score functions  The prediction label vectors 22

Model Training  The general loss function  Idea: cross entropy of prediction and target label vectors  These prediction tasks have their target label vectors , and for where is the set of valid head entities in the training set, given and  Use label smoothing to regularize target label vectors where is the label smoothing parameter 23 Model Training  The general loss function  Binary cross ‐ entropy losses for the 3 prediction tasks  The general loss for the given triple 24

Model Training  The adaptively weighted loss mechanism  The prediction on 1 ‐ side or M ‐ side  Punish the model more severely if deterministic ones are wrong  Relation prediction and entity prediction  Punish wrong predictions on head/tail entities more severely  The final loss function for the triple 25 The SENN+ Method  Employ to improve and in test  The relation ‐ aided test mechanism  Given , assume that is a valid head entity  If we do , is most probably have a prediction label higher than other relations and be ranked higher 26

The SENN+ Method  The adaptively weighted loss mechanism  Two additional relation aided vectors  The final prediction label vectors for entity prediction 27 Experiments  Entity prediction 28

Experiments  Entity prediction in detail  The adaptively weighted loss mechanism D istinguish and well learn the predictions of different mapping  properties 29 Experiments  Relation prediction  SENN and SENN+ capture the following information to obtain better performance  I mplicit information interaction among different predictions  P rediction ‐ specific information 30

Application 2 ： Link Prediction on N ‐ Ary Facts NaLP: Link Prediction on N ‐ ary Relational Data S. Guan, X. Jin, Y. Wang, X. Cheng. The 2019 International World Wide Web Conference (WWW 2019) 31 Motivation  N ‐ ary facts are pervasive in practice  Existing link prediction methods usually convert n ‐ ary facts into a few triples (i.e., binary sub ‐ facts), which has some drawbacks  Needs to consider many triples and is thus more complicated  The loss of structural information in some conversions that leads to inaccurate link prediction  The added virtual entities and triples bring in more parameters to be learned 32

Related works  A few link prediction methods focus on n ‐ ary facts directly  m ‐ TransH (IJCAI ‐ 2016)  A relation is defined by the mapping from a sequence of roles, corresponding to this type of relation, to their values. E.g.,  Receive_Award: [person, award, point in time]  [Marie Curie, Nobel Prize in Chemistry, 1911]  “Marie Curie received Nobel Prize in Chemistry in 1911.”  Each specific mapping is an instance of the relation  Generalize TransH to n ‐ ary relational data 33 Related works  Also a few link prediction methods focus on n ‐ ary facts directly  RAE (Relatedness Affiliated Embedding, WWW ‐ 2018)  Improve m ‐ TransH by further considering values’ relatedness  Ignore the roles in the above process  Under different sequences of roles, the relatedness of two values is greatly different Marie Curie and Henri Becquerel (person, award, point in time, winner) (person, spouse, start time, end time, place of marriage) The proposed NaLP method explicitly models the relatedness of the role ‐ value pairs. 34

The NaLP method  The presentation of each n ‐ ary fact  A set of role ‐ value pairs  Formally, given an n ‐ ary fact with roles, each role having values, the representation is as follows:  For example, “ Marie Curie received Nobel Prize in Chemistry in 1911 .” is represented as: {person: Marie Curie, award: Nobel Prize in Chemistry, point in time: 1911} 35 The NaLP method  The framework  A role and its value are tightly linked to each other, thus should be bound together  For a set of role ‐ value pairs, it decides if they form a valid n ‐ ary fact, i.e., if they are closely related Role ‐ value pair embedding Relatedness evaluation 36

Role ‐ value pair embedding Capture the features of the role ‐ value pairs Form the embedding matrix 37 Relatedness evaluation  The principle  A set of role ‐ value pairs form a valid fact → Every two role ‐ value pairs are greatly related → The values of their relatedness feature vector are large → The minimum over each feature dimension among all the pairs is not allowed to be too small → Apply element ‐ wise minimizing over the pair ‐ wise relatedness to approximately evaluate the overall relatedness 38

Relatedness evaluation Compute the relatedness between role ‐ value pairs Estimate the overall relatedness of all the role ‐ value pairs Obtain the evaluation score 39 Look into NaLP and the loss function  Look into NaLP  Permutation ‐ invariant to the input order of role ‐ value pairs  Able to cope with facts of different arities  The loss function: 40

Knowledge Graph Embedding and Its Applications Xiaolong Jin CAS Key - PDF document

Knowledge Graph Embedding and Its Applications Xiaolong Jin CAS Key Lab of Network Data Science and Technology, Institute of Computing Technology (ICT), Chinese Academy of Sciences (CAS) 2019 11 30@Fudan Agenda Background Knowledge

Greedy embedding of a graph Greedy embedding of a graph 99 Greedy embedding Greedy embedding

Graph Drawing Embedding Embedding For a given graph G = ( V , E ) , an embedding (into R 2 )

Planarity Embedding Embedding For a given graph G = ( V , E ) , an embedding (into R 2 ) assigns

Knowledge Graph Embedding for Mining Cultural Heritage Data Nada Mimouni and Jean-Claude Moissinac

Graph Embeddings Alicia Frame, PhD October 10, 2019 Overview Whats an embedding? How do

Knowledge-Based Agents knowledge knowledge representation, knowledge base, types of knowledge

GRAPH MINING AND GRAPH KERNELS Part I: Graph Mining Karsten Borgwardt^ and Xifeng Yan*

Exploiting Graph Embeddings for Graph Analysis Tasks Fatemeh Salehi Rizi Graph Embedding Day

Embedding 3-manifolds via surgery on surfaces Kyle Larson University of Texas at Austin

GRAPH MINING AND GRAPH KERNELS Part II: Graph Kernels Karsten Borgwardt^ and Xifeng Yan*

Joint Network and Edge-To-Vertex Graph Embedding Ilya Makarov , Ksenia Korovina and L.E. Zhukov

Challenges and Innovations in Building a Product Knowledge Graph XIN LUNA DONG, AMAZON JANUARY,

Graph Indexing: Tree + Delta Delta >= Graph >= Graph Graph Indexing: Tree + Peixian Zhao,

Graph Mining Marco Serafini COMPSCI 532 Lecture 11 Classes of Graph Systems Graph

Avoiding artifacts in spectral white matter fiber clustering and embedding Demian Wassermann

Coding Theorems for Reversible Embedding Frans Willems and Ton Kalker T.U. Eindhoven, Philips

MOOC-based course to credit course in CNMOOC YU JIANBO The MOOC Office of Shanghai Jiao

Welcome to the CONVERGE Virtual Forum COVID-19 Working Groups for Public Health and Social

Disclosures I have nothing to disclose Updates on the care of people with addictions Diana

Slide 2 (Universal & Global) HSBC is a universal, global bank. o As well see in just a

11/30/2011 ABMP Webinar Team East Meets West: Course Overview Balance Your Business Explore how

Perennial Real Estate Holdings Ltd ANNUAL GENERAL MEETING 26 JUNE 2020 MR. PUA SECK GUAN CHIEF

Our Vision Learn with Passion Serve with Compassion Lead with Vision Our Mission Maximising

WELCOME TO UTS:SCIENCE UTS:SCIENCE UTS CRICOS PROVIDER CODE: 00099F science.uts.edu.au

Knowledge Graph Embedding and Its Applications Xiaolong Jin CAS Key - PDF document

Knowledge Graph Embedding and Its Applications Xiaolong Jin CAS Key Lab of Network Data Science and Technology, Institute of Computing Technology (ICT), Chinese Academy of Sciences (CAS) 2019 11 30@Fudan Agenda Background Knowledge

Greedy embedding of a graph Greedy embedding of a graph 99 Greedy embedding Greedy embedding

Graph Drawing Embedding Embedding For a given graph G = ( V , E ) , an embedding (into R 2 )

Planarity Embedding Embedding For a given graph G = ( V , E ) , an embedding (into R 2 ) assigns

Knowledge Graph Embedding for Mining Cultural Heritage Data Nada Mimouni and Jean-Claude Moissinac

Graph Embeddings Alicia Frame, PhD October 10, 2019 Overview Whats an embedding? How do

Knowledge-Based Agents knowledge knowledge representation, knowledge base, types of knowledge

GRAPH MINING AND GRAPH KERNELS Part I: Graph Mining Karsten Borgwardt^ and Xifeng Yan*

Exploiting Graph Embeddings for Graph Analysis Tasks Fatemeh Salehi Rizi Graph Embedding Day

Embedding 3-manifolds via surgery on surfaces Kyle Larson University of Texas at Austin

GRAPH MINING AND GRAPH KERNELS Part II: Graph Kernels Karsten Borgwardt^ and Xifeng Yan*

Joint Network and Edge-To-Vertex Graph Embedding Ilya Makarov , Ksenia Korovina and L.E. Zhukov

Challenges and Innovations in Building a Product Knowledge Graph XIN LUNA DONG, AMAZON JANUARY,

Graph Indexing: Tree + Delta Delta &gt;= Graph &gt;= Graph Graph Indexing: Tree + Peixian Zhao,

Graph Mining Marco Serafini COMPSCI 532 Lecture 11 Classes of Graph Systems Graph

Avoiding artifacts in spectral white matter fiber clustering and embedding Demian Wassermann

Coding Theorems for Reversible Embedding Frans Willems and Ton Kalker T.U. Eindhoven, Philips

MOOC-based course to credit course in CNMOOC YU JIANBO The MOOC Office of Shanghai Jiao

Welcome to the CONVERGE Virtual Forum COVID-19 Working Groups for Public Health and Social

Disclosures I have nothing to disclose Updates on the care of people with addictions Diana

Slide 2 (Universal &amp; Global) HSBC is a universal, global bank. o As well see in just a

11/30/2011 ABMP Webinar Team East Meets West: Course Overview Balance Your Business Explore how

Perennial Real Estate Holdings Ltd ANNUAL GENERAL MEETING 26 JUNE 2020 MR. PUA SECK GUAN CHIEF

Our Vision Learn with Passion Serve with Compassion Lead with Vision Our Mission Maximising

WELCOME TO UTS:SCIENCE UTS:SCIENCE UTS CRICOS PROVIDER CODE: 00099F science.uts.edu.au

Graph Indexing: Tree + Delta Delta >= Graph >= Graph Graph Indexing: Tree + Peixian Zhao,

Slide 2 (Universal & Global) HSBC is a universal, global bank. o As well see in just a