Knowledge Graph Reasoning CSCI 699: ML4Know Instructor: Xiang Ren - PowerPoint PPT Presentation

Knowledge Graph Reasoning CSCI 699: ML4Know Instructor: Xiang Ren USC Computer Science

Overview • Motivation • Path-Based Reasoning • Embedding-Based Reasoning • Bridging Path-Based and Embedding-Based Reasoning: DeepPath & DIVA • Conclusion 2

Knowledge Graphs are Not Complete English serviceLanguage personLanguages 1 - n I n e k o p S y Actor r Caesars t n u o personLanguages c Entertain… profession nationality -1 Neal serviceLocation -1 Tom United McDonough Hanks States castActor countryOfOrigin awardWorkWinner writtenBy music Graham Band of Michael Yost Brothers Kamen tvProgramGenre tvProgramCreator ... Mini- HBO Series 3

Benefits of Knowledge Graph • Support various applications • Structured Search • Question Answering • Dialogue Systems • Relation Extraction • Summarization 4

Benefits of Knowledge Graph • Support various applications • Structured Search • Question Answering • Dialogue Systems • Relation Extraction • Summarization • Knowledge Graphs can be constructed via information extraction from text, but… • There will be a lot of missing links. • Goal: complete the knowledge graph. 5

Reasoning on Knowledge Graph Query node: Band of brothers Query relation: tvProgramLanguage tvProgramLanguage(Band of Brothers, ? ) 6

Reasoning on Knowledge Graph English serviceLanguage personLanguages 1 - n I n e k o p S y Actor r Caesars t n u o personLanguages c Entertain… profession nationality -1 Neal serviceLocation -1 Tom United McDonough Hanks States castActor countryOfOrigin awardWorkWinner writtenBy music Graham Band of Michael Yost Brothers Kamen tvProgramGenre tvProgramCreator ... Mini- HBO Series 7

KB Reasoning Tasks • Predicting the missing link. • Given e1 and e2, predict the relation r. • Predicting the missing entity. • Given e1 and relation r, predict the missing entity e2. • Fact Prediction. • Given a triple, predict whether it is true or false. 8

Related Work • Path-based methods • Path-Ranking Algorithm, Lao et al. 2011 • ProPPR, Wang et al, 2013 • Subgraph Feature Extraction, Gardner et al, 2015 • RNN + PRA, Neelakantan et al, 2015 • Chains of Reasoning, Das et al, 2017 Why do we need path-based methods? It’s accurate and explainable! 9

Random Walk Inference 10

Path-Ranking Algorithm (Lao et al., 2011) • 1. Run random walk with restarts to derive many paths. • 2. Use supervised training to rank different paths. 11

Path-Ranking Algorithm (Lao et al., 2011) • 1. Run random walk with restarts to derive many paths. 12

Path-Ranking Algorithm (Lao et al., 2011) • 1. Run random walk with restarts to derive many paths. 13

Path-Ranking Algorithm (Lao et al., 2011) • 2. Use supervised training to rank different paths. 14

Path-Ranking Algorithm (Lao et al., 2011) • 2. Use supervised training to rank different paths. 15

ProPPR (Wang et al., 2013;2015) • ProPPR generalizes PRA with recursive probabilistic logic programs. • You may use other relations to jointly infer this target relation. 16

Chain of Reasoning (Das et al, 2017) • 1. Use PRA to derive the path. • 2. Use RNNs to perform reasoning of the target relation. 17

Related Work • Embedding-based method • RESCAL, Nickel et al, 2011 • TransE, Bordes et al, 2013 • Neural Tensor Network, Socher et al, 2013 • TransR/CTransR, Lin et al, 2015 • Complex Embeddings, Trouillon et al, 2016 Embedding methods allow us to compare, and find similar entities in the vector space. 18

RESCAL (Nickel et al., 2011) • Tensor factorization on the • (head)entity-(tail)entity-relation tensor. 19

TransE (Bordes et al., 2013) • Assumption: in the vector space, when adding the relation to the head entity, we should get close to the target tail entity. • Margin based loss function: • Minimize the distance between (h+l) and t. • Maximize the distance between (h+l) to a randomly sampled tail t’ (negative example). 20

Neural Tensor Networks (Socher et al., 2013) • Model the bilinear interaction between entity pairs with tensors. 21

Poincaré Embeddings (Nickel and Kiela, 2017) • Idea: learn hierarchical KB representations by looking at hyperbolic space. 22

ConvE (Dettmers et al, 2018) • 1. Reshape the head and relation embeddings into “images”. • 2. Use CNNs to learn convolutional feature maps. 23

Bridging Path-Based and Embedding-Based Reasoning with Deep Reinforcement Learning: DeepPath (Xiong et al., 2017) 24

RL for KB Reasoning: DeepPath (Xiong et al., 2017) Ø Learning the paths with RL, instead of using random walks with restart Ø Model the path finding as a MDP Ø Train a RL agent to find paths Ø Represent the KG with pretrained KG embeddings Ø Use the learned paths as logical formulas 25

Supervised v.s. Reinforcement Supervised Learning Reinforcement Learning ◦ Training basedon ◦ Training only basedon supervisor/label/annotation reward signal ◦ Feedback isinstantaneous ◦ Feedback isdelayed ◦ Not much temporal aspects ◦ Timematters ◦ Agent actionsaffect subsequent exploration 2 6

Reinforcement Learning • RL is a general purpose framework for decision making • ◦ RL is for an agent with the capacity to act • ◦ Each action influences the agent’s future state • ◦ Success is measured by a scalar reward signal • ◦ Goal: select actions to maximize futurereward 2 7

Reinforcement Learning Agent ' # $ ! " $ # $%& Environment ' $%& Agent Environment Multi-layer neural nets ѱ(s t ) KG modeled as a MDP 28

DeepPath: RL for KG Reasoning 29

Components of MDP • Markov decision process < ", $, %, & > • ": continuous states represented with embeddings • $: action space (relations or edges) • % " >?@ = B C " > = B, $ > = D : transition probability • & B, D : reward received for each taken step • With pretrained KG embeddings • B > = I > ⊕ (I >KLMN> − I > ) • $ = P @ , P Q , … , P S , all relations in the KG 30

Reward Functions • Global Accuracy • Path Efficiency • Path Diversity 31

Training with Policy Gradient • Monte-Carlo Policy Gradient (REINFORCE, William, 1992) 32

Challenge Ø Typical RL problems q Atari games (Mnih et al., 2015): 4~18 valid actions q AlphaGo (Silver et al. 2016): ~250 valid actions q Knowledge Graph reasoning: >= 400 actions Is Issue: ue: q large action (search) space -> poor convergence properties 33

Supervised (Imitation) Policy Learning § Use randomized BFS to retrieve a few paths § Do imitation learning using the retrieved paths § All the paths are assigned with +1 reward 34

Datasets and Preprocessing Dataset # of Entities # of Relations # of Triples # of Tasks FB15k-237 14,505 237 310,116 20 NELL-995 75,492 200 154,213 12 FB15k-237: Sampled from FB15k (Bordes et al., 2013), redundant relations removes NELL-995: Sampled from the 995 th iteration of NELL system (Carlson et al., 2010b) Ø Dataset processing q Remove useless relations: haswikipediaurl , generalizations, etc q Add inverse relation links to the knowledge graph q Remove the triples with task relations 35

Effect of Supervised Policy Learning x-axis: number of training epochs • • y-axis: success ratio (probability of reaching the target) on test set -> Re-train the agent using reward functions 36

Inference Using Learned Paths § Path as logical formula try: actionFilm -1 -> personNationality § Fi FilmCo mCountr § Pe PersonNationality: : placeOfBirth -> locationContains -1 § etc … § Bi-directional path-constrained search § Check whether the formulas hold for entity pairs … … Uni-directional search bi-directional search 37

Link Prediction Result Tasks PRA DeepPath TransE TransR worksFor 0.681 0.711 0.677 0.692 atheletPlaysForTea 0.987 0.955 0.896 0.784 m athletePlaysInLeag 0.841 0.960 0.773 0.912 ue athleteHomeStadiu 0.859 0.890 0.718 0.722 m teamPlaysSports 0.791 0.738 0.761 0.814 orgHirePerson 0.599 0.742 0.719 0.737 personLeadsOrg 0.700 0.795 0.751 0.772 … Overall 0.675 0.796 0.737 0.789 Mean average precision on NELL-995 38

Qualitative Analysis Path length distributions 39

Qualitative Analysis Example Paths placeOfBirth -> locationContains -1 placeOfBirth -> locationContains personNationality: peoplePlaceLived -> locationContains -1 peopleMariage -> locationOfCeremony -> locationContains -1 tvCountryOfOrigin -> countryOfficialLanguage tvProgramLanguage: tvCountryOfOrigin -> filmReleaseRegion-1 -> filmLanguage tvCastActor -> personLanguage athleteHomeStadium -> teamHomeStadium -1 athletePlaysForTeam: athletePlaysSports -> teamPlaysSports -1 atheleteLedSportsTeam 40

Bridging Path-Finding and Reasoning w. Variational Inference DIVA (Chen et al., NAACL 2018) 41

̅ DIVA: Variational KB Reasoning (NAACL 2018) • Inferring latent paths connecting entity nodes. English countrySpeakLanguage )('|" # , " % ) United States Condition (" # , " % ) Observed Variable ' ) = +',-+. / log )('|" # , " % ) 42

Knowledge Graph Reasoning CSCI 699: ML4Know Instructor: Xiang Ren - PowerPoint PPT Presentation

Knowledge Graph Reasoning CSCI 699: ML4Know Instructor: Xiang Ren USC Computer Science Overview Motivation Path-Based Reasoning Embedding-Based Reasoning Bridging Path-Based and Embedding-Based Reasoning: DeepPath & DIVA

CHAPTER-4 1 LOGIC AND REASONING ! Knowledge and ! Reasoning in Knowledge- Reasoning Based

Automated Reasoning Course Presentation Summary Automated Reasoning Motivations Course Plan

Evidential and Causal Reasoning Much reasoning in AI can be seen as evidential reasoning ,

Knowledge-Based Agents knowledge knowledge representation, knowledge base, types of knowledge

GRAPH MINING AND GRAPH KERNELS Part I: Graph Mining Karsten Borgwardt^ and Xifeng Yan*

SECTION 1: Introductions Code Reasoning Forward Reasoning CODE REASONING +

Probabilistic Reasoning; Probabilistic Reasoning; Network-based reasoning Network-based

Principles of Knowledge Representation and Reasoning May 20 & 23, 2008 Nonmonotonic

GRAPH MINING AND GRAPH KERNELS Part II: Graph Kernels Karsten Borgwardt^ and Xifeng Yan*

Knowledge-Based Reasoning in Computer Vision CSC 2539 Paul Vicol Outline Knowledge Bases

Knowledge Graph Completion Mayank Kejriwal (USC/ISI) What is knowledge graph completion? An

VU @ D2.1.1 Part 1: Approximation Reasoning method Knowledge Knowledge base Base

Challenges and Innovations in Building a Product Knowledge Graph XIN LUNA DONG, AMAZON JANUARY,

Logics for Data and Knowledge Representation 5. Reasoning in ALC Luciano Serafini FBK-irst,

Knowledge Representation and Reasoning (Logic) George Konidaris gdk@cs.brown.edu Fall 2019

Graph Indexing: Tree + Delta Delta >= Graph >= Graph Graph Indexing: Tree + Peixian Zhao,

Factors affecting students re- enrolment at a public university system David Rodriguez-Gomez 1

SciLifeLab Bioinformatics Platform National Bioinformatics Infrastructure Sweden (NBIS) Bjrn

Learning Discrete Graphical Models with Neural Networks Andrey Lokhov joint work with Abhijith

PubPol 201 Trade Policies under Trump Module 3: International Safeguards Trade Policy

5 Now there was in the citadel of Susa a Jew of the tribe of Benjamin, named Mordecai ... 6 who

The Global Financial Crisis, Capital Flows, Investors Heterogeneity in Turkey Linh Bun (UCSC),

TPCB Student Town Hall Mental Health & Wellness Diversity & Inclusion November 16, 2020

Programming Overview of MPI-IO Exercises ARCHER Training Courses Sponsors Reusing this

Knowledge Graph Reasoning CSCI 699: ML4Know Instructor: Xiang Ren - PowerPoint PPT Presentation

Knowledge Graph Reasoning CSCI 699: ML4Know Instructor: Xiang Ren USC Computer Science Overview Motivation Path-Based Reasoning Embedding-Based Reasoning Bridging Path-Based and Embedding-Based Reasoning: DeepPath & DIVA

CHAPTER-4 1 LOGIC AND REASONING ! Knowledge and ! Reasoning in Knowledge- Reasoning Based

Automated Reasoning Course Presentation Summary Automated Reasoning Motivations Course Plan

Evidential and Causal Reasoning Much reasoning in AI can be seen as evidential reasoning ,

Knowledge-Based Agents knowledge knowledge representation, knowledge base, types of knowledge

GRAPH MINING AND GRAPH KERNELS Part I: Graph Mining Karsten Borgwardt^ and Xifeng Yan*

SECTION 1: Introductions Code Reasoning Forward Reasoning CODE REASONING +

Probabilistic Reasoning; Probabilistic Reasoning; Network-based reasoning Network-based

Principles of Knowledge Representation and Reasoning May 20 &amp; 23, 2008 Nonmonotonic

GRAPH MINING AND GRAPH KERNELS Part II: Graph Kernels Karsten Borgwardt^ and Xifeng Yan*

Knowledge-Based Reasoning in Computer Vision CSC 2539 Paul Vicol Outline Knowledge Bases

Knowledge Graph Completion Mayank Kejriwal (USC/ISI) What is knowledge graph completion? An

VU @ D2.1.1 Part 1: Approximation Reasoning method Knowledge Knowledge base Base

Challenges and Innovations in Building a Product Knowledge Graph XIN LUNA DONG, AMAZON JANUARY,

Logics for Data and Knowledge Representation 5. Reasoning in ALC Luciano Serafini FBK-irst,

Knowledge Representation and Reasoning (Logic) George Konidaris gdk@cs.brown.edu Fall 2019

Graph Indexing: Tree + Delta Delta &gt;= Graph &gt;= Graph Graph Indexing: Tree + Peixian Zhao,

Factors affecting students re- enrolment at a public university system David Rodriguez-Gomez 1

SciLifeLab Bioinformatics Platform National Bioinformatics Infrastructure Sweden (NBIS) Bjrn

Learning Discrete Graphical Models with Neural Networks Andrey Lokhov joint work with Abhijith

PubPol 201 Trade Policies under Trump Module 3: International Safeguards Trade Policy

5 Now there was in the citadel of Susa a Jew of the tribe of Benjamin, named Mordecai ... 6 who

The Global Financial Crisis, Capital Flows, Investors Heterogeneity in Turkey Linh Bun (UCSC),

TPCB Student Town Hall Mental Health &amp; Wellness Diversity &amp; Inclusion November 16, 2020

Programming Overview of MPI-IO Exercises ARCHER Training Courses Sponsors Reusing this

Principles of Knowledge Representation and Reasoning May 20 & 23, 2008 Nonmonotonic

Graph Indexing: Tree + Delta Delta >= Graph >= Graph Graph Indexing: Tree + Peixian Zhao,

TPCB Student Town Hall Mental Health & Wellness Diversity & Inclusion November 16, 2020