KNOWLEDGE GRAPH CONSTRUCTION Jay Pujara University of Maryland, - PowerPoint PPT Presentation

KNOWLEDGE GRAPH CONSTRUCTION Jay Pujara University of Maryland, College Park Max Planck Institute 7/9/2015

Can Computers Create Knowledge? Internet Knowledge Massive source of publicly available information

Computers + Knowledge =

What does it mean to create knowledge? What do we mean by knowledge?

Defining the Questions • Extraction • Representation • Reasoning and Inference

A Revised Knowledge-Creation Diagram Extraction Internet Knowledge Graph (KG) Cutting-edge IE methods Structured representation of Massive source of entities, their labels and publicly available the relationships information between them

Knowledge Graphs in the wild

Motivating Problem: Real Challenges Extraction Internet Knowledge Graph Difficult! Noisy! Contains many errors and inconsistencies

NELL: The Never-Ending Language Learner • Large-scale IE project (Carlson et al., AAAI10) • Lifelong learning: aims to “read the web” • Ontology of known labels and relations • Knowledge base contains millions of facts

Examples of NELL errors

Entity co-reference errors Kyrgyzstan has many variants: • Kyrgystan • Kyrgistan • Kyrghyzstan • Kyrgzstan • Kyrgyz Republic

Missing and spurious labels Kyrgyzstan is labeled a bird and a country

Missing and spurious relations Kyrgyzstan’s location is ambiguous – Kazakhstan, Russia and US are included in possible locations

Violations of ontological knowledge • Equivalence of co-referent entities (sameAs) • SameEntity(Kyrgyzstan, Kyrgyz Republic) • Mutual exclusion (disjointWith) of labels • MUT(bird, country) • Selectional preferences (domain/range) of relations • RNG(countryLocation, continent) Enforcing these constraints requires jointly considering multiple extractions across documents

Examples where joint models have succeeded • Information extraction • ER+Segmentation: Poon & Domingos, AAAI07 • SRL: Srikumar & Roth, EMNLP11 • Within-doc extraction: Singh et al., AKBC13 • Social and communication networks • Fusion: Eldardiry & Neville, MLG10 • EMailActs: Carvalho & Cohen, SIGIR05 • GraphID: Namata et al., KDD11

GRAPH IDENTIFICATION

Slides courtesy Getoor, Namata, Kok Transformation Graph Identification Input Graph Output Graph Available but inappropriate Appropriate for further for analysis analysis

Slides courtesy Getoor, Namata, Kok Motivation: Different Networks nsmith@msn.com Neil Smith mjones@email.com mtaylor@email.com Mary Taylor Robert Lee neil@email.com Mary Jones robert@email.com Anne Cole acole@email.com mary@email.com Label: CEO Manager Assistant Programmer Communication Network Organizational Network Nodes: Email Address Nodes: Person Edges: Communication Edges: Manages Node Attributes: Words Node Labels: Title

Slides courtesy Getoor, Namata, Kok Graph Identification nsmith@msn.com Neil Smith mjones@email.com mtaylor@email.com Mary Taylor Robert Lee neil@email.com Graph Iden+fica+on Mary Jones robert@email.com Anne Cole acole@email.com mary@email.com Label: CEO Manager Assistant Programmer Input Graph: Email Communication Network Output Graph: Social Network

Slides courtesy Getoor, Namata, Kok Graph Identification nsmith@msn.com mjones@email.com mtaylor@email.com neil@email.com Graph Iden+fica+on robert@email.com acole@email.com mary@email.com Input Graph: Email Communication Network Output Graph: Social Network • What’s involved?

Slides courtesy Getoor, Namata, Kok Graph Identification nsmith@msn.com Neil Smith mjones@email.com mtaylor@email.com Mary Taylor Robert Lee neil@email.com ER robert@email.com Anne Cole Mary Jones acole@email.com mary@email.com Input Graph: Email Communication Network Output Graph: Social Network • What’s involved? • Entity Resolution (ER): Map input graph nodes to output graph nodes

Slides courtesy Getoor, Namata, Kok Graph Identification nsmith@msn.com Neil Smith mjones@email.com mtaylor@email.com Mary Taylor Robert Lee neil@email.com ER+LP robert@email.com Anne Cole Mary Jones acole@email.com mary@email.com Input Graph: Email Communication Network Output Graph: Social Network • What’s involved? • Entity Resolution (ER): Map input graph nodes to output graph nodes • Link Prediction (LP): Predict existence of edges in output graph

Slides courtesy Getoor, Namata, Kok Graph Identification nsmith@msn.com Neil Smith mjones@email.com mtaylor@email.com Mary Taylor Robert Lee neil@email.com ER+LP+NL robert@email.com Anne Cole Mary Jones acole@email.com mary@email.com Label: CEO Manager Assistant Programmer Input Graph: Email Communication Network Output Graph: Social Network • What’s involved? • Entity Resolution (ER): Map input graph nodes to output graph nodes • Link Prediction (LP): Predict existence of edges in output graph • Node Labeling (NL): Infer the labels of nodes in the output graph

Slides courtesy Getoor, Namata, Kok Problem Dependencies ER Input LP NL Graph • Most work looks at these tasks in isolation • In graph identification they are: • Evidence-Dependent – Inference depend on observed input graph e.g., ER depends on input graph • Intra-Dependent – Inference within tasks are dependent e.g., NL prediction depend on other NL predictions • Inter-Dependent – Inference across tasks are dependent e.g., LP depend on ER and NL predictions

KNOWLEDGE GRAPH IDENTIFICATION Pujara, Miao, Getoor, Cohen, ISWC 2013 (best student paper)

(Pujara et al., ISWC13) Motivating Problem (revised) Knowledge Graph (noisy) Extraction Graph Internet = Large-scale IE Joint Reasoning

(Pujara et al., ISWC13) Knowledge Graph Identification Problem: Knowledge Graph Knowledge Graph = Identification Extraction Graph Solution: Knowledge Graph Identification (KGI) • Performs graph identification : • entity resolution • node labeling • link prediction • Enforces ontological constraints • Incorporates multiple uncertain sources

(Pujara et al., ISWC13) Illustration of KGI: Extractions Uncertain Extractions: .5: Lbl(Kyrgyzstan, bird) .7: Lbl(Kyrgyzstan, country) .9: Lbl(Kyrgyz Republic, country) .8: Rel(Kyrgyz Republic, Bishkek, hasCapital)

(Pujara et al., ISWC13) Illustration of KGI: Ontology + ER Extraction Graph Uncertain Extractions: .5: Lbl(Kyrgyzstan, bird) Kyrgyzstan Kyrgyz Republic .7: Lbl(Kyrgyzstan, country) .9: Lbl(Kyrgyz Republic, country) Rel(hasCapital) .8: Rel(Kyrgyz Republic, Bishkek, hasCapital) l b country L bird Bishkek

(Pujara et al., ISWC13) Illustration of KGI: Ontology + ER (Annotated) Extraction Graph Uncertain Extractions: SameEnt .5: Lbl(Kyrgyzstan, bird) Kyrgyzstan Kyrgyz Republic .7: Lbl(Kyrgyzstan, country) .9: Lbl(Kyrgyz Republic, country) Rel(hasCapital) .8: Rel(Kyrgyz Republic, Bishkek, hasCapital) D o m l b Ontology: country L Dom(hasCapital, country) Mut(country, bird) bird Entity Resolution: Bishkek SameEnt(Kyrgyz Republic, Kyrgyzstan)

(Pujara et al., ISWC13) Illustration of KGI (Annotated) Extraction Graph Uncertain Extractions: SameEnt .5: Lbl(Kyrgyzstan, bird) Kyrgyzstan Kyrgyz Republic .7: Lbl(Kyrgyzstan, country) .9: Lbl(Kyrgyz Republic, country) Rel(hasCapital) .8: Rel(Kyrgyz Republic, Bishkek, hasCapital) D o m l b Ontology: country L Dom(hasCapital, country) Mut(country, bird) bird Entity Resolution: Bishkek SameEnt(Kyrgyz Republic, Kyrgyzstan) After Knowledge Graph Identification Kyrgyzstan Rel(hasCapital) Lbl Bishkek country Kyrgyz Republic

(Pujara et al., ISWC13) Modeling Knowledge Graph Identification

(Pujara et al., ISWC13) Viewing KGI as a probabilistic graphical model Rel(hasCapital, Lbl(Kyrgyzstan, bird) Kyrgyzstan, Bishkek) Lbl(Kyrgyzstan, country) Lbl(Kyrgyz Republic, country) Rel(hasCapital, Lbl(Kyrgyz Republic, Kyrgyz Republic, bird) Bishkek)

(Pujara et al., ISWC13) Background: Probabilistic Soft Logic (PSL) (Broecheler et al., UAI10; Kimming et al., NIPS-ProbProg12) • Templating language for hinge-loss MRFs, very scalable! • Model specified as a collection of logical formulas SameEnt ( E 1 , E 2 ) ˜ ∧ Lbl ( E 1 , L ) ⇒ Lbl ( E 2 , L ) Uses soft-logic formulation p ˜ ∧ q = max(0 , p + q − 1) • Truth values of atoms relaxed p ˜ ∨ q = min(1 , p + q ) to [0,1] interval • Truth values of formulas ¬ p = 1 − p ˜ derived from Lukasiewicz p ˜ ⇒ q = min(1 , q − p + 1) t-norm

Soft Logic T utorial: Rules to Groundings • Given a database of evidence, we can convert rule templates to instances (grounding) • Rules are grounded by substituting literals into formulas SameEnt ( E 1 , E 2 ) ˜ ∧ Lbl ( E 1 , L ) ⇒ Lbl ( E 2 , L ) SameEnt (Kyrgyzstan , Kyrygyz Republic) ˜ ∧ Lbl (Kyrgyzstan , country) ⇒ Lbl (Kyrygyz Republic , country) • The soft logic interpretation assigns a “satisfaction” value to each ground rule

KNOWLEDGE GRAPH CONSTRUCTION Jay Pujara University of Maryland, - PowerPoint PPT Presentation

KNOWLEDGE GRAPH CONSTRUCTION Jay Pujara University of Maryland, College Park Max Planck Institute 7/9/2015 Can Computers Create Knowledge? Internet Knowledge Massive source of publicly available information Computers + Knowledge = What

Knowledge-Based Agents knowledge knowledge representation, knowledge base, types of knowledge

GRAPH MINING AND GRAPH KERNELS Part I: Graph Mining Karsten Borgwardt^ and Xifeng Yan*

Challenges in Chinese Knowledge Graph Construction Chengyu Wang, Ming Gao, Xiaofeng He, Rong

GRAPH MINING AND GRAPH KERNELS Part II: Graph Kernels Karsten Borgwardt^ and Xifeng Yan*

Knowledge Graph Construction from Text AAAI 2017 J AY P UJARA , S AMEER S INGH , B HAVANA D ALVI

KNOWLEDGE ACQUISITION AND CONSTRUCTION Transfer of Knowledge Knowledge acquisition is the

Challenges and Innovations in Building a Product Knowledge Graph XIN LUNA DONG, AMAZON JANUARY,

Part 1: Knowledge Graphs Part 2: Part 3: Knowledge Graph

Part 1: Knowledge Graphs Part 2: Part 3: Knowledge Graph

Graph Indexing: Tree + Delta Delta >= Graph >= Graph Graph Indexing: Tree + Peixian Zhao,

Graph Mining Marco Serafini COMPSCI 532 Lecture 11 Classes of Graph Systems Graph

Hybrid Construction Hybrid Construction Hybrid Construction Hybrid Construction 1 VP

Plan for today Knowledge-based systems 1 Explicit knowledge Knowledge Representation Inferred

Plan for today Knowledge-based systems 1 Tacit knowledge Knowledge Representation Inferred

26:198:722 Expert Systems I Knowledge representation I Knowledge acquisition I Machine learning I

Knowledge Graph Completion Mayank Kejriwal (USC/ISI) What is knowledge graph completion? An

Presentation resources This template presentation is designed to help you to structure and guide

Introduction to Systematic Review and Meta-Analysis: A Health Care Perspective Sally C. Morton

After all of this treatment, why isnt he/she better? Common Causes of Treatment Resistance and

Handcrafted Fraud and Extortion: Manual Account Hijacking in the Wild Elie Bursztein, Borbala

OEM Slides Steel Extended Contact Bearing Stages Stainless Steel Extended Contact Bearing

Global Construction: International Opportunities, Local Risks Sponsored By: 1 About Advisen

Dr. CU 2.0: A Scalable Detailed Routing Framework with Correct-by-Construction Design Rule

Evaluating Vacant and Abandoned Buildings IAAI/USFA Abandoned Building Project Inspection and

KNOWLEDGE GRAPH CONSTRUCTION Jay Pujara University of Maryland, - PowerPoint PPT Presentation

KNOWLEDGE GRAPH CONSTRUCTION Jay Pujara University of Maryland, College Park Max Planck Institute 7/9/2015 Can Computers Create Knowledge? Internet Knowledge Massive source of publicly available information Computers + Knowledge = What

Knowledge-Based Agents knowledge knowledge representation, knowledge base, types of knowledge

GRAPH MINING AND GRAPH KERNELS Part I: Graph Mining Karsten Borgwardt^ and Xifeng Yan*

Challenges in Chinese Knowledge Graph Construction Chengyu Wang, Ming Gao, Xiaofeng He, Rong

GRAPH MINING AND GRAPH KERNELS Part II: Graph Kernels Karsten Borgwardt^ and Xifeng Yan*

Knowledge Graph Construction from Text AAAI 2017 J AY P UJARA , S AMEER S INGH , B HAVANA D ALVI

KNOWLEDGE ACQUISITION AND CONSTRUCTION Transfer of Knowledge Knowledge acquisition is the

Challenges and Innovations in Building a Product Knowledge Graph XIN LUNA DONG, AMAZON JANUARY,

Part 1: Knowledge Graphs Part 2: Part 3: Knowledge Graph

Part 1: Knowledge Graphs Part 2: Part 3: Knowledge Graph

Graph Indexing: Tree + Delta Delta &gt;= Graph &gt;= Graph Graph Indexing: Tree + Peixian Zhao,

Graph Mining Marco Serafini COMPSCI 532 Lecture 11 Classes of Graph Systems Graph

Hybrid Construction Hybrid Construction Hybrid Construction Hybrid Construction 1 VP

Plan for today Knowledge-based systems 1 Explicit knowledge Knowledge Representation Inferred

Plan for today Knowledge-based systems 1 Tacit knowledge Knowledge Representation Inferred

26:198:722 Expert Systems I Knowledge representation I Knowledge acquisition I Machine learning I

Knowledge Graph Completion Mayank Kejriwal (USC/ISI) What is knowledge graph completion? An

Presentation resources This template presentation is designed to help you to structure and guide

Introduction to Systematic Review and Meta-Analysis: A Health Care Perspective Sally C. Morton

After all of this treatment, why isnt he/she better? Common Causes of Treatment Resistance and

Handcrafted Fraud and Extortion: Manual Account Hijacking in the Wild Elie Bursztein, Borbala

OEM Slides Steel Extended Contact Bearing Stages Stainless Steel Extended Contact Bearing

Global Construction: International Opportunities, Local Risks Sponsored By: 1 About Advisen

Dr. CU 2.0: A Scalable Detailed Routing Framework with Correct-by-Construction Design Rule

Evaluating Vacant and Abandoned Buildings IAAI/USFA Abandoned Building Project Inspection and

Graph Indexing: Tree + Delta Delta >= Graph >= Graph Graph Indexing: Tree + Peixian Zhao,