Tutorial Overview https://kgtutorial.github.io Part 1: Knowledge - PowerPoint PPT Presentation

Tutorial Overview https://kgtutorial.github.io Part 1: Knowledge Graphs Part 2: Part 3: Knowledge Graph Extraction Construction Part 4: Critical Analysis 1

Tutorial Outline 1. Knowledge Graph Primer [Jay] 2. Knowledge Extraction Primer [Jay] Coffee Break 3. Knowledge Graph Construction a. Probabilistic Models [Jay] b. Embedding Techniques [Sameer] 4. Critical Overview and Conclusion [Sameer] 2

Critical Overview SUMMARY SUCCESS STORIES DATASETS, TASKS, SOFTWARES EXCITING RESEARCH DIRECTIONS 3

Why do we need Knowledge graphs? •Humans can explore large database in intuitive ways •AI agents get access to human common sense knowledge 5

Knowledge graph construction A 1 E 1 • Who are the entities A 2 (nodes) in the graph? • What are their attributes E 2 and types (labels)? A 1 A 2 • How are they related E 3 (edges)? A 1 A 2 6

Knowledge Graph Construction Extraction Knowledge Knowledge Graph graph Text Extraction Construction graph 7

Two perspectives Extraction graph Knowledge graph • • Who are the entities? Named Entity Entity Linking • (nodes) Recognition Entity Resolution • Entity Coreference • • What are their Entity Typing Collective attributes? (labels) classification • • How are they related? Semantic role Link prediction (edges) labeling • Relation Extraction 8

Knowledge Extraction John was born in Liverpool, to Julia and Alfred Lennon. Text NLP Lennon.. Mrs. Lennon.. his father the Pool John Lennon... .. his mother .. Alfred he Location Person Person Person John was born in Liverpool, to Julia and Alfred Lennon. Annotated text NNP VBD VBD IN NNP TO NNP CC NNP NNP Extraction graph Information Alfred Extraction Lennon childOf birthplace John Liverpool Lennon Julia childOf Lennon 9

Information Extraction Single extractor Supervised Defining domain Learning extractors Semi-supervised Scoring candidate facts Unsupervised Fusing multiple extractors 10

Knowledge Graph Construction Part 2: Extraction Part 3: Knowledge Knowledge Graph graph Text graph Extraction Construction 11

Issues with Extraction Graph Extracted knowledge could be: • ambiguous • incomplete • inconsistent 12

Two approaches for KG construction PROBABILISTIC MODELS EMBEDDING BASED MODELS 13

Two classes of Probabilistic Models GRAPHICAL MODEL BASED RANDOM WALK BASED ◦ Possible facts in KG are ◦ Possible facts posed as variables queries ◦ Logical rules relate facts ◦ Random walks of the KG constitute “proofs” ◦ Probability path ◦ Probability satisfied lengths/transitions rules ◦ Local grounding ◦ Universal-quantification 15

Illustration of KG Identification (Annotated) Extraction Graph Uncertain Extractions: SameEnt .5: Lbl(Fab Four, novel) Fab Four Beatles .7: Lbl(Fab Four, musician) .9: Lbl(Beatles, musician) .8: Rel(Beatles,AlbumArtist, Abbey Road) Ontology: musician Dom(albumArtist, musician) Mut(novel, musician) novel Entity Resolution: Abbey Road SameEnt(Fab Four, Beatles) After Knowledge Graph Identification Beatles Rel(AlbumArtist ) Lbl Abbey Road musician Fab Four PUJARA+ISWC13; PUJARA+AIMAG15

Random Walk Illustration Query: R(Lennon, PlaysInstrument, ?) 17

Why embeddings? Limitations of probabilistic models Embedding based models Limitation to Logical Relations • Representation restricted by manual design • Everything as dense vectors • Clustering? Asymmetric implications? • Captures many relations • Information flows through these relations • Learned from data • Difficult to generalize to unseen entities/relations • Can generalize to unseen entities and relations Computational Complexity of Algorithms • Efficient inference at large scale • • Learning is NP-Hard, difficult to approximate Learning using stochastic • Query-time inference is also NP-Hard gradient, back-propagation • • Not easy to parallelize, or use GPUs Querying is often cheap • • Scalability is badly affected by representation GPU-parallelism friendly

Relation Embeddings 20

Part 1: Knowledge Graphs Part 2: Part 3: Knowledge Graph Extraction Construction 21

Success stories YAGO 23

Success story: OpenIE (ReVerb) openie.allenai.org 24

Success story: NELL 25

Success story: YAGO • Input: Wikipedia infoboxes, WordNet and GeoNames • Output: KG with 350K entity types, 10M entities, 120M facts • Temporal and spatial information 26

Link 27

Success story • DBPedia is automatically extracted structured data from Wikipedia • 17M canonical entities • 88M type statements • 72M infobox statements 28

DeepDive • Machine learning based extraction system • Best Precision/recall/F1 in KBP-slot filling task 2014 evaluations (31 teams participated) 29

IE systems in practice Defining Learning Scoring Fusing domain extractors candidate extractors facts ConceptNet NELL Heuristic rules Knowledge Classifier Vault OpenIE 30

Datasets • KG as datasets • FB15K-237 Knowledge base completion dataset based on Freebase 1 • DBPedia Structured data extracted from Wikipedia • NELL Read the web datasets • AristoKB Tuple knowledge base for Science domain • Text datasets • Clueweb09: 1 billion webpages (sample of Web) • FACC1: Freebase Annotations of the Clueweb09 Corpora • Gigaword: automatically-generated syntactic and discourse structure • NYTimes: The New York Times Annotated Corpus • Datasets related to Semi-supervised learning for information extraction Link: entity typing, concept discovery, aligning glosses to KB, multi-view learning 1 see Dettmers et al, 2017 for details (https://arxiv.org/pdf/1707.01476.pdf) 32

Shared tasks • Text Analysis Conference on Knowledge Base Population (TAC KBP) • Slot filling task • Cold Start KBP Track • Tri-Lingual Entity Discovery and Linking Track (EDL) • Event Track • Validation/Ensembling Track 33

Software: NLP • Stanford CoreNLP: a suite of core NLP tools [link] (Java code) • FIGER: fine-grained entity recognizer assigns over 100 semantic types link (Java code) • FACTORIE: out-of-the-box tools for NLP and information integration link (Scala code) • EasySRL: Semantic role labeling link (Java code) 34

Software: Extracting and Reasoning • Open IE (University of Washington) Open IE 4.2 link (Scala code) Stanford Open IE link (Java code) • Interactive Knowledge Extraction (IKE) (Allen Institute for Artificial Intelligence) link (Scala code) • PSL: Probabilistic soft logic link (Java code) • ProPPR: Programming with Personalized PageRank link (Java code) 35

Exciting Active Research • INTERESTING APPLICATIONS OF KG • MULTI-MODAL INFORMATION EXTRACTION • KNOWLEDGE AS SUPERVISION • COMMON KNOWLEDGE 37

Interesting application of Knowledge Graphs The Literome Project [link] • Automatic system for extracting genomic knowledge from PubMed articles • Web-accessible knowledge base Literome: PubMed-Scale Genomic Knowledge Base in the Cloud , Hoifung Poon et al., Bioinformatics 2014 39

Interesting application of Knowledge Graphs Chronic disease management: develop AI technology for predictive and preventive personalized medicine to reduce the national healthcare expenditure on chronic diseases (90% of total cost) 40

Knowledge Base Completion Scoring Function Table from Dettmers, et al. (2017)

Multimodal KB Embeddings Scoring Function Encoder Object Entity Lookup Images CNN Text LSTM Numbers, etc. FeedFwd

Knowledge as Supervision ✔ spouseOf(Barack, Michelle) Learned Model Update Model Learning User Algorithm Problem 1: Each annotation takes time Problem 2: Each annotation is a drop in the ocean ✔ X husband of Y => spouseOf(X,Y) Learned Model Update Model Learning User Algorithm Many different options - Generalized Expectation - Posterior Regularization - Labeling functions in SNORKEL 45

Tutorial Overview https://kgtutorial.github.io Part 1: Knowledge - PowerPoint PPT Presentation

Tutorial Overview https://kgtutorial.github.io Part 1: Knowledge Graphs Part 2: Part 3: Knowledge Graph Extraction Construction Part 4: Critical Analysis 1 Tutorial Outline 1. Knowledge Graph Primer [Jay] 2. Knowledge Extraction Primer

Tutorial Tutorial A2 is out, its called Inpainting Tutorial Tutorial A2 is out, its called

A GAMS TUTORIAL A GAMS TUTORIAL A GAMS TUTORIAL WHAT IS GAMS ? General Algebraic Modeling

Excel Tutorial 1 Getting Started with Excel Tutorial 2 Formatting a Workbook Tutorial 3

Do Fifty- Two Motivation Overview of the Language

PROGRAMMING TUTORIAL Thierry Lepley, April 4 th 2016 TUTORIAL GOAL Intermediate Tutorial for

UPPAAL Tutorial UPPAAL Tutorial UPPAAL Tutorial Introduction Introduction Alexandre David

PowerPoint Tutorial 1 Creating a Presentation Tutorial 2 Applying and Modifying Text and

Tutorial: TF-Ranking for sparse features Tutorial: TF-Ranking for sparse features This tutorial

Comp 1402 Winter 2008 Tutorial #1 Tutorial 1 The objectives of this tutorial will be:

XDP hands-on tutorial Jesper Dangaard Brouer Toke Hiland-Jrgensen Bornhack Gelsted, August

Prose tutorial Edit New Page Sumit Gulwani edited this page 9 minutes ago 60 revisions

Tutorial on using the Google Cloud Platform (GCP) Tutorial on using the Google Cloud Platform

CS 525M Mobile and Ubiquitous Computing Tutorial 1: Introduction by Bucky Roberts (thenewboston)

CAVE2 Unity Tutorial CAVE2 unity tutorial on github Omicron Cave example unity scene Cave2

NLP Programming Tutorial 0 - Programming Basics Graham Neubig Nara Institute of Science and

CAVE2 Unity Tutorial CAVE2 unity tutorial on github Omicron Cave example unity scene Cave2

The Entity-Relationship Model ER Model - Part 2: Conversion to SQL By Michael Hahsler Based on

Entities The Entity-Relationship Model name ssn lot Employees Database Management Systems, R.

The Entity-Relationship Requirements analysis Model Conceptual design data model

Sustainable Development and Enterprise Value Clay Williams Manager, Governance Science Research

KDI EER: The Extended ER Model Fausto Giunchiglia and Mattia Fumagallli University of Trento

Database Design . CO19-320302 Databases & Web Services (P. Baumann) 1 Core Database Design

CSE 232A Graduate Database Systems Arun Kumar Topic 5: Data Integration and Cleaning Slide

Named Entity Recognition Using BERT and ELMo Group 8 : Mikaela Guerrero Vikash Kumar Nitya

Tutorial Overview https://kgtutorial.github.io Part 1: Knowledge - PowerPoint PPT Presentation

Tutorial Overview https://kgtutorial.github.io Part 1: Knowledge Graphs Part 2: Part 3: Knowledge Graph Extraction Construction Part 4: Critical Analysis 1 Tutorial Outline 1. Knowledge Graph Primer [Jay] 2. Knowledge Extraction Primer

Tutorial Tutorial A2 is out, its called Inpainting Tutorial Tutorial A2 is out, its called

A GAMS TUTORIAL A GAMS TUTORIAL A GAMS TUTORIAL WHAT IS GAMS ? General Algebraic Modeling

Excel Tutorial 1 Getting Started with Excel Tutorial 2 Formatting a Workbook Tutorial 3

Do Fifty- Two Motivation Overview of the Language

PROGRAMMING TUTORIAL Thierry Lepley, April 4 th 2016 TUTORIAL GOAL Intermediate Tutorial for

UPPAAL Tutorial UPPAAL Tutorial UPPAAL Tutorial Introduction Introduction Alexandre David

PowerPoint Tutorial 1 Creating a Presentation Tutorial 2 Applying and Modifying Text and

Tutorial: TF-Ranking for sparse features Tutorial: TF-Ranking for sparse features This tutorial

Comp 1402 Winter 2008 Tutorial #1 Tutorial 1 The objectives of this tutorial will be:

XDP hands-on tutorial Jesper Dangaard Brouer Toke Hiland-Jrgensen Bornhack Gelsted, August

Prose tutorial Edit New Page Sumit Gulwani edited this page 9 minutes ago 60 revisions

Tutorial on using the Google Cloud Platform (GCP) Tutorial on using the Google Cloud Platform

CS 525M Mobile and Ubiquitous Computing Tutorial 1: Introduction by Bucky Roberts (thenewboston)

CAVE2 Unity Tutorial CAVE2 unity tutorial on github Omicron Cave example unity scene Cave2

NLP Programming Tutorial 0 - Programming Basics Graham Neubig Nara Institute of Science and

CAVE2 Unity Tutorial CAVE2 unity tutorial on github Omicron Cave example unity scene Cave2

The Entity-Relationship Model ER Model - Part 2: Conversion to SQL By Michael Hahsler Based on

Entities The Entity-Relationship Model name ssn lot Employees Database Management Systems, R.

The Entity-Relationship Requirements analysis Model Conceptual design data model

Sustainable Development and Enterprise Value Clay Williams Manager, Governance Science Research

KDI EER: The Extended ER Model Fausto Giunchiglia and Mattia Fumagallli University of Trento

Database Design . CO19-320302 Databases &amp; Web Services (P. Baumann) 1 Core Database Design

CSE 232A Graduate Database Systems Arun Kumar Topic 5: Data Integration and Cleaning Slide

Named Entity Recognition Using BERT and ELMo Group 8 : Mikaela Guerrero Vikash Kumar Nitya

Database Design . CO19-320302 Databases & Web Services (P. Baumann) 1 Core Database Design