CORE: Context-Aware Open Relation Extraction with Factorization - PowerPoint PPT Presentation

CORE: Context-Aware Open Relation Extraction with Factorization Machines Fabio Petroni Luciano Del Corro Rainer Gemulla

Open relation extraction I Open relation extraction is the task of extracting new facts for a potentially unbounded set of relations from various sources natural knowledge language bases text EMNLP 2015. September 17-21, 2015. Lisbon, Portugal. 2 of 21

Input data: facts from natural language text open information extractor extract all surface relation facts in text Enrico Fermi was a surface fact rel sub obj professor in theoretical physics at Sapienza "professor at"(Fermi,Sapienza) University of Rome. tuple EMNLP 2015. September 17-21, 2015. Lisbon, Portugal. 3 of 21

Input data: facts from knowledge bases KB fact Fermi employee(Fermi,Sapienza) employee entity link natural e.g., string match heuristic language Sapienza "professor at"(Fermi,Sapienza) text KB relation surface fact EMNLP 2015. September 17-21, 2015. Lisbon, Portugal. 4 of 21

Relation extraction techniques taxonomy open close relation in-KB in-KB extraction out of-KB latent factors distant relation supervision models clustering set of predefined "black and white" relations approach tensor matrix completion completion RESCAL PITF NFE CORE (Nickel et al., 2011) (Drumond et al., 2012) (Riedel et al., 2013) (Petroni et al., 2015) limited scalability with the number restricted prediction space of relations; large prediction space EMNLP 2015. September 17-21, 2015. Lisbon, Portugal. 5 of 21

Matrix completion for open relation extraction (Caesar,Rome) 1 (Fermi,Rome) 1 (Fermi,Sapienza) 1 1 1 (de Blasio,NY) born in professor at mayor of employee tuples x relations surface relation KB relation EMNLP 2015. September 17-21, 2015. Lisbon, Portugal. 6 of 21

Matrix completion for open relation extraction (Caesar,Rome) ? ? ? 1 ? (Fermi,Rome) ? ? 1 ? (Fermi,Sapienza) 1 ? 1 ? ? 1 ? (de Blasio,NY) born in professor at mayor of employee tuples x relations surface relation KB relation EMNLP 2015. September 17-21, 2015. Lisbon, Portugal. 6 of 21

Matrix factorization I learn latent semantic representations of tuples and relations relation latent factor vector tuple latent dot product factor vector I leverage latent representations to predict new facts (Fermi,Sapienza) professor at 0.8 0.9 related with science -0.5 -0.3 related with sport I in real applications latent factors are uninterpretable EMNLP 2015. September 17-21, 2015. Lisbon, Portugal. 7 of 21

Matrix factorization CORE integrates contextual information into such models to improve prediction performance EMNLP 2015. September 17-21, 2015. Lisbon, Portugal. 7 of 21

Contextual information unspecific relation "join"(Peloso,Modest Mouse) surface relation Contextual information Tom Peloso joined person Modest Mouse to record entity types organization their fifth studio album. article topic named entity label entity recognizer with coarse- words record album grained type EMNLP 2015. September 17-21, 2015. Lisbon, Portugal. 8 of 21

Contextual information unspecific relation "join"(Peloso,Modest Mouse) Contextual information How to incorporate contextual How to incorporate contextual Tom Peloso joined person entity types Modest Mouse to record information within the model? information within the model? organization their fifth studio album. article topic surface relation words record album EMNLP 2015. September 17-21, 2015. Lisbon, Portugal. 8 of 21

CORE - latent representations of variables I associates latent representations f v with each variable v 2 V tuple (Peloso,Modest Mouse) relation join Peloso entities Modest Mouse latent factor person vectors organization context record album EMNLP 2015. September 17-21, 2015. Lisbon, Portugal. 9 of 21

CORE - modeling facts Surface KB Context x 1 1 0 0 1 0 0 0.5 0.5 0 0 0 1 0 1 x 2 0 1 0 0 0 1 0 0 0.5 0.5 1 0 1 0 … 1 0 0 0 1 0 0 0.5 0.5 0 0 1 0.6 0.4 x 3 x 4 0 0 1 0 0 1 0 0 0.5 0.5 1 0 1 0 “born in”(x,y) “professor at”(x,y) employee(x,y) Caesar,Rome Fermi,Rome Fermi,Sapienza Caesar Rome Fermi Sapienza person, person, physics history organization location relations tuples entities tuple types tuple topics I models the input data in terms of a matrix in which each row corresponds to a fact x and each column to a variable v I groups columns according to the type of the variables I in each row the values of each column group sum up to unity EMNLP 2015. September 17-21, 2015. Lisbon, Portugal. 10 of 21

CORE - modeling context Surface KB Context x 1 1 0 0 1 0 0 0.5 0.5 0 0 0 1 0 1 0 1 0 0 0 1 0 0 0.5 0.5 1 0 1 0 x 2 … x 3 1 0 0 0 1 0 0 0.5 0.5 0 0 1 0.6 0.4 x 4 0 0 1 0 0 1 0 0 0.5 0.5 1 0 1 0 “born in”(x,y) “professor at”(x,y) employee(x,y) Caesar,Rome Fermi,Rome Fermi,Sapienza Caesar Rome Fermi Sapienza person, person, physics history organization location relations tuples entities tuple types tuple topics I aggregates and normalizes contextual information by tuple B a fact can be observed multiple times with di ff erent context B there is no context for new facts (never observed in input) I this approach allows us to provide comprehensive contextual information for both observed and unobserved facts EMNLP 2015. September 17-21, 2015. Lisbon, Portugal. 11 of 21

CORE - factorization model Surface KB Context x 1 1 0 0 1 0 0 0.5 0.5 0 0 0 1 0 1 x 2 0 1 0 0 0 1 0 0 0.5 0.5 1 0 1 0 … 1 0 0 0 1 0 0 0.5 0.5 0 0 1 0.6 0.4 x 3 x 4 0 0 1 0 0 1 0 0 0.5 0.5 1 0 1 0 “born in”(x,y) “professor at”(x,y) employee(x,y) Caesar,Rome Fermi,Rome Fermi,Sapienza Caesar Rome Fermi Sapienza person, person, physics history organization location relations tuples entities tuple types tuple topics I uses factorization machines as underlying framework I associates a score s ( x ) with each fact x X X x v 1 x v 2 f T s ( x ) = v 1 f v 2 v 1 ∈ V v 2 ∈ V \{ v 1 } I weighted pairwise interactions of latent factor vectors EMNLP 2015. September 17-21, 2015. Lisbon, Portugal. 12 of 21

CORE - prediction goal produce a ranked list of tuples for each relation I rank reflects the likelihood that the corresponding fact is true I to generate this ranked list: B fix a relation r B retrieve all tuples t , s.t. the fact r(t) is not observed B add tuple context B rank unobserved facts by their scores EMNLP 2015. September 17-21, 2015. Lisbon, Portugal. 13 of 21

CORE - parameter estimation I parameters: Θ = { b v , f v | v 2 V } I all our observations are positive, no negative training data I Bayesian personalized ranking, open-world assumption x x- observed fact sampled negative evidence physics history Fermi Caesar professor at professor at person person Sapienza Rome (Fermi,Sapienza) (Caesar,Rome) organization location location fact tuple entities tuple context fact tuple entities tuple context I pairwise approach, x is more likely to be true than x - X maximize f ( s ( x ) � s ( x -)) x I stochastic gradient ascent Θ Θ + η r Θ ( ) EMNLP 2015. September 17-21, 2015. Lisbon, Portugal. 14 of 21

Experiments - dataset entity mentions 440k facts 15k facts linked using extracted from from string matching corpus I Contextual information entity type article metadata bag-of-word news desk (e.g., foreign desk) person sentences where descriptors (e.g., finances) organization the fact has been online section (e.g., sports) location section (e.g., a, d) extracted m t w publication year miscellaneous I letters to indicate contextual information considered EMNLP 2015. September 17-21, 2015. Lisbon, Portugal. 15 of 21

Experiments - methodology I we consider (to keep experiments feasible): 10k tuples 19 Freebase relations 10 surface relations I for each relation and method: B we rank the tuples subsample B we consider the top-100 predictions and label them manually I evaluation metrics: number of true facts MAP (quality of the ranking) I methods: B PITF , tensor factorization method B NFE , matrix completion method (context-agnostic) B CORE , uses relations, tuples and entities as variables B CORE +m, +t, +w, +mt, +mtw EMNLP 2015. September 17-21, 2015. Lisbon, Portugal. 16 of 21

CORE: Context-Aware Open Relation Extraction with Factorization - PowerPoint PPT Presentation

CORE: Context-Aware Open Relation Extraction with Factorization Machines Fabio Petroni Luciano Del Corro Rainer Gemulla Open relation extraction I Open relation extraction is the task of extracting new facts for a potentially unbounded set of

Mohamed Thahir Traditional and Open Relation Extraction Read the Web Relation Extraction

Welcome Welcome Core: Core A Regional Destination Core: Core UL Core: Core Downtown

uf: Minimizing the Coq Extraction TCB Eric Mullen , Stuart Pernsteiner, James Wilcox, Zachary

Toolkit to Support Intelligibility in Context Aware Applications Context-Aware Applications P

Relation Extraction CSCI 699 Instructor: Xiang Ren USC Computer Science Relation extraction

Caching, Parallelism, Fault Tolerance Marco Serafini COMPSCI 532 Lectures 2-3 Memory Hierarchy

Soil Extraction Cell: An Alternative Soil Extraction Cell: An Alternative Method of Soil

Declarative Information Extraction Declarative Information Extraction Using Datalog Datalog with

Relation Extraction Prof. Sameer Singh CS 295: STATISTICAL NLP WINTER 2017 February 23, 2017

Zero-Shot Relation Extraction via Reading Comprehension Omer Levy Minjoon Seo Eunsol Choi Luke

? (entity type) Apr 23, 2007 NAACL-HLT 2 1 What Is Relation Extraction? hundreds of

IN4080 2020 FALL NATURAL LANGUAGE PROCESSING Jan Tore Lnning 2 IE: Relation extraction,

Relation between things vs. a relation between people Lenin: Where the bourgeois economists

Part I: Soil Mechanics Volume-Volume relation Mass-Mass relation Mass-Volume relation

Relation Schema Given domains D 1 , D 2 , . D n a relation r is a subset of D 1 x D 2 x

A Fine-grained and Noise-aware Method for Neural Relation Extraction ADVISOR: JIA-LING, KOH

Scalable Algorithms for Scholarly Figure Mining and Semantics Sagnik Ray Choudhury

STA - Static Timing Analysis STA Lecturer: Gil Rahav Semester B , EE Dept. BGU. Freescale

Learning From/For Knowledge Bases Graham Neubig Site https://phontron.com/class/nn4nlp2019/

ETL Overview Extract, Transform, Load (ETL) General ETL issues ETL/DW refreshment process

HenryFord Nuclear Engr & Rad. Science Health System mikef@umich.edu mikef@rad.hfh.edu

Wearable Displays with Focus Cues ! EE367/CS448I: Computational Imaging and Display !

E DUCATOR M OTIVATION , S ATISFACTION , AND P ERSISTENCE Educator Career and Pathway Survey

George M Briggs, PhD Society for Nutrition Education is Formed For information about George

CORE: Context-Aware Open Relation Extraction with Factorization - PowerPoint PPT Presentation

CORE: Context-Aware Open Relation Extraction with Factorization Machines Fabio Petroni Luciano Del Corro Rainer Gemulla Open relation extraction I Open relation extraction is the task of extracting new facts for a potentially unbounded set of

Mohamed Thahir Traditional and Open Relation Extraction Read the Web Relation Extraction

Welcome Welcome Core: Core A Regional Destination Core: Core UL Core: Core Downtown

uf: Minimizing the Coq Extraction TCB Eric Mullen , Stuart Pernsteiner, James Wilcox, Zachary

Toolkit to Support Intelligibility in Context Aware Applications Context-Aware Applications P

Relation Extraction CSCI 699 Instructor: Xiang Ren USC Computer Science Relation extraction

Caching, Parallelism, Fault Tolerance Marco Serafini COMPSCI 532 Lectures 2-3 Memory Hierarchy

Soil Extraction Cell: An Alternative Soil Extraction Cell: An Alternative Method of Soil

Declarative Information Extraction Declarative Information Extraction Using Datalog Datalog with

Relation Extraction Prof. Sameer Singh CS 295: STATISTICAL NLP WINTER 2017 February 23, 2017

Zero-Shot Relation Extraction via Reading Comprehension Omer Levy Minjoon Seo Eunsol Choi Luke

? (entity type) Apr 23, 2007 NAACL-HLT 2 1 What Is Relation Extraction? hundreds of

IN4080 2020 FALL NATURAL LANGUAGE PROCESSING Jan Tore Lnning 2 IE: Relation extraction,

Relation between things vs. a relation between people Lenin: Where the bourgeois economists

Part I: Soil Mechanics Volume-Volume relation Mass-Mass relation Mass-Volume relation

Relation Schema Given domains D 1 , D 2 , . D n a relation r is a subset of D 1 x D 2 x

A Fine-grained and Noise-aware Method for Neural Relation Extraction ADVISOR: JIA-LING, KOH

Scalable Algorithms for Scholarly Figure Mining and Semantics Sagnik Ray Choudhury

STA - Static Timing Analysis STA Lecturer: Gil Rahav Semester B , EE Dept. BGU. Freescale

Learning From/For Knowledge Bases Graham Neubig Site https://phontron.com/class/nn4nlp2019/

ETL Overview Extract, Transform, Load (ETL) General ETL issues ETL/DW refreshment process

HenryFord Nuclear Engr &amp; Rad. Science Health System mikef@umich.edu mikef@rad.hfh.edu

Wearable Displays with Focus Cues ! EE367/CS448I: Computational Imaging and Display !

E DUCATOR M OTIVATION , S ATISFACTION , AND P ERSISTENCE Educator Career and Pathway Survey

George M Briggs, PhD Society for Nutrition Education is Formed For information about George

HenryFord Nuclear Engr & Rad. Science Health System mikef@umich.edu mikef@rad.hfh.edu