Combining Distant and Partial Supervision for Relation Extraction - PowerPoint PPT Presentation

Combining Distant and Partial Supervision for Relation Extraction Gabor Angeli , Julie Tibshirani, Jean Y. Wu, Christopher D. Manning Stanford University October 28, 2014 Angeli, Tibshirani, Wu, Manning (Stanford) Combining Distant and Partial Supervision ... October 28, 2014 1 / 19

Motivation: Knowledge Base Completion Unstructured Text Structured Knowledge Base ⇒ Angeli, Tibshirani, Wu, Manning (Stanford) Combining Distant and Partial Supervision ... October 28, 2014 2 / 19

Motivation: Question Answering Angeli, Tibshirani, Wu, Manning (Stanford) Combining Distant and Partial Supervision ... October 28, 2014 3 / 19

Relation Extraction Input : Sentences containing (entity, slot value). Output : Relation between entity and slot value. Angeli, Tibshirani, Wu, Manning (Stanford) Combining Distant and Partial Supervision ... October 28, 2014 4 / 19

Relation Extraction Input : Sentences containing (entity, slot value). Output : Relation between entity and slot value. Consider two approaches: Supervised: Trivial as a supervised classifier. Training data: { (sentence, relation) } . But ... Angeli, Tibshirani, Wu, Manning (Stanford) Combining Distant and Partial Supervision ... October 28, 2014 4 / 19

Relation Extraction Input : Sentences containing (entity, slot value). Output : Relation between entity and slot value. Consider two approaches: Supervised: Trivial as a supervised classifier. Training data: { (sentence, relation) } . But ... this training data is expensive to produce. Angeli, Tibshirani, Wu, Manning (Stanford) Combining Distant and Partial Supervision ... October 28, 2014 4 / 19

Relation Extraction Input : Sentences containing (entity, slot value). Output : Relation between entity and slot value. Consider two approaches: Supervised: Trivial as a supervised classifier. Training data: { (sentence, relation) } . But ... this training data is expensive to produce. Distantly Supervised: Artificially produce “supervised” data. Training data: { (entity, relation, slot value) } . But ... Angeli, Tibshirani, Wu, Manning (Stanford) Combining Distant and Partial Supervision ... October 28, 2014 4 / 19

Relation Extraction Input : Sentences containing (entity, slot value). Output : Relation between entity and slot value. Consider two approaches: Supervised: Trivial as a supervised classifier. Training data: { (sentence, relation) } . But ... this training data is expensive to produce. Distantly Supervised: Artificially produce “supervised” data. Training data: { (entity, relation, slot value) } . But ... this training data is much more noisy. Angeli, Tibshirani, Wu, Manning (Stanford) Combining Distant and Partial Supervision ... October 28, 2014 4 / 19

Contribution: Combine Benefits of Both Adding carefully selected supervision improves distantly supervised relation extraction. Angeli, Tibshirani, Wu, Manning (Stanford) Combining Distant and Partial Supervision ... October 28, 2014 5 / 19

Contribution: Combine Benefits of Both Adding carefully selected supervision improves distantly supervised relation extraction. What is “carefully selected”: Propose new active learning criterion. Evaluate a number of questions: Angeli, Tibshirani, Wu, Manning (Stanford) Combining Distant and Partial Supervision ... October 28, 2014 5 / 19

Contribution: Combine Benefits of Both Adding carefully selected supervision improves distantly supervised relation extraction. What is “carefully selected”: Propose new active learning criterion. Evaluate a number of questions: Is the proposed criterion better than other methods? Angeli, Tibshirani, Wu, Manning (Stanford) Combining Distant and Partial Supervision ... October 28, 2014 5 / 19

Contribution: Combine Benefits of Both Adding carefully selected supervision improves distantly supervised relation extraction. What is “carefully selected”: Propose new active learning criterion. Evaluate a number of questions: Is the proposed criterion better than other methods? Where is the supervision helping? Angeli, Tibshirani, Wu, Manning (Stanford) Combining Distant and Partial Supervision ... October 28, 2014 5 / 19

Contribution: Combine Benefits of Both Adding carefully selected supervision improves distantly supervised relation extraction. What is “carefully selected”: Propose new active learning criterion. Evaluate a number of questions: Is the proposed criterion better than other methods? Where is the supervision helping? How far can we get with a supervised classifier? Angeli, Tibshirani, Wu, Manning (Stanford) Combining Distant and Partial Supervision ... October 28, 2014 5 / 19

Distant Supervision (Barack Obama, EmployedBy , United States) Angeli, Tibshirani, Wu, Manning (Stanford) Combining Distant and Partial Supervision ... October 28, 2014 6 / 19

Multiple-Instance Multiple-Label (MIML) Learning (Barack Obama, EmployedBy , United States) Angeli, Tibshirani, Wu, Manning (Stanford) Combining Distant and Partial Supervision ... October 28, 2014 6 / 19

Distant Supervision ↓ EmployedBy y y y x x x ↑ Barack Obama is the 44th and current president of the United States Angeli, Tibshirani, Wu, Manning (Stanford) Combining Distant and Partial Supervision ... October 28, 2014 7 / 19

Multiple-Instance y Latent per-mention relation → z 1 z 2 z 3 x 3 x 1 x 2 Angeli, Tibshirani, Wu, Manning (Stanford) Combining Distant and Partial Supervision ... October 28, 2014 8 / 19

Multiple-Instance Multiple-Label (MIML-RE) y n − 1 y 1 y 2 y n ... z 1 z 2 z 3 x 1 x 2 x 3 Angeli, Tibshirani, Wu, Manning (Stanford) Combining Distant and Partial Supervision ... October 28, 2014 8 / 19

Active Learning Old problem: Supervision is expensive, but very useful. Old solution: Active learning! Angeli, Tibshirani, Wu, Manning (Stanford) Combining Distant and Partial Supervision ... October 28, 2014 9 / 19

Active Learning Old problem: Supervision is expensive, but very useful. Old solution: Active learning! Select a subset of latent z to annotate. Fix these labels during training. Angeli, Tibshirani, Wu, Manning (Stanford) Combining Distant and Partial Supervision ... October 28, 2014 9 / 19

Active Learning Old problem: Supervision is expensive, but very useful. Old solution: Active learning! Select a subset of latent z to annotate. Fix these labels during training. Bonus: this creates a supervised training set. We initialize from a supervised classifier on this training set. Angeli, Tibshirani, Wu, Manning (Stanford) Combining Distant and Partial Supervision ... October 28, 2014 9 / 19

Active Learning Old problem: Supervision is expensive, but very useful. Old solution: Active learning! Select a subset of latent z to annotate. Fix these labels during training. Bonus: this creates a supervised training set. We initialize from a supervised classifier on this training set. Some Statistics 1,208,524 latent z which we could annotate. $0.13 per annotation. $160,000 to annotate everything. Angeli, Tibshirani, Wu, Manning (Stanford) Combining Distant and Partial Supervision ... October 28, 2014 9 / 19

Active Learning Old problem: Supervision is expensive, but very useful. Old solution: Active learning! Select a subset of latent z to annotate. Fix these labels during training. Bonus: this creates a supervised training set. We initialize from a supervised classifier on this training set. Some Statistics 1,208,524 latent z which we could annotate. $0.13 per annotation. $160,000 to annotate everything. New spin: Have to get it right the first time. Angeli, Tibshirani, Wu, Manning (Stanford) Combining Distant and Partial Supervision ... October 28, 2014 9 / 19

Example Selection Criteria Train k MIML-RE models on k subsets of the data. 1 y 1 y 2 y n − 1 y n y 1 y 2 y n − 1 y n y 1 y 2 y n − 1 y n y 1 y 2 y n − 1 y n y 1 y 2 y n − 1 y n ... ... ... ... ... z 1 z 2 z 3 z 1 z 2 z 3 z 1 z 2 z 3 z 1 z 2 z 3 z 1 z 2 z 3 x 1 x 2 x 3 x 1 x 2 x 3 x 1 x 2 x 3 x 1 x 2 x 3 x 1 x 2 x 3 Angeli, Tibshirani, Wu, Manning (Stanford) Combining Distant and Partial Supervision ... October 28, 2014 10 / 19

Combining Distant and Partial Supervision for Relation Extraction - PowerPoint PPT Presentation

Combining Distant and Partial Supervision for Relation Extraction Gabor Angeli , Julie Tibshirani, Jean Y. Wu, Christopher D. Manning Stanford University October 28, 2014 Angeli, Tibshirani, Wu, Manning (Stanford) Combining Distant and Partial

Noise2Self: Blind Denoising by Self-Supervision Joshua Batson Loc Royer Noisy Data

Distant Supervision and MultiR Happy Mittal We will discuss Distant Supervision [Mintz et

Supervision Strengthening Our Practice The plan Supervision what is it? Benefits

DISTANT SUPERVISION USING PROBABILISTIC GRAPHICAL MODELS Presented by: Sankalan Pal Chowdhury

Label-Free Distant Supervision for Relation Extraction via Knowledge Graph Embedding Guanying Wang

Robust Distant Supervision Relation Extraction via Deep Reinforcement Learning BUPT Pengda Qin ,

Overview Partial Constituent Fronting in German The phenomenon: Partial constituent fronting

Supervision Mandatory Webinar 4 Webinar overview I. Background II. Why supervision? III.

Cold Cold and and Hot Hot Baryons Baryons in in the the Most Most Distant Distant Galaxy

Definition. A relation R on a set A is called a partial ordering or a partial order if it is

Neural Distant Superv rvision for Relation Ext xtraction Deepanshu Jindal Elements and Images

Group and Commercial Insurer Supervision Presenter: Gerald Gakundi Assistant Director of

Effective Slot Filling Based on Shallow Distant Supervision Methods Benjamin Roth, Tassilo Barth,

Partial Functions and Categories of Partial Maps Science Atlantic at Acadia University Darien

Relation between things vs. a relation between people Lenin: Where the bourgeois economists

Part I: Soil Mechanics Volume-Volume relation Mass-Mass relation Mass-Volume relation

Security and Implementation Properties of ABC v.2 Vladimir Anashin Andrey Bogdanov 1 Ilya

FUELING PERFORMANCE NUTRITION FOR ATHLETES EN ENDURANCE E S SPEED EED H HYDRATION P

From Array Domains to Abstract Interpretation Under Store-Buffer-Based Memory Models Thibault

Town of Burlington Department of Public Works Planning Board Informational Presentation Weston

A meshless method for the Reissner-Mindlin plate equations based on a stabilized mixed weak form

Spelling Presentation Book, Grade 3 (SRA Reading Mastery, Signature Spelling Presentation Book,

Math 233 Warm Up Problems September 15, 2009 1. Find the gradients of the functions (a) x f ( x

PLANING CRITERIA TO LOCATE SWITCHES IN DISTRIBUTION SYSTEMS Juan M. Gers, PhD Cuernavaca, Mexico