Lecture 24: Relation Extraction Kai-Wei Chang CS @ University of - - PowerPoint PPT Presentation

lecture 24 relation extraction
SMART_READER_LITE
LIVE PREVIEW

Lecture 24: Relation Extraction Kai-Wei Chang CS @ University of - - PowerPoint PPT Presentation

Lecture 24: Relation Extraction Kai-Wei Chang CS @ University of Virginia kw@kwchang.net Couse webpage: http://kwchang.net/teaching/NLP16 CS6501-NLP 1 Goal v Acquire structured knowledge from text CS6501-NLP 2 Information extraction v


slide-1
SLIDE 1

Lecture 24: Relation Extraction

Kai-Wei Chang CS @ University of Virginia kw@kwchang.net Couse webpage: http://kwchang.net/teaching/NLP16

1 CS6501-NLP

slide-2
SLIDE 2

Goal

v Acquire structured knowledge from text

CS6501-NLP 2

slide-3
SLIDE 3

Information extraction

v Entities recognition

vIdentify name entities: People, Organization, Location, Times, Dates, etc. vor genes, proteins, diseases, etc.

v Relation extraction

vLocation in, employed by, married to

CS6501-NLP 3

slide-4
SLIDE 4

Example

CS6501-NLP 4

slide-5
SLIDE 5

Why relation extraction?

v Create structured knowledge bases v Augment structured knowledge bases v Support question answering v The first step for event extraction and storyline extraction v …

CS6501-NLP 5

slide-6
SLIDE 6

Relation types (closed domain)

v 17 relations from Automated Content Extraction (ACE)

CS6501-NLP 6

Credit: Dan Jurafsky

slide-7
SLIDE 7

Relation types (closed domain)

v UMLS: Unified Medical Language System v 134 entity types, 54 relations

CS6501-NLP 7

slide-8
SLIDE 8

Relation types (open domain)

v Freebase: thousand relations/million entities

CS6501-NLP 8

slide-9
SLIDE 9

Wikipedia Infobox

CS6501-NLP 9

slide-10
SLIDE 10

CS6501-NLP 10

|undergrad = 15,669<ref name=facts/> |postgrad = 6,316<ref name=facts/> |city = [[Charlottesville, Virginia|Charlottesville]]|state = [[Virginia]]|country = U.S. |campus = [[Charlottesville, Virginia metropolitan area|Small city]]<br />{{convert|1682|acre|km2}}<br />[[World Heritage Site]]

slide-11
SLIDE 11

How to build relation extractors (closed domain)

v Hand-written patterns v Supervised machine learning

vTake each sentence as input vIdentify name entities (mentions) vPerform multi-class classifications

v + constraints or features to model correlations

CS6501-NLP 11

slide-12
SLIDE 12

CS6501-NLP 12

slide-13
SLIDE 13

How to build relation extractors (open domain)

v Bootstrap learning [Brin 98, …]

v Use seed instances to extract a set of relational patterns

v Unsupervised learning

v Cluster sentences based on relational patterns

v Distant supervision

Distant supervision for relation extraction without labeled data [Mintz 09+]

vCombine the above approaches

CS6501-NLP 13

slide-14
SLIDE 14

v A follow-up approach:

Relation Extraction with Matrix Factorization and Universal Schemas [Riedel 13+]

CS6501-NLP 14