A Hybrid Neural Model for Type Classification of Entity Mentions - - PowerPoint PPT Presentation

a hybrid neural model for type
SMART_READER_LITE
LIVE PREVIEW

A Hybrid Neural Model for Type Classification of Entity Mentions - - PowerPoint PPT Presentation

A Hybrid Neural Model for Type Classification of Entity Mentions Motivation Types group entities to categories Entity types are important for various NLP tasks Question answering Relation extraction Semantic role labeling


slide-1
SLIDE 1

A Hybrid Neural Model for Type Classification of Entity Mentions

slide-2
SLIDE 2

▪ Types group entities to categories ▪ Entity types are important for various NLP tasks

▪ Question answering ▪ Relation extraction ▪ Semantic role labeling ▪ …

▪ Our task

▪ predict an entity mention’s type

Motivation

slide-3
SLIDE 3

▪ Input

▪ 𝑑−𝑇 … 𝑑−1 left context 𝑥1 … 𝑥𝑜 mention 𝑑1 … 𝑑𝑇 right context

▪ Output

▪ Type

▪ [an initiative sponsored by][Bill & Melinda Gates Foundation][to fight HIV infection]

Type Classification of Entity Mentions

Organization

slide-4
SLIDE 4

▪ Mention

▪ Bill & Melinda Gates Foundation (Organization) ▪ Bill, Melinda, Gates -> {Person Name} ▪ {Person Name} + Foundation -> Organization

▪ Context

▪ [The greater part of ][Gates][ ' population is in Marion County .] (Location) ▪ [Gates][ was a baseball player .] (Person)

slide-5
SLIDE 5

▪ Named Entity Recognition

▪ Limited types

▪ Location, Person, Organization, Misc

▪ (e.g.) Question Answering

▪ Questions are classified into more answer types

▪ Named Entity Linking (1. Link to an entity in knowledge base 2. Query its entity type)

▪ Performance drops for uncommon entities ▪ (e.g.) Question Answering

▪ Extracted answer candidate may not appear in knowledge base

▪ NEL is a harder problem than type classification

▪ Design rich features

▪ N-gram, morphological features, gazetteers, WordNet, ReVerb patterns, POS tags, dependency parsing results, etc.

Related Work

slide-6
SLIDE 6

Architecture

Mention Model Context Model Context Model Decision Model (softmax classifier)

𝒛 =

1 σ𝑘=1

𝐷

𝑓𝜾𝒌𝒚

𝑓𝜾𝟐𝒚 ⋮ 𝑓𝜾𝑫𝒚

slide-7
SLIDE 7

▪ Learn composition patterns for entity mention

▪ {Name} + Foundation / University -> (Organization) ▪ {Body Region} + {Disease} -> (Disease)

▪ Recurrent Neural Networks (Elman Networks)

▪ Use a global composition matrix to compute representation recurrently ▪ A natural way to learn composition patterns

RNN-based Mention Model

Bill & Melinda Gates Foundation

𝒘𝟑 = 𝑢𝑏𝑜ℎ(𝑋 𝒙𝟐 𝒙𝟑 + 𝒄𝒏) 𝒘𝟒 = 𝑢𝑏𝑜ℎ(𝑋 𝒘𝟑 𝒙𝟒 + 𝒄𝒏) 𝒘𝟔 = 𝑢𝑏𝑜ℎ(𝑋 𝒘𝟓 𝒙𝟔 + 𝒄𝒏) … 𝒙𝟐 𝒙𝟑 𝒙𝟒 𝒙𝟓 𝒙𝟔 Mention Representation

slide-8
SLIDE 8

▪ Use context to disambiguate

▪ [The greater part of ][Gates][ ' population is in Marion County .] (Location) ▪ [Gates][ was a baseball player .] (Person)

▪ MultiLayer Perceptrons

▪ Location-aware, jointly trained

MLP-based Context Model

an initiative sponsored by

Concatenation Layer Hidden Layer

to fight HIV infection

Left Context Representation Right Context Representation

slide-9
SLIDE 9

▪ Objective function ▪ Back-propagation algorithm

▪ Back-propagate errors of softmax classifier to other layers

▪ Optimization

▪ Mini-batched AdaGrad

Model Training

minimize

𝜄

− ෍

𝑗

𝑘

𝒖𝑘

𝑗 log 𝒛𝑘 𝑗 cross entropy loss

+ 𝜇𝜄 2 𝜄 2

2 regularization

slide-10
SLIDE 10

Automatically Generating Training Data

Wikipedia Article Wikipedia ID

DBpedia

DBpedia Entity

Organization rdf:type

Anchor link

Mention Context Context

slide-11
SLIDE 11

Automatically Generating Training Data

▪ DBpedia ontology

▪ 22 top-level types

▪ Wiki-22

▪ #Train: 2 million ▪ #Dev: 0.1 million ▪ #Test: 0.28 million

slide-12
SLIDE 12

▪ micro-F1 / macro-F1 score ▪ Baseline methods

▪ Support Vector Machine (SVM) ▪ Multinomial Naive Bayes (MNB) ▪ Sum word vectors (ADD)

▪ Use a softmax classifier

▪ *-mention

▪ Only use mention

▪ *-context

▪ Only use context

▪ *-joint

▪ Use both mention and context

Evaluation on Wiki-22

slide-13
SLIDE 13

▪ HYENA [Yosef et al., 2012]

▪ Support Vector Machine ▪ unigrams, bigrams, and trigrams of mentions, surrounding sentences, mention paragraphs, part-of-speech tags of context words, gazetteer dictionary

▪ FIGER [Ling and Weld, 2012]

▪ Perceptron ▪ unigrams, word shapes, part-of-speech tags, length, Brown clusters, head words, dependency structures, ReVerb patterns

Comparison with Previous Systems

slide-14
SLIDE 14

▪ Evaluate on unseen mentions (length > 2)

▪ Mentions which do not appear in the train set

▪ Help us deal with uncommon or unseen mentions

▪ RNN-based mention model utilizes the compositional nature of mentions

Evaluation on Unseen Mentions

slide-15
SLIDE 15

▪ Query similar mention examples

▪ cosine similarity of mentions' vector representations

▪ Mentions that are of similar patterns are closer

Examples: Compositionality of Mentions

slide-16
SLIDE 16

▪ Web-based QA system [Cucerzan and Agichtein, 2005; Lin, 2007]

▪ Add Q&A type interaction feature template

Evaluation in Question Answering (QA)

Q: who is the ceo of microsoft? Search Engine (Bing) Candidates Extracted from Titles and Snippets Ranker Answers Answer Type Classifier (18 types)

Person

[left context] [Satya Nadella] [right context] [left context] [Xbox] [right context]

Person Device

Feature Template: {Type(Q)|Type(A)} {Person|Person} – positive weight {Person|Device} – negative weight

Add

slide-17
SLIDE 17

▪ WebQuestions dataset [Berant et al., 2013]

▪ Manually annotated question-answer pairs

▪ Our type classifier improves the accuracy of QA systems

Evaluation in Question Answering (QA)

slide-18
SLIDE 18

▪ Conclusion

▪ Recurrent Neural Networks are good at learning soft patterns

▪ Compositional nature of entity mentions ▪ Generalize for Unseen or uncommon mentions

▪ Automatically generate training data instead of annotating manually ▪ Type information is important for many NLP tasks

▪ Future work

▪ Fine-grained type classification

▪ Person -> doctor, actor, etc.

▪ Utilize hierarchical taxonomy ▪ Multi-label ▪ Utilize global information (e.g., document topic) ▪ …

Conclusion and Future Work