Multi-level Representations for Fine-Grained Typing of Knowledge Base - - PowerPoint PPT Presentation

multi level representations for fine grained typing of
SMART_READER_LITE
LIVE PREVIEW

Multi-level Representations for Fine-Grained Typing of Knowledge Base - - PowerPoint PPT Presentation

Multi-level Representations for Fine-Grained Typing of Knowledge Base Entities Yadollah Yaghoobzadeh and Hinrich Schtze LMU Munich, Germany Presented by: Xiaotao Gu Knowledge Graph/Base Image retrieved from:


slide-1
SLIDE 1

Multi-level Representations

for Fine-Grained Typing

  • f Knowledge Base Entities

Yadollah Yaghoobzadeh and Hinrich Schütze LMU Munich, Germany Presented by: Xiaotao Gu

slide-2
SLIDE 2

Knowledge Graph/Base

Image retrieved from: https://www.ambiverse.com/knowledge-graphs-encyclopaedias-for-machines/

slide-3
SLIDE 3

Entity Typing

typing

Barack Obama

entity City State Person Country candidate types

feature

classifier

label

How to extract high quality feature representation?

Supervised Classification!

slide-4
SLIDE 4

Joint Representation

slide-5
SLIDE 5

Character-level Representation

Character-level Representation Max Pooling Convolution Layer concatenate Character Embedding + Lookup Table

s p a n i s h

Character Embedding of “S” Filter of size 2 (3 * 6 feature map) Filter of size 4 (4 * 4 feature map)

slide-6
SLIDE 6

Joint Representation

slide-7
SLIDE 7

Word-level Representation

Semantic meaning of words are useful for typing! E.g. “XXX Lake” implies $LOCATION

Image retrieved from: Professor Julia Hockenmaier’s slides for CS447: Natural Language Processing

Word embedding based on Distributional Hypothesis

Entity embedding = average word embedding E(Lake Michigan) = 0.5 * { E(Lake) + E(Michigan) } Subword/morphology in words are useful for typing! E.g. “Spanish” implies $LANGUAGE FastText: n-gram subword embedding!

FastText: https://research.fb.com/fasttext/

slide-8
SLIDE 8

Joint Representation

slide-9
SLIDE 9

Entity-level Representation

Capture Context Information Entity (Lake Michigan) Identifier (Lake_Michigan) Entity Embedding Corpus+SkipGram Unique Identifier

slide-10
SLIDE 10

Entity-level Representation

slide-11
SLIDE 11

Joint Representation

slide-12
SLIDE 12

Experiment

Task: Entity Typing Dataset: FIGMENT

  • 102 types
  • 200K Freebase entities, 60K for testing
  • 12K head entities (freq > 100), 10K tail entities (freq < 5)

Evaluation Metrics

  • Accuracy: correct iff all types of an entity are inferred correct and no wrong types are inferred
  • Micro average F1: F1 of all type-entity assignment decisions
  • Macro average F1: F1 of types assigned to an entity, averaged over entities
slide-13
SLIDE 13

Experiment

baseline Character-level

slide-14
SLIDE 14

Experiment

baseline Character-level Word-level

slide-15
SLIDE 15

Experiment

baseline Character-level Word-level Word + Character

slide-16
SLIDE 16

Experiment

baseline Character-level Word-level Word + Character Entity-level

slide-17
SLIDE 17

Experiment

baseline Character-level Word-level Word + Character Entity-level Entity + Word + Character

slide-18
SLIDE 18

Experiment

Character-Level: CNN performs best for capturing local features Word-Level: Subword improves the performance, especially for rare entities Joint entity-word-character information: achieves the best performance External Information: adding description text helps tail entities!

Add description

slide-19
SLIDE 19

Summary

High-quality Entity Representation from:

  • Context information: from large corpus
  • Surface name information: word, subword, character-level information
  • External knowledge: description, relational links
  • The joint representation is the most informative for entity typing.

Thank you!