Cons nstruc ucting g Knowledg dge e Gr Graph ph from Un - - PowerPoint PPT Presentation

cons nstruc ucting g knowledg dge e gr graph ph
SMART_READER_LITE
LIVE PREVIEW

Cons nstruc ucting g Knowledg dge e Gr Graph ph from Un - - PowerPoint PPT Presentation

Cons nstruc ucting g Knowledg dge e Gr Graph ph from Un Unstruc uctur ured d Tex ext Kundan Kumar Siddhant Manocha Image Source: www.ibm.com/smarterplanet/us/en/ibmwatson/ MOTIVATION Image


slide-1
SLIDE 1

Cons nstruc ucting g Knowledg dge e Gr Graph ph from Un Unstruc uctur ured d Tex ext

Image Source: www.ibm.com/smarterplanet/us/en/ibmwatson/

Kundan Kumar Siddhant Manocha

slide-2
SLIDE 2

MOTIVATION

Image Source: KDD 2014 Tutorial on Constructing and Mining Web-scale Knowledge Graphs, New York

slide-3
SLIDE 3

Image Source: KDD 2014 Tutorial on Constructing and Mining Web-scale Knowledge Graphs, New York

MOTIVATION

slide-4
SLIDE 4

Image Source: KDD 2014 Tutorial on Constructing and Mining Web-scale Knowledge Graphs, New York

MOTIVATION

slide-5
SLIDE 5

PROBLEM STATEMENT

slide-6
SLIDE 6

KNOWLEDGE GRAPH

http://courses.cs.washington.edu/courses/cse517/13wi/slides/cse517wi13-RelationExtraction.pdf

slide-7
SLIDE 7

KNOWLEDGE GRAPH

http://courses.cs.washington.edu/courses/cse517/13wi/slides/cse517wi13-RelationExtraction.pdf

slide-8
SLIDE 8

QUESTION ANSWERING

slide-9
SLIDE 9

EXISTING KNOWLEDGE BASES

Image Source: KDD 2014 Tutorial on Constructing and Mining Web-scale Knowledge Graphs, New York

slide-10
SLIDE 10

EXISTING KNOWLEDGE BASES

Supervised Models:

  • Learn classifiers from +/- examples, typical features: context words + POS, dependency path between

entities, named entity tags

  • Require large number of tagged training examples
  • Cannot be generalized

Semi-Supervised Models:

  • Bootstrap Algorithms: Use seed examples to learn initial set of relations
  • Generate +ve/-ve examples to learn a classifier
  • Learn more relations using this classifier

Distant Supervision:

  • Existing knowledge base + unlabeled text generate examples
  • Learn models using this set of relations
slide-11
SLIDE 11

OUR APPROACH

Bootstrapping Relations using Distributed Word Vector Embedding

1) Word that occur in similar context lie close together in the word embedding space. 2) Word Vectors is semantically consistent and capture many linguistic properties (like 'capital city', 'native language', 'plural relations') 3) Obtain word vectors from unstructured text ( using Google word2vec, Glove, etc ) 4) Exploit the properties of the manifold to obtain binary relations between entities

slide-12
SLIDE 12

ALGORITHM

Image Source: KDD 2014 Tutorial on Constructing and Mining Web-scale Knowledge Graphs, New York

slide-13
SLIDE 13

SIMILARITY METRIC

Image Source:A survey on relation extraction, Nguyen Bach, Carnegie Mellon University

slide-14
SLIDE 14

KERNEL BASED APPROACHES

slide-15
SLIDE 15

DEPENDENCY KERNELS

1.Actual Sentences 2. Dependency Graph

Kernel: K(x,y)=3×1×1×1×2×1×3 = 18

3.Kernel Computation Image Source:A Shortest Path Dependency Kernel for Relation Extraction,Mooney,et al

slide-16
SLIDE 16

PRELIMINARY RESULTS

Word Vector Embedding: Wikipedia Corpus

slide-17
SLIDE 17

PRELIMINARY RESULTS(wikipedia corpus)

Seed Examples for capital relationship Positive relations learnt Negative Relations learnt

slide-18
SLIDE 18

PRELIMINARY RESULTS(google news corpus)

Seed Examples Positive Relations Learned Negative Relations Learned

slide-19
SLIDE 19

References

1)Tomas Mikolov, Wen-tau Yih, and Geoffrey Zweig. Linguistic Regularities in Continuous Space Word Representations. In Proceedings of NAACL HLT, 2013. 2) Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. Distributed Representations of

Words and Phrases and their Compositionality. In Proceedings of NIPS, 2013. 3)Eugene Agichtein Luis Gravano. Snowball: Extracting Relations from Large Plain-T ext

  • Collections. In Proceedings of the fifth ACM conference on Digital libraries, June 2000
slide-20
SLIDE 20

Questions!

slide-21
SLIDE 21

CBOW MODEL

  • input vector represented as 1-of-V

encoding

  • Linear sum of input vectors are

projected onto the projection layer

  • Hierarchical Softmax layer is used to

ensure that the weights in the

  • utput layer are between 0<=p<=1
  • Weights learnt using back-

propagation

  • The projection matrix from the

projection layer to the hidden layer give the word vector embeddings

Image Source: Linguistic Regularities in Continuous Space Word Representations,Mikolov,et.al 2013

slide-22
SLIDE 22

WORD VECTOR MODEL

slide-23
SLIDE 23

WORD VECTOR MODEL

slide-24
SLIDE 24

KERNEL BASED APPROACHES

Image Source:Kernel Methods for Relation Extraction,Zelenko,et al