Extracting semantic relations from unlabeled text Chandra Prakash - - PowerPoint PPT Presentation

extracting semantic relations from unlabeled text
SMART_READER_LITE
LIVE PREVIEW

Extracting semantic relations from unlabeled text Chandra Prakash - - PowerPoint PPT Presentation

Introduction Problem Statement Related Work Infinite Relational Model References Extracting semantic relations from unlabeled text Chandra Prakash Vishal Kumar Gupta Mentor: Dr. Amitabha Mukerjee March 21, 2013 Chandra Prakash, Vishal


slide-1
SLIDE 1

Introduction Problem Statement Related Work Infinite Relational Model References

Extracting semantic relations from unlabeled text

Chandra Prakash Vishal Kumar Gupta

Mentor: Dr. Amitabha Mukerjee

March 21, 2013

Chandra Prakash, Vishal Kumar Gupta CS365: Course Project

slide-2
SLIDE 2

Introduction Problem Statement Related Work Infinite Relational Model References

1 Introduction

Motivation Hardness

2 Problem Statement 3 Related Work 4 Infinite Relational Model

Algorithm

5 References

Chandra Prakash, Vishal Kumar Gupta CS365: Course Project

slide-3
SLIDE 3

Introduction Problem Statement Related Work Infinite Relational Model References Motivation Hardness

Motivation

A 14yo bxr owned by a reputable breeder is being treated for IBD with pred.

Chandra Prakash, Vishal Kumar Gupta CS365: Course Project

slide-4
SLIDE 4

Introduction Problem Statement Related Work Infinite Relational Model References Motivation Hardness

Motivation

[A 14yo bxr]ANIMAL owned by [a reputable breeder]HUMAN is being treated for [IBD]DISEASE with [pred]DRUG . [4]

Chandra Prakash, Vishal Kumar Gupta CS365: Course Project

slide-5
SLIDE 5

Introduction Problem Statement Related Work Infinite Relational Model References Motivation Hardness

Why the problem is hard ?

Huge amount of data available on the web No manual tags or labels available Don’t know exactly how many types of entities are present Presence of many irrelevant relations as noise

Chandra Prakash, Vishal Kumar Gupta CS365: Course Project

slide-6
SLIDE 6

Introduction Problem Statement Related Work Infinite Relational Model References

Problem Definition

Given a corpus of data of extracted relational tuples of the form r(a, b), clusters the data using their relationship and also determine the best match for a given relation.

Chandra Prakash, Vishal Kumar Gupta CS365: Course Project

slide-7
SLIDE 7

Introduction Problem Statement Related Work Infinite Relational Model References

Related Work

TextRunner: identifies relational tuples in one pass of the web [3] Semantic Network Extractor: Jointly cluster relation and object string [5] Infinite Relational Model [1]

Chandra Prakash, Vishal Kumar Gupta CS365: Course Project

slide-8
SLIDE 8

Introduction Problem Statement Related Work Infinite Relational Model References Algorithm

Algorithm Specification

P(z1, ....zn | R1, R2, ....Rm) Generative Model P(R1, R2, ....Rm, z1, ....zn) =

m

  • i=1

P(Ri | z1, ...zn)

n

  • j=1

P(zj) P(zj) is calculated using Chinese Restaurant Process R(i, j) | z, η(a, b) is calculated using Bernoulli Distribution Chinese Restaurant Process (CRP) also determines the number of clusters

Chandra Prakash, Vishal Kumar Gupta CS365: Course Project

slide-9
SLIDE 9

Introduction Problem Statement Related Work Infinite Relational Model References Algorithm

Output Matrics

Figure: Output Matrics

Chandra Prakash, Vishal Kumar Gupta CS365: Course Project

slide-10
SLIDE 10

Introduction Problem Statement Related Work Infinite Relational Model References

References

1 Kemp Charles, Tenenbaum Joshua B, Griffiths Thomas L, Yamada Takeshi, and Ueda Naonori. Learning systems of concepts with an infinite relational model. 21(1):381, 2006. 2 Turney Peter D, Pantel Patrick, et al. From frequency to meaning: Vector space models of semantics. Journal of Artificial Intelligence Research, 37(1):141–188, 2010. 3 Banko Michele. Open information extraction for the web. PhD thesis, University of Washington, 2009. 4 Huang Ruihong and Riloff Ellen. Inducing domain specific semantic class taggers from (almost) nothing. Proceedings of the Association for Computational Linguistics (ACL), 2010. 5 Kok Stanley and Domingos Pedro. Extracting semantic networks from text via relational clustering. Proceedings of ECML, 2008. Chandra Prakash, Vishal Kumar Gupta CS365: Course Project

slide-11
SLIDE 11

Introduction Problem Statement Related Work Infinite Relational Model References

Source code and Dataset

The Source code for IRM is publicly available at http://www.psy.cmu.edu/˜ckemp/code/irm.html The dataset is available at http://knight.cis.temple.edu/˜yates/data/resolver data.tar.gz

Chandra Prakash, Vishal Kumar Gupta CS365: Course Project

slide-12
SLIDE 12

Introduction Problem Statement Related Work Infinite Relational Model References

Questions !!

Chandra Prakash, Vishal Kumar Gupta CS365: Course Project

slide-13
SLIDE 13

Introduction Problem Statement Related Work Infinite Relational Model References

Formulae Specifications

Generative Model P(R1, R2, ....Rm, z1, ....zn) =

m

  • i=1

P(Ri | z1, ...zn)

n

  • j=1

P(zj) Generating Clusters (CRP) P(zi = a | z1, ..., zi−1) =

na i−1+γ if na > 0

P(zi = a | z1, ..., zi−1) =

γ i−1+γ if a is a new cluster

Generating Relations from clusters z | γ ∼ CRP(γ) η(a, b) | β ∼ Beta(β, β) R(i, j) | z, η ∼ Bernoulli(η(zi, zj)) Inference P(R | z) =

  • a,bǫN

Beta(m(a, b) + β), Beta( ¯ m(a, b) + β) Beta(β, β)

Chandra Prakash, Vishal Kumar Gupta CS365: Course Project

slide-14
SLIDE 14

Introduction Problem Statement Related Work Infinite Relational Model References

Semantic Network Extractor

Chandra Prakash, Vishal Kumar Gupta CS365: Course Project

slide-15
SLIDE 15

Introduction Problem Statement Related Work Infinite Relational Model References

Thank You

Chandra Prakash, Vishal Kumar Gupta CS365: Course Project