Efficient induction of probabilistic word classes with LDA - - PowerPoint PPT Presentation

efficient induction of probabilistic word classes with lda
SMART_READER_LITE
LIVE PREVIEW

Efficient induction of probabilistic word classes with LDA - - PowerPoint PPT Presentation

Efficient induction of probabilistic word classes with LDA Grzegorz Chrupa la Saarland University IJCNLP 2011 G. Chrupala (Saarland Uni) Efficient word classes with LDA IJCNLP 2011 1 / 29 Word classes Berlin Bangkok Tokyo Warsaw


slide-1
SLIDE 1

Efficient induction of probabilistic word classes with LDA

Grzegorz Chrupa la

Saarland University

IJCNLP 2011

  • G. Chrupala (Saarland Uni)

Efficient word classes with LDA IJCNLP 2011 1 / 29

slide-2
SLIDE 2

Word classes

Berlin Bangkok Tokyo Warsaw Sarkozy Merkel Obama Berlusconi Mr Ms President Dr Groups of words sharing syntax/semantics Useful for generalization and abstraction

  • G. Chrupala (Saarland Uni)

Efficient word classes with LDA IJCNLP 2011 2 / 29

slide-3
SLIDE 3

Word classes as features

Have been successfully used in Named Entity recognition Syntactic parsing Sentence retrieval

  • G. Chrupala (Saarland Uni)

Efficient word classes with LDA IJCNLP 2011 3 / 29

slide-4
SLIDE 4

Brown clustering

Brown et al propose their algorithm in 1992 Agglomerative, hard clustering algorithm Minimizes MI between adjacent classes Still most commonly used word class type

  • G. Chrupala (Saarland Uni)

Efficient word classes with LDA IJCNLP 2011 4 / 29

slide-5
SLIDE 5

Brown’s weaknesses

1 Time complexity:

O(K2V )

  • G. Chrupala (Saarland Uni)

Efficient word classes with LDA IJCNLP 2011 5 / 29

slide-6
SLIDE 6

Brown’s weaknesses

1 Time complexity:

O(K2V )

2 Hard clustering ◮ Each word form assigned to only one class ◮ Need separate classes for: ⋆ first name ⋆ last name ⋆ first name OR last name ⋆ last name OR city

  • G. Chrupala (Saarland Uni)

Efficient word classes with LDA IJCNLP 2011 5 / 29

slide-7
SLIDE 7

Word class induction with LDA addresses both issues

  • G. Chrupala (Saarland Uni)

Efficient word classes with LDA IJCNLP 2011 6 / 29

slide-8
SLIDE 8

LDA for topic modeling

For each topic z draw φz from a Dirichlet For each document d

◮ Draw a topic distribution θd from a Dirichlet ◮ Repeat until generated all the words in d ⋆ Draw a topic z from θd ⋆ Draw a word w from the φz

  • G. Chrupala (Saarland Uni)

Efficient word classes with LDA IJCNLP 2011 7 / 29

slide-9
SLIDE 9

LDA

  • G. Chrupala (Saarland Uni)

Efficient word classes with LDA IJCNLP 2011 8 / 29

slide-10
SLIDE 10

Topic vs word classes

Topics → Word classes Documents → Word types Words → Context features

  • G. Chrupala (Saarland Uni)

Efficient word classes with LDA IJCNLP 2011 9 / 29

slide-11
SLIDE 11

Krzysztof

arguesL arguesR directorL directorL editsR saidR BledkowskiR KieslowskiR KieslowskiR RutkowskiR SikorskiR andL

  • G. Chrupala (Saarland Uni)

Efficient word classes with LDA IJCNLP 2011 10 / 29

slide-12
SLIDE 12

Generative process

For each class z draw φz from a Dirichlet For each word type d

◮ Draw a class distribution θd from a Dirichlet ◮ Repeat ⋆ Draw a word class z from θd ⋆ Draw a context feature w from the φz

  • G. Chrupala (Saarland Uni)

Efficient word classes with LDA IJCNLP 2011 11 / 29

slide-13
SLIDE 13

Induced distributions

θd: class distribution given word type φz: feature distribution given class

  • G. Chrupala (Saarland Uni)

Efficient word classes with LDA IJCNLP 2011 12 / 29

slide-14
SLIDE 14

Soft clustering

Martin Cameron

chief Gingrich Martin Newt Van Scott Roberts

  • Mr. Ms. John Robert President Dr. David

Street General Texas Fidelity State California

  • G. Chrupala (Saarland Uni)

Efficient word classes with LDA IJCNLP 2011 13 / 29

slide-15
SLIDE 15

Context

Newt, Speaker

  • executive, operating

says, Chairman

  • Clinton, Dole, J.

Wall, West, East • County, AG, Journal

  • G. Chrupala (Saarland Uni)

Efficient word classes with LDA IJCNLP 2011 14 / 29

slide-16
SLIDE 16

Efficiency

Brown: O(K2V ) LDA: O(KN) Scaling feature counts by 1

m reduces LDA

runtime m times

  • G. Chrupala (Saarland Uni)

Efficient word classes with LDA IJCNLP 2011 15 / 29

slide-17
SLIDE 17

Testing efficiency in practice

60M words of North American News Text LDA, Brown: 100, 200, 500, 1000 classes LDA counts scaled by 1

3

  • G. Chrupala (Saarland Uni)

Efficient word classes with LDA IJCNLP 2011 16 / 29

slide-18
SLIDE 18

Runtimes

50 100 200 500 0.2 1.0 5.0 50.0 Runtime hours

  • brown

lda

  • G. Chrupala (Saarland Uni)

Efficient word classes with LDA IJCNLP 2011 17 / 29

slide-19
SLIDE 19

Semi-supervised learning performance

Use word classes as features Brown

◮ different levels of hierarchy

LDA

◮ class distributions and context information

Explore several class granularities

  • G. Chrupala (Saarland Uni)

Efficient word classes with LDA IJCNLP 2011 18 / 29

slide-20
SLIDE 20

Fine-grained NER on BBN

animal cardinal age date duration disease building highway-street city country state-province law continent region money nationality political

  • rdinal corporation educational

government percent person plant vehicle weight chemical drug food time

  • G. Chrupala (Saarland Uni)

Efficient word classes with LDA IJCNLP 2011 19 / 29

slide-21
SLIDE 21

F1 error

50 200 1000 8 10 12 14

  • brown

lda

  • G. Chrupala (Saarland Uni)

Efficient word classes with LDA IJCNLP 2011 20 / 29

slide-22
SLIDE 22

Morphological analysis

Token Lemma MSD Gloss Pero pero cc but cuando cuando cs when era ser vsii3s0 he was ni˜ no ni˜ no ncms000 boy le el pp3csd00 to him gustaba gustar vmii3p0 it pleased

  • G. Chrupala (Saarland Uni)

Efficient word classes with LDA IJCNLP 2011 21 / 29

slide-23
SLIDE 23

MA results with Morfette

Brown: 500 classes LDA: 50 classes on Spanish, 100 on French

Spanish French Baseline Brown LDA 2 4 6

  • G. Chrupala (Saarland Uni)

Efficient word classes with LDA IJCNLP 2011 22 / 29

slide-24
SLIDE 24

Semantic relation classification

Task defined at Semeval 2007 and 2010 The bowl was full of apples, pears and

  • ranges

content-container(pears, bowl)

  • G. Chrupala (Saarland Uni)

Efficient word classes with LDA IJCNLP 2011 23 / 29

slide-25
SLIDE 25

Relation inventory

cause-effect instrument-agency product-producer content-container entity-origin entity-destination component-whole member-collection communication-topic

  • G. Chrupala (Saarland Uni)

Efficient word classes with LDA IJCNLP 2011 24 / 29

slide-26
SLIDE 26

Relation classification results

500 Brown classes, 100 LDA classes

LDA Brown Baseline 5 10 15 20 25

  • G. Chrupala (Saarland Uni)

Efficient word classes with LDA IJCNLP 2011 25 / 29

slide-27
SLIDE 27

LDA RC would rank third in Semeval 2010 Without PropBank, FrameNet, WordNet, NomLex, Text Runner, Cyc...

  • G. Chrupala (Saarland Uni)

Efficient word classes with LDA IJCNLP 2011 26 / 29

slide-28
SLIDE 28

To conclude:

Efficient induction of Probabilistic word classes which Match or improve on hierarchical Brown classes

  • G. Chrupala (Saarland Uni)

Efficient word classes with LDA IJCNLP 2011 27 / 29

slide-29
SLIDE 29

Thank you

  • G. Chrupala (Saarland Uni)

Efficient word classes with LDA IJCNLP 2011 28 / 29

slide-30
SLIDE 30

Relation classification

20 30 40 50

  • baseline

lda

  • G. Chrupala (Saarland Uni)

Efficient word classes with LDA IJCNLP 2011 29 / 29