Modeling Syntactic Structures of Topics with a Nested HMM LDA Jing - - PowerPoint PPT Presentation

modeling syntactic structures of topics with a nested hmm
SMART_READER_LITE
LIVE PREVIEW

Modeling Syntactic Structures of Topics with a Nested HMM LDA Jing - - PowerPoint PPT Presentation

Modeling Syntactic Structures of Topics with a Nested HMM LDA Jing Jiang Singapore Management University Singapore Management University Topic Models A generative model for discovering hidden topics from documents Topics are


slide-1
SLIDE 1

Modeling Syntactic Structures of Topics with a Nested HMM‐LDA

Jing Jiang Singapore Management University Singapore Management University

slide-2
SLIDE 2

Topic Models

  • A generative model for discovering hidden topics from

documents

  • Topics are represented as word distributions
  • Topics are represented as word distributions

This makes full synchrony

  • f activated units the

default condition in the model cortex, as in Brown s model [Brown and s model [Brown and Cooke, 1996], so that the background activation is h d b d coherent, and can be read into high order cortical levels which synchronize

12/7/2009 ICDM 2009 2

y with it.

slide-3
SLIDE 3

How to Interpret Topics?

  • List the top‐K frequent words

– Not easy to interpret – How are the top words related to each other?

  • Our method: model the syntactic structures of

Our method: model the syntactic structures of topics using a combination of hidden Markov models (HMM) and topic models (LDA) models (HMM) and topic models (LDA)

  • A preliminary solution towards meaningful

representations of topics representations of topics

12/7/2009 ICDM 2009 3

slide-4
SLIDE 4

Related Work on Syntactic LDA

  • Similar to / based on [Griffiths et al. 05]

– More general, with multiple “semantic classes”

  • [Boyd‐Graber & Blei 09]

– Combines parse trees with LDA Combines parse trees with LDA – Expensive to obtain parse trees for large text collections

  • [Gruber et al. 07]

– Combines HMM with LDA – Combines HMM with LDA – Does not model syntax

12/7/2009 ICDM 2009 4

slide-5
SLIDE 5

HMM to Model Syntax

  • In natural language sentences, the syntactic

class of a word occurrence (noun, verb, adjective, adverb, preposition, etc.) depends

  • n its context
  • Transitions between syntactic classes follow

some structure

  • HMMs can be used to model these transitions

– HMM‐based part‐of‐speech tagger – HMM‐based part‐of‐speech tagger

12/7/2009 ICDM 2009 5

slide-6
SLIDE 6

Overview of Our Model

  • Assumptions

– A topic is represented as an HMM – C Content states: convey semantic meanings of topics (likely to be nouns, verbs, adjectives, etc.) – F Functional states: serve linguistic functions (e.g. prepositions and articles)

  • Word distributions of these functional states are shared

among topics

Each document has a mixture of topics – Each document has a mixture of topics – Each sentence is generated from a single topic

12/7/2009 ICDM 2009 6

slide-7
SLIDE 7

Overview of Our Model

12/7/2009 ICDM 2009 7

slide-8
SLIDE 8

Overview of Our Model

Topics

12/7/2009 ICDM 2009 8

slide-9
SLIDE 9

Overview of Our Model

Topics States

12/7/2009 ICDM 2009 9

slide-10
SLIDE 10

Overview of Our Model

Topics States

12/7/2009 ICDM 2009 10

slide-11
SLIDE 11

The n‐HMM‐LDA Model

12/7/2009 ICDM 2009 11

slide-12
SLIDE 12

Document Generation Process

Sample topics and transition probabilities

12/7/2009 ICDM 2009 12

slide-13
SLIDE 13

Document Generation Process

Sample a topic distribution for the document (same as in LDA)

12/7/2009 ICDM 2009 13

slide-14
SLIDE 14

Document Generation Process

Sample a topic for a sentence

12/7/2009 ICDM 2009 14

slide-15
SLIDE 15

Document Generation Process

Generate the words in the sentence using the HMM sentence using the HMM corresponding to this topic

12/7/2009 ICDM 2009 15

slide-16
SLIDE 16

Variations

  • Transition probabilities between states can be

either topic‐specific (left) or shared by all topics (right)

12/7/2009 ICDM 2009 16

slide-17
SLIDE 17

Model Inference: Gibbs Sampling

  • Sample a topic for a sentence
  • Sample a state for a word

12/7/2009 ICDM 2009 17

slide-18
SLIDE 18

Experiments – Data Sets

  • NIPS publications (downloaded from

http://nips.djvuzone.org/txt.html)

  • Reuters‐21578

Date Sets NIPS Publications* Reuters‐21578 Vocabulary 18,864 10,739 Words 5,305,230 1,460,666 documents for training 1314 8052 documents for training 1314 8052 documents for testing 618 2665

12/7/2009 ICDM 2009 18

slide-19
SLIDE 19

Quantitative Evaluation

  • Perplexity: a commonly used metric for the

generalization power of language models

  • For a test document, observe the first K

sentences and predict the remaining sentences and predict the remaining sentences

12/7/2009 ICDM 2009 19

slide-20
SLIDE 20

LDA vs. LDA‐s

  • LDA‐s: n‐HMM‐LDA with a single state for each

HMM.

– Same as standard LDA with each sentence having a single topic

NIPS Reuters

One topic per sentence assumption helps

12/7/2009 ICDM 2009 20

One‐topic‐per‐sentence assumption helps.

slide-21
SLIDE 21

HMM

  • Achieves much lower perplexity, but cannot

be used to discover topics

NIPS

12/7/2009 ICDM 2009 21

NIPS

slide-22
SLIDE 22

Increase Number of Functional States

  • Fixing the number of content states to 1 and

the number of topics to 40

NIPS Reuters NIPS Reuters

More functional states decreases perplexity

12/7/2009 ICDM 2009 22

More functional states decreases perplexity.

slide-23
SLIDE 23

Qualitative Evaluation

  • Use the top frequent words to represent a

topic/state

12/7/2009 ICDM 2009 23

slide-24
SLIDE 24

Sample Topics/States from LDA/HMM

12/7/2009 ICDM 2009 24

NIPS

slide-25
SLIDE 25

Sample States from n‐HMM‐LDA‐d

12/7/2009 ICDM 2009 25

NIPS

slide-26
SLIDE 26

Different Content States

12/7/2009 ICDM 2009 26

slide-27
SLIDE 27

Case Study (LDA)

This makes full synchrony

  • f activated units the

the f is the that

default condition in the model cortex, as in Brown s model [Brown and

  • f

a signal and that be are can

  • f

the in

s model [ rown and Cooke, 1996], so that the background activation is coherent and can be read

and to in frequency can

  • ne

it to and cells to

coherent, and can be read into high order cortical levels which synchronize

frequency is * for cell model a response

with it.

response

12/7/2009 ICDM 2009 27

slide-28
SLIDE 28

Case Study (n‐HMM‐LDA)

This makes full synchrony

  • f activated units the

cells ll * receptive synaptic

default condition in the model cortex, as in Brown s model [Brown and

cell * neurons field synaptic inhibitory head excitatory

s model [ rown and Cooke, 1996], so that the background activation is coherent and can be read

field input response model excitatory direction cell visual

coherent, and can be read into high order cortical levels which synchronize

model activity synapses pyramidal

with it.

12/7/2009 ICDM 2009 28

slide-29
SLIDE 29

Case Study (Comparison)

This makes full synchrony

  • f activated units the

This makes full synchrony

  • f activated units the

default condition in the model cortex, as in Brown s model [Brown and default condition in the model cortex, as in Brown s model [Brown and s model [ rown and Cooke, 1996], so that the background activation is coherent and can be read s model [ rown and Cooke, 1996], so that the background activation is coherent and can be read coherent, and can be read into high order cortical levels which synchronize coherent, and can be read into high order cortical levels which synchronize with it. with it. LDA n‐HMM‐LDA

12/7/2009 ICDM 2009 29

LDA

slide-30
SLIDE 30

Conclusion

  • We proposed a nested‐HMM‐LDA to model

the syntactic structures of topics

– Extension of [Griffiths et al. 05]

  • Experiments on two data sets show that

p

– The model achieves perplexity between LDA and HMM – The model can provide more insights into the structures of topics than standard LDA

12/7/2009 ICDM 2009 30

slide-31
SLIDE 31

Thank You!

  • Questions?

12/7/2009 ICDM 2009 31