predicting sequences structured perceptron
play

Predicting Sequences: Structured Perceptron CS 6355: Structured - PowerPoint PPT Presentation

Predicting Sequences: Structured Perceptron CS 6355: Structured Prediction 1 Conditional Random Fields summary An undirected graphical model Decompose the score over the structure into a collection of factors Each factor assigns a


  1. Predicting Sequences: Structured Perceptron CS 6355: Structured Prediction 1

  2. Conditional Random Fields summary • An undirected graphical model – Decompose the score over the structure into a collection of factors – Each factor assigns a score to assignment of the random variables it is connected to • Training and prediction – Final prediction via argmax w T Á ( x , y ) – Train by maximum (regularized) likelihood • Connections to other models – Effectively a linear classifier – A generalization of logistic regression to structures – An conditional variant of a Markov Random Field • We will see this soon 2

  3. Global features The feature function decomposes over the sequence y 0 y 1 y 2 y 3 x 𝒙 𝑈 𝜚(𝒚, 𝑧 0 , 𝑧 1 ) 𝒙 𝑈 𝜚(𝒚, 𝑧 + , 𝑧 2 ) 𝒙 𝑈 𝜚(𝒚, 𝑧 2 , 𝑧 3 ) 3

  4. Outline • Sequence models • Hidden Markov models – Inference with HMM – Learning • Conditional Models and Local Classifiers • Global models – Conditional Random Fields – Structured Perceptron for sequences 4

  5. � HMM is also a linear classifier Consider the HMM: 𝑄 𝐲, 𝐳 = 2 𝑄 𝑧 3 𝑧 34+ 𝑄 𝑦 3 𝑧 3 3 5

  6. � HMM is also a linear classifier Consider the HMM: 𝑄 𝐲, 𝐳 = 2 𝑄 𝑧 3 𝑧 34+ 𝑄 𝑦 3 𝑧 3 3 Transitions Emissions 6

  7. � � HMM is also a linear classifier Consider the HMM: 𝑄 𝐲, 𝐳 = 2 𝑄 𝑧 3 𝑧 34+ 𝑄 𝑦 3 𝑧 3 3 Or equivalently log 𝑄 𝐲, 𝐳 = : log 𝑄 𝑧 3 ∣ 𝑧 34+ + log 𝑄 𝑦 3 ∣ 𝑧 3 3 Log joint probability = transition scores + emission scores 7

  8. � HMM is also a linear classifier log 𝑄 𝐲, 𝐳 = : log 𝑄 𝑧 3 ∣ 𝑧 34+ + log 𝑄 𝑦 3 ∣ 𝑧 3 3 Log joint probability = transition scores + emission scores Let us examine this expression using a carefully defined set of indicator functions 8

  9. � HMM is also a linear classifier log 𝑄 𝐲, 𝐳 = : log 𝑄 𝑧 3 ∣ 𝑧 34+ + log 𝑄 𝑦 3 ∣ 𝑧 3 3 Log joint probability = transition scores + emission scores Let us examine this expression using a carefully defined set of indicator functions 𝐽 ? = @1, 𝑨 is true, 0, 𝑨 is false. Indicators are functions that map Booleans to 0 or 1 9

  10. � � � HMM is also a linear classifier log 𝑄 𝐲, 𝐳 = : log 𝑄 𝑧 3 ∣ 𝑧 34+ + log 𝑄 𝑦 3 ∣ 𝑧 3 3 Log joint probability = transition scores + emission scores Let us examine this expression using a carefully defined set of indicator functions Equivalent to 𝑡 L : : log 𝑄 𝑡 ⋅ 𝐽 N O PQ ⋅ 𝐽 [N OTUVWR ] Q R Q The indicators ensure that only one of the elements of the double summation is non-zero 10

  11. � � HMM is also a linear classifier log 𝑄 𝐲, 𝐳 = : log 𝑄 𝑧 3 ∣ 𝑧 34+ + log 𝑄 𝑦 3 ∣ 𝑧 3 3 Log joint probability = transition scores + emission scores Let us examine this expression using a carefully defined set of indicator functions Equivalent to : log 𝑄 𝑦 3 𝑡 ⋅ 𝐽 N O PQ Q The indicators ensure that only one of the elements of the summation is non-zero 11

  12. � � � � � � HMM is also a linear classifier log 𝑄 𝐲, 𝐳 = : log 𝑄 𝑧 3 ∣ 𝑧 34+ + log 𝑄 𝑦 3 ∣ 𝑧 3 3 Let us examine this expression using a carefully defined set of indicator functions 𝑡 L log 𝑄 𝐲, 𝐳 = : : : log 𝑄 𝑡 ⋅ 𝐽 N O PQ ⋅ 𝐽 [N OTUVWR ] Q R 3 Q + : : log 𝑄 𝑦 3 𝑡 ⋅ 𝐽 N O PQ 3 Q 12

  13. � � � � � � � � � � � HMM is also a linear classifier log 𝑄 𝐲, 𝐳 = : log 𝑄 𝑧 3 ∣ 𝑧 34+ + log 𝑄 𝑦 3 ∣ 𝑧 3 3 Let us examine this expression using a carefully defined set of indicator functions 𝑡 L log 𝑄 𝐲, 𝐳 = : : : log 𝑄 𝑡 ⋅ 𝐽 N O PQ ⋅ 𝐽 [N OTUVWR ] Q R 3 Q + : : log 𝑄 𝑦 3 𝑡 ⋅ 𝐽 N O PQ 3 Q log 𝑄 𝐲, 𝐳 = : : log 𝑄(𝑡 ∣ 𝑡 L ) : 𝐽 N O PQ 𝐽 [N OTUVWR ] Q R Q 3 + : log 𝑄 𝑦 3 𝑡 : 𝐽 N O PQ Q 3 13

  14. � � � � � � HMM is also a linear classifier log 𝑄 𝐲, 𝐳 = : log 𝑄 𝑧 3 ∣ 𝑧 34+ + log 𝑄 𝑦 3 ∣ 𝑧 3 3 Let us examine this expression using a carefully defined set of indicator functions 𝑡 L : 𝐽 N O PQ log 𝑄 𝐲, 𝐳 = : : log 𝑄 𝑡 𝐽 [N OTUVWR ] Q R Q 3 + : log 𝑄 𝑦 3 𝑡 : 𝐽 N O PQ Number of times Q 3 there is a transition in the sequence from state 𝑡’ to state 𝑡 Count(𝑡 L → 𝑡) 14

  15. � � � � � HMM is also a linear classifier log 𝑄 𝐲, 𝐳 = : log 𝑄 𝑧 3 ∣ 𝑧 34+ + log 𝑄 𝑦 3 ∣ 𝑧 3 3 Let us examine this expression using a carefully defined set of indicator functions ⋅ Count(𝑡 L → 𝑡) 𝑡 L log 𝑄 𝐲, 𝐳 = : : log 𝑄 𝑡 Q R Q + : log 𝑄 𝑦 3 𝑡 : 𝐽 N O PQ Q 3 15

  16. � � � � � HMM is also a linear classifier log 𝑄 𝐲, 𝐳 = : log 𝑄 𝑧 3 ∣ 𝑧 34+ + log 𝑄 𝑦 3 ∣ 𝑧 3 3 Let us examine this expression using a carefully defined set of indicator functions ⋅ Count(𝑡 L → 𝑡) 𝑡 L log 𝑄 𝐲, 𝐳 = : : log 𝑄 𝑡 Q R Q + : log 𝑄 𝑦 3 𝑡 : 𝐽 N O PQ Q 3 Number of times state 𝑡 occurs in the sequence: Count(𝑡) 16

  17. � � � � HMM is also a linear classifier log 𝑄 𝐲, 𝐳 = : log 𝑄 𝑧 3 ∣ 𝑧 34+ + log 𝑄 𝑦 3 ∣ 𝑧 3 3 Let us examine this expression using a carefully defined set of indicator functions ⋅ Count(𝑡 L → 𝑡) 𝑡 L log 𝑄 𝐲, 𝐳 = : : log 𝑄 𝑡 Q R Q + : log 𝑄 𝑦 3 𝑡 ⋅ Count 𝑡 Q 17

  18. � � � � HMM is also a linear classifier log 𝑄 𝐲, 𝐳 = : log 𝑄 𝑧 3 ∣ 𝑧 34+ + log 𝑄 𝑦 3 ∣ 𝑧 3 3 Let us examine this expression using a carefully defined set of indicator functions ⋅ Count(𝑡 L → 𝑡) 𝑡 L log 𝑄 𝐲, 𝐳 = : : log 𝑄 𝑡 Q R Q + : log 𝑄 𝑦 3 𝑡 ⋅ Count 𝑡 Q This is a linear function log P terms are the weights; counts via indicators are features Can be written as w T Á ( x , y ) and add more features 18

  19. HMM is a linear classifier: An example Det Noun Verb Det Noun The dog ate the homework 19

  20. HMM is a linear classifier: An example Det Noun Verb Det Noun The dog ate the homework Consider 20

  21. HMM is a linear classifier: An example Det Noun Verb Det Noun The dog ate the homework Consider Transition scores + Emission scores 21

  22. HMM is a linear classifier: An example Det Noun Verb Det Noun The dog ate the homework Consider log P(Det ! Noun) £ 2 + log P(Noun ! Verb) £ 1 + Emission scores + log P(Verb ! Det) £ 1 22

  23. HMM is a linear classifier: An example Det Noun Verb Det Noun The dog ate the homework Consider log P(The | Det) £ 1 log P(Det ! Noun) £ 2 + log P(dog| Noun) £ 1 + log P(Noun ! Verb) £ 1 + + log P(ate| Verb) £ 1 + log P(Verb ! Det) £ 1 + log P(the| Det) £ 1 + log P(homework| Noun) £ 1 23

  24. HMM is a linear classifier: An example Det Noun Verb Det Noun The dog ate the homework Consider log P(The | Det) £ 1 log P(Det ! Noun) £ 2 + log P(dog| Noun) £ 1 + log P(Noun ! Verb) £ 1 + + log P(ate| Verb) £ 1 + log P(Verb ! Det) £ 1 + log P(the| Det) £ 1 + log P(homework| Noun) £ 1 w : Parameters of the model 24

  25. HMM is a linear classifier: An example Det Noun Verb Det Noun The dog ate the homework Consider log P(The | Det) £ 1 log P(Det ! Noun) £ 2 Á ( x , y ): Properties of + log P(dog| Noun) £ 1 this output and the + log P(Noun ! Verb) £ 1 input + + log P(ate| Verb) £ 1 + log P(Verb ! Det) £ 1 + log P(the| Det) £ 1 + log P(homework| Noun) £ 1 25

  26. HMM is a linear classifier: An example Det Noun Verb Det Noun The dog ate the homework Consider log 𝑄(𝐸𝑓𝑢 → 𝑂𝑝𝑣𝑜) 2 log 𝑄(𝑂𝑝𝑣𝑜 → 𝑊𝑓𝑠𝑐) 1 log 𝑄(𝑊𝑓𝑠𝑐 → 𝐸𝑓𝑢) 1 log 𝑄 𝑈ℎ𝑓 𝐸𝑓𝑢) 1 ⋅ log 𝑄 𝑒𝑝𝑕 𝑂𝑝𝑣𝑜) 1 1 log 𝑄 𝑏𝑢𝑓 𝑊𝑓𝑠𝑐) 1 log 𝑄 𝑢ℎ𝑓 𝐸𝑓𝑢 1 log 𝑄 ℎ𝑝𝑛𝑓𝑥𝑝𝑠𝑙 𝑂𝑝𝑣𝑜) Á ( x , y ): Properties of this w : Parameters output and the input of the model 26

  27. HMM is a linear classifier: An example Det Noun Verb Det Noun The dog ate the homework Consider log 𝑄(𝐸𝑓𝑢 → 𝑂𝑝𝑣𝑜) 2 log 𝑄(𝑂𝑝𝑣𝑜 → 𝑊𝑓𝑠𝑐) 1 log P( x , y ) = A linear scoring log 𝑄(𝑊𝑓𝑠𝑐 → 𝐸𝑓𝑢) 1 function = w T Á ( x , y ) log 𝑄 𝑈ℎ𝑓 𝐸𝑓𝑢) 1 ⋅ log 𝑄 𝑒𝑝𝑕 𝑂𝑝𝑣𝑜) 1 1 log 𝑄 𝑏𝑢𝑓 𝑊𝑓𝑠𝑐) 1 log 𝑄 𝑢ℎ𝑓 𝐸𝑓𝑢 1 log 𝑄 ℎ𝑝𝑛𝑓𝑥𝑝𝑠𝑙 𝑂𝑝𝑣𝑜) Á ( x , y ): Properties of this w : Parameters output and the input of the model 27

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend