Sequence Labeling Prof. Sameer Singh CS 295: STATISTICAL NLP - - PowerPoint PPT Presentation

sequence labeling
SMART_READER_LITE
LIVE PREVIEW

Sequence Labeling Prof. Sameer Singh CS 295: STATISTICAL NLP - - PowerPoint PPT Presentation

Sequence Labeling Prof. Sameer Singh CS 295: STATISTICAL NLP WINTER 2017 January 31, 2017 Based on slides from Nathan Schneider, Noah Smith, Yejin Choi, and everyone else they copied from. Outline Sequence Labelling and POS Tagging


slide-1
SLIDE 1

Sequence Labeling

  • Prof. Sameer Singh

CS 295: STATISTICAL NLP WINTER 2017

January 31, 2017

Based on slides from Nathan Schneider, Noah Smith, Yejin Choi, and everyone else they copied from.

slide-2
SLIDE 2

Outline

CS 295: STATISTICAL NLP (WINTER 2017) 2

Sequence Labelling and POS Tagging Generative Modeling: HMMs Inference in HMMs: Viterbi and F/B Unsupervised Tagging using EM

slide-3
SLIDE 3

Outline

CS 295: STATISTICAL NLP (WINTER 2017) 3

Sequence Labelling and POS Tagging Generative Modeling: HMMs Inference in HMMs: Viterbi and F/B Unsupervised Tagging using EM

slide-4
SLIDE 4

Classification

CS 295: STATISTICAL NLP (WINTER 2017) 4

Sentiment Analysis Identify Topic Language Model

slide-5
SLIDE 5

Sequence Labeling

CS 295: STATISTICAL NLP (WINTER 2017) 5

slide-6
SLIDE 6

Parts of Speech

CS 295: STATISTICAL NLP (WINTER 2017) 6

This is a simple sentence . DET VB DET ADJ NOUN .

Applications:

  • Text to speech: record, lead, …
  • Machine translation: run, walk, …
  • Noun phrases: `grep {JJ | NN}* {NN | NNS}`
  • and many others…
slide-7
SLIDE 7

Parts of Speech: Tags

CS 295: STATISTICAL NLP (WINTER 2017) 7

“Open classes” Nouns, verbs, adjectives, adverbs, numbers “Closed classes”

  • Modal verbs
  • Prepositions (on, to)
  • Particles (off, up)
  • Determiners (the, some)
  • Pronouns (she, they)
  • Conjunctions (and, or)
slide-8
SLIDE 8

Named Entity Recognition

CS 295: STATISTICAL NLP (WINTER 2017) 8

Barack Obama spoke from the White House today . PER PER O O O LOC LOC O O

slide-9
SLIDE 9

Field Segmentation: Ads

CS 295: STATISTICAL NLP (WINTER 2017) 9

3BR flat in Bruntsfield , near main roads . Bright , well maintained ... SIZE TYPE O LOC O LOC LOC LOC O FEAT O FEAT FEAT ...

slide-10
SLIDE 10

Field Segmentation: Citations

CS 295: STATISTICAL NLP (WINTER 2017) 10

Authors Title Publication Venue

slide-11
SLIDE 11

Outline

CS 295: STATISTICAL NLP (WINTER 2017) 11

Sequence Labelling and POS Tagging Generative Modeling: HMMs Inference in HMMs: Viterbi and F/B Unsupervised Tagging using EM

slide-12
SLIDE 12

Naïve Bayes Classifier

CS 295: STATISTICAL NLP (WINTER 2017) 12

slide-13
SLIDE 13

“Transitions” matter

CS 295: STATISTICAL NLP (WINTER 2017) 13

How do we select a “consistent” set of POS tags? “Impossible” Transitions

  • Two determiners never follow each other
  • Two base form verbs never follow each other
  • Determiner is followed by adjective or noun

Fruit flies like a bird. Fruit flies like bananas. Based on semantics

slide-14
SLIDE 14

“Transitions” matter

CS 295: STATISTICAL NLP (WINTER 2017) 14

slide-15
SLIDE 15

“Transitions” matter

CS 295: STATISTICAL NLP (WINTER 2017) 15

Transition on Words versus Tags

  • Too many words, learn the same thing again
  • Support for unseen words: “I like tenguizino!”
slide-16
SLIDE 16

Hidden Markov Models

CS 295: STATISTICAL NLP (WINTER 2017) 16

S E

slide-17
SLIDE 17

Example Sentence

CS 295: STATISTICAL NLP (WINTER 2017) 17

This is a simple sentence DET VB DET ADJ NOUN S E

slide-18
SLIDE 18

Estimating Emissions

CS 295: STATISTICAL NLP (WINTER 2017) 18 S E

Smoothing

  • Unknown/rare words get inaccurate probabilities
  • Reminder: Laplace Smoothing (Add-k)
  • Next lecture: we will look at “features”
slide-19
SLIDE 19

Estimating Transitions

CS 295: STATISTICAL NLP (WINTER 2017) 19 S E

Interpolation

  • If there are too many tags, or too little data, some combinations are too rare
  • Same as N-gram language models, “backoff” to simpler models
slide-20
SLIDE 20

Outline

CS 295: STATISTICAL NLP (WINTER 2017) 20

Sequence Labelling and POS Tagging Generative Modeling: HMMs Inference in HMMs: Viterbi and F/B Unsupervised Tagging using EM

slide-21
SLIDE 21

Predicting from HMMs

CS 295: STATISTICAL NLP (WINTER 2017) 21

slide-22
SLIDE 22

Brute Force Inference

CS 295: STATISTICAL NLP (WINTER 2017) 22

slide-23
SLIDE 23

Conditional Independence

CS 295: STATISTICAL NLP (WINTER 2017) 23 S E

slide-24
SLIDE 24

Dynamic Programming

CS 295: STATISTICAL NLP (WINTER 2017) 24

slide-25
SLIDE 25

State Lattice

CS 295: STATISTICAL NLP (WINTER 2017) 25

Fruit flies like bananas R(1,N) R(1,V) R(1,IN) R(2,N) R(2,V) R(2,IN) R(3,N) R(3,V) R(3,IN) R(4,N) R(4,V) R(4,IN) S E

slide-26
SLIDE 26

Viterbi Decoding Algorithm

CS 295: STATISTICAL NLP (WINTER 2017) 26

Initialization Iterative Computation (forward) Follow pointers (backward)

slide-27
SLIDE 27

Computational Complexity

CS 295: STATISTICAL NLP (WINTER 2017) 27

slide-28
SLIDE 28

Outline

CS 295: STATISTICAL NLP (WINTER 2017) 28

Sequence Labelling and POS Tagging Generative Modeling: HMMs Inference in HMMs: Viterbi and F/B Unsupervised Tagging using EM

slide-29
SLIDE 29

Unsupervised Tagging

CS 295: STATISTICAL NLP (WINTER 2017) 29

Supervision is not always appropriate

  • Linguist has to read and understand each sentence
  • Time consuming and expensive
  • Contains domain specific signal in the labels
  • WSJ doesn’t generalize to Twitter, for example
  • Difficult to agree on the universal part-of-speech tags (C5 tags: 61, Brown: 87)
  • Want to apply it to low-resource/unknown languages

Generalize the notion of “clustering” to sequence labeling.

slide-30
SLIDE 30

Expectation Maximization

CS 295: STATISTICAL NLP (WINTER 2017) 30

K-Means Initialization Pick K random centroids Compute Expectations Cluster all the points Update Parameters Update centroids

slide-31
SLIDE 31

Upcoming…

CS 295: STATISTICAL NLP (WINTER 2017) 31

  • Homework 2 is due (~10 days): February 9, 2017
  • Write-up, data, and code for Homework 2 is up
  • Ask questions early!

Homework

  • Proposal is due in a week: February 7, 2017
  • Only 2 pages

Project

  • Paper summaries: February 17, February 28, March 14
  • Only 1 page each

Summaries