1
NLP Programming Tutorial 3 – The Perceptron Algorithm
NLP Programming Tutorial 3 - The Perceptron Algorithm
Graham Neubig Nara Institute of Science and Technology (NAIST)
NLP Programming Tutorial 3 - The Perceptron Algorithm Graham Neubig - - PowerPoint PPT Presentation
NLP Programming Tutorial 3 The Perceptron Algorithm NLP Programming Tutorial 3 - The Perceptron Algorithm Graham Neubig Nara Institute of Science and Technology (NAIST) 1 NLP Programming Tutorial 3 The Perceptron Algorithm Prediction
1
NLP Programming Tutorial 3 – The Perceptron Algorithm
Graham Neubig Nara Institute of Science and Technology (NAIST)
2
NLP Programming Tutorial 3 – The Perceptron Algorithm
3
NLP Programming Tutorial 3 – The Perceptron Algorithm
A book review Oh, man I love this book! This book is so boring... Is it positive? yes no
Binary Prediction (2 choices)
A tweet On the way to the park! 公園に行くなう! Its language English Japanese
Multi-class Prediction (several choices)
A sentence I read a book Its syntactic parse
Structured Prediction (millions of choices)
I read a book
DET NN NP VBD VP S N
4
NLP Programming Tutorial 3 – The Perceptron Algorithm
Given
Gonso was a Sanron sect priest (754-827) in the late Nara and early Heian periods.
Predict
Yes!
Shichikuzan Chigogataki Fudomyoo is a historical site located at Magura, Maizuru City, Kyoto Prefecture.
No!
5
NLP Programming Tutorial 3 – The Perceptron Algorithm
6
NLP Programming Tutorial 3 – The Perceptron Algorithm
Gonso was a Sanron sect priest ( 754 – 827 ) in the late Nara and early Heian periods . Shichikuzan Chigogataki Fudomyoo is a historical site located at Magura , Maizuru City , Kyoto Prefecture .
7
NLP Programming Tutorial 3 – The Perceptron Algorithm
Gonso was a Sanron sect priest ( 754 – 827 ) in the late Nara and early Heian periods . Shichikuzan Chigogataki Fudomyoo is a historical site located at Magura , Maizuru City , Kyoto Prefecture .
Contains “priest” → probably person! Contains “site” → probably not person! Contains “(<#>-<#>)” → probably person! Contains “Kyoto Prefecture” → probably not person!
8
NLP Programming Tutorial 3 – The Perceptron Algorithm
and negative if it indicates “no”
contains “priest” contains “(<#>-<#>)” contains “site” contains “Kyoto Prefecture” wcontains “priest” = 2 wcontains “(<#>-<#>)” = 1 wcontains “site” = -3 wcontains “Kyoto Prefecture” = -1 Kuya (903-972) was a priest born in Kyoto Prefecture.
2 + -1 + 1 = 2
9
NLP Programming Tutorial 3 – The Perceptron Algorithm
I
i( x))
10
NLP Programming Tutorial 3 – The Perceptron Algorithm
x = A site , located in Maizuru , Kyoto
φunigram “A”(x) = 1 φunigram “site”(x) = 1 φunigram “,”(x) = 2 φunigram “located”(x) = 1 φunigram “in”(x) = 1 φunigram “Maizuru”(x) = 1 φunigram “Kyoto”(x) = 1 φunigram “the”(x) = 0 φunigram “temple”(x) = 0
…
The rest are all 0
instead of feature indexes (φ1)
11
NLP Programming Tutorial 3 – The Perceptron Algorithm
x = A site , located in Maizuru , Kyoto
φunigram “A”(x) = 1 φunigram “site”(x) = 1 φunigram “,”(x) = 2 φunigram “located”(x) = 1 φunigram “in”(x) = 1 φunigram “Maizuru”(x) = 1 φunigram “Kyoto”(x) = 1 wunigram “a” = 0 wunigram “site” = -3 wunigram “located” = 0 wunigram “Maizuru” = 0 wunigram “,” = 0 wunigram “in” = 0 wunigram “Kyoto” = 0 φunigram “priest”(x) = 0 wunigram “priest” = 2 φunigram “black”(x) = 0 wunigram “black” = 0
+ + + + + + + + + =
12
NLP Programming Tutorial 3 – The Perceptron Algorithm
predict_all(model_file, input_file): load w from model_file # so w[name] = wname for each x in input_file phi = create_features(x) # so phi[name] = φname(x) y' = predict_one(w, phi) # calculate sign(w*φ(x)) print y'
13
NLP Programming Tutorial 3 – The Perceptron Algorithm
predict_one(w, phi) score = 0 for each name, value in phi # score = w*φ(x) if name exists in w score += value * w[name] if score >= 0 return 1 else return -1
14
NLP Programming Tutorial 3 – The Perceptron Algorithm
CREATE_FEATURES(x): create map phi split x into words for word in words phi[“UNI:”+word] += 1 # We add “UNI:” to indicate unigrams return phi
15
NLP Programming Tutorial 3 – The Perceptron Algorithm
16
NLP Programming Tutorial 3 – The Perceptron Algorithm
y x
1
FUJIWARA no Chikamori ( year of birth and death unknown ) was a samurai and poet who lived at the end of the Heian period .
1
Ryonen ( 1646 - October 29 , 1711 ) was a Buddhist nun of the Obaku Sect who lived from the early Edo period to the mid-Edo period .
A moat settlement is a village surrounded by a moat .
Fushimi Momoyama Athletic Park is located in Momoyama-cho , Kyoto City , Kyoto Prefecture .
17
NLP Programming Tutorial 3 – The Perceptron Algorithm
create map w for I iterations for each labeled pair x, y in the data phi = create_features(x) y' = predict_one(w, phi) if y' != y update_weights(w, phi, y)
18
NLP Programming Tutorial 3 – The Perceptron Algorithm
– Features for positive examples get a higher weight
– Features for negative examples get a lower weight
→ Every time we update, our predictions get better!
update_weights(w, phi, y) for name, value in phi: w[name] += value * y
19
NLP Programming Tutorial 3 – The Perceptron Algorithm
x = A site , located in Maizuru , Kyoto y = -1
wunigram “A” = -1 wunigram “site” = -1 wunigram “,” = -2 wunigram “located” = -1 wunigram “in” = -1 wunigram “Maizuru” = -1 wunigram “Kyoto” = -1
20
NLP Programming Tutorial 3 – The Perceptron Algorithm
x = Shoken , monk born in Kyoto y = 1
wunigram “A” = -1 wunigram “site” = -1 wunigram “,” = -1 wunigram “located” = -1 wunigram “in” = 0 wunigram “Maizuru” = -1 wunigram “Kyoto” = 0
wunigram “Shoken” = 1 wunigram “monk” = 1 wunigram “born” = 1
21
NLP Programming Tutorial 3 – The Perceptron Algorithm
22
NLP Programming Tutorial 3 – The Perceptron Algorithm
23
NLP Programming Tutorial 3 – The Perceptron Algorithm
analyze why
24
NLP Programming Tutorial 3 – The Perceptron Algorithm