Natural Language Processing
Spring 2017
Professor Liang Huang liang.huang.sh@gmail.com
Unit 1: Sequence Models
Lectures 7-8: Stochastic String Transformations (a.k.a. “channel-models”)
required
- ptional
Natural Language Processing Spring 2017 Unit 1: Sequence Models - - PowerPoint PPT Presentation
Natural Language Processing Spring 2017 Unit 1: Sequence Models Lectures 7-8: Stochastic String Transformations (a.k.a. channel-models) required optional Professor Liang Huang liang.huang.sh@gmail.com String Transformations
Professor Liang Huang liang.huang.sh@gmail.com
required
Viterbi (both max and sum)
2
CS 562 - Lec 5-6: Probs & WFSTs
3
4
V IH D IY OW
5
http://www.speech.cs.cmu.edu/cgi-bin/cmudict from CMU Pronunciation Dictionary 39 phonemes (15 vowels + 24 consonants)
echo 'W H A L E B O N E S' | carmel -sriIEQk 5 epron.wfsa epron-espell.wfst
6
CMU/IPA Example Translation
AA /ɑ/
AE /æ/ at
AH /ʌ/ hut
AO /ɔ:/
AO T AW /aʊ/ cow
AY /aɪ/ hide
B /b/ be
CH /tʃ/ cheese
D /d/ dee
DH /ð/ thee
EH /ɛ/ Ed
ER /ɚ/ hurt
EY /eɪ/ ate
F /f/ fee
G /g/ green G R IY N HH /h/ he
IH /ɪ/ it
IY /i:/ eat
JH /dʒ/ gee
CMU/IPA Example Translation
K /k/ key
L /l/ lee
M /m/ me
N /n/ knee
NG /ŋ/ ping
OW /oʊ/ oat
OY /ɔɪ/ toy
P /p/ pee
R /ɹ/ read
S /s/ sea
SH /ʃ/ she
T /t/ tea
TH /θ/ theta TH EY T AH UH /ʊ/ hood
UW /u/ too
V /v/ vee
W /w/ we
Y /j/ yield Y IY L D Z /z/ zee
ZH /ʒ/ usual Y UW ZH UW AH L
WRONG! missing the SCHWA ə (merged with the STRUT ʌ “AH”)!
7
WRONG! missing the SCHWA ə (merged with the STRUT ʌ “AH”)! DOES NOT ANNOTATE STRESSES
A AH A EY AAA T R IH P AH L EY AABERG AA B ER G AACHEN AA K AH N ... ABOUT AH B AW T ... ABRAMOVITZ AH B R AA M AH V IH T S ABRAMOWICZ AH B R AA M AH V IH CH ABRAMOWITZ AH B R AA M AH W IH T S ... FATHER F AA DH ER ... ZYDECO Z AY D EH K OW ZYDECO Z IH D AH K OW ZYDECO Z AY D AH K OW ... ZZZZ Z IY Z
8
9
echo 'HH EH L OW' | carmel -sliOEQk 50 epron-espell.wfst espell-eword.wfst eword.wfsa
VH IH N N AY T K E B I N N A I T O ケ ビ ン ナ イ ト
10
inventory mismatch
constraint
11
[consonant] + vowel + [nasal n]
V = 50 syllables
more syllables: use alphabets
tutorial from course page!
12
general
?
n?
Japanese
5
10
1
13
CS 562 - Lec 5-6: Probs & WFSTs
14
ㅇnull/ng, ㅈj, ㅊch, ㅋk, ㅌt, ㅍp, ㅎh
Q: 강남 스타일 = ?
Viterbi
15
16
from Knight & Sproat 09
Japanese just transliterates almost everything (even though its syllable inventory is really small...) but... it is quite easy for English speakers to decode .... if you have a good language model!
17
Viterbi チャイルドシート
(transcribed in Romaji) back to English words
18
[Knight & Graehl 98]
WFST!
website for Mohri+Riley paper
19
[Knight & Graehl 98]
20
HW2 epron-jpron.data (MLE) HW4 epron-jpron.data (EM) HW3 Viterbi decoding HW2 eword-epron.data
Viterbi (both max and sum)
21
CS 562 - Lec 5-6: Probs & WFSTs
a language model
context-indep.
22
CS 562 - Lec 5-6: Probs & WFSTs
Viterbi...
23
CS 562 - Lec 5-6: Probs & WFSTs
24
context-dependence (from LM) propagates left and right!
CS 562 - Lec 5-6: Probs & WFSTs
25
CS 562 - Lec 5-6: Probs & WFSTs
26
CS 562 - Lec 5-6: Probs & WFSTs
27
how about unigram?
CS 562 - Lec 5-6: Probs & WFSTs
28
Q: what about top-down recursive + memoization?
CS 562 - Lec 5-6: Probs & WFSTs
29
not normalized?
to be a V or N? Q3: how to train p(w|t)?
CS 562 - Lec 5-6: Probs & WFSTs
30
time complexity: O(nT3) in general: O(nTg) for g-gram
CS 562 - Lec 5-6: Probs & WFSTs
31
how to compute the normalization factor?
CS 562 - Lec 5-6: Probs & WFSTs
32
α
CS 562 - Lec 5-6: Probs & WFSTs
33
CS 562 - Lec 5-6: Probs & WFSTs
1.topological sort 2.visit each vertex v in sorted order and do updates
V + E )
34
v u
w(u, v)
d(v) ⊕ = d(u) ⊗ w(u, v)
see tutorial on DP from course page
without spaces between words
35
36
Liang Huang (Penn) Dynamic Programming
37
民主
min-zhu
people-dominate
“democracy”
江泽民 主席
jiang-ze-min zhu-xi
... - ... - people dominate-podium
“President Jiang Zemin”
this was 5 years ago. now Google is good at segmentation!
下 雨 天 地 面 积 水
xia yu tian di mian ji shui
graph search 下 雨 天 地 面 积 水 tagging problem
Liang Huang (Penn) Dynamic Programming
38
Liang Huang (Penn) Dynamic Programming
39
Liang Huang (Penn) Dynamic Programming
40
CS 562 - Lec 5-6: Probs & WFSTs
yu Shalong juxing le huitan
与 沙龙 举行 了 会谈
held a talk with Sharon yu Shalong juxing le huitan
with Sharon held a talk talks Sharon held
with
_ _ _ _ _ _ _●
CS 562 - Lec 5-6: Probs & WFSTs
yu Shalong juxing le huitan
与 沙龙 举行 了 会谈
held a talk with Sharon yu Shalong juxing le huitan
with Sharon held a talk talks Sharon held
with
_ _ _ _ _ _ _●
CS 562 - Lec 5-6: Probs & WFSTs
yu Shalong juxing le huitan
与 沙龙 举行 了 会谈
held a talk with Sharon
_ _●
held a talk with Sharon
_ _ _ _ _
... ... ...
_ _●
source-side: coverage vector target-side: grow hypotheses strictly left-to-right
space: O(2n), time: O(2n n2) -- cf. traveling salesman problem
43
CS 562 - Lec 5-6: Probs & WFSTs
=> (foreign phrases in english word order) => permutations (2n)=> (foreign)
44
CS 562 - Lec 5-6: Probs & WFSTs
complexity => distortion limit
(Held and Karp, 1962; Knight, 1999)
45
CS 562 - Lec 5-6: Probs & WFSTs
46
a:ε" ε" ε:a b:ε" ε" ε:b a:b b:a a:a b:b O(k) deletion arcs O(k) insertion arcs O(k) identity arcs
courtesy of Jason Eisner
CS 562 - Lec 5-6: Probs & WFSTs
47
CS 562 - Lec 5-6: Probs & WFSTs
48
clara
a : ε " ε " ε ε : a b : ε " ε " ε ε : b a : b b : a a : a b : b
.o. =
caca
.o.
c:ε" ε" l:ε" ε" a:ε" ε" r:ε" ε" a:ε" ε" ε:c c:c ε:c l:c ε:c a:c ε:c r:c ε:c a:c ε:c c:ε" ε" l:ε" ε" a:ε" ε" r:ε" ε" a:ε" ε" ε:a c:a ε:a l:a ε:a a:a ε:a r:a ε:a a:a ε:a c:ε" ε" l:ε" ε" a:ε" ε" r:ε" ε" a:ε" ε" ε:c c:c ε:c l:c ε:c a:c ε:c r:c ε:c a:c ε:c c:ε" ε" l:ε" ε" a:ε" ε" r:ε" ε" a:ε" ε" ε:a c:a ε:a l:a ε:a a:a ε:a r:a ε:a a:a ε:a c:ε" ε" l:ε" ε" a:ε" ε" r:ε" ε" a:ε" ε"
Best path (by Dijkstra’s algorithm)
b) what is the most likely conversion path?
CS 562 - Lec 5-6: Probs & WFSTs
49
CS 562 - Lec 5-6: Probs & WFSTs
the highest score?
50
CS 562 - Lec 5-6: Probs & WFSTs
51
CS 562 - Lec 5-6: Probs & WFSTs
52
CS 562 - Lec 5-6: Probs & WFSTs
53
CS 562 - Lec 5-6: Probs & WFSTs
54
CS 562 - Lec 5-6: Probs & WFSTs
55
CS 562 - Lec 5-6: Probs & WFSTs
56