1
Learning a Language Model from Continuous Speech
Learning a Language Model from Continuous Speech Graham Neubig, - - PowerPoint PPT Presentation
Learning a Language Model from Continuous Speech Learning a Language Model from Continuous Speech Graham Neubig, Masato Mimura, Shinsuke Mori, Tatsuya Kawahara School of Informatics, Kyoto University, Japan 1 Learning a Language Model from
1
Learning a Language Model from Continuous Speech
2
Learning a Language Model from Continuous Speech
3
Learning a Language Model from Continuous Speech
this is the song that never ends it just goes on and on my friends and if you started singing it not knowing what it was you'll just keep singing it forever just because this is the song that never ends it just goes on and on my friends and if you started singing it not knowing what it was you'll just keep singing it forever just because this is the song that never ends it just goes on and on my friends and if you started singing it not knowing what it was you'll just keep singing it forever just because this is the song that never ends it just goes on and on my friends and if...
this is the song that never ends it just goes on and on my friends and if you started singing it not knowing what it was you'll just keep singing it forever just because this is the song that never ends it just goes on and on my friends and if...
4
Learning a Language Model from Continuous Speech
this is the song that never ends it just goes on and on my friends and if you started singing it not knowing what it was you'll just keep singing it forever just because this is the song that never ends it just goes on and on my friends and if you started singing it not knowing what it was you'll just keep singing it forever just because this is the song that never ends it just goes on and on my friends and if you started singing it not knowing what it was you'll just keep singing it forever just because this is the song that never ends it just goes on and on my friends and if...
this is the song that never ends it just goes on and on my friends and if you started singing it not knowing what it was you'll just keep singing it forever just because this is the song that never ends it just goes on and on my friends and if...
5
Learning a Language Model from Continuous Speech
this is the song that never ends it just goes on and on my friends and if you started singing it not knowing what it was you'll just keep singing it forever just because this is the song that never ends it just goes on and on my friends and if...
6
Learning a Language Model from Continuous Speech
7
Learning a Language Model from Continuous Speech
8
Learning a Language Model from Continuous Speech
9
Learning a Language Model from Continuous Speech
10
Learning a Language Model from Continuous Speech
11
Learning a Language Model from Continuous Speech
12
Learning a Language Model from Continuous Speech
Hε Ha Hb Hba Hca Hab Hdb … … … ~ PY(Hbase, d1, Θ1) ~ PY(Hε, d2, Θ2) ~ PY(Hb, d3, Θ3) PY(Ha, d3, Θ3) ~
13
Learning a Language Model from Continuous Speech
PLM(i|<s>) PLM(am|i) PLM(in|am) PLM(<unk>|in) PLM(now|<unk>) PLM(</s>|now) PSM(c|<s>) PSM(h|c) PSM(i|h) PSM(b|i) PSM(a|b) PSM(</s>|a)
14
Learning a Language Model from Continuous Speech
s0 s1 s2 s3 s4 s5 p(s1|s0) p(s2|s0) p(s3|s2) p(s4|s1) p(s3|s1) p(s4|s2) p(s5|s3) p(s5|s4)
15
Learning a Language Model from Continuous Speech
s0 s1 s2 s3 s4 s5 p(s1|s0) p(s2|s0) p(s3|s2) p(s4|s1) p(s3|s1) p(s4|s2) p(s5|s3) p(s5|s4)
s0
16
Learning a Language Model from Continuous Speech
s0 s1 s2 s3 s4 s5 p(s1|s0) p(s2|s0) p(s3|s2) p(s4|s1) p(s3|s1) p(s4|s2) p(s5|s3) p(s5|s4)
s1
17
Learning a Language Model from Continuous Speech
s0 s1 s2 s3 s4 s5 p(s1|s0) p(s2|s0) p(s3|s2) p(s4|s1) p(s3|s1) p(s4|s2) p(s5|s3) p(s5|s4)
s2
18
Learning a Language Model from Continuous Speech
s0 s1 s2 s3 s4 s5 p(s1|s0) p(s2|s0) p(s3|s2) p(s4|s1) p(s3|s1) p(s4|s2) p(s5|s3) p(s5|s4)
s3
19
Learning a Language Model from Continuous Speech
s0 s1 s2 s3 s4 s5 p(s1|s0) p(s2|s0) p(s3|s2) p(s4|s1) p(s3|s1) p(s4|s2) p(s5|s3) p(s5|s4)
s4
20
Learning a Language Model from Continuous Speech
s0 s1 s2 s3 s4 s5 p(s1|s0) p(s2|s0) p(s3|s2) p(s4|s1) p(s3|s1) p(s4|s2) p(s5|s3) p(s5|s4)
s5
21
Learning a Language Model from Continuous Speech
s0 s1 s2 s3 s4 s5 p(s1|s0) p(s2|s0) p(s3|s2) p(s4|s1) p(s3|s1) p(s4|s2) p(s5|s3) p(s5|s4)
s5
∝ ∝
22
Learning a Language Model from Continuous Speech
s0 s1 s2 s3 s4 s5 p(s1|s0) p(s2|s0) p(s3|s2) p(s4|s1) p(s3|s1) p(s4|s2) p(s5|s3) p(s5|s4)
∝ ∝
23
Learning a Language Model from Continuous Speech
s0 s1 s2 s3 s4 s5 p(s1|s0) p(s2|s0) p(s3|s2) p(s4|s1) p(s3|s1) p(s4|s2) p(s5|s3) p(s5|s4)
s3
24
Learning a Language Model from Continuous Speech
s0 s1 s2 s3 s4 s5 p(s1|s0) p(s2|s0) p(s3|s2) p(s4|s1) p(s3|s1) p(s4|s2) p(s5|s3) p(s5|s4)
25
Learning a Language Model from Continuous Speech
s0 s1 s2 s3 s4 s5 p(s1|s0) p(s2|s0) p(s3|s2) p(s4|s1) p(s3|s1) p(s4|s2) p(s5|s3) p(s5|s4)
s2
26
Learning a Language Model from Continuous Speech
s0 s1 s2 s3 s4 s5 p(s1|s0) p(s2|s0) p(s3|s2) p(s4|s1) p(s3|s1) p(s4|s2) p(s5|s3) p(s5|s4)
27
Learning a Language Model from Continuous Speech
28
Learning a Language Model from Continuous Speech
i/i a/a m/m a/ε m/ε ε/amw m/mc
29
Learning a Language Model from Continuous Speech
ε w1 w2 <s> c1
c2
ε
w2:p(w2|w1) w1:p(w1) w2:p(w2) ε:p(FB|w1) ε:p(FB|w2) ε:p(FB) c1:p(c1|<s>)
ε:p(FB|<s>) ε:p(FB|c1) ε:p(FB|c2) c1:p(c1) c2:p(c2) ε:p(</s>|c1) ε:p(</s>|c2)
30
Learning a Language Model from Continuous Speech
ε w1 w2 <s> c1
c2
ε
w2:p(w2|w1) w1:p(w1) w2:p(w2) ε:p(FB|w1) ε:p(FB|w2) ε:p(FB) c1:p(c1|<s>)
ε:p(FB|<s>) ε:p(FB|c1) ε:p(FB|c2) c1:p(c1) c2:p(c2) ε:p(</s>|c1) ε:p(</s>|c2)
31
Learning a Language Model from Continuous Speech
i/i w:PL(i) m/am w:PL(am) a/ε PL(FB) PL(FB) PL(FB) i/ic:PS(i) a/ac:PS(a) m/mc:PS(m) ε/</s>: PS(</s>) ε/</s>: PS(</s>) ε/</s>: PS(</s>) ε/</s>: PL(</s>) a/a w:PL(a)
32
Learning a Language Model from Continuous Speech
i/i w:PL(i) m/am w:PL(am) a/ε PL(FB) PL(FB) PL(FB) i/ic:PS(i) a/ac:PS(a) m/mc:PS(m) ε/</s>: PS(</s>) ε/</s>: PS(</s>) ε/</s>: PS(</s>) ε/</s>: PL(</s>) a/a w:PL(a)
33
Learning a Language Model from Continuous Speech
i/i w:PL(i) m/am w:PL(am) a/ε PL(FB) PL(FB) PL(FB) i/ic:PS(i) a/ac:PS(a) m/mc:PS(m) ε/</s>: PS(</s>) ε/</s>: PS(</s>) ε/</s>: PS(</s>) ε/</s>: PL(</s>) a/a w:PL(a)
34
Learning a Language Model from Continuous Speech
i/i w:PL(i) m/am w:PL(am) a/ε PL(FB) PL(FB) PL(FB) i/ic:PS(i) a/ac:PS(a) m/mc:PS(m) ε/</s>: PS(</s>) ε/</s>: PS(</s>) ε/</s>: PS(</s>) ε/</s>: PL(</s>) a/a w:PL(a)
35
Learning a Language Model from Continuous Speech
i a m
i/PAM(i) e/PAM(e) y/PAM(y) a/PAM(a) a/PAM(a) m/PAM(m)
36
Learning a Language Model from Continuous Speech
37
Learning a Language Model from Continuous Speech
38
Learning a Language Model from Continuous Speech
39
Learning a Language Model from Continuous Speech
7.9 16.1 31.1 58.7 116.7 25.00% 25.50% 26.00% 26.50% 27.00% 27.50% 28.00%
Proposed 3-gram Proposed 2-gram Proposed 1-gram
Size of Training Data (Minutes)
P h
e m e E r r
R a t e
AM Only 34.2%
40
Learning a Language Model from Continuous Speech
0.0 7.9 16.1 31.1 58.7 116.7 24.00% 24.50% 25.00% 25.50% 26.00% 26.50% 27.00% 27.50% 28.00% 28.50% 29.00%
Size of Training Data (Minutes) P h
e m e E r r
R a t e
41
Learning a Language Model from Continuous Speech
42
Learning a Language Model from Continuous Speech
43
Learning a Language Model from Continuous Speech
44
Learning a Language Model from Continuous Speech
45
Learning a Language Model from Continuous Speech
46
Learning a Language Model from Continuous Speech
rimasukeredomo, mo:shiage, yu:fu:ni jo:kyo:, kangae, chi:ki, toki, shiteki
47
Learning a Language Model from Continuous Speech
48
Learning a Language Model from Continuous Speech
49
Learning a Language Model from Continuous Speech
7.9 16.1 31.1 58.7 116.7 4 4.5 5 5.5 6 6.5 7
Minutes of Training Data P e r
y l l a b l e E n t r
y
50
Learning a Language Model from Continuous Speech
51
Learning a Language Model from Continuous Speech
0.0 7.9 16.1 31.1 58.7 116.7 24.00% 24.50% 25.00% 25.50% 26.00% 26.50% 27.00% 27.50% 28.00% Proposed 3-gram Oracle 3-gram Transcript 3-gram
Minutes of Training Data P h
e m e E r r
R a t e
52
Learning a Language Model from Continuous Speech
– 0.5-1 times real time
– Perform beam-search trimming during forward filtering – Parallel sampling
53
Learning a Language Model from Continuous Speech
54
Learning a Language Model from Continuous Speech
a/s:2 b/t:1 s/x:0.5 t/y:3
a/x:2.5 b/y:4
55
Learning a Language Model from Continuous Speech
ε <s> c1
c2
ε:p(FB) ε:p(</s>|c1) ε:p(</s>|c2)
Pw i∣w i −2 ,UNK=Pw i∣UNK=Pw i
* technically not true if the same word appears twice in a single sentence
56
Learning a Language Model from Continuous Speech