1
Natural Language Processing
Acoustic Models
Dan Klein – UC Berkeley
Natural Language Processing Acoustic Models Dan Klein UC Berkeley 1 - - PowerPoint PPT Presentation
Natural Language Processing Acoustic Models Dan Klein UC Berkeley 1 The Noisy Channel Model Acoustic model: HMMs over Language model: word positions with mixtures Distributions over sequences of Gaussians as emissions of words (sentences)
1
Dan Klein – UC Berkeley
2
Acoustic model: HMMs over word positions with mixtures
Language model: Distributions over sequences
Figure: J & M
3
Figure: J & M
4
5
Figure: Bryan Pellom
6
25 ms 10ms
a1 a2 a3
Figure: Simon Arnfield
7
in lower freqs
equal samples above and below 1kHz
[Graph: Wikipedia]
8
9
10
phone)
11
discrete symbols
just by counting
quantization or VQ
more
a starting point
12
quality ASR
dimensional space with codebook
model to the preprocessing
possible values of the
normally distributed.
likelihood function as a Gaussian? From bartus.org/akustyk
13
P(x) x P(x) is highest here at mean P(x) is low here, far from mean
A Gaussian is parameterized by a mean and a variance:
14
15
Text and figures from Andrew Ng
16
value of x and value of y
Text and figures from Andrew Ng
17
bad job of modeling a complex distribution in any dimension
covariances
Gaussians
From openlearn.open.ac.uk
18
From robots.ox.ac.uk http://www.itee.uq.edu.au/~comp4702
19
distribution P(x|s) (likelihood function) parameterized by:
Dx1 diagonal variance vectors
learned codebook entries
learned codebook distance function
multinomial over codes
inventories, cf next week)
20
21
the
cat chased
dog
has
22
Figure: J & M
23
Figure: J & M
24
Figure from Huang et al page 618
25
26
27
Time (s) 0.48152 0.937203 5000 Frequency (Hz) ay k
28
Figure: J & M
29
Figure: J & M
30
31
Figure: J & M
32
33
1994]
triphones to cluster together?
‘broad phonetic classes’)
Figure: J & M
34
35
36
Most likely word sequence:
d ‐ ae ‐ d
Most likely state sequence:
d1‐d6‐d6‐d4‐ae5‐ae2‐ae3‐ae0‐d2‐d2‐d3‐d7‐d5
37
Figure: Enrique Benimeli
38
Figure: Enrique Benimeli
39
40
pushed weights)
Figure: Aubert, 02
0.04 0.02 0.01 0.04 0.25 0.5 1 1 1
41
each time
the b. the m. and then. at then. the ba. the be. the bi. the ma. the me. the mi. then a. then e. then i. the ba. the be. the ma. then a.
42
0.04 0.25 0.5 1 1 1
43
assumptions in the model