Lecture 10 N.MORGAN / B.GOLD LECTURE 10 10.1 LECTURE ON - - PowerPoint PPT Presentation

lecture 10
SMART_READER_LITE
LIVE PREVIEW

Lecture 10 N.MORGAN / B.GOLD LECTURE 10 10.1 LECTURE ON - - PowerPoint PPT Presentation

LECTURE ON STATISTICAL PATTERN RECOGNITION EE 225D University of California Berkeley College of Engineering Department of Electrical Engineering and Computer Sciences Professors : N.Morgan / B.Gold EE225D Spring,1999 Statistical Pattern


slide-1
SLIDE 1

EE 225D N.MORGAN / B.GOLD LECTURE 10 10.1 LECTURE ON STATISTICAL PATTERN RECOGNITION

University of California Berkeley

College of Engineering Department of Electrical Engineering and Computer Sciences

Professors : N.Morgan / B.Gold EE225D Spring,1999

Statistical Pattern Recognition

Lecture 10

slide-2
SLIDE 2

EE 225D N.MORGAN / B.GOLD LECTURE 10 10.2 LECTURE ON STATISTICAL PATTERN RECOGNITION

Last Time

p x ωi ( ) log p ωi ( ) log +

slide-3
SLIDE 3

EE 225D N.MORGAN / B.GOLD LECTURE 10 10.3 LECTURE ON STATISTICAL PATTERN RECOGNITION

slide-4
SLIDE 4

EE 225D N.MORGAN / B.GOLD LECTURE 10 10.4 LECTURE ON STATISTICAL PATTERN RECOGNITION

Discrete Density Estimation

ωM ω2 ω1 ω0 K 1 – j 1 class index clusters p ωi ( ) nij

j

∑ nij

i j ,

  • row total

total

  • =

= p ωi x ( ) p ωi y j ( ) ≈ p ωi y j , ( ) p y j ( )

  • nij

nij

i

  • =

=

slide-5
SLIDE 5

EE 225D N.MORGAN / B.GOLD LECTURE 10 10.5 LECTURE ON STATISTICAL PATTERN RECOGNITION

K-means Clustering

1.Choose N centers 2.Assign paths to nearest 3.Compute centers 4.Assess 5.Write “codebook”

slide-6
SLIDE 6

EE 225D N.MORGAN / B.GOLD LECTURE 10 10.6 LECTURE ON STATISTICAL PATTERN RECOGNITION

Example: speech frame classification

  • 256 pt DFT, 128 spec vals, take log power
  • Use K-means, find 64 centers, make table
  • Assign each spectrum to a codebook entry, count

co-occurences with phoneme labels, get probs

slide-7
SLIDE 7

EE 225D N.MORGAN / B.GOLD LECTURE 10 10.7 LECTURE ON STATISTICAL PATTERN RECOGNITION

Estimators requiring iterative training

  • Gaussian mixtures
  • Neural networks
slide-8
SLIDE 8

EE 225D N.MORGAN / B.GOLD LECTURE 10 10.8 LECTURE ON STATISTICAL PATTERN RECOGNITION

Gaussian Mixtures

p x ωk j = 1 , ( ) p x ωk ( ) p x ωk j = M , ( ) ∑ x cM c1 p x ωk ( ) p j x ωk , ( )

j 1 = M

∑ p j ωk ( ) p x ωk j , ( )

j 1 = M

∑ = =      c j c j prob x originated from dist j =

slide-9
SLIDE 9

EE 225D N.MORGAN / B.GOLD LECTURE 10 10.9 LECTURE ON STATISTICAL PATTERN RECOGNITION

Expectation Maximization

(Also sometimes called Estimate-and-Maximize)

  • Potentially quite general
  • Cannot analytically determine parameters
  • E step: Conditional Expectation of unknown

variable given what is known

  • M step: Choose parameters to maximize E
slide-10
SLIDE 10

EE 225D N.MORGAN / B.GOLD LECTURE 10 10.10 LECTURE ON STATISTICAL PATTERN RECOGNITION

p x ωi ( ) p ωi x ( ) p ωi x θ , ( ) p x ωi θ , ( ) p x θ ( ) for class ωi argθmaxp x θ ( ) ML → Σ

slide-11
SLIDE 11

EE 225D N.MORGAN / B.GOLD LECTURE 10 10.11 LECTURE ON STATISTICAL PATTERN RECOGNITION

Let k be hidden variables x be observed θ be params θold old params E p k x θ , ( ) log { } p k x θold , ( ) p k x θ , ( ) [ ] log

k

∑ = p k x θold , ( ) p k x θ , ( )p x θ ( ) [ ] log

k

∑ =

slide-12
SLIDE 12

EE 225D N.MORGAN / B.GOLD LECTURE 10 10.12 LECTURE ON STATISTICAL PATTERN RECOGNITION

Q θ θold , ( ) p k x θold , ( ) p k x θ , ( ) ( ) [ ] log

k

∑ = p k x θold , ( ) p x θ ( ) [ ] log

k

∑ +       

  • Indep. of k

Sums to 1

Q θold θold , ( ) p K x θold , ( ) p k x θold , ( ) [ ] log

k

∑ p x θold ( ) log + = diff p x θ ( ) log p x θold ( ) log – = p k x θold , ( ) p k x θold , ( ) p k x θ , ( )

  • log

k

∑ –

slide-13
SLIDE 13

EE 225D N.MORGAN / B.GOLD LECTURE 10 10.13 LECTURE ON STATISTICAL PATTERN RECOGNITION

Gaussian Mixture

p x θ ( ) p x k , θ ( )

k 1 = K

∑ p k θ ( )p x k θ , ( )

k 1 = K

∑ = = Log Joint Density p x k θ , ( ) log p k θ ( )p x k θ , ( ) [ ] log = Q p k xn θold , ( ) p k θ ( )p xn k θ , ( ) [ ] log

n 1 = N

k 1 = K

∑ = p k xn θold , ( ) p k θ ( ) log

n 1 = N

k 1 = K

∑ = p k xn θold , ( ) p xn k θ , ( ) log

n 1 = N

k 1 = K

slide-14
SLIDE 14

EE 225D N.MORGAN / B.GOLD LECTURE 10 10.14 LECTURE ON STATISTICAL PATTERN RECOGNITION

Let p xn k ( ) 1 2πσk

2

  • 1

2

  • xn

µk – σk

  

2

– exp = Q p k xn θold , ( ) p k θ ( ) log

n 1 = N

k 1 = K

∑ = p k xn θold , ( ) σn log – C xn µk – ( )2 – 2σk

2

  • +

+    

n 1 = N

k 1 = K

∑ +

slide-15
SLIDE 15

EE 225D N.MORGAN / B.GOLD LECTURE 10 10.15 LECTURE ON STATISTICAL PATTERN RECOGNITION

µ j ∂ ∂Q = p j xn θold , ( ) xn σ j

2

  • µ j

σ j

2

   

n 1 = N

∑ ⇒ = p j xn θold , ( )xn

n 1 = N

∑ ⇒ p j xn θold , ( )µ j

n 1 = N

∑ = µ j ⇒ p j xn θold , ( )xn

n 1 = N

∑ p j xn θold , ( )

n 1 = N

  • =
slide-16
SLIDE 16

EE 225D N.MORGAN / B.GOLD LECTURE 10 10.16 LECTURE ON STATISTICAL PATTERN RECOGNITION

EM Summary

  • Choose parametric form
  • Choose initial values
  • Compute posterior estimates for hidden variables
  • Choose parameters to maximize expectation of joint

density (observed, hidden)

  • Assess goodness of fit
  • Good enough Stop

Yes No