Lecture Maximization Expectation 12 . Variational Inference - - PowerPoint PPT Presentation

lecture
SMART_READER_LITE
LIVE PREVIEW

Lecture Maximization Expectation 12 . Variational Inference - - PowerPoint PPT Presentation

Lecture Maximization Expectation 12 . Variational Inference Scribes Daniel Zeiberg : Alesia Chernihova 6mm Maximum Likelihood Estimation in " observed " Easy Estimate for t : n = & - acy Am log fly 2-


slide-1
SLIDE 1 Lecture 12 . Expectation Maximization Variational Inference Scribes : Daniel Zeiberg Alesia Chernihova
slide-2
SLIDE 2 Maximum Likelihood Estimation in 6mm Easy : Estimate n for " observed " t
  • = &
log ply , 2-

Am

) ? fly , 77
  • § acy
)

Problem

: Need to marginalize
  • ver
Z

pcyly

) = ldtplu.tt/y1=n7!/dznpCyn.7nly7oQylogpCy1yl=ptqy,fqnI ,

ldznexplyitly

,
  • aint
Integral throws spanner in worms
slide-3
SLIDE 3 Expectation Maximization * * * Objective : n.ME = a

.gg?gxlogpCy11u.E,n

) Repeat until convergence ' (
  • bjective
unchanged ) 1 . For n in 7 , .
  • .
Ni yuh ie El IlZn=h ) ) = / dznpcznlyn.io ) Ilan
  • h ]
# Points in cluster he 2. For W in I , . . . Ki N N Mu = t E yuh yn Empirical Nh :-. I ruh µ h h = ' Mean h 't Ih = ,

&

!

trnhynynt

  • pryuht
Empirical Covariance nu = Nh IN Fraction in cluster h
slide-4
SLIDE 4 Expectation Maximization : Example Iteration : O
slide-5
SLIDE 5 Expectation Maximization : Example Iteration : 1
slide-6
SLIDE 6 Expectation Maximization : Example Iteration : 2
slide-7
SLIDE 7 Expectation Maximization : Example Iteration : 3
slide-8
SLIDE 8 Expectation Maximization : Example Iteration : 4
slide-9
SLIDE 9 Expectation Maximization : Example Iteration : 5
slide-10
SLIDE 10 Expectation Maximization : Example Iteration : 6
slide-11
SLIDE 11 Intermezzo , : Jensen 's Inequality Convex Functions Area above f- ( tx , t Ci
  • t )
xz ) fix , . . . • fkn curve is a
  • convex
set s t fix , ) t It
  • t )
flat fix . ' g X , Xz Concave Functions Area below f- ( tx , t Ci
  • t )
xz ) fix flat cu
  • ve
is a
  • ,←¥f
  • fix
, convex set , 7 t fix , ) + It
  • t )
flat s X , Xz Corrolary : Random Variables

t.ci#xnl:Efii.i:iit::::.

slide-12
SLIDE 12 Lowen Bounds
  • n

Marginal

Likelihoods Idea : Use Jensen 's inequality to define Lower Bound 2- i
  • fax
six , = fax aix ) 4¥, = E goal L :
  • E. *
" . I

lost

I

slog

#

, 14

It

  • . tog
't [ Lower bound
  • n
boy 7

Gaussian

Mixture Model 2- I O ) ; = Id 't pig . 't :O ) = I at act ; y , PgY¥ = pig ;o , pig , 7 ;D ) £10,81 :-. Ez .mg , I log , ) s log pay ;
  • l
slide-13
SLIDE 13 Intermezzo : Kullback
  • Leibler
Divergence Measures how much KL( qcxsll MIX ) ) :
  • I
DX 941 by n " ' a ,× , deviates from MIN Properties I . KL ( a call mix ) ) 3
  • ( Positive femi
  • definite)
  • KL ( g
kill MIN ) =

lax

guy leg "9¥ , = E. ⇐ galley "g , ) ± log ( E⇐g*l"gift ) = log lil
  • 2
. KL ( q C x ) 11171×1) =
  • a
91×1 = Mk ) 9kt 171×7 I dxqix , log 94¥ , lax Mix , leg , =
slide-14
SLIDE 14 KL divergence vs Lower Bound
  • pcyit
;o ) = £10,21 = Eagan , lloypggjt.sc ] P 's :O 'P 't 's :o) = #z~q , ;y)|log pig :o) + leg Plaything ) does not depend
  • n £
rewrite as Kttdiv = log pig :o)
  • #
an ;n|log9pYftTo , ] = log ply :O)
  • KL (
qhsy ) H paly ;o ) ) a \ Does not depend
  • n
y Depends
  • n
y Implication : Maximizing £ ( O ,y ) wrt y is equivalent to minimizing KL ( 917 ; g) 11 pcttly ; O ) )
  • .
slide-15
SLIDE 15 Algorithm : Generalized Expectation Maximization Objective : Lto .hr .

Egj

,

.gg/loyPlY'tt-9/slogpcy;o7

9175g ) Initialize ; A Repeat until £10 , y ) unchanged : 1 . Expectation Step Computes expected y = ang mate I ( O , 8 ) sufficient statistics r 2 . Maximization Step Maximizes O given O = anymore L( 0,8 ) computed statistics
slide-16
SLIDE 16 Algorithm : Generalized Expectation Maximization Objective : Lto , Hi .

Egj

,

.gg/loyPlY'tt-9/slogpcy;o7

9175g ) Initialize : O qc.zi-hsyl-ku-pf2-n-hlyn.cl Repeat until £10,8) unchanged "

q

moments EE ttt ) ) 1 , Expectation Step determine distribution 8 = angngax I I o , r ) = angginkhfgct.gl//pC71yiO )) 2 . Maximization Step D= anymore LIO , r )
slide-17
SLIDE 17 Maximization Step : Update Parameters a n . rt = Ea , . .gs/logPg::?IT

]

K µ f tdyiz) \ , leg ply it I y ) = E

Mtf

, thlynl Ithih ]
  • ainu
) Iftar ) htt ! yall 9. r ) = Ig go.gg/bgpisitin ) I N = ? fthlynl-g-YEgn.nl#zn=hH=n.Etulyn1rnu )
  • 844yd
Nh
slide-18
SLIDE 18 Maximization Step : Update Parameters a n . rt = Ea , . ,nflogPgY÷] K µ leg ply it I g) = E

Mt

f.

, tulynl Ithih )
  • acyu
) Iftar ) htt

Iq

Cly , r ) = (I ? tulyn ) run )
  • 844yd
Nh =
  • h
Maximum Likelihood : Match moments to expected suff stats

ftp..mn/thl5tY--oa-M--fuE?tulynlrnu

bye
slide-19
SLIDE 19 Algorithm : Generalized Expectation Maximization Objective : Lto , Hi .

Egj

,

.gg/loyPlYitt-9/slogpcy;o7

917 ; y ) Initialize : O qc.zi-hsyl-ku-pftn-hlyn.cl Repeat until £10,8) unchanged "

q

Moments EE ttt ) ) 1 , Expectation Step determine distribution 8 = angngax I C a , r ) = angginkhfgct.gl//pC71yiO )) z . Maximization Step use computed suff . stats
  • to
update parameters D= anymore L ( O , r ) by matching moments
slide-20
SLIDE 20 Variational Inference Idea , Approximate posterior by maximizing a variational lower bound , ply ? Ptt , Ely ) " 9) = * go.o.cn/eogPgY.if?T ) 7,0 ly = log ply ) t Eg
  • ;
llogpqlcz.co#)--legpcys-KL/9lZ.osoDllpcz.o1ys ) q s log pay , Maximizing L lol ) is the Same as minimizing KL
slide-21
SLIDE 21 Intuition : Minimizing KL divergences Ply , × , ,X . ) = PCYIX , ,×z ) pcx , ,xz ) Z

/

qcx ) = Norm ( x

, 2)

g)

qc ,× , ,×z , ÷ q(× , )q( × . , qcx , ) ÷ Norm , , ,6 ? ) qkz ) := Normkn ;µz,6i ) LC aim ) :-.

EaaflogPlgYIlI@ply.x

, ,×z ) = pcyipix , ,×ziy ) = leg ply )
  • KL ( q(
x , ,x , ) Hpcx , ,×zly ) ) Intuition : KL divergence under

approximates

variance
slide-22
SLIDE 22 Intuition : Minimizing KL divergences z

g

P( Yixi ,×z ) = pcylx , ,×up( 1- , ,x > )
  • a. (
x ) = Norm ( x ; p , [ )

g)

pagan qc ,× , ,×z , ÷ qk , , qkz ) Propagate qlx , ) :-. Norm ( × , , ,6 , ) 91×21 :-. µorm( Xz ;µz , G) * KL( plx . ,xzly ) 119k , ,xz )) 9k , ,×z I KL( qlx , ,xz ) Hpcx . ,xzly ) ) = |d× , dx . qk , ,xz 1 log
  • plx
, ,Xz1y )

ligng

. q leg 'f =
  • lipm ,
a log ph = as Intuition : q ( × , ,xe )
  • whenever
PC x , ,xz ly )→0
slide-23
SLIDE 23 Algorithm : Variational Expectation Maximization Define : qcz , O ; g) = gets Of 't ) g C O 's 90 ) Objective

:L

Clot , do ) = Et qiao , flog '

"q¥to

I slog ply ,
  • Repeat
until Ileft , 00 ) converges ( change smaller than some threshold ) 1 . Expectation Step lot = argy.mx L ( oligo) Analogous to EM step for j z . Maximization Step Updates distribution 010 = angurax £ ( 97,010 ) glo ; go ) instead
  • f
40 point estimate O
slide-24
SLIDE 24 Example : Gaussian Mixture ( Simplified ) Generative Model f ! I si is:/ huh , d ~ Norm ( pro ,d , So ) 2- n
  • Discrete
( YK , . . . ,Yk ) ynl7n=h n Norml pea ,

EI

)
slide-25
SLIDE 25 Model Selection

µ

Margined likelihood " Average Evidence livelihood " £ I log ply ) log pigs = log ldtdoply.z.io ) K=2
  • I
I Number
  • f
Clusters Intuition : Can avoid
  • ver
fitting by keeping model with highest £
slide-26
SLIDE 26 Gaussian Mixture : Derivation
  • f
Updates
  • f
  • f
9 I 9 Ltr . miss = E go.ngimm.si/bsPgIYIYgTp..m,s , ) = E gcaayuillogpcy.7.ms/-EgmlloylgiIY-Eqyuslloglgii-j ) depends
  • n
y , m , S depends an y depends
  • n
m , s
  • KL (
gets 11pm)
  • KL ( que ) llpqul )
E
  • step
: Solve I = a
  • r
M
  • step
: Solve Ofm =
  • ,

¥

=
slide-27
SLIDE 27 Gaussian Mixture : Derivation
  • f
Updates Idea : Exploit Exponential Families

Eaczsgcy

, flog ply It , y ) ) hiyqexpfyuttlyn )
  • algal
) = Eta

guy

, I &!£!IHn=h) log pcynlzi-h.lu )] = . Ego,fIha=hD

#

crplyittlyni
  • Eam, feign))
T I T depends
  • n

017=8

depends
  • n
0/7=4,5 )
slide-28
SLIDE 28 Example : Gaussian Mixture ( Simplified ) Generative Model f ! ! I " " i s:/ Variational

Distribution

µ h , d ~ Norm ( pro ,d , So ) gyu , 7) = 9441917 ) 2- n n Discrete ( 11k , . . . ,Yk ) 9171 = 917187 ynl7n=h n Norml pea , EI ) 9 ( in = I?

.NL/uuimu,5u

) h E
  • step
uh a exp I Eg , I log plynltn-h.MY ) M
  • step
i N mu = mo t Tywyn si = 1- mi ( ÷ . ) t frm