Xiong Zhang yi : McInerney Jered Auto encoding Variational - - PowerPoint PPT Presentation

xiong
SMART_READER_LITE
LIVE PREVIEW

Xiong Zhang yi : McInerney Jered Auto encoding Variational - - PowerPoint PPT Presentation

Lecture Estimators Doubly Re parameterized 20 : - Scribes Xiong Zhang yi : McInerney Jered Auto encoding Variational General Methods View : - weighted Ulp Auto encoding & Importance auto encoders SMC : EI [ IT Replace


slide-1
SLIDE 1 Lecture 20 : Doubly
  • Re
parameterized Estimators Scribes : Xiong yi Zhang Jered McInerney
slide-2
SLIDE 2 Auto encoding Variational Methods : General View Importance
  • weighted
auto encoders & Auto encoding SMC : Replace W with unbiased estimator EI [ IT = pg Cx ) £194 )
  • ftp.#qczixsllogwDsEp..llogpdxiIw=Po&I9pl71X
) n Wake
  • sleep
methods : Replace lower band with upper bound

Ulp

) > , log poles
slide-3
SLIDE 3 Gradient Estimation in Variational Inference Reinforce
  • Style
d ddqo ICO , a ) =
  • f
# go, czixsl log wo.pk , t ) ) wo.pk ,zs= Pok , Z ) 94171×3 =

& &

! (Ig leg

go ,

At

a) (log Wo ,

#

ith ) tbh ) Z "
  • aaczks
Normal : 2-41 E. x ) = µ¢lxlt6¢X ) E Re parameterized . dido L ( O , to ) = ddp Epee , flog woo, C x. E )) Wo ,¢c× , = Pok ,

2%5×3

)

9¢CZgk,x

) IX ) K =

If

dd-plogwo.pk

, eh ) duplet K k
  • I
slide-4
SLIDE 4 Gradient Estimation in Variational Inference Re parameterized . d- L ( O , to ) = Epee , flog woo, ( x. E ) ) Wo , ,oc× , = Pok . Zak .

"

) def

9¢CZqk,x

) IX ) K Samples are =

If

Idol log

wo.pk

, eh ) duplet equally weighted I K h=i Importance
  • Weighted
( Replace weight with average weight ) Iw h Foi Latos =
  • ff #

pceikgllogltf.wo.acx.es

)) K w go, (
  • x. Eh )
d Better samples = I
  • log
wo.pk , Eh ) E " " n pls " 's ) have a higher h= , { Wo , @ C x , El ) dot weight I
slide-5
SLIDE 5 Gradient Estimation in Variational Inference Importance
  • Weighted
Iw adq Llap ) =

ddqEpceikylle.gl#&!wo.4H&)HRepanameteniudkwo.glx.E

" ) d = I
  • log
wo.pk , Eh ) E " " n pls " 's ) ha f Wo , @ C x. El ) dot Wake
  • sleep
Style

y

Approximate with self . normalized importance sampling
  • dd-qUIO.to )
=
  • da #
poczix ) 1h07 WORK 't ) ) Notre parameterized to K woo , ( x.

zh

) d- log we ,qf× , Eh ) 2-49,171×3 Opposite sign
  • I
dot 6=1 § Wo , ( x , El)
slide-6
SLIDE 6 Aside . .

Reweighed

Wake Sleep Methods Wake phase : Sample Xb n pdat ' C x ) and approximate b 2- b' h ~ pact 1×1 using self
  • normalized
importance sampling with proposal zb.hn qgzlxb ) b , h To Epa , I log poem ) = LEE

II.

etolegpocxb

,

744

  • to

Epcxilkllpottixllgottixl ))

=
  • lgffqwt.tw#e0qbgqo,lzb.hlxbl
Sleep phase : Sample Xb , 7 's ~ pocx ,7 ) from the generative model and compute gradient ( often shipped )
  • 9

ftp.cnn/bgPg::ITII=fE4bggo,czbixbs

b
slide-7
SLIDE 7 Gradient Estimation in Variational Inference Importance
  • Weighted
Iw ddq Llap ) =

ddp-Epceikyllogltuwo.lk

'd

))) Repanameteniud K woo , ( x. E " ) d Sleep phase = I
  • log
Wo ,p(X , Eh ) E' i " npcglik ) generally shipped h= , § Wo , ( x , El ) dot i.

Wahef

sleep) Style
  • dd-qUIO.to )
=
  • & ftp.czixs/h0FW0HHZ ) )
Notre parameterized to K woo , ( x.

zh

) d- log we ,¢(× , Eh ) 2-49,171×3 Opposite sign
  • I
dot 6=1 & Wo , C x , El)
slide-8
SLIDE 8 Reinforce
  • style
Gradient ddqLn_Iffadqlogqacz4xDflogwCx.z4tb7Zzg@logpoCx.z.s log gettin 2- , Idea i move 94171×3 towards sumpters zh that have I .Higher Ip log tix ) 2 . Higher log wo.pk , t )
slide-9
SLIDE 9 Reinforce
  • style
Gradient

¥

L = I f (ad-glogqaczh.la/flogwCx.z4tb7 Zz

log Pok , as zh log gettin 2- , Idea i move 94171×3 towards sumpters zh that have I .Higher Ip log tix ) 2 . Higher log wo.pk , t )
slide-10
SLIDE 10 Reinforce
  • style
Gradient

¥

L = I f (ad-glogqaczh.la/flogwCx.z4tb7 Zz

log Pok , as zh log gettin 2- , Idea i move 94171×3 towards sumpters zh that have I .Higher Ip log tix ) 2 . Higher log wo.pk , t )
slide-11
SLIDE 11 Reinforce
  • style
Gradient

¥

L = I f (ad-glogqaczh.IN/flogwCx.z4tb7 Zz

log Pok ,z . > 9 To
  • zh
& log gettin . 2- I Idea i move 94171×3 towards sumpters zh that have I .Higher Ip log tix ) 2 . Higher log wo.pl
  • 577
slide-12
SLIDE 12 Reinforce
  • style
Gradient Depanaweten
  • iced
Gradient

¥

L =

§ (

adqlogqaczh.la/flogwCx.z4tb7ddqL--tEfddplogwo.qCx,E7 Zz

§

log pocx.rs Zz

§

log Pok ,z . ) 9 To a
  • zh

z-q.ie?x7logqqCzix7

. log gettin 2- , Zi Idea : move qolczlxs towards dea : Move 94171×1 in direction samples zh that have that will increase log Woo , ( x , E ) I .Higher Ip log 9ft " ' adqeogw = ! logwdzs
  • Iglesia
't ' " 2 . Higher log wo.plx.es 09
slide-13
SLIDE 13 Reinforce
  • style
Gradient Depanaweten
  • iced
Gradient

¥

L =

§ (

adqlogqaczh.la/flogwCx.z4+b7ddqL--tEfddplogwo.qCx,E7 Zz

§

log pocx.rs

Zzg@logpoCx.z.s

To% .
  • zh

z-g.ie?x7logqqCz1x7

. log q¢Cz1x ) 2- , Zi Idea : move qolczlxs towards dea : Move 94171×1 in direction samples zh that have that will increase log wo.pk , E ) I .Higher Ip log 9ft " ' adqeogw = ! logwdzs
  • Iglesia
't ' " 2 . Higher log wo.plx.es 09
slide-14
SLIDE 14 Reinforce
  • style
Gradient Depanaweten
  • iced
Gradient

¥

L =

§ (

adqlogqaczh.la/flogwCx.z4tb7ddqL--tEfddplogwo.qCx,E7 Zz

§

log pocx.rs Zz

§

log Pok ,z . > To% .
  • zh

z-q.ie?x7logqqCzlx7

. log gettin 2- , Zi Idea i move qolczlxs towards dea : Move 94171×1 in direction sumpters zh that have that will increase log wo.pk , E ) I .Higher Talos 9ft " ' adqeogw = ! logwffq
  • Iglesia
't ' " 2 . Higher log wo.qlx.tt )
slide-15
SLIDE 15 Importance
  • weighted
w C x , Eh ) Fold =

§ gee

, ddqlogwlx.ch ) 2- z

logPo"g

High self . normalized weight = ' log q¢cz,× ,

#

w self
  • normalized
weight yo 2- , Idea : Multiply adqlogwcx.es by the self
  • normalized
weight w ( X , Eh ) , , Gwf}
  • ooqeeyqa.in)
slide-16
SLIDE 16 Importance
  • weighted
w C x , Eh ) ddqLi_YfqwI.egddqlogwlx.ehIZzg@logpoCx.z.s . log gettin 2- , Idea : Multiply adqlogwcx.es by the self
  • normalized
weight w ( X , Eh ) ee , btw :&
  • ooqeegoioieixl)
slide-17
SLIDE 17 Importance
  • weighted
w C x , Eh )

dqL=

fqwI.ee

, dd-qlogwk.de ' Problem : High weight samples Zz

§

log have low signal
  • to
  • noise
ratio .

*

Samples that are closer to . " log pocx.tn have a higher log q¢Czix ) weight wcx , E ) but also z , have a larger
  • ddg
, log go, CZKT Idea : Multiply adqlogwcx.es by the self
  • normalized
weight w ( X , Eh ) ee , btw :3 ,
  • ooqeey9.mx

)

slide-18
SLIDE 18 Importance
  • weighted
Wake
  • sleep
Style Wh d

dqdn-fq-wed-qlogwk.de

'
  • dd_pU-nfgh-eweadqloggoiCzhk7Zg@logpocx.z.s
Zz

log Pok , " . log gettin log gettin Zi Zi d Idea : Multiply adqlogwcx.es Idea : Multiply af log 9171×1 by the self
  • normalized
weight by the self
  • normalized
weight how :&

www.ixi/wqIiIe,ofaees9oithw

I

2- e
slide-19
SLIDE 19 Doubly
  • Re
parameterized Estimators IN h DI =

Effie

logwhggiooqeeyg.a.in

)

d Of h e l 0,9 a Idea : Rewrite ¥¢bg9¢GK ) term

King

identity F- can , I fc2.ch#logqaczix ) ] .
  • Epee , I 8¥ 8¥ )
( for proof see Appendix 8.1
  • f
Tincher et at . ) h h f = W 0,9 07 I wo.IO 07 se

we

. .

I:*

  • few;
.
  • em
. :p

)

:

:&

's

. WO , ¢ OZ ( use Reinforce trial )

( qw.io

,
  • ¥3

!

log who , go
slide-20
SLIDE 20 Doubly
  • Re
parameterized Estimators IN h df = E

fit

log

nigh

. go ,

www.ixi

)

d Of h e l O , of a Idea : Rewrite Iq leg 9,41×7 term

King

identity h a .dz " " Intf

Tejo

, ,

# log

qacziix

) ]

Escamilla

:*

.

÷ : :&

:

eosw:* . :# I TLDR : Sau are the d£p±w=

( woah ) '

§

, log who ,
  • , 8¥

weights

& drop the ( f Yolo , ) ' .
  • f leg go
, can terms
slide-21
SLIDE 21 Doubly
  • Re
parameterized Estimators Singly Reparameteriud Doubly Reparameterized "

. "

&

. 2- , 2- I

ii.

how :*

www.ixi

)

(

wo.0://8.eog.io

,

:* )

£ Yoo, ' e s No mane signal
  • to
  • not
problems : All terms proportional to leg wgho ,
slide-22
SLIDE 22 Combining Importance . Weighting and Wake . sleep Idea : Tane convex combination
  • f
upper bound U and lower bound L d d

ddgth

= I I
  • a)AFL
  • a
  • g 'll
If Use doubly
  • re
parameterized estimator for both bounds K =L ftp.etci-zas/gwo:.l)f9.eoswho..:Ei) be
  • ,
9=0

Importune

  • weighted
D= 0.5 STL 0=1 Reweighed wake
  • sleep