19 Auto Lecture encoders : Ankur Bambhanoliya Scribes : - - PowerPoint PPT Presentation

19
SMART_READER_LITE
LIVE PREVIEW

19 Auto Lecture encoders : Ankur Bambhanoliya Scribes : - - PowerPoint PPT Presentation

I Variational 19 Auto Lecture encoders : Ankur Bambhanoliya Scribes : Donald Hamnett Motivation Inferring from Images Latent Variables : Dataset MNIST Goh images ; hand of digits written - Goal two variables Infer


slide-1
SLIDE 1 I Lecture 19 : Variational Auto encoders Scribes : Ankur Bambhanoliya Donald Hamnett
slide-2
SLIDE 2 Motivation : Inferring Latent Variables from Images Dataset MNIST ; Goh images
  • f
hand
  • written
digits Goal Infer two variables 1.

Digit

labels y e { ,
  • .
. , 9 } 2 , Style variables 7 RD
slide-3
SLIDE 3 Assume all digits Deep Generative Models equally frequent Idea I : Use a Neural Network to define a generative mode , generative µ .ae , ) Digit ( can supervise ) to yn ~ Discrete ( 0.1 , . . . , 0.1 ) 2- n , a n Normal (

g

I )

IX

Xn n Berinoulli (

µyn,7n

:O )) Neural I network Questions Style
  • f
9 I . How do we train this ? ( no supervision ) Image 2 . How do we do inference ?
slide-4
SLIDE 4 Training Deep Generative Models Idea 2 : Use Stochastic Gradient Ascent
  • n
a lower bound to approximate maximum likelihood estimation True data Generative model with distribution network weights O > a
  • To
KL (pdatacxsllpcxio ) =
  • T
. E paan , I leg

)

It = Epdata

ftp.logpcxioyfFEIFstiaip

:3 from a unknown N data distribution = I , § To log

pcxnlo

)

xhnpdatac

, = ,
slide-5
SLIDE 5 Training Deep Generative Models Idea 2 : Use Stochastic Gradient Ascent
  • n
a lower bound to approximate maximum likelihood estimation proposal I variational distribution depends
  • n
x Ll
  • '
'

ftp.nm.lt#gI..leos'

"

Ii

:

'

II

= E panta ( hog PK

10

)
  • KL ( 9cg

.HN//ply,z1x,oD)/

  • JL
Col =
  • To
KL (pm 's

Tx

, I Ipo Cx ) when qcyitlxl
  • ply
, 7190 ) 0£10 ) = Ftpdnta # I Ealy , ,×,[ To log pix , b. 7107 ]) =

Tuff ?

Do leg pcxn.gs , # to , x n
  • pdatacx ,
5,7 ' racy , 71×4
slide-6
SLIDE 6 Training Deep Generative Models Idea 3 : Use Stochastic Gradient Ascent to perform Variational inference

doin

. 't pan

.am#as.....teost

"

Ii :

÷ : ;D

Combining 1+2 i Perform gradient ascent
  • n
both 7 . Max Likelihood : To I ( O , 4) OF wgnfax leg pgcxl (

Learn

generative model ) 2 . Variational Inference : Tq

£10,4

)
  • f
't = arqninklfqcyit.gs/lpry,7Ix ) ) (Approx . Posterior )
slide-7
SLIDE 7 Training Deep Generative Models Idea 4 : Use neural network to define the inference model ( a.k.a. the variation dist . ) Digit f Inference model yn ~ Discrete ( I '(xn ;¢↳ )) Neural networks } 7 n ~ Normal ( Itkmyn ;¢D Vaniaticnal Dtst in

/

q(

y.tl/)=Mnqcyn,7n1xn18tule

Image
slide-8
SLIDE 8 Variational Auto encoders Objective : Learn a deep generative model and a corresponding inference model by
  • ptimizing
F. Lead
  • Oo .
'

Em

.

.mn/logP:Yi;::iIiTI

)

slide-9
SLIDE 9 Intermezzo : Auto encoders Xn 7h ( all continuous >
slide-10
SLIDE 10 Encoder : Mapping from image x to latent code 7 768 256 2- 50 Multi
  • layer
Perception hn ;=6( {

whijxn

, ; +

bh

;) .
  • Activation
linear map 10h Tainan Qtnptoochws zn , ;=
  • ( §

wtinhni

, + bti ) Activation functions 6kt X
slide-11
SLIDE 11 Decoder : Mapping from code 7 to image x Multi . layer Perception gxnzooh hni =
  • ( §

wiijzn

, ; +

BY

) dipterous pwws In . ; = 6( §

Whinhnj

+ bhi ) Loss : Binary Cross
  • entropy
LH

,¢)=n§

.pl?jxn.plogxn.p ^( Minimize = with SGD
slide-12
SLIDE 12 Auto encoder Learned Latent Codes
slide-13
SLIDE 13 Variation al Auto encoder : Treat a as latent variable
slide-14
SLIDE 14 Variational Auto encoder : Treat a as latent variable Inference Model ( Encoder ) Generative Model (Decoder ) 917=1 Xn 's Q ) ph pin , 7- n ; O )
  • g
hn = 6 ( Whxn tbh ) 7- n n Normal (
  • ,
I ) ¢M
  • n
Oh , µ ? = W " hut b " hn = 6 ( Whan

tbh )

46
  • r
a
  • k
Git = exp I Whn + b ' I y :-.
  • f What
b " ) 7- n n Normal ( pint , 67 ) xn n Bernoulli ( Yn ) Objective : Lto , on
  • Eg
, × , on I log Pg '

7×71

, ]
slide-15
SLIDE 15 Variutional Auto encoder Learned Latent Codes
slide-16
SLIDE 16 Auto encoder vs Variation Auto encoder
slide-17
SLIDE 17

Training

: The Re parameterization Trick fowl : Compute gradient
  • n
batch
  • f
images Pok " "

xbn

Uniform (f X , , . . .,Xr3)

tf

.iq#gczixbsllogqoTzx

) B ¢ Analogue
  • f
BBVI : REINFORCE
  • style
Estimator b b , s to , LIO , a) = '
  • §
, 's § To , log

gczhsix

' ) log

%

B a

qczb

's

lxb

) Y
  • b. s
b 2- ~ g ( 2- I X 7 ¢ Problem ; will be very high variance
slide-18
SLIDE 18

Training

: The Re parameterization Trick Idea : Sample z b 's using a te parameterized dist
  • b. s
q ~ Normal ( 0,1 )
  • b. s
b 7 =Mkt :p )

+61×3

;¢ ,

eh

s

}

it ~ 9ft Ix ) ^ Neural Tret works Result : Re parameterized Estimator

B-

b z I

Eb

;p ) ) I Poet , Tq £10,10 ) =toff .

pce.bg/b8qo,czces;qslxb

) B s b
  • b. s
= Is [

fo

leg Po " ' " E ' 0 ) )

ebiipce

, B 4
  • b. s
be . s
  • I
94 ( 2- C E ; 9) lxb ) In practice : S I
  • ften
enough
slide-19
SLIDE 19 Variational Auto encoders Objective : Learn a deep generative model and a corresponding inference model by
  • ptimizing
F. scores
  • %

.GE?Em..*.leogP:YiiiiiiiTI

)

slide-20
SLIDE 20 Continuous : pp ( Xn ,7n ) , 9$17 nlxnl ( 7 encodes both style & digit ) Continuous + Discrete : pfxn , ynifnl , qfbn ,7n lxn ) ft encodes style )
slide-21
SLIDE 21 Disentangled Representations Babahtsarthaht Hao : Learn Interpretable Features Xn Zn , ' Zn , -2

:

in ,7
  • 3
C 3

÷

1 1 ) × 6 Fn ,i =-3 Zn ,i=o 7h ,i=3