Ming Colin Scribes ming : - , Motivation Inferring from - - PowerPoint PPT Presentation

ming
SMART_READER_LITE
LIVE PREVIEW

Ming Colin Scribes ming : - , Motivation Inferring from - - PowerPoint PPT Presentation

l Variation al Auto 16 Lecture encoders : Ming Colin Scribes ming : - , Motivation Inferring from Images Latent Variables : Dataset MNIST Goh images ; hand of digits written - Goal two variables Infer labels 1.


slide-1
SLIDE 1 l Lecture 16 :

Variation al

Auto

encoders Scribes :

Ming

  • ming
, Colin
slide-2
SLIDE 2 Motivation :

Inferring

Latent Variables from Images Dataset MNIST ; Goh images
  • f
hand
  • written
digits Goal Infer two variables 1.

Digit

labels y e { ,
  • .
. , 9 } 2 , Style variables 7 RD
slide-3
SLIDE 3

Deep

Generative Models equally probable Andisi "

)

Ideal : Use a Neural Network to define a generative model f Digit label {
  • ,
... ,q } Generative Model yn ~ Discrete ( 0.1 , ... ,0 . 1 ) ×neRP Zn , a ~ Normal (
  • ,
1 ) xn ~ Bernoulli ( y×( yn ,7n :& )) N= boh D= 2-50 P=76P Joint Distribution p f N Style Image PC x. 4,7 ;D ) = M plxn.yn.tn ;D ) € Rb ( pixels ) n= , × = { × , , ... ,X× , }
slide-4
SLIDE 4

Training

Deep Generative Models Idea 2 : Use Stochastic Gradient Ascent to perform maximum likelihood estimation pcx , y ,7 :O) 110 )= # pc , [ log )

sloypcx

;o7 y it pcy ,77 POLIO ) =

Etpcyi

, | To log pc x. yitso ) ] ( s ) y ~ p( y ) a

lg §

,

T.logpcx.ci

" , " " so ) z 's

'~ph→

Problem : Prior too broad to propose good samples
slide-5
SLIDE 5

Training

Deep Generative Models Idea 3 : Use Stochastic Gradient Ascent to perform Vaniational inference Llo ,¢)= E

[ log

Pl×' " 750 ' ) s

logpcx

:o) qcyitso ) qcy ,7 :O) Combining 2+3 i

Perform

gradient ascent
  • n
both 7 . Max Likelihood : VfL ( O , ¢ ) ( Learn Generative Model ) g* = argngae log Ptt ' 0 ) ( Learns posterior ) 2 . Vaniatianal Inference :

to

£ ( 0,9 ) ¢* =

angngtn

KL

(

qlyitso )

Hpcyitlx

) )
slide-6
SLIDE 6

Training

Deep Generative Models Idea 4 : Use neural network to define the inference model ( a.k.a. the variatianal dist . ) Digit label ynt{ , ... ,9 } ( have supervision ) Inference model Image yn ~ Discrete ( FY ( xn ; 019 )) P xe R
  • 9
n Neural Nets \I

|

7n a Normal

(

ittlxmynsgts

) Vaniaticnal

Dtstf

Parameters N stale vows ZNERD

qly.tl/)=M9olyn.7nlXn

) ( no supervision ) nil
slide-7
SLIDE 7

Variation al

Autoencoders Objective : Learn a deep generative model and a corresponding inference Medel by
  • ptimizing

Tomko

.n=0a¢( E

' #

www.nlbspgfxiy?niI*

slide-8
SLIDE 8 Intermezzo : A Wto encoders
slide-9
SLIDE 9 Differentiable Encoder :

Mapping

from image x to latent code z ¢h e RHK e pzas
  • c. p2
Multi
  • layer

Perception

hn = 6 (

Whxn

+

bh

)

±

new mapping a Zooh ~ 500 7 7 pwnams pawns zn , ; = 6 ( W hn t b )

÷

7 Activation functions 6kt X
slide-10
SLIDE 10 Decoder : Mapping from code 7 to image × Multi . layer Perception Oh .

tin

= 6( Whzn + b " )

÷

In = 6 (

When

+ b ' ) Loss : Binary Cross
  • entropy
L( to

,¢)=§l,§.!×n,p

log In ,p + ( l
  • Xn
, p )

{

Minimize this log ( c- In ,p ))
slide-11
SLIDE 11 Auto encoder Learned Latent Codes
slide-12
SLIDE 12 Variation al Auto encoder : Treat a as latent variable
slide-13
SLIDE 13 Variation al Auto encoder : Treat a as latent variable Inference Model (

Encoded

Generative Model

( Decade )

91711in 's Q ) plxn ,7n ; 0 ) hn =

6( Whxnt

bh ) 7h ~ Normal ( 0 , I ) µ ? = wtuhntbl " hn = 6( When +

bh )

a

= exp ( w ' hntb ' ] y×n=

6(W×hn+b× )

7h ~ Normal ( µI , bit I ) xn ~ Bernoulli

( y×n )

Objective : Llo , a) =

Ean

. ;o, ,[ logp "gYzYa

's

]

slide-14
SLIDE 14 Variutional Auto encoder Learned Latent Codes
slide-15
SLIDE 15 Auto encoder vs Variation Auto encoder
slide-16
SLIDE 16

Training

: The Re parameterization Trick

fowl

: Compute gradient
  • n
batch
  • f
images

polx.MS

Gradient
  • f
O §B ,

.iq#a.p.,n.lbgaF

]

"

In "u!if(

{ × , , ... .x " } )

Analogue

  • f
BBVI : REINFORCE . style Estimator 13

fqLl0ioD-2stE.qlegqoaHnxd@gP093L.a

)

b= , 9¢(

7,141

( sl Problem : might be high

variance

Zb ~ qlttskb )
slide-17
SLIDE 17

Training

: The Re parameterization Trick Idea : Sample

zb

" ' using a teparameterized dist eb " ' ~ Normal ( ° ' ' 7

}

as

'~µmm(µ?66

) 75 " =

pit ( xb

;¢Mlt6( xb 's F) Eb 's ' \ Result : Re parameterized Estimator

B-

pq( Xb , 2- ( Xb ,{bs$tD

to

£ ( O ,¢ ) = Do ,|£ , # pc e) [ 68 qgczks.es:491 's ) B S pqkb , 2- ( Xb ,{bs¢tD

=f[ [

0g

leg

dsstpcel

b- , s= , 9g ( 7 ( Xs ,Ebs¢tl1×b ) In practice : 5=1
  • ften
enough
slide-18
SLIDE 18

Variation al

Autoencoders Objective : Learn a deep generative model and a corresponding inference Medel by
  • ptimizing

Tomko

.n=0a¢( E

' #

www.nlbspgfxiy?niI*

slide-19
SLIDE 19

Practical

Implementations : Tensor flow

/

Py Torch
slide-20
SLIDE 20 Continuous : pp ( Xn ,7n ) , 9$17 nlxnl

( 7

encodes both style & digit ) Continuous + Discrete : pfxn , ynifnl , qfbn ,7n lxn ) ft encodes style )
slide-21
SLIDE 21 Disentangled Representations Bab ah + Sarthaht Hao : Learn Interpretable Features Xn 7in , 1 7h , 2 . ) . 1

:

n ,7