Neural Network Part 5: Unsupervised Models
CS 760@UW-Madison
Neural Network Part 5: Unsupervised Models CS 760@UW-Madison Goals - - PowerPoint PPT Presentation
Neural Network Part 5: Unsupervised Models CS 760@UW-Madison Goals for the lecture you should understand the following concepts autoencoder restricted Boltzmann machine (RBM) Nash equilibrium minimax game generative
CS 760@UW-Madison
you should understand the following concepts
2
β π¦ π Hidden representation (the code) Reconstruction Input
β π¦ π Decoder π(β ) Encoder π(β ) β = π π¦ , π = π β = π(π π¦ )
thus hopefully can learn useful properties of the data
1988; Hinton and Zemel, 1994).
π π¦, π = π(π¦, π π π¦ ) β π¦ π
π π¦, π = π(π¦, π π π¦ )
encourages the model to have other properties
ππ = π(π¦, π π π¦ ) + π(β) β π¦ π
log π(π¦) = log ΰ·
ββ²
π(ββ², π¦)
max log π(π¦) = max log ΰ·
ββ²
π(ββ², π¦)
representation, and Οββ² π(ββ², π¦) can be approximated by π(β, π¦)
max log π(β, π¦) = max log π(π¦|β) + log π(β) Regularization Loss
π 2 exp(β π 2 β 1)
ππ = π(π¦, π π π¦ ) + π β 1
to be identity
π π¦, π = π(π¦, π π ΰ·€ π¦ ) where ΰ·€ π¦ is π¦ + ππππ‘π
probability distributions over binary vectors
exp(βπΉ π¦ ) π
π π¦ = exp(βπΉ π¦ ) π
πΉ π¦ = βπ¦πππ¦ β πππ¦ where π is the weight matrix and π is the bias parameter
π¦ = π¦π€, π¦β , π¦π€ visible, π¦β hidden πΉ π¦ = βπ¦π€
πππ¦π€ β π¦π€ πππ¦β β π¦β πππ¦β β πππ¦π€ β πππ¦β
1, π¦π€ 2, β¦ , π¦π€ π
log π π = ΰ·
π
log π(π¦π€
π)
where π π¦π€ = ΰ·
π¦β
π(π¦π€, π¦β) = ΰ·
π¦β
1 π exp(βπΉ(π¦π€, π¦β))
Boltzmann machine
π π€, β = exp(βπΉ π€, β ) π where the energy function is πΉ π€, β = βπ€ππβ β πππ€ β ππβ with the weight matrix π and the bias π, π
π = ΰ·
π€
ΰ·
β
exp(βπΉ π€, β )
Figure from Deep Learning, Goodfellow, Bengio and Courville
π β|π€ = π(π€, β) π(π€) = ΰ·
π
π(βπ|π€) and π βπ = 1|π€ = π π
π + π€ππ :,π
is logistic function
π π€|β = π(π€, β) π(β) = ΰ·
π
π(π€π|β) and π π€π = 1|β = π ππ + π
π,:β
is logistic function
See Ian Goodfellowβs tutorial slides: http://www.iangoodfellow.com/slides/2018-06-22-gan_tutorial.pdf
Some of the slides in these lectures have been adapted/borrowed from materials developed by Mark Craven, David Page, Jude Shavlik, Tom Mitchell, Nina Balcan, Matt Gormley, Elad Hazan, Tom Dietterich, Pedro Domingos, Geoffrey Hinton, and Ian Goodfellow.