Deep Learning Basics Lecture 1: Feedforward
Princeton University COS 495 Instructor: Yingyu Liang
Lecture 1: Feedforward Princeton University COS 495 Instructor: - - PowerPoint PPT Presentation
Deep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: Yingyu Liang Motivation I: representation learning Machine learning 1-2-3 Collect data and extract features Build model: choose hypothesis class
Princeton University COS 495 Instructor: Yingyu Liang
Color Histogram
Red Green Blue
Extract features
π¦ π§ = π₯ππ π¦
build hypothesis
π§ = π₯ππ π¦
build hypothesis
Linear model Nonlinear model
π¦1 π¦2 π§ = sign(π₯ππ(π¦) + π) Fixed π π¦
Learn π π¦
π¦ π§ = π₯ππ π¦
Learn π₯
π π¦
π¦
π π¦
π§ = π₯ππ π¦
β¦ β¦
ππ¦ donβt work: need some nonlinearity
π¦
π π¦
π§ = π₯ππ π¦
β¦ β¦
ππ¦) where π (β ) is some nonlinear function
π¦
π π¦
π§ = π₯ππ π¦
β¦ β¦
β¦ β¦
β¦ β¦ β¦ β¦
π¦
β1 β2 βπ π§
Figure from Deep learning, by Goodfellow, Bengio, Courville. Dark boxes are things to be learned.
Figure from Wikipedia
between the input and a pattern π exceeds some threshold π
π§ π¦1 π¦2 π¦π
β¦ β¦
β¦ β¦ β¦ β¦
π¦
β1 β2 βπ π§
β¦ β¦
β¦ β¦ β¦ β¦
Hidden variables β1 β2 βπ π§ Input π¦ First layer Output layer
preprocessing, e.g.,
Expand
β π§ Output layer
β π§ Output layer
β π§ Output layer
logistic regression on β
β π§ Output layer π¨
combination of the previous layer
value for the next layer
β¦ β¦
βπ βπ+1
π§ π¦ π (β )
π§ π¦ π (β )
Figure borrowed from Pattern Recognition and Machine Learning, Bishop
Too small gradient
Figure from Deep learning, by Goodfellow, Bengio, Courville.
Gradient 0 Gradient 1
π¨ gReLU π¨