lecture 1 feedforward
play

Lecture 1: Feedforward Princeton University COS 495 Instructor: - PowerPoint PPT Presentation

Deep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: Yingyu Liang Motivation I: representation learning Machine learning 1-2-3 Collect data and extract features Build model: choose hypothesis class


  1. Deep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: Yingyu Liang

  2. Motivation I: representation learning

  3. Machine learning 1-2-3 β€’ Collect data and extract features β€’ Build model: choose hypothesis class π“˜ and loss function π‘š β€’ Optimization: minimize the empirical loss

  4. Features 𝑦 Color Histogram Extract build 𝑧 = π‘₯ π‘ˆ 𝜚 𝑦 features hypothesis Red Green Blue

  5. Features: part of the model Nonlinear model build 𝑧 = π‘₯ π‘ˆ 𝜚 𝑦 hypothesis Linear model

  6. Example: Polynomial kernel SVM 𝑦 1 𝑧 = sign(π‘₯ π‘ˆ 𝜚(𝑦) + 𝑐) 𝑦 2 Fixed 𝜚 𝑦

  7. Motivation: representation learning β€’ Why don’t we also learn 𝜚 𝑦 ? Learn 𝜚 𝑦 Learn π‘₯ 𝜚 𝑦 𝑧 = π‘₯ π‘ˆ 𝜚 𝑦 𝑦

  8. Feedforward networks β€’ View each dimension of 𝜚 𝑦 as something to be learned … 𝑧 = π‘₯ π‘ˆ 𝜚 𝑦 … 𝑦 𝜚 𝑦

  9. Feedforward networks π‘ˆ 𝑦 don’t work: need some nonlinearity β€’ Linear functions 𝜚 𝑗 𝑦 = πœ„ 𝑗 … 𝑧 = π‘₯ π‘ˆ 𝜚 𝑦 … 𝑦 𝜚 𝑦

  10. Feedforward networks π‘ˆ 𝑦) where 𝑠(β‹…) is some nonlinear function β€’ Typically, set 𝜚 𝑗 𝑦 = 𝑠(πœ„ 𝑗 … 𝑧 = π‘₯ π‘ˆ 𝜚 𝑦 … 𝑦 𝜚 𝑦

  11. Feedforward deep networks β€’ What if we go deeper? … … … … 𝑧 … … β„Ž 𝑀 β„Ž 1 𝑦 β„Ž 2

  12. Figure from Deep learning , by Goodfellow, Bengio, Courville. Dark boxes are things to be learned.

  13. Motivation II: neurons

  14. Motivation: neurons Figure from Wikipedia

  15. Motivation: abstract neuron model β€’ Neuron activated when the correlation between the input and a pattern πœ„ 𝑦 1 exceeds some threshold 𝑐 𝑦 2 β€’ 𝑧 = threshold(πœ„ π‘ˆ 𝑦 βˆ’ 𝑐) or 𝑧 = 𝑠(πœ„ π‘ˆ 𝑦 βˆ’ 𝑐) 𝑧 β€’ 𝑠(β‹…) called activation function 𝑦 𝑒

  16. Motivation: artificial neural networks

  17. Motivation: artificial neural networks β€’ Put into layers: feedforward deep networks … … … … 𝑧 … … β„Ž 𝑀 β„Ž 1 𝑦 β„Ž 2

  18. Components in Feedforward networks

  19. Components β€’ Representations: β€’ Input β€’ Hidden variables β€’ Layers/weights: β€’ Hidden layers β€’ Output layer

  20. Components First layer Output layer … … … … 𝑧 … … β„Ž 𝑀 Hidden variables β„Ž 1 β„Ž 2 Input 𝑦

  21. Input β€’ Represented as a vector β€’ Sometimes require some Expand preprocessing, e.g., β€’ Subtract mean β€’ Normalize to [-1,1]

  22. Output layers Output layer β€’ Regression: 𝑧 = π‘₯ π‘ˆ β„Ž + 𝑐 β€’ Linear units: no nonlinearity 𝑧 β„Ž

  23. Output layers Output layer β€’ Multi-dimensional regression: 𝑧 = 𝑋 π‘ˆ β„Ž + 𝑐 β€’ Linear units: no nonlinearity 𝑧 β„Ž

  24. Output layers Output layer β€’ Binary classification: 𝑧 = 𝜏(π‘₯ π‘ˆ β„Ž + 𝑐) β€’ Corresponds to using logistic regression on β„Ž 𝑧 β„Ž

  25. Output layers Output layer β€’ Multi-class classification: β€’ 𝑧 = softmax 𝑨 where 𝑨 = 𝑋 π‘ˆ β„Ž + 𝑐 β€’ Corresponds to using multi-class logistic regression on β„Ž 𝑨 𝑧 β„Ž

  26. Hidden layers β€’ Neuron take weighted linear combination of the previous … layer β€’ So can think of outputting one value for the next layer … β„Ž 𝑗 β„Ž 𝑗+1

  27. Hidden layers β€’ 𝑧 = 𝑠(π‘₯ π‘ˆ 𝑦 + 𝑐) β€’ Typical activation function 𝑠 𝑠(β‹…) β€’ Threshold t 𝑨 = 𝕁[𝑨 β‰₯ 0] 𝑦 𝑧 β€’ Sigmoid 𝜏 𝑨 = 1/ 1 + exp(βˆ’π‘¨) β€’ Tanh tanh 𝑨 = 2𝜏 2𝑨 βˆ’ 1

  28. Hidden layers β€’ Problem: saturation 𝑠(β‹…) 𝑦 𝑧 Too small gradient Figure borrowed from Pattern Recognition and Machine Learning , Bishop

  29. Hidden layers β€’ Activation function ReLU (rectified linear unit) β€’ ReLU 𝑨 = max{𝑨, 0} Figure from Deep learning , by Goodfellow, Bengio, Courville.

  30. Hidden layers β€’ Activation function ReLU (rectified linear unit) β€’ ReLU 𝑨 = max{𝑨, 0} Gradient 1 Gradient 0

  31. Hidden layers β€’ Generalizations of ReLU gReLU 𝑨 = max 𝑨, 0 + 𝛽 min{𝑨, 0} β€’ Leaky- ReLU 𝑨 = max{𝑨, 0} + 0.01 min{𝑨, 0} β€’ Parametric- ReLU 𝑨 : 𝛽 learnable gReLU 𝑨 𝑨

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend