cs489 698 lecture 9 feb 1 2017
play

CS489/698 Lecture 9: Feb 1, 2017 Multi-layer Neural Networks, - PowerPoint PPT Presentation

CS489/698 Lecture 9: Feb 1, 2017 Multi-layer Neural Networks, Error Backpropagation [D] Chapt. 10, [HTF] Chapt. 11, [B] Sec. 5.2, 5.3, [M] Sec. 16.5, [RN] Sec. 18.7 CS489/698 (c) 2017 P. Poupart 1 Quick Recap: Linear Models Linear


  1. CS489/698 Lecture 9: Feb 1, 2017 Multi-layer Neural Networks, Error Backpropagation [D] Chapt. 10, [HTF] Chapt. 11, [B] Sec. 5.2, 5.3, [M] Sec. 16.5, [RN] Sec. 18.7 CS489/698 (c) 2017 P. Poupart 1

  2. Quick Recap: Linear Models Linear Regression Linear Classification CS489/698 (c) 2017 P. Poupart 2

  3. Quick Recap: Non-linear Models Non-linear classification Non-linear regression CS489/698 (c) 2017 P. Poupart 3

  4. Non-linear Models • Convenient modeling assumption: linearity • Extension: non-linearity can be obtained by mapping to a non-linear feature space • Limit: the basis functions are chosen a priori and are fixed • Question: can we work with unrestricted non-linear models? CS489/698 (c) 2017 P. Poupart 4

  5. Flexible Non-Linear Models • Idea 1: Select basis functions that correspond to the training data and retain only a subset of them (e.g., Support Vector Machines ) • Idea 2: Learn non-linear basis functions (e.g., Multi-layer Neural Networks ) CS489/698 (c) 2017 P. Poupart 5

  6. Two-Layer Architecture • Feed-forward neural network • Hidden units: • Output units: • Overall: CS489/698 (c) 2017 P. Poupart 6

  7. Common activation functions • Threshold: • Sigmoid: �� � � ��� • Gaussian: � � � �� • Tanh: � �� • Identity: CS489/698 (c) 2017 P. Poupart 7

  8. Adaptive non-linear basis functions • Non-linear regression – : non-linear function and : identity • Non-linear classification – : non-linear function and : sigmoid CS489/698 (c) 2017 P. Poupart 8

  9. Weight training • Parameters: • Objectives: – Error minimization • Backpropagation (aka “backprop”) – Maximum likelihood – Maximum a posteriori – Bayesian learning CS489/698 (c) 2017 P. Poupart 9

  10. Least squared error • Error function • When Linear combo Non-linear basis functions then we are optimizing a linear combination of non- linear basis functions CS489/698 (c) 2017 P. Poupart 10

  11. Sequential Gradient Descent • For each example adjust the weights as follows: • How can we compute the gradient efficiently given an arbitrary network structure? • Answer: backpropagation algorithm CS489/698 (c) 2017 P. Poupart 11

  12. Backpropagation Algorithm • Two phases: – Forward phase: compute output of each unit – Backward phase: compute delta at each unit CS489/698 (c) 2017 P. Poupart 12

  13. Forward phase • Propagate inputs forward to compute the output of each unit • Output at unit : where CS489/698 (c) 2017 P. Poupart 13

  14. Backward phase • Use chain rule to recursively compute gradient � � � – For each weight : �� � �� � – Let � then � – Since then �� CS489/698 (c) 2017 P. Poupart 14

  15. Simple Example • Consider a network with two layers: � �� – Hidden nodes: � �� � � • Tip: – Output node: • Objective: squared error CS489/698 (c) 2017 P. Poupart 15

  16. Simple Example • Forward propagation: – Hidden units: – Output units: • Backward propagation: – Output units: – Hidden units: • Gradients: � – Hidden layers: �� � – Output layer: �� CS489/698 (c) 2017 P. Poupart 16

  17. Non-linear regression examples • Two layer network: – 3 tanh hidden units and 1 identity output unit � � �� CS489/698 (c) 2017 P. Poupart 17

  18. Analysis • Efficiency: – Fast gradient computation: linear in number of weights • Convergence: – Slow convergence (linear rate) – May get trapped in local optima • Prone to overfitting – Solutions: early stopping, regularization (add penalty term to objective) CS489/698 (c) 2017 P. Poupart 18

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend