cs480 680 lecture 9 june 5 2019
play

CS480/680 Lecture 9: June 5, 2019 Perceptrons, Neural Networks [D] - PowerPoint PPT Presentation

CS480/680 Lecture 9: June 5, 2019 Perceptrons, Neural Networks [D] Chapt. 4, [HTF] Chapt. 11, [B] Sec. 4.1.7, 5.1, [M] Sec. 8.5.4, [RN] Sec. 18.7 University of Waterloo CS480/680 Spring 2019 Pascal Poupart 1 Outline Neural networks


  1. CS480/680 Lecture 9: June 5, 2019 Perceptrons, Neural Networks [D] Chapt. 4, [HTF] Chapt. 11, [B] Sec. 4.1.7, 5.1, [M] Sec. 8.5.4, [RN] Sec. 18.7 University of Waterloo CS480/680 Spring 2019 Pascal Poupart 1

  2. Outline • Neural networks – Perceptron – Supervised learning algorithms for neural networks University of Waterloo CS480/680 Spring 2019 Pascal Poupart 2

  3. Brain • Seat of human intelligence • Where memory/knowledge resides • Responsible for thoughts and decisions • Can learn • Consists of nerve cells called neurons University of Waterloo CS480/680 Spring 2019 Pascal Poupart 3

  4. Neuron University of Waterloo CS480/680 Spring 2019 Pascal Poupart 4

  5. Comparison • Brain – Network of neurons – Nerve signals propagate in a neural network – Parallel computation – Robust (neurons die everyday without any impact) • Computer – Bunch of gates – Electrical signals directed by gates – Sequential and parallel computation – Fragile (if a gate stops working, computer crashes) University of Waterloo CS480/680 Spring 2019 Pascal Poupart 5

  6. Artificial Neural Networks • Idea: mimic the brain to do computation • Artificial neural network: – Nodes (a.k.a. units) correspond to neurons – Links correspond to synapses • Computation: – Numerical signal transmitted between nodes corresponds to chemical signals between neurons – Nodes modifying numerical signal corresponds to neurons firing rate University of Waterloo CS480/680 Spring 2019 Pascal Poupart 6

  7. ANN Unit • For each unit i: • Weights: ! – Strength of the link from unit " to unit # – Input signals $ " weighted by % #" and linearly combined: & # = ∑ ) % *) $ ) + , - = ! . / 0 • Activation function: 1 – Numerical signal produced: 2 * = ℎ(& * ) University of Waterloo CS480/680 Spring 2019 Pascal Poupart 7

  8. ANN Unit • Picture University of Waterloo CS480/680 Spring 2019 Pascal Poupart 8

  9. Activation Function • Should be nonlinear – Otherwise network is just a linear function • Often chosen to mimic firing in neurons – Unit should be “active” (output near 1) when fed with the “right” inputs – Unit should be “inactive” (output near 0) when fed with the “wrong” inputs University of Waterloo CS480/680 Spring 2019 Pascal Poupart 9

  10. Common Activation Functions Threshold Sigmoid University of Waterloo CS480/680 Spring 2019 Pascal Poupart 10

  11. Logic Gates • McCulloch and Pitts (1943) – Design ANNs to represent Boolean functions • What should be the weights of the following units to code AND, OR, NOT ? University of Waterloo CS480/680 Spring 2019 Pascal Poupart 11

  12. Network Structures • Feed-forward network – Directed acyclic graph – No internal state – Simply computes outputs from inputs • Recurrent network – Directed cyclic graph – Dynamical system with internal states – Can memorize information University of Waterloo CS480/680 Spring 2019 Pascal Poupart 12

  13. Feed-forward network • Simple network with two inputs, one hidden layer of two units, one output unit University of Waterloo CS480/680 Spring 2019 Pascal Poupart 13

  14. Perceptron • Single layer feed-forward network University of Waterloo CS480/680 Spring 2019 Pascal Poupart 14

  15. Supervised Learning • Given list of (", $) pairs • Train feed-forward ANN – To compute proper outputs $ when fed with inputs " – Consists of adjusting weights & '( • Simple learning algorithm for threshold perceptrons University of Waterloo CS480/680 Spring 2019 Pascal Poupart 15

  16. Threshold Perceptron Learning • Learning is done separately for each unit ! – Since units do not share weights • Perceptron learning for unit ! : – For each (#, %) pair do: • Case 1: correct output produced ∀ ( ) *( ← ) *( • Case 2: output produced is 0 instead of 1 ∀ ( ) *( ← ) *( + - ( • Case 3: output produced is 1 instead of 0 ∀ ( ) *( ← ) *( − - ( – Until correct output for all training instances University of Waterloo CS480/680 Spring 2019 Pascal Poupart 16

  17. Threshold Perceptron Learning " # ! " # ! • Dot products: ! " ≥ 0 and −! " ≤ 0 • Perceptron computes 1 when ) # ! " = ∑ , - , . , + . 0 > 0 0 when ) # ! " = ∑ , - , . , + . 0 < 0 • If output should be 1 instead of 0 then " 4 ! " ≥ ) # ! ) ← ) + ! " since ) + ! " • If output should be 0 instead of 1 then " 4 ! " ≤ ) # ! ) ← ) − ! " since ) − ! " University of Waterloo CS480/680 Spring 2019 Pascal Poupart 17

  18. Alternative Approach • Let ! ∈ −1,1 ∀! • Let ' = { * + , ! + ∀+ } be set of misclassified examples – i.e., ! + - . / * 0 < 0 • Find - that minimizes misclassification error 3(-) = − ∑ * 7 ,8 7 ∈9 ! + - . / * 0 • Algorithm: gradient descent - ← - − ;<= learning rate or step length University of Waterloo CS480/680 Spring 2019 Pascal Poupart 18

  19. Sequential Gradient Descent • Gradient: !" = − ∑ & ' ,) ' ∈+ , - . / 0 • Sequential gradient descent: – Adjust 1 based on one example /, , at a time 1 ← 1 + 4,. / • When 4 = 1 , we recover the threshold perceptron learning algorithm University of Waterloo CS480/680 Spring 2019 Pascal Poupart 19

  20. Threshold Perceptron Hypothesis Space • Hypothesis space ℎ " : – All binary classifications with parameters " s.t. " # $ % > 0 → +1 " # $ % < 0 → −1 • Since " # $ % is linear in " , perceptron is called a linear separator • Theorem: Threshold perceptron learning converges iff the data is linearly separable University of Waterloo CS480/680 Spring 2019 Pascal Poupart 20

  21. Linear Separability • Examples: Linearly separable Non-linearly separable University of Waterloo CS480/680 Spring 2019 Pascal Poupart 21

  22. Sigmoid Perceptron • Represent “soft” linear separators • Same hypothesis space as logistic regression University of Waterloo CS480/680 Spring 2019 Pascal Poupart 22

  23. Sigmoid Perceptron Learning • Possible objectives – Minimum squared error ! " = 1 ! ' " ( = 1 ( ) ' − + " , - 2 & 2 & . / ' ' – Maximum likelihood • Same algorithm as for logistic regression – Maximum a posteriori hypothesis – Bayesian Learning University of Waterloo CS480/680 Spring 2019 Pascal Poupart 23

  24. Gradient • Gradient: !" !" * !# $ = ∑ ' ( ' ) !# $ = − ∑ ' ( ' ) , - ) . ̅ 0 ' 0 1 Recall that , - = ,(1 − ,) = − ∑ ' ( ' ) , ) . ̅ 1 − , ) . ̅ 0 ' 0 ' 0 1 University of Waterloo CS480/680 Spring 2019 Pascal Poupart 24

  25. Sequential Gradient Descent • Perceptron-Learning(examples,network) – Repeat • For each (" # , % & ) in examples do ( & ← % & − +(, - . " # ) , ← , + 0 ( & + , - . 1 − + , - . " # " # . " # – Until some stopping criterion satisfied – Return learnt network • N.B. 0 is a learning rate corresponding to the step size in gradient descent University of Waterloo CS480/680 Spring 2019 Pascal Poupart 25

  26. Multilayer Networks • Adding two sigmoid units with parallel but opposite “cliffs” produces a ridge University of Waterloo CS480/680 Spring 2019 Pascal Poupart 26

  27. Multilayer Networks • Adding two intersecting ridges (and thresholding) produces a bump University of Waterloo CS480/680 Spring 2019 Pascal Poupart 27

  28. Multilayer Networks • By tiling bumps of various heights together, we can approximate any function • Training algorithm: – Back-propagation – Essentially sequential gradient descent performed by propagating errors backward into the network – Derivation next class University of Waterloo CS480/680 Spring 2019 Pascal Poupart 28

  29. Neural Net Applications • Neural nets can approximate any function, hence millions of applications – Speech recognition – Word embeddings – Machine translation – Vision-based object recognition – Vision-based autonomous driving – Etc. University of Waterloo CS480/680 Spring 2019 Pascal Poupart 29

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend