CS344: Introduction to Artificial Intelligence (associated lab: - PowerPoint PPT Presentation

CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., CSE Dept., IIT Bombay Lecture 32: sigmoid neuron; Feedforward N/W; Error Backpropagation 29 th March, 2011

The Perceptron Model . y = 1 for Σw i x i >=θ = 0 otherwise Output = y Threshold = θ w 1 w n W n-1 x 1 X n-1

y 1 Σw i x i θ

Perceptron Training Algorithm Start with a random value of w 1. ex: <0,0,0…> Test for wx i > 0 2. If the test succeeds for i=1,2,…n then return w 3. Modify w, w next = w prev + x fail

Feedforward Network

Example - XOR θ = 0.5 � Calculation of XOR w 1 =1 w 2 =1 x 1 x 2 x 1 x 2 x x 1 x x 2 x x x 1 x Calculation of x 1 x 2 2 0 0 0 0 < Θ θ = 1 0 1 1 2 ≥ Θ w w 1 =-1 w 2 =1.5 1 < Θ w 1 0 0 1 2 + < Θ x 2 w w x 1 1 1 0

Example - XOR θ = 0.5 w 1 =1 w 2 =1 x 1 x 2 1 1 x 1 x 2 1.5 -1 -1 1.5 x 2 x 1

Can Linear Neurons Work? = + y m x c 3 3 h 2 h 1 = + = + y m x c y m x c 1 1 2 2 x 2 x 1 ( ) = + + h m w x w x c 1 1 1 1 2 2 1 ( ) = + + h m w x w x c 1 1 1 1 2 2 1 ( ) = + + Out w h w h c 5 1 6 2 3 = + + k x k x k 1 1 2 2 3

Note: The whole structure shown in earlier slide is reducible to a single neuron with given behavior = + + Out k x k x k 1 1 2 2 3 Claim: A neuron with linear I-O behavior can’t compute X- OR. Proof: Considering all possible cases: Proof: Considering all possible cases: [assuming 0.1 and 0.9 as the lower and upper thresholds] ( . 0 . 0 ) 0 . 1 + − θ + < m w w c 1 2 . 0 . 1 − θ < c m ⇒ For (0,0), Zero class: ( . 1 . 0 ) 0 . 9 + − θ + > m w w c 2 1 . . 0 . 9 − θ + > m w m c 1 ⇒ For (0,1), One class:

. . 0 . 9 − θ + > m w m c 1 For (1,0), One class: . . 0 . 9 − θ + > m w m c For (1,1), Zero class: 1 These equations are inconsistent. Hence X-OR can’t be computed. Observations: A linear neuron can’t compute X-OR. 1. A multilayer FFN with linear neurons is collapsible to a 2. single linear neuron, hence no a additional power due to hidden layer. Non-linearity is essential for power. 3.

Multilayer Perceptron

Gradient Descent Technique � Let E be the error at the output layer 1 p n ( ) 2 = − E t o ∑∑ i i j 2 1 1 = = j i � t i = target output; o i = observed output � i is the index going over n neurons in the outermost layer � j is the index going over the p patterns (1 to p) � Ex: XOR:– p=4 and n=1

Weights in a FF NN � w mn is the weight of the m w mn connection from the n th neuron to the m th neuron n W � E vs surface is a complex surface in the space defined by the weights w ij δ E − δ E gives the direction in δ w � ∆ ∝ − w mn mn δ w which a movement of the mn operating point in the w mn co- ordinate space will result in maximum decrease in error

Sigmoid neurons � Gradient Descent needs a derivative computation - not possible in perceptron due to the discontinuous step function used! � Sigmoid neurons with easy-to-compute derivatives used! 1 as → → ∞ y x 0 as → → −∞ y x � Computing power comes from non-linearity of sigmoid function.

Derivative of Sigmoid function 1 = y 1 − + x e 1 1 − x dy dy e e ( ( − − ) ) = = − − − − x x = = e e ( 1 − ) 2 ( 1 − ) 2 + + x x dx e e 1 1 1 ( 1 ) = − = −   y y   1 1 − − + x + x e e  

Training algorithm � Initialize weights to random values. � For input x = <x n ,x n-1 ,…,x 0 >, modify weights as follows Target output = t, Observed output = o Target output = t, Observed output = o δ E ∆ ∝ − w i δ w i 1 ( ) = − E t o 2 2 � Iterate until E < δ (threshold)

Calculation of ∆w i δ δ δ 1 − E E net n   : = × = where net w x   ∑ i i δ δ δ w net w = 0 i i i   δ δ δ E o net = × × δ δ δ δ δ δ o o net net w w i i ( ) ( 1 ) = − − − t o o o x i δ E ( learning constant, 0 1 ) ∆ = − η η = ≤ η ≤ w i δ w i ( ) ( 1 ) ∆ = η − − w t o o o x i i

Observations Does the training technique support our intuition? � The larger the x i , larger is ∆w i � Error burden is borne by the weight values corresponding to large input values

CS344: Introduction to Artificial Intelligence (associated lab: - PowerPoint PPT Presentation

CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., CSE Dept., IIT Bombay Lecture 32: sigmoid neuron; Feedforward N/W; Error Backpropagation 29 th March, 2011 The Perceptron Model . y = 1

CS344: Introduction to Artificial CS344: Introduction to Artificial Intelligence g (associated

CS344: Introduction to Artificial CS344: Introduction to Artificial Intelligence g (associated

CS344: Introduction to Artificial CS344: Introduction to Artificial Intelligence g (associated

CS344: Introduction to Artificial CS344: Introduction to Artificial Intelligence Vishal Vachhani

CS344: Introduction to CS344: Introduction to Artificial Intelligence g Pushpak Bhattacharyya

CS344: Introduction to CS344: Introduction to Artificial Intelligence g Pushpak Bhattacharyya

CS344: Introduction to CS344: Introduction to Artificial Intelligence g Pushpak Bhattacharyya

CS344: Introduction to CS344: Introduction to Artificial Intelligence g Pushpak Bhattacharyya

CS344: Introduction to Artificial Intelligence Intelligence (associated lab: CS386) Pushpak

Artificial Intelligence Artificial Intelligence Artificial Intelligence Study and design of

CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya

CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya

Artificial Intelligence Course Presentation Summary Artificial Intelligence Motivations

Artificial Intelligence Course Presentation Summary Artificial Intelligence Motivations

CS344: Introduction to Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT B IIT

CS344: Introduction to Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT B IIT

Synchronization of mobile oscillators Albert Daz-Guilera Universitat de Barcelona

Mul$dimensional Core- Collapse Simula$ons in FLASH & some

Policy Preview: An Update on the Reauthorization of the Healthy & Hunger-Free Kids Act

National Good Food Network Food Collaboration March 27, 2014 Presenters Eugene Kim, National

Specification and Abstraction of Souvenir, Souvenir . . . . . . . . . . . . . . . . . . . . . .

mlpack update a quick update on things that have changed August 28, 2019

Algebras and Coalgebras in dependent type theory Anton Setzer Swansea University 12 April 2012

Strange meson production near threshold Strange meson production near threshold FOPI FOPI in

CS344: Introduction to Artificial Intelligence (associated lab: - PowerPoint PPT Presentation

CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., CSE Dept., IIT Bombay Lecture 32: sigmoid neuron; Feedforward N/W; Error Backpropagation 29 th March, 2011 The Perceptron Model . y = 1

CS344: Introduction to Artificial CS344: Introduction to Artificial Intelligence g (associated

CS344: Introduction to Artificial CS344: Introduction to Artificial Intelligence g (associated

CS344: Introduction to Artificial CS344: Introduction to Artificial Intelligence g (associated

CS344: Introduction to Artificial CS344: Introduction to Artificial Intelligence Vishal Vachhani

CS344: Introduction to CS344: Introduction to Artificial Intelligence g Pushpak Bhattacharyya

CS344: Introduction to CS344: Introduction to Artificial Intelligence g Pushpak Bhattacharyya

CS344: Introduction to CS344: Introduction to Artificial Intelligence g Pushpak Bhattacharyya

CS344: Introduction to CS344: Introduction to Artificial Intelligence g Pushpak Bhattacharyya

CS344: Introduction to Artificial Intelligence Intelligence (associated lab: CS386) Pushpak

Artificial Intelligence Artificial Intelligence Artificial Intelligence Study and design of

CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya

CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya

Artificial Intelligence Course Presentation Summary Artificial Intelligence Motivations

Artificial Intelligence Course Presentation Summary Artificial Intelligence Motivations

CS344: Introduction to Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT B IIT

CS344: Introduction to Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT B IIT

Synchronization of mobile oscillators Albert Daz-Guilera Universitat de Barcelona

Mul$dimensional Core- Collapse Simula$ons in FLASH &amp; some

Policy Preview: An Update on the Reauthorization of the Healthy &amp; Hunger-Free Kids Act

National Good Food Network Food Collaboration March 27, 2014 Presenters Eugene Kim, National

Specification and Abstraction of Souvenir, Souvenir . . . . . . . . . . . . . . . . . . . . . .

mlpack update a quick update on things that have changed August 28, 2019

Algebras and Coalgebras in dependent type theory Anton Setzer Swansea University 12 April 2012

Strange meson production near threshold Strange meson production near threshold FOPI FOPI in

Mul$dimensional Core- Collapse Simula$ons in FLASH & some

Policy Preview: An Update on the Reauthorization of the Healthy & Hunger-Free Kids Act