Supervised Learning Artificial Neural Networks Marco Chiarandini - PowerPoint PPT Presentation

Lecture 11 Supervised Learning Artificial Neural Networks Marco Chiarandini Department of Mathematics & Computer Science University of Southern Denmark Slides by Stuart Russell and Peter Norvig

Neural Networks Course Overview Other Methods and Issues ✔ Introduction Learning Supervised ✔ Artificial Intelligence Decision Trees, Neural ✔ Intelligent Agents Networks ✔ Search Learning Bayesian Networks ✔ Uninformed Search Unsupervised ✔ Heuristic Search EM Algorithm ✔ Uncertain knowledge and Reinforcement Learning Reasoning Games and Adversarial Search ✔ Probability and Bayesian Minimax search and approach Alpha-beta pruning ✔ Bayesian Networks Multiagent search ✔ Hidden Markov Chains ✔ Kalman Filters Knowledge representation and Reasoning Propositional logic First order logic Inference Plannning 2

Neural Networks Outline Other Methods and Issues 1. Neural Networks Feedforward Networks Single-layer perceptrons Multi-layer perceptrons 2. Other Methods and Issues 3

Neural Networks A neuron in a living biological system Other Methods and Issues Axonal arborization Axon from another cell Synapse Dendrite Axon Nucleus Synapses Cell body or Soma Signals are noisy “spike trains” of electrical potential 4

Neural Networks Other Methods and Issues In the brain: > 20 types of neurons with 10 14 synapses (compare with world population = 7 × 10 9 ) Additionally, brain is parallel and reorganizing while computers are serial and static Brain is fault tolerant: neurons can be destroyed. 5

Neural Networks Other Methods and Issues Observations of neuroscience Neuroscientists: view brains as a web of clues to the biological mechanisms of cognition. Engineers: The brain is an example solution to the problem of cognitive computing 6

Neural Networks Applications Other Methods and Issues supervised learning: regression and classification associative memory optimization: grammatical induction, (aka, grammatical inference) e.g. in natural language processing noise filtering simulation of biological brains 7

Neural Networks Artificial Neural Networks Other Methods and Issues � “ The neural network” does not exist. There are different paradigms for neural networks, how they are trained and where they are used. Artificial Neuron Each input is multiplied by a weighting factor. Output is 1 if sum of weighted inputs exceeds the threshold value; 0 otherwise. Network is programmed by adjusting weights using feedback from examples. 8

Neural Networks McCulloch–Pitts “unit” (1943) Other Methods and Issues Output is a function of weighted inputs:   � a i = g ( in i ) = g W j , i a j  j Bias Weight a 0 = − 1 a i = g ( in i ) W 0 ,i g in i W j,i Σ a j a i Input� Input� Activation� Output� Output Links Function Function Links A gross oversimplification of real neurons, but its purpose is to develop understanding of what networks of simple units can do 9

Neural Networks Activation functions Other Methods and Issues Non linear activation functions g ( in i ) g ( in i ) + 1 + 1 in i in i (a)� (b)� (a) is a step function or threshold function (mostly used in theoretical studies) (b) is a continuous activation function, e.g., sigmoid function 1 / ( 1 + e − x ) (mostly used in practical applications) Changing the bias weight W 0 , i moves the threshold location 10

Neural Networks Implementing logical functions Other Methods and Issues W 0 = 1.5 W 0 = 0.5 W 0 = – 0.5 W 1 = 1 W 1 = 1 W 1 = –1 W 2 = 1 W 2 = 1 AND OR NOT McCulloch and Pitts: every (basic) Boolean function can be implemented (eventually by connecting a large number of units in networks, possibly recurrent, of arbitrary depth) 11

Neural Networks Network structures Other Methods and Issues Architecture: definition of number of nodes and interconnection structures and activation functions g but not weights. Feed-forward networks: no cycles in the connection graph single-layer perceptrons (no hidden layers) multi-layer perceptrons (one or more hidden layers) Feed-forward networks implement functions, have no internal state Recurrent networks: – Hopfield networks have symmetric weights ( W i , j = W j , i ) g ( x ) = sign ( x ) , a i = { 1 , 0 } ; associative memory – recurrent neural nets have directed cycles with delays = ⇒ have internal state (like flip-flops), can oscillate etc. 13

Neural Networks Use Other Methods and Issues Neural Networks are used in classification and regression Boolean classification: - value over 0.5 one class - value below 0.5 other class k -way classification - divide single output into k portions - k separate output unit continuous output - identity activation function in output unit 14

Neural Networks Single-layer NN (perceptrons) Other Methods and Issues Perceptron output 1 0.8 0.6 0.4 0.2 -4 -2 0 2 4 0 -4 x 2 -2 0 2 Input Output x 1 4 W j,i Units Units Output units all operate separately—no shared weights Adjusting weights moves the location, orientation, and steepness of cliff 15

Neural Networks Expressiveness of perceptrons Other Methods and Issues Consider a perceptron with g = step function (Rosenblatt, 1957, 1960) The output is 1 when: � W j x j > 0 or W · x > 0 j Hence, it represents a linear separator in input space: - hyperplane in multidimensional space - line in 2 dimensions Minsky & Papert (1969) pricked the neural network balloon 16

Neural Networks Perceptron learning Other Methods and Issues Learn by adjusting weights to reduce error on training set The squared error for an example with input x and true output y is E = 1 2 Err 2 ≡ 1 2 ( y − h W ( x )) 2 , Find local optima for the minimization of the function E ( W ) in the vector of variables W by gradient methods. Note, the function E depends on constant values x that are the inputs to the perceptron. The function E depends on h which is non-convex, hence the optimization problem cannot be solved just by solving ∇ E ( W ) = 0 17

Neural Networks Digression: Gradient methods Other Methods and Issues Gradient methods are iterative approaches: find a descent direction with respect to the objective function E move W in that direction by a step size The descent direction can be computed by various methods, such as gradient descent, Newton-Raphson method and others. The step size can be computed either exactly or loosely by solving a line search problem. Example: gradient descent 1. Set iteration counter t = 0, and make an initial guess W 0 for the minimum 2. Repeat: 3. Compute a descent direction p t = ∇ ( E ( W t )) 4. Choose α t to minimize f ( α ) = E ( W t − α p t ) over α ∈ R + 5. Update W t + 1 = W t − α t p t , and t = t + 1 6. Until �∇ f ( W k ) � < tolerance Step 3 can be solved ’loosely’ by taking a fixed small enough value α > 0 18

Neural Networks Perceptron learning Other Methods and Issues In the specific case of the perceptron, the descent direction is computed by the gradient:   n ∂ E Err · ∂ Err ∂ � = = Err ·  y − g ( W j x j )  ∂ W j ∂ W j ∂ W j j = 0 − Err · g ′ ( in ) · x j = and the weight update rule (perceptron learning rule) in step 5 becomes: W t + 1 = W t j + α · Err · g ′ ( in ) · x j j For threshold perceptron, g ′ ( in ) is undefined: Original perceptron learning rule (Rosenblatt, 1957) simply omits g ′ ( in ) 19

Neural Networks Perceptron learning contd. Other Methods and Issues function Perceptron-Learning( examples,network ) returns perceptron weights inputs : examples , a set of examples, each with input x = x 1 , x 2 , . . . , x n and output y inputs : network , a perceptron with weights W j , j = 0 , . . . , n and activation function g repeat for each e in examples do in ← � n j = 0 W j x j [ e ] Err ← y [ e ] − g ( in ) W j ← W j + α · Err · g ′ ( in ) · x j [ e ] end until all examples correctly predicted or stopping criterion is reached return network Perceptron learning rule converges to a consistent function for any linearly separable data set 20

Neural Networks Numerical Example Other Methods and Issues The (Fisher’s or Anderson’s) iris data set gives the measurements in centimeters of the variables petal length and width, respectively, for 50 flowers from each of 2 species of iris. The species are “Iris setosa”, and “versicolor”. > head(iris.data) Petal Dimensions in Iris Blossoms Sepal.Length Sepal.Width Species id 4.5 6 5.4 3.9 setosa -1 S 4.0 S 4 4.6 3.1 setosa -1 S S S S 84 6.0 2.7 versicolor 1 S S S SS 3.5 S SS S S V S S S V V 31 4.8 3.1 setosa -1 S S V V S S S V V V 3.0 V V V V 77 6.8 2.8 versicolor 1 V V Width V V V V V 15 5.8 4.0 setosa -1 V 2.5 V V S V V V 2.0 1.5 S Setosa Petals V Versicolor Petals 1.0 4 5 6 7 8 Length 21

Supervised Learning Artificial Neural Networks Marco Chiarandini - PowerPoint PPT Presentation

Lecture 11 Supervised Learning Artificial Neural Networks Marco Chiarandini Department of Mathematics & Computer Science University of Southern Denmark Slides by Stuart Russell and Peter Norvig Neural Networks Course Overview Other

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Supervised Learning in Neural Networks Keith L. Downing The Norwegian University of Science and

Introduction to Artificial Intelligence Neural Networks - Deep Learning for NLP Janyl Jumadinova

Artificial Neural Networks By: Kodi Neumiller Overview What is an artificial neural network

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

Introduction Supervised Learning CSCE CSCE 496/896 496/896 Lecture 2: Lecture 2: Basic

Generative Adversarial Networks (GANs) By: Ismail Elezi ismail.elezi@gmail.com Supervised

PCA CS 446 Supervised learning So far, weve done supervised learning: Given (( x i , y i )) ,

Artificial Neural Networks Roger Barlow CODATA School - Roger Barlow -Artificial Neural Networks

How Neural Networks (NN) Biological Neuron: A . . . Can (Hopefully) Learn Artificial Neural . .

Neural Networks and their Application to Go Neural Networks Learning Blackjack Theory Training

Artificial Neural Networks Oliver Schulte - CMPT 726 Feed-forward Networks Network Training

Networks Luke Schuler Overview What is an Artificial Neural Network? History

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

CS4501: Introduction to Computer Vision Neural Networks (NNs) Artificial Neural Networks (ANNs)

CS885 Reinforcement Learning Lecture 4a: May 11, 2018 Deep Neural Networks [GBC] Chap. 6, 7, 8

Learning From Data Lecture 21 Neural Networks: Backpropagation Forward propagation: algorithmic

Perceptron Lecturer: Barnabas Poczos Disclaimer : These notes have not been subjected to the usual

Gatekeeper: Supporting Bandwidth Guarantees for Multi-tenant Datacenter Networks Henrique

GN3Plus SA3T3 - Multi Domain VPN - technical architecture 2nd TERENA Network Architects Workshop

Fuel Cell Electric Buses Transitioning to Zero Emissions Jaimie Levin Renewable Power to

Transport Technology Research Innovation Grant (T-TRIG) Briefing webinars 9 th & 14 th

ASCR-NP : Experimental NP Graham Heyes - JLab, July 5th 2016 Introduction DAQ.

Supervised Learning Artificial Neural Networks Marco Chiarandini - PowerPoint PPT Presentation

Lecture 11 Supervised Learning Artificial Neural Networks Marco Chiarandini Department of Mathematics & Computer Science University of Southern Denmark Slides by Stuart Russell and Peter Norvig Neural Networks Course Overview Other

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Supervised Learning in Neural Networks Keith L. Downing The Norwegian University of Science and

Introduction to Artificial Intelligence Neural Networks - Deep Learning for NLP Janyl Jumadinova

Artificial Neural Networks By: Kodi Neumiller Overview What is an artificial neural network

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

Introduction Supervised Learning CSCE CSCE 496/896 496/896 Lecture 2: Lecture 2: Basic

Generative Adversarial Networks (GANs) By: Ismail Elezi ismail.elezi@gmail.com Supervised

PCA CS 446 Supervised learning So far, weve done supervised learning: Given (( x i , y i )) ,

Artificial Neural Networks Roger Barlow CODATA School - Roger Barlow -Artificial Neural Networks

How Neural Networks (NN) Biological Neuron: A . . . Can (Hopefully) Learn Artificial Neural . .

Neural Networks and their Application to Go Neural Networks Learning Blackjack Theory Training

Artificial Neural Networks Oliver Schulte - CMPT 726 Feed-forward Networks Network Training

Networks Luke Schuler Overview What is an Artificial Neural Network? History

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

CS4501: Introduction to Computer Vision Neural Networks (NNs) Artificial Neural Networks (ANNs)

CS885 Reinforcement Learning Lecture 4a: May 11, 2018 Deep Neural Networks [GBC] Chap. 6, 7, 8

Learning From Data Lecture 21 Neural Networks: Backpropagation Forward propagation: algorithmic

Perceptron Lecturer: Barnabas Poczos Disclaimer : These notes have not been subjected to the usual

Gatekeeper: Supporting Bandwidth Guarantees for Multi-tenant Datacenter Networks Henrique

GN3Plus SA3T3 - Multi Domain VPN - technical architecture 2nd TERENA Network Architects Workshop

Fuel Cell Electric Buses Transitioning to Zero Emissions Jaimie Levin Renewable Power to

Transport Technology Research Innovation Grant (T-TRIG) Briefing webinars 9 th &amp; 14 th

ASCR-NP : Experimental NP Graham Heyes - JLab, July 5th 2016 Introduction DAQ.

Transport Technology Research Innovation Grant (T-TRIG) Briefing webinars 9 th & 14 th