Anartificialneuron Artificialneuralnetworks y = f ( S ) x 0 =+1 - PDF document

✂ ✂ An�artificial�neuron Artificial�neural�networks y = f ( S ) x 0 =�+1 • Background n n = �✁� − θ = w 0 =�– θ S w x w x x 1 i i i i • Artificial�neurons,�what�they�can�and�cannot�do = = w 1 i 1 i 0 • The�multilayer�perceptron�(MLP) w 2 Σ f ( S )�=�any non-linear,� x 2 f y saturating�function,�e.g.�a� • Three�forms�of�learning step�function�or�a�sigmoid: w n x n • The�back�propagation�algorithm 1 = 1 • Radial�basis�function�networks f ( S ) + − S e • Competitive�learning�(and�relatives) 1 2 A�single�neuron�as�a�classifier The�XOR�problem The�neuron�can�be�used�as�a�classifier Not�linearly�separable�– must�combine�two�linear�discriminants. y<0.5� class�0 y>0.5� class�1 x 2 Linear�discriminant�=�a�hyper plane θ w 1 x 2 = − + 1 x x 2 1 w w 2 2 2D�example: Only�linearly�separable� A�line classification�problems� 0 x 1 0 1 can�be�solved. AND Two�sigmoids�implement�fuzzy� NOR x 1 AND�and�NOR 3 4 The�multilayer�perceptron Artificial�neural�networks�... • store�information�in�the�weights,�not�in�the�nodes • are�trained,�by�adjusting�the�weights,�not� programmed Inputs Outputs • can�generalize�to�previously�unseen�data • are�adaptive • are�fast�computational�devices,�well�suited�for� Linear�(func.�approx.)�or Sigmoidal�(classification) parallel�simulation�and/or�hardware�implementation Can�implement�any�function,�given�a�sufficiently�rich� • are�fault�tolerant internal�structure�(number�of�nodes�and�layers) 5 6 1

✁✂ � ✁ ✆ ✄ ☎ Why�neural�networks? Application�areas (statistical�methods�are�always�at�least�as�good,�right?) Finance Industry • Neural�networks�are statistical�methods • Forecasting • Adaptive�control • Model�independence • Fraud�detection • Signal�analysis • Adaptivity/Flexibility Medicine • Data�mining • Image�analysis • Concurrency Consumer market • Economical�reasons�(rapid�prototyping) • Household�equipment • Character�recognition • Speech�recognition 7 8 Three�forms�of�learning Back�propagation Output�( y ) Unsupervised Supervised Reinforcement Input Desired�output�( d ) Environment Target� Input Error�function function Reward State Action Agent Learning� The�contribution�to�the�error� E – ∂ system E from�a�particular�weight� w ji is� ∂ w ji Error Learning� Action� The�weight�should�be�moved�in� Error�function�and�transfer�function� Suggested� system actions selector proportion�to�that�contribution,�but�in�the� must�both�be�differentiable. other�direction: ∂ E ∆ = − η w ∂ ji w ji 9 10 Back�propagation�update�rule Training�procedure�(1) Network�is�initialised�with�small�random�weights w ji i j k Split�data�in�two�– a� training�set and�a� test�set Assumptions ∂ E ∆ = − η = ηδ w x Error�is�squared�error: ji ∂ j i w ji n The�training�set�is�used�for�training� 1 The�test�set�is�used�to�test�for� = − − 2 ( ) derivative�of�error E d y j j 2 and�is�passed�through�many�times. generalization�(to�see�how� = j 1 − − well�the�net�does�on� y ( 1 y )( d y ) If�node� j is�an�output�node Transfer�(activation)� j j j j δ = Update�weights�after�each� previously�unseen�data). − δ function�is�sigmoid: y ( 1 y ) w Otherwise j j j kj k presentation�( pattern�learning ) k 1 = = or This�is�the�result�the�counts! y f ( S ) − j j + S sum�over�all�nodes�in�the� 1 e j Accumulate�weight�changes�( ∆ w )� derivative�of�sigmoid ’next’�layer�(closer�to�the� until�the�end�of�the�training�set�is� outputs) reached�( epoch or� batch�learning ) 11 12 2

� � � � � Overtraining Network�size E Typical�error�curves • Overtraining�is�more�likely�to�occur�… if�we�train�on�too�little�data Test�or�validation�set�error if�the�network�has�too�many�hidden�nodes if�we�train�for�too�long Training�set�error • The�network�should�be� slightly larger�than�the�size� necessary�to�represent�the�target�function Time�(epochs) • Unfortunately,�the�target�function�is�unknown�... Overtraining • Need�much�more�training�data�than�the�number�of� weights! Cross�validation:�Use�a�third�set,�a� validation�set ,�to�decide�when�to�stop�(find� the�minimum�for�this�set,�and�retrain�for�that�number�of�epochs) 13 14 Training�procedure�(2) Practical�considerations 1. Start�with�a�small�network,�train,�increase�the�size,� • What�happens�if�the�mapping�represented�by�the�data�is�not�a� train�again,�etc.,�until�the�error�on�the� training�set can� function?�For�example,�what�if�the�same�input�does�not�always� be�reduced�to�acceptable�levels. lead�to�the�same�output? • In�what�order�should�data�be�presented?�Sequentially?�At� 2. If�an�acceptable�error�level�was�found,�increase�the� random? size�by�a�few�percent�and�retrain�again,�this�time�using� • How�should�data�be�represented?�Compact?�Distributed? the�cross-validation�procedure�to�decide�when�to�stop.� Publish�the�result�on�the�independent�test�set. • What�can�be�done�about�missing�data? 3. If�the�network�failed�to�reduce�the�error�on�the�training� • Trick�of�the�trade:�Monotonic�functions�are�easier�to�learn�than non-monotonic�functions!�(at�least�for�the�MLP) set,�despite�a�large�number�of�nodes�and�attempts,� something�is�likely�to�be�wrong�with�the�data. 15 16 Radial�basis�functions�(RBF) Geometric�interpretation • Layered�structure,�like�the�MLP,�with�one�hidden�layer • The�input�space�is�covered�with�overlapping�Gaussians. • Output�nodes�are�conventional • Each�hidden�node�… measures�the�distance�between�its�weight�vector�and�the� input�vector�(instead�of�a�weighted�sum) • In�classification,�the�discriminants�become�hyper�spheres� (circles�in�2D). feeds�that�through�a�Gaussian�(instead�of�a�sigmoid) Inputs Outputs 17 18 3

Anartificialneuron Artificialneuralnetworks y = f ( S ) x 0 =+1 - PDF document

Anartificialneuron Artificialneuralnetworks y = f ( S ) x 0 =+1 Background n n = = w 0 = S w x w x x 1 i i i i

Outline Why model neural networks? Modeling Neural Networks A brief look at the neuron.

How Neural Networks (NN) Biological Neuron: A . . . Can (Hopefully) Learn Artificial Neural . .

CHAPTER I CHAPTER I From Biological From Biological to Artificial Neuron Model to Artificial

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Neural Circuits Underlie Brain Function interneuron inter- neuron pyramidal neurons Neural

STEP 5. Slides of a neuron, nerve, and spinal cord #1 Neuron slide This step can wait until

Python + NEURON Interpreter HOC Section Neuron specific syntax Range Variable Mechanism

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

NEURAL NETWORKS NEURAL NETWORKS THE IDEA BEHIND ARTIFICIAL NEURONS Initially a simplified

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

Introduction to Artificial Intelligence Neural Networks - Deep Learning for NLP Janyl Jumadinova

Modeling Neural Networks Paul Nuytten CPSC 607 Outline Why model neural networks? A

CS 472 - Perceptron 1 Basic Neuron CS 472 - Perceptron 2 Expanded Neuron CS 472 - Perceptron

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

Introduction to Neural Networks Jakob Verbeek 2017-2018 Biological motivation Neuron is basic

CHAPTER I CHAPTER I From Biological From Biological to Artificial Neuron Model to Artificial

Logistic Regression INFO-4604, Applied Machine Learning University of Colorado Boulder September

Feedforward neural nets CSE 250B Outline 1 Architecture 2 Expressivity 3 Learning The

Lecture 12: Computational Graph Backpropagation Aykut Erdem March 2016 Hacettepe

NLP Programming Tutorial 8 - Recurrent Neural Nets Graham Neubig Nara Institute of Science and

Deep Networks Andrea Passerini passerini@disi.unitn.it Machine Learning Deep Networks Need for

Gradients of Deep Networks Chris Cremer March 29 2017 Neural Net $ %

CS6220: DATA MINING TECHNIQUES Image Data: Classification via Neural Networks Instructor: Yizhou

Neural Net Backpropagation 3/20/17 Recall: Limitations of Perceptrons vs. AND and OR are