Introduction to Artificial Neural Networks (ANNs) Keith L. Downing - - PowerPoint PPT Presentation

introduction to artificial neural networks anns
SMART_READER_LITE
LIVE PREVIEW

Introduction to Artificial Neural Networks (ANNs) Keith L. Downing - - PowerPoint PPT Presentation

Introduction to Artificial Neural Networks (ANNs) Keith L. Downing The Norwegian University of Science and Technology (NTNU) Trondheim, Norway keithd@idi.ntnu.no January 19, 2015 Keith L. Downing Introduction to Artificial Neural Networks


slide-1
SLIDE 1

Introduction to Artificial Neural Networks (ANNs)

Keith L. Downing

The Norwegian University of Science and Technology (NTNU) Trondheim, Norway keithd@idi.ntnu.no

January 19, 2015

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-2
SLIDE 2

NETtalk (Sejnowski + Rosenberg, 1986)

N E U R E C N E I S O C Letters "Concepts" Phonemes

Silent C

  • n

t e x t W i n d

  • w

IBM’s DECtalk: several man years of work → Reading machine. NETtalk: 10 hours of backprop training on a 1000-word text, T1000. 95% accuracy on T1000; 78% accuracy on novel text. Improvement during training sounds like a child learning to read. Concept layer is key. 79 different (overlapping) clouds of neurons are gradually formed, with each mapping to one of the 79 phonemes.

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-3
SLIDE 3

Sample ANN Applications: Forecasting

1

Train the ANN (typically using backprop) on historical data to learn [X(t−k),X(t−k+1),...,X(t0)] → [X(t1),...,X(tm−1),X(tm)]

2

Use to predict future value(s) based on the past k values. Sample applications (Ungar, in Handbook of Brain Theory and NNs, 2003) Car sales Airline passengers Currency exchange rates Electrical loads on regional power systems. Flour prices Stock prices (Warning: often tried, but few good, documented results).

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-4
SLIDE 4

Brain-Computer Interfaces (BCI)

Scalp EEG Action Neural Context Neural Ensembles

1

Ask subject to think about an activity (e.g. moving joystick left)

2

Register brain activity (EEG waves - non-invasive) or (Neural ensembles - invasive)

3

ANN training case = (brain readings, joystick motion) Sample applications (Millan, in Handbook of Brain Theory and NNs, 2003) Keyboards (3 keystrokes per minute) Artificial (prosthetic) hands Wheelchairs Computer games

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-5
SLIDE 5

Brains as Bio-Inspiration

"Watermelon" Grandmother "The truth? You can't handle the truth." "I got a 69 Chevy with a 396..." Texas

Distributed Memory - A key to the brain’s success, and a major difference between it and computers. Brain operations slower than computers, but massively parallel. How can the brain inspire AI advances? What is the proper level of abstraction?

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-6
SLIDE 6

Signal Transmission in the Brain

Nucleus

Dendrites Axons

SP AP

Action Potential (AP) A wave of voltage change along an axon. Nucleus (soma) generates an AP if the sum of its incoming synaptic potentials SPs (similar, but weaker, voltage change along dendrites) is strong enough. Unlike neuroscientists, AI people rarely distinguish between APs and SPs. Both are just signals.

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-7
SLIDE 7

Ion Channels

Na+ Ca++ K+

Ca++ Na+ K+

Repolarization Depolarization

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-8
SLIDE 8

Depolarization and Repolarization

+40 mV

  • 65 mV

0 mV Undershoot Overshoot Na+ gates open Na+ Influx K+ gates opens K+ Efflux Na+ gates close K+ Efflux K+ gates close Time Resting Potential

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-9
SLIDE 9

Transferring APs across a Synapse

Presynaptic Terminal Postsynaptic Terminal Synapse Vesicle Neurotransmitter (NT) NT-gated Ion Channel Action-Potential (AP)

Neurotransmitters Excite - Glutamate, AMPA ; bind Na+ and Ca++ channels. Inhibit - GABA; binds K+ channels

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-10
SLIDE 10

Location, Location, Location..of Synapses

Soma

Dendrites Axons

I2 I1 P2 P1

Distal and Proximal Synapses Synapses closer to the soma normally have a stronger effect.

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-11
SLIDE 11

Donald Hebb (1949)

Fire Together, Wire Together When an axon of cell A is near enough to excite a cell B and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells, such that A’s efficiency as one of the cells firing B, is increased. Hebb Rule △wi,j = λoioj Instrumental in Binding of.. pieces of an image words of a song multisensory input (e.g. words and images) sensory inputs and proper motor outputs simple movements of a complex action sequence

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-12
SLIDE 12

Coincidence Detection and Synaptic Change

2 Key Synaptic Changes

1

the propensity to release neurotransmitter (and amount released) at the pre-synaptic terminal,

2

the ease with which the post-synaptic terminal depolarizes in the presence of neurotransmitters. Coincidences

1

Pre-synaptic: Adenyl cyclase (AC) detects simultaneous presence of Ca++ and serotonin.

2

Post-synaptic: NMDA receptors detect co-occurrence of glutamate (a neurotransmitter) and depolarization.

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-13
SLIDE 13

Pre-synaptic Modification

AC 5HT Ca++ ATP cAMP PKA Ca++ Post-synaptic Terminal Pre-synaptic Terminal Glutamate Mg++ NMDA Receptor AC 5HT Serotonin Adenyl Cyclase Salient Event Depolarization

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-14
SLIDE 14

Post-synaptic Modification

Mg++

CA++

Net Negative Charge Polarized (relaxed) postsynaptic state Depolarized (firing) postsynaptic state Mg++

CA++

Net Positive Charge Glutamate NMDA Receptor Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-15
SLIDE 15

Neurochemical Basis of Hebbian Learning

Fire together: When the pre- and post-synaptic terminal

  • f a synapse depolarize at about the same time, the NMDA

channels on the post-synaptic side notice the coincidence and open, thus allowing Ca++ to flow into the post-synaptic terminal. Wire together: Ca++ (via CaMKII and protein kinase C) promotes post- and pre-synaptic changes that enhance the efficiency of future AP transmission.

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-16
SLIDE 16

Hebbian Basis of Classical Conditioning

Salivate (R) Hear Bell(CS) See Food (US) S1 S2

Unconditioned Stimulus (US) - sensory input normally associated with a response (R). E.g. the sight of food stimulates salivation. Conditioned Stimulus (CS) - sensory input having no previous correlation with a response but which becomes associated with it. E.g. Pavlov’s bell.

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-17
SLIDE 17

Long-Term Potentiation (LTP)

Early Phase Chemical changes to pre- and post-synaptic terminals, due to AC and NMDA activity, respectively, increase the probability (and efficiency) of AP transmission for minutes to hours after training. Late Phase Structural changes occur to the link between the upstream and downstream neuron. This often involves increases in the numbers of axons and dendrites linking the two, and seems to be driven by chemical processes triggered by high concentrations of Ca++ in the post-synaptic soma.

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-18
SLIDE 18

Abstraction

Human Brains 1011 neurons 1014 connections between them (a.k.a. synapses), many modifiable Complex physical and chemical activity to transmit ONE action potential (AP) (a.k.a. signal) along ONE connection. Artificial Neural Networks N = 101 −104 nodes Max N2 connections All physics and chemistry represented by a few parameters associated with nodes and arcs.

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-19
SLIDE 19

Structural Abstraction

Soma

Axons

AP Soma Soma Soma Soma

Dendrites

AP

Synapses

Soma Node Node Node Node Node Node

w w w w w w w Soma Soma Soma

Axonal Compartments Dendritic Compartments

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-20
SLIDE 20

Diverse ANN Topologies

A B D E C F

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-21
SLIDE 21

Functional Abstraction

Na+ Ca++ K+

Ca++ Na+ K+ Lipid bilayer = capacitor Ion channel = resistor

K+ Na+

CM EK RK VM ENa RNa

Integrate Activate N2 N3 w12 w13

N1

Learn Reset

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-22
SLIDE 22

Main Functional Components

Integrate Activate N2 N3 w12 w13 N1 Learn Reset

Integrate neti = ∑n

j=1 xjwi,j

: Vi ← Vi +neti Activate xi =

1 1+e−Vi

Reset Vi ← 0 Learn △wi,j = λxixj

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-23
SLIDE 23

Functional Options

Vi xi

i

Vi xi

Spiking Neuron Model: Reset Vi only when above threshold Neurons without state: Always reset Vi Never reset Vi Vi <= Vi + neti Integrate Activate Reset Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-24
SLIDE 24

Activation Functions xi = f(Vi)

Vi 1 Identity 1 Step T 1 Ramp T 1 Logistic

  • 1

1 Hyperbolic Tangent (tanh) xi xi xi xi xi Vi Vi Vi Vi

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-25
SLIDE 25

Diverse Model Semantics

What Does xi Represent?

1

The occurrence of a spike in the action potential,

2

The instantaneous membrane potential of a neuron,

3

The firing rate of a neuron (AP’s / sec),

4

The average firing rate of a neuron over a time window,

5

The difference between a neuron’s current firing rate and its average firing rate.

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-26
SLIDE 26

Circuit Models of Neurons

Lipid bilayer acts as a capacitor Ion channels act as resistors

CM EK RK VM

K+ Na+

ENa RNa

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-27
SLIDE 27

Using Kirchoff’s Current Law

The sum of all currents into the cell must be zero. The currents : Capacitance: Icap = CM

dVi dt

: Ionic (Potassium): IK = (VM−EK )

rK

= gK(VM −EK) : Ionic (Sodium): INa = (VM−ENa)

rNa

= gNa(VM −ENa) : Ionic (Leak): IL = (VM−EL)

rL

= gL(VM −EL) = Passive flow of ions through ungated channels. where I = current, r = resistance, g = conductance (1

r ), and VM

= membrane potential Icap +IK +INa +IL = 0 CM dVM dt = −gK(VM −EK)−gNa(VM −ENa)−gL(VM −EL)

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-28
SLIDE 28

Modeling Voltage-Gated Channels

gK and gNa are sensitive to the membrane potential, VM The gating probabilities m, n and h = gating probabilities (between 0 and 1) They are complex functions of VM, determined empirically by Hodgkin and Huxley’s work on the giant squid axon. Conductances are functions of the gating probabilities gK = gKn4 - since 4 identical and independent parts of a K gate need to be open. gK = maximum K conductance. gNa = gNam3h - since 3 identical and independent parts (along with a different, 4th part) of an Na gate need to be

  • pen.

gNa = maximum Na conductance.

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-29
SLIDE 29

A Basic Version of the Hodgkin-Huxley Model

Na+ Ca++ K+

Ca++ Na+ K+

Repolarization Depolarization

τm dVM dt = −gK(VM −EK)−gNa(VM −ENa)−gL(VM −EL) △VM ∝ Inflow(Na+) - outflow(K+) - Leak current EL ≈ −60mV, EK ≈ −70mV, and ENa ≈ 50mV τm includes the capacitance, CM.

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-30
SLIDE 30

Leaky Integrate and Fire Neurons

c b a

i Leak Vi EL = -65 mV wia wib wic xi xa xb xc

These models ignore ion channels and activity along axons and dendrites.

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-31
SLIDE 31

A Simple Leak-and-Integrate Model

τm dVi dt = cL(EL −Vi)+cI

N

j=1

xjwij (1) Vi = intracellular potential for neuron i. xi = output (current) from neuron i. wij = weight on connection from j to i. EL = extracellular potential τm = membrane time const. Higher τm → slower change. cL, cI = leak and integration constants. A Common Abstraction τm dVi dt = −Vi +

N

j=1

xjwij (2)

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-32
SLIDE 32

Firing Models

Continuous: Sigmoid Function xi = 1 1+e−csVi (3) * Often used for rate-coding, where xi = the neuron’s firing rate; cs is a scaling constant. Discrete: Step Function with Reset xi = 1 if Vi > Tf

  • therwise

(4) Vi ← Vreset after exceeding the threshold, Tf. Typical values: Vreset = −65mV, Tf = −50mV. Often used in spiking neuron models, where xi is binary, denoting presence or absence of an action potential.

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-33
SLIDE 33

Temporal Abstraction

A B C

0.8 0.5 0.4 +40 mV

  • 65 mV

0 mV Time

A B C

Time

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-34
SLIDE 34

Spike Response Model (SRM) - Gerstner et. al., 2002

Vi(t) = κ(Iext)+η(t − ˆ ti)+

N

j=1

wij

H

h=1

εij(t − ˆ ti,t −th

j )

i j k

t* t* t*

!

ki

!

kj

The timing of each spike is very important in determining its effects upon downstream neurons.

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-35
SLIDE 35

Spiking Neurons

Eugene Izhikevich, 2003 A Simple Model of Spiking Neurons. IEEE Transactions on Neural Networks, 14(6). τm dVi dt = 0.04V 2

i +5v +140−Ui +cI N

j=1

xjwij (5) τm dUi dt = a(bVi −Ui) (6) Ui = recovery factor If Vi ≥ 30mV then Vi ← Vreset, and Ui ← Ui +Ureset

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-36
SLIDE 36

Parameterized Spiking Patterns

Vi Time Chattering Regular Spiking Intrinsic Bursting Thalamocortical

Key parameters a, b, Vreset, and Ureset → spike patterns.

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-37
SLIDE 37

Continuous Time Recurrent Neural Networks

1 2 3 4 5

Sensory Input Layer

1 2 1 2

Hidden Layer

B

Motor Output Layer Bias Node

CTRNNs abstract away spikes but achieve complex dynamics with neuron-specific time constants, gains and biases. All weights evolve, but none are modified by learning. Invented by Randall Beer in early 1990’s and used in many evolved, minimally-cognitive agents.

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-38
SLIDE 38

The Simple CTRNN Model

si =

n

j=1

xjwi,j +Ii dVi dt = 1 τi [−Vi +si +θi] xi = 1 1+e−giVi

θi = bias; gi = gain. τi = time constant for neuron i. Each neuron implicitly runs at a different temporal resolution.

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-39
SLIDE 39

Essence of Learning in Neural Networks

u1 u2 un v w2 wn w1

pre-synaptic neurons post-synaptic neuron

? ∆w

Most ANNs do not model spikes nor STDP . Learning is based

  • n a comparison of recent firing rates of neuron pairs.

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-40
SLIDE 40

Spike-Timing Dependent Plasticity (STDP)

t s

  • 40 ms

40 ms 0.4

  • 0.4

Change in synaptic strength (△s) as function of △t = tpre −tpost, the times of the most recent pre- and post-synaptic spikes. The maximum magnitude of change is roughly 0.4% of the maximum possible synaptic strength/conductance.

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-41
SLIDE 41

3 Fundamental ANN Learning Paradigms

Supervised Constant, detailed feedback that includes the correct response to each input; Omnipresent teacher. Reinforced Simple feedback mainly at the end of a problem-solving attempt, although possibly a few intermediate rewards or penalties, but no direct response recommendations. Unsupervised No feedback whatsoever. ANN normally tries to intelligently cluster the inputs and/or learn proper correlations between components of input space.

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-42
SLIDE 42

Supervised Learning

You should have turned RIGHT at the last intersection. Sensory Input Motor Output Correct Action

  • ∆W

Error

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-43
SLIDE 43

Reinforced Learning

You are at the goal!

w

∆w

w w w w Reinforcement Signal

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-44
SLIDE 44

Unsupervised Learning

A long trip down a corridor is followed by a left turn.

w

∆w

w w w w

Input

w w

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-45
SLIDE 45

Hebbian Learning Rules

Basic Heterosynaptic Basic Homosynaptic △wi = λv(ui −θi) △wi = λ(v −θv)ui General Hebbian BCM Oja △wi = λuiv △wi = λuiv(v −θv) △wi = uiv −wiv2

Homosynaptic All active synapses are modified the same way, depending only on the strength of the postsynaptic activity. Heterosynaptic Active synapses can be modified differently, depending upon the strength of their presynaptic activity.

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)

slide-46
SLIDE 46

Modelling Options to Consider

1

Single or multiple neurons?

2

Can neuron A send more than one axon to neuron B?

3

Are connections modeled as cables or just simple connector points (i.e. a single weight).

4

Do neurons have state? I.e., does Vi(t +1) depend on Vi(t)?

5

Do outputs (xi) represent individual spikes or spike rates or ..?

6

Are neurons organized by layers?

7

Do layers follow a feed-forward topology or is there recurrence (i.e. looping)?

8

Are neurons connected within layers or only between layers?

9

Is learning supervised, unsupervised or reinforced?

10 Is spike-timing dependent plasticity (STDP) involved in the learning

rule?

Keith L. Downing Introduction to Artificial Neural Networks (ANNs)