ARTIFICIAL INTELLIGENCE
Lecturer: Silja Renooij
Artificial Neural Networks
Utrecht University The Netherlands
These slides are part of the INFOB2KI Course Notes available from www.cs.uu.nl/docs/vakken/b2ki/schema.html
ARTIFICIAL INTELLIGENCE Artificial Neural Networks Lecturer: Silja - - PowerPoint PPT Presentation
Utrecht University INFOB2KI 2019-2020 The Netherlands ARTIFICIAL INTELLIGENCE Artificial Neural Networks Lecturer: Silja Renooij These slides are part of the INFOB2KI Course Notes available from www.cs.uu.nl/docs/vakken/b2ki/schema.html 2
Utrecht University The Netherlands
These slides are part of the INFOB2KI Course Notes available from www.cs.uu.nl/docs/vakken/b2ki/schema.html
2
3
4
5
6
7
8
9
Artificial neuron (Node) Natural neuron
10
w1 w2 wn x1 x2 xn y
hard delimiter Linear Combiner
aka:
11
x x
z
y = g(z)
12
w=2 w=4 4
8
13
14
15
Input nodes: Single layer of neurons:
(d=desired output)
16
NB in the book the learning rate is called Gain, with notation η
Weights for any t changed? All Weights unchanged?
Ready
1 ‘epoch’
17
1 1
18
w=0.3 w=-0.1 0.2
x1 x2 d(t)
t1 t2 1 t3 1 t4 1 1 1
Alternative: use bias b= – θ with unit stepfunction
19
w=0.3 w=-0.1 0.2 1
x1 x2 d(t)
t1 t2 1 t3 1 t4 1 1 1
20
w=0.3 w=-0.1 0.2 1 0.3 0.3 1
w=0.2
x1 x2 d(t)
t1 t2 1 t3 1 t4 1 1 1
w=0.2 w=-0.1 0.2 1 1 0.2
0.1
w=0 w=0.3
x1 x2 d(t)
t1 t2 1 t3 1 t4 1 1 1
22
w=0.1 w=0.1 0.2
23
x1 x2 d y
1 1 1 1 1 1
1 1
24
25
x1 x2 d
1 1 1 1 1 1
1 1 XOR
26
x2 x1 1 1
ϴ=1
x2 x1 1 1
x2 x1 1 1
ϴ=1 ϴ=1 ϴ=1 ϴ=1
27
28
input nodes hidden layer of neurons
neuron layer
29
Input signals Error signals
30
w1 w2 wn x1 x2 xn y
sigmoid function Linear Combiner
32
w=2 w=4 3
6
0.119
33
𝑒 𝑧
35
𝑥
36
𝑥
Initialize weights and threshold (or bias) to random numbers; Choose a learning rate 0 𝛽 1 For each training input t=<x1,…,xn>:
Weights for any t changed? All Weights unchanged?
Ready
calculate the output y(t) and error e(t)=d(t) - y(t)
𝑨 𝑓𝑢
𝑨 ∑
if j is hidden node
37
0.003
x1 x2 d
1 1 1 1 1 1
0.002 0.002 1 2 4 3 5
W35= 5
W45= 5 W13= 10 W24= 10 W14= -5 W23= -5
To simplify computation, if absolute value of e(t) < 0.1, we consider outcome correct. With the sigmoid as approximation of the step‐function, we consider this outcome correct no weight updates required for first case, for now…..
Activation function for nodes 3-5: (i.e. ) Set 𝛽 0.9 𝑨 1 1 𝑓 ⁄ 𝜄 6
38
1
x1 x2 d
1 1 1 1 1 1
0.000 0.982 0.252 δ5 = y5 * (1-y5) * e ~ 0.141 Δw35 = α * y3 * δ5 ~ 0.000 Δw45 = α * y4 * δ5 ~ 0.125 δ3 = y3 * (1-y3) * w35* δ5 ~ 0.000 δ4 = y4 * (1-y4) * w45* δ5 ~ 0.012 Δw13 = α * y1 * δ3 = α * x1 * δ3 = 0 = Δw14 Δw23 = α * x2 * δ3 ~ 0.000 Δw24 = α * x2 * δ4 ~ 0.011 1 2 4 3 5
W35= 5
W45= 5 W13= 10 W24= 10 W14= -5 W23= -5
Activation function for nodes 3-5: (i.e. ) Set 𝛽 0.9 𝑨 1 1 𝑓 ⁄ 𝜄 6
39
x1 x2 d
1 1 1 1 1 1
Adjust the weights that require changing: Δw45 ~ 0.125: update w45 to 5.125 Δw24 ~ 0.011: update w24 to 10.011
0.000 0.276 1 2 4 3 5
W35= 5
W45= 5.125 W13= 10 W24= 10.011 W14= -5 W23= -5
1 0.982 Activation function for nodes 3-5: (i.e. ) Set 𝛽 0.9 𝑨 1 1 𝑓 ⁄ 𝜄 6
40
Activation function for nodes 3-5: (i.e. ) Set 𝛽 0.9 𝑨 1 1 𝑓 ⁄ 𝜄 6 x1 x2 d y
0.003 1 1 0.999 1 1 0.999 1 1 0.003
0.000 0.999 1 2 4 3 5
W35= 13
W45= 13 W13= 12 W24= 13 W14= -13 W23= -11
1 0.999
41
42
43
– Feed forward network
44
Steve David
45
Steve David
46
Steve David
47
Steve David
48
Total error #sweeps
49
Steve David
50
– 100%
– 100%
51
52
– Sonar mine/rock recognition (Gorman & Sejnowksi, 1988) – Navigation of a car (Pomerleau, 1989) – Stock‐market prediction – Pronunciation (NETtalk:
Sejnowksi & Rosenberg, 1987)
53
Acyclic: feedforward Cyclic: recurrent
54
55 Source: NIPS 2015 tutorial by Y LeCun
56
57
58
https://www.youtube.com/watch?v=mzpW10DPHeQ
59
https://www.youtube.com/watch?v=S9Y_I9vY8Qw https://www.youtube.com/watch?v=TS8QlL‐3NXk
60