Biomedicine Enrico Grisan enrico.grisan@dei.unipd.it Neuron basics - - PowerPoint PPT Presentation
Biomedicine Enrico Grisan enrico.grisan@dei.unipd.it Neuron basics - - PowerPoint PPT Presentation
Applied Machine Learning in Biomedicine Enrico Grisan enrico.grisan@dei.unipd.it Neuron basics Neuron: real and simulated A bit of history From biology to models Biological models? Careful with brain analogies: Many different types of
Neuron basics
Neuron: real and simulated
A bit of history
From biology to models
Biological models?
Careful with brain analogies: Many different types of neurons Dendrites can perform complex non-linear computations Synapses are not single weights but complex dynamical dynamical system Rate code may not be adequate
Single neuron classifier
Neuron and logistic classifier
๐ ๐ = ๐ ๐๐๐ = 1 1 + ๐โ(๐ฅ0+ ๐ ๐ฅ๐๐ฆ๐) Forward flow x x ๐ง
cell body activation function
- utput axon
axon from a neuron synapse dendrite
๐ฅ๐๐ฆ๐
๐
๐ฅ๐๐ฆ๐ + ๐ ๐
๐ฅ0๐ฆ0 ๐ฅ1๐ฆ1 ๐ฆ0 ๐ฅ0
๐
๐
๐ฅ๐๐ฆ๐ + ๐
๐ง
Linking output to input
How a change in the output (loss) affects the weights?
๐
๐
๐ฅ๐๐ฆ๐ + ๐ cell body activation function
- utput axon
axon from a neuron synapse dendrite
๐ฅ๐๐ฆ๐
๐
๐ฅ๐๐ฆ๐ + ๐ ๐
๐ฅ0๐ฆ0 ๐ฅ1๐ฆ1 ๐ฆ0 ๐ฅ0
Backward flow w ๐ง ๐(๐, ๐ง) โ (๐ง โ ๐ง ) 2
Linking output to input
cell body activation function
- utput axon
axon from a neuron synapse dendrite ๐
๐ฆ0 ๐ฅ0
Backward flow w y
๐๐ ๐๐
๐๐ ๐๐ ๐๐จ ๐๐ ๐๐ ๐๐ ๐๐จ ๐๐ฅ0 ๐๐ ๐๐ ๐๐จ ๐๐ฅ1 ๐๐ ๐๐ ๐๐จ ๐๐ฅ๐
How a change in the output (loss) affects the weights?
๐๐ ๐๐
Activation function
1 1 + ๐โ๐ง(๐๐๐) = ๐ ๐๐๐ = ๐(๐) Ups 1) Easy analytical derivatives 2) Squashes numbers to range [0,1] 3) Biological interpretation as saturating ยซfiring rateยป of a neuron Downs 1) Saturated neurons kill the gradients 2) Sigmoid output are not zero- centered
Sigmoid backpropagation
Assume the input of a neuron is always positive. What about the gradient on ๐? ๐
๐
๐ฅ๐๐ฆ๐ + ๐ ๐๐ข+1 = ๐๐ข + ๐ผ
๐๐
Gradient is all positive or all negative!
Improving activation function
Ups 1) Still analytical derivatives 2) Squashes numbers to range [-1,1] 3) Zero-centered! Downs 1) Saturated neurons kill the gradients tanh ๐ฆ = ๐๐ฆ โ ๐โ๐ฆ ๐๐ฆ + ๐โ๐ฆ ๐ tanh ๐ฆ ๐๐ฆ = 1 โ ๐ข๐๐โ2(๐ฆ)
Activation function 2
Rectifying Linear Unit: ReLU
Ups 1) Does not saturate 2) Computationally efficient 3) Converges faster in practice Downs 1) What happens for x<0? ๐ ๐ฆ = max(0, ๐ฆ) ๐๐ ๐๐ฆ = 1 ๐ฆ > 0 ๐ฆ < 0
ReLU neuron killing
Activation function 3
Leaky ReLU
Ups 1) Does not saturate 2) Computationally efficient 3) Converges faster in practice 4) Keep neurons alive! Downs ๐ ๐ฆ = ๐ผ โ๐ฆ ๐ฝ๐ฆ + ๐ผ ๐ฆ ๐ฆ ๐๐ ๐๐ฆ = 1 ๐ฆ > 0 ๐ฝ ๐ฆ < 0
Activation function 4
Maxout
Ups 1) Does not saturate 2) Computationally efficient 3) Linear regime 4) Keeps neurons alive! 5) Generalizes ReLU and leaky ReLU Downs 1) Is not a dot product 2) Doubles the parameters ๐ ๐ฆ = max(๐1๐๐, ๐2๐๐)
Neural Networks: architecture
Neural Networks: architecture
2-layers Neural Network 1-hidden layer Neural Network 3-layers Neural Network 2-hidden layers Neural Network
Neural Networks: architecture
Number of neurons? Number of weights? Number of parameters?
Neural Networks: architecture
Number of neurons: 4+2=6 Number of weights: 4x3+2x4=20 Number of parameters: 20+6
Neural Networks: architecture
Number of neurons: 4+2=6 Number of weights: 4x3+2x4=20 Number of parameters: 20+6 Number of neurons: 4+4+1=9 Number of weights: 4x3+4x4+1x4=32 Number of parameters: 32+9
Neural Networks: architecture
Modern CNNs: ~10 million artificial neurons Human Visual Cortex: ~5 billion neurons
ANN representation
๐ฆ1 ๐ฆ2 ๐ฆ3 ๐1,1 ๐1,4 ๐2,1 ๐2,2
๐ = 1 ๐ฆ11 โฏ ๐ฆ13 1 ๐ฆ21 1 ๐ฆ31 โฎ โฏ โฏ โฏ ๐ฆ23 ๐ฆ33 โฎ 1 ๐ฆ๐1 โฏ ๐ฆ๐3 = ๐1
๐
๐2
๐
๐3
๐
โฎ
๐
๐๐,๐ = ๐ฅ0,๐ ๐ฅ1,๐ ๐ฅ2,๐ ๐ฅ3,๐ ๐๐,๐ = ๐ฅ0,๐ ๐ฅ1,๐ ๐ฅ2,๐ ๐ฅ3,๐ ๐ฅ4,๐
ANN representation
๐ = 1 ๐ฆ11 โฏ ๐ฆ13 1 ๐ฆ21 1 ๐ฆ31 โฎ โฏ โฏ โฏ ๐ฆ23 ๐ฆ33 โฎ 1 ๐ฆ๐1 โฏ ๐ฆ๐3 = ๐1
๐
๐2
๐
๐3
๐
โฎ ๐๐
๐ ๐ฟ1 =
๐ฅ0,1 ๐ฅ1,1 ๐ฅ2,1 ๐ฅ3,1 ๐ฅ0,2 ๐ฅ1,2 ๐ฅ2,2 ๐ฅ3,2 ๐ฅ0,3 ๐ฅ1,3 ๐ฅ2,3 ๐ฅ3,3 = ๐๐,๐ ๐๐,๐ ๐๐,๐ ๐ฟ2 = ๐ฅ1,2 ๐ฅ2,2 ๐๐ ๐ = ๐
1 ๐ฟ๐๐
๐ ๐ = ๐
2 ๐ฟ๐๐1 = ๐ 2 ๐ฟ2๐ 1 ๐ฟ1๐
ANN becoming popular
ANN training: forward flow
Define a loss function ๐ ๐๐; ๐ง Each neuron computes: ๐๐ = ๐ ๐ฅ
๐๐๐จ๐
And pass to the following layer: ๐จ
๐ = ๐(๐๐)
ANN training: backward flow
Need to compute: ๐๐ ๐๐ฅ
๐๐
= ๐๐ ๐๐๐ ๐๐๐ ๐๐ฅ
๐๐
= ๐
๐๐จ๐
Considering that for the output neurons: ๐๐ = ๐ง๐ โ ๐ง๐ We get: ๐
๐ = ๐โฒ(๐๐) ๐
๐ฅ๐๐๐๐
ANN training: backpropagation
Updating all weights: ๐ฅ
๐๐๐ข+1 = ๐ฅ ๐๐๐ข + ๐๐ ๐
ANN training ex: forward
๐ ๐ = tanh(๐) ๐๐ = 1 2
๐=1 ๐ฟ
๐ง๐ โ ๐ง๐ 2 ๐๐ =
๐=0 ๐ธ
๐ฅ๐๐
(1)๐ฆ๐
๐จ
๐ = tanh(๐๐)
๐ง๐ =
๐=1 ๐
๐ฅ๐๐
(2)๐จ ๐
ANN training ex: backward
๐๐ = ๐ง๐ โ ๐ง๐ ๐
๐ = (1 โ ๐จ ๐2) ๐=1 ๐ฟ
๐ฅ๐๐๐๐ ๐๐๐ ๐๐ฅ
๐๐(1) = ๐ ๐๐ฆ๐
๐๐๐ ๐๐ฅ๐๐(2) = ๐๐๐จ
๐