Biomedicine Enrico Grisan enrico.grisan@dei.unipd.it Neuron basics - - PowerPoint PPT Presentation

โ–ถ
biomedicine
SMART_READER_LITE
LIVE PREVIEW

Biomedicine Enrico Grisan enrico.grisan@dei.unipd.it Neuron basics - - PowerPoint PPT Presentation

Applied Machine Learning in Biomedicine Enrico Grisan enrico.grisan@dei.unipd.it Neuron basics Neuron: real and simulated A bit of history From biology to models Biological models? Careful with brain analogies: Many different types of


slide-1
SLIDE 1

Applied Machine Learning in Biomedicine

Enrico Grisan enrico.grisan@dei.unipd.it

slide-2
SLIDE 2

Neuron basics

slide-3
SLIDE 3

Neuron: real and simulated

slide-4
SLIDE 4

A bit of history

slide-5
SLIDE 5

From biology to models

slide-6
SLIDE 6

Biological models?

Careful with brain analogies: Many different types of neurons Dendrites can perform complex non-linear computations Synapses are not single weights but complex dynamical dynamical system Rate code may not be adequate

slide-7
SLIDE 7

Single neuron classifier

slide-8
SLIDE 8

Neuron and logistic classifier

๐‘ ๐’š = ๐œ ๐’™๐‘ˆ๐’š = 1 1 + ๐‘“โˆ’(๐‘ฅ0+ ๐‘— ๐‘ฅ๐‘—๐‘ฆ๐‘—) Forward flow x x ๐‘ง

cell body activation function

  • utput axon

axon from a neuron synapse dendrite

๐‘ฅ๐‘—๐‘ฆ๐‘—

๐‘—

๐‘ฅ๐‘—๐‘ฆ๐‘— + ๐‘ ๐‘”

๐‘ฅ0๐‘ฆ0 ๐‘ฅ1๐‘ฆ1 ๐‘ฆ0 ๐‘ฅ0

๐‘”

๐‘—

๐‘ฅ๐‘—๐‘ฆ๐‘— + ๐‘

๐‘ง

slide-9
SLIDE 9

Linking output to input

How a change in the output (loss) affects the weights?

๐‘”

๐‘—

๐‘ฅ๐‘—๐‘ฆ๐‘— + ๐‘ cell body activation function

  • utput axon

axon from a neuron synapse dendrite

๐‘ฅ๐‘—๐‘ฆ๐‘—

๐‘—

๐‘ฅ๐‘—๐‘ฆ๐‘— + ๐‘ ๐‘”

๐‘ฅ0๐‘ฆ0 ๐‘ฅ1๐‘ฆ1 ๐‘ฆ0 ๐‘ฅ0

Backward flow w ๐‘ง ๐‘€(๐’™, ๐‘ง) โ‰ˆ (๐‘ง โˆ’ ๐‘ง ) 2

slide-10
SLIDE 10

Linking output to input

cell body activation function

  • utput axon

axon from a neuron synapse dendrite ๐‘”

๐‘ฆ0 ๐‘ฅ0

Backward flow w y

๐œ–๐‘” ๐œ–๐’œ

๐œ–๐‘” ๐œ–๐’œ ๐œ–๐‘จ ๐œ–๐’™ ๐œ–๐‘” ๐œ–๐’œ ๐œ–๐‘จ ๐œ–๐‘ฅ0 ๐œ–๐‘” ๐œ–๐’œ ๐œ–๐‘จ ๐œ–๐‘ฅ1 ๐œ–๐‘” ๐œ–๐’œ ๐œ–๐‘จ ๐œ–๐‘ฅ๐‘—

How a change in the output (loss) affects the weights?

๐œ–๐‘€ ๐œ–๐’œ

slide-11
SLIDE 11

Activation function

1 1 + ๐‘“โˆ’๐‘ง(๐’™๐‘ˆ๐’š) = ๐œ ๐’™๐‘ˆ๐’š = ๐œ(๐’œ) Ups 1) Easy analytical derivatives 2) Squashes numbers to range [0,1] 3) Biological interpretation as saturating ยซfiring rateยป of a neuron Downs 1) Saturated neurons kill the gradients 2) Sigmoid output are not zero- centered

slide-12
SLIDE 12

Sigmoid backpropagation

Assume the input of a neuron is always positive. What about the gradient on ๐’™? ๐‘”

๐‘—

๐‘ฅ๐‘—๐‘ฆ๐‘— + ๐‘ ๐’™๐‘ข+1 = ๐’™๐‘ข + ๐›ผ

๐’™๐‘”

Gradient is all positive or all negative!

slide-13
SLIDE 13

Improving activation function

Ups 1) Still analytical derivatives 2) Squashes numbers to range [-1,1] 3) Zero-centered! Downs 1) Saturated neurons kill the gradients tanh ๐‘ฆ = ๐‘“๐‘ฆ โˆ’ ๐‘“โˆ’๐‘ฆ ๐‘“๐‘ฆ + ๐‘“โˆ’๐‘ฆ ๐‘’ tanh ๐‘ฆ ๐‘’๐‘ฆ = 1 โˆ’ ๐‘ข๐‘๐‘œโ„Ž2(๐‘ฆ)

slide-14
SLIDE 14

Activation function 2

Rectifying Linear Unit: ReLU

Ups 1) Does not saturate 2) Computationally efficient 3) Converges faster in practice Downs 1) What happens for x<0? ๐‘” ๐‘ฆ = max(0, ๐‘ฆ) ๐‘’๐‘” ๐‘’๐‘ฆ = 1 ๐‘ฆ > 0 ๐‘ฆ < 0

slide-15
SLIDE 15

ReLU neuron killing

slide-16
SLIDE 16

Activation function 3

Leaky ReLU

Ups 1) Does not saturate 2) Computationally efficient 3) Converges faster in practice 4) Keep neurons alive! Downs ๐‘” ๐‘ฆ = ๐ผ โˆ’๐‘ฆ ๐›ฝ๐‘ฆ + ๐ผ ๐‘ฆ ๐‘ฆ ๐‘’๐‘” ๐‘’๐‘ฆ = 1 ๐‘ฆ > 0 ๐›ฝ ๐‘ฆ < 0

slide-17
SLIDE 17

Activation function 4

Maxout

Ups 1) Does not saturate 2) Computationally efficient 3) Linear regime 4) Keeps neurons alive! 5) Generalizes ReLU and leaky ReLU Downs 1) Is not a dot product 2) Doubles the parameters ๐‘” ๐‘ฆ = max(๐’™1๐‘ˆ๐’š, ๐’™2๐‘ˆ๐’š)

slide-18
SLIDE 18

Neural Networks: architecture

slide-19
SLIDE 19

Neural Networks: architecture

2-layers Neural Network 1-hidden layer Neural Network 3-layers Neural Network 2-hidden layers Neural Network

slide-20
SLIDE 20

Neural Networks: architecture

Number of neurons? Number of weights? Number of parameters?

slide-21
SLIDE 21

Neural Networks: architecture

Number of neurons: 4+2=6 Number of weights: 4x3+2x4=20 Number of parameters: 20+6

slide-22
SLIDE 22

Neural Networks: architecture

Number of neurons: 4+2=6 Number of weights: 4x3+2x4=20 Number of parameters: 20+6 Number of neurons: 4+4+1=9 Number of weights: 4x3+4x4+1x4=32 Number of parameters: 32+9

slide-23
SLIDE 23

Neural Networks: architecture

Modern CNNs: ~10 million artificial neurons Human Visual Cortex: ~5 billion neurons

slide-24
SLIDE 24

ANN representation

๐‘ฆ1 ๐‘ฆ2 ๐‘ฆ3 ๐’™1,1 ๐’™1,4 ๐’™2,1 ๐’™2,2

๐’€ = 1 ๐‘ฆ11 โ‹ฏ ๐‘ฆ13 1 ๐‘ฆ21 1 ๐‘ฆ31 โ‹ฎ โ‹ฏ โ‹ฏ โ‹ฏ ๐‘ฆ23 ๐‘ฆ33 โ‹ฎ 1 ๐‘ฆ๐‘‚1 โ‹ฏ ๐‘ฆ๐‘‚3 = ๐’š1

๐‘ˆ

๐’š2

๐‘ˆ

๐’š3

๐‘ˆ

โ‹ฎ

๐‘ˆ

๐’™๐Ÿ,๐’‹ = ๐‘ฅ0,๐‘— ๐‘ฅ1,๐‘— ๐‘ฅ2,๐‘— ๐‘ฅ3,๐‘— ๐’™๐Ÿ‘,๐’‹ = ๐‘ฅ0,๐‘— ๐‘ฅ1,๐‘— ๐‘ฅ2,๐‘— ๐‘ฅ3,๐‘— ๐‘ฅ4,๐‘—

slide-25
SLIDE 25

ANN representation

๐’€ = 1 ๐‘ฆ11 โ‹ฏ ๐‘ฆ13 1 ๐‘ฆ21 1 ๐‘ฆ31 โ‹ฎ โ‹ฏ โ‹ฏ โ‹ฏ ๐‘ฆ23 ๐‘ฆ33 โ‹ฎ 1 ๐‘ฆ๐‘‚1 โ‹ฏ ๐‘ฆ๐‘‚3 = ๐’š1

๐‘ˆ

๐’š2

๐‘ˆ

๐’š3

๐‘ˆ

โ‹ฎ ๐’š๐‘‚

๐‘ˆ ๐‘ฟ1 =

๐‘ฅ0,1 ๐‘ฅ1,1 ๐‘ฅ2,1 ๐‘ฅ3,1 ๐‘ฅ0,2 ๐‘ฅ1,2 ๐‘ฅ2,2 ๐‘ฅ3,2 ๐‘ฅ0,3 ๐‘ฅ1,3 ๐‘ฅ2,3 ๐‘ฅ3,3 = ๐’™๐Ÿ,๐Ÿ ๐’™๐Ÿ,๐Ÿ‘ ๐’™๐Ÿ,๐Ÿ’ ๐‘ฟ2 = ๐‘ฅ1,2 ๐‘ฅ2,2 ๐’‚๐Ÿ ๐’€ = ๐‘”

1 ๐‘ฟ๐Ÿ๐’€

๐’ ๐’€ = ๐‘”

2 ๐‘ฟ๐Ÿ‘๐’‚1 = ๐‘” 2 ๐‘ฟ2๐‘” 1 ๐‘ฟ1๐’€

slide-26
SLIDE 26

ANN becoming popular

slide-27
SLIDE 27

ANN training: forward flow

Define a loss function ๐‘€ ๐’™๐‘ˆ; ๐‘ง Each neuron computes: ๐‘๐‘˜ = ๐‘— ๐‘ฅ

๐‘˜๐‘—๐‘จ๐‘—

And pass to the following layer: ๐‘จ

๐‘˜ = ๐‘”(๐‘๐‘˜)

slide-28
SLIDE 28

ANN training: backward flow

Need to compute: ๐œ–๐‘€ ๐œ–๐‘ฅ

๐‘˜๐‘—

= ๐œ–๐‘€ ๐œ–๐‘๐‘˜ ๐œ–๐‘๐‘˜ ๐œ–๐‘ฅ

๐‘˜๐‘—

= ๐œ€

๐‘˜๐‘จ๐‘—

Considering that for the output neurons: ๐œ€๐‘™ = ๐‘ง๐‘™ โˆ’ ๐‘ง๐‘™ We get: ๐œ€

๐‘˜ = ๐‘”โ€ฒ(๐‘๐‘˜) ๐‘™

๐‘ฅ๐‘™๐‘˜๐œ€๐‘™

slide-29
SLIDE 29

ANN training: backpropagation

Updating all weights: ๐‘ฅ

๐‘˜๐‘—๐‘ข+1 = ๐‘ฅ ๐‘˜๐‘—๐‘ข + ๐œƒ๐œ€ ๐‘˜

slide-30
SLIDE 30

ANN training ex: forward

๐‘” ๐‘ = tanh(๐‘) ๐‘€๐‘œ = 1 2

๐‘™=1 ๐ฟ

๐‘ง๐‘™ โˆ’ ๐‘ง๐‘™ 2 ๐‘๐‘˜ =

๐‘—=0 ๐ธ

๐‘ฅ๐‘—๐‘˜

(1)๐‘ฆ๐‘—

๐‘จ

๐‘˜ = tanh(๐‘๐‘˜)

๐‘ง๐‘™ =

๐‘˜=1 ๐‘

๐‘ฅ๐‘™๐‘˜

(2)๐‘จ ๐‘˜

slide-31
SLIDE 31

ANN training ex: backward

๐œ€๐‘™ = ๐‘ง๐‘™ โˆ’ ๐‘ง๐‘™ ๐œ€

๐‘˜ = (1 โˆ’ ๐‘จ ๐‘˜2) ๐‘™=1 ๐ฟ

๐‘ฅ๐‘™๐‘˜๐œ€๐‘™ ๐œ–๐‘€๐‘œ ๐œ–๐‘ฅ

๐‘˜๐‘—(1) = ๐œ€ ๐‘˜๐‘ฆ๐‘—

๐œ–๐‘€๐‘œ ๐œ–๐‘ฅ๐‘™๐‘˜(2) = ๐œ€๐‘™๐‘จ

๐‘˜

slide-32
SLIDE 32

What can ANN represent?

slide-33
SLIDE 33

What can ANN classify?

slide-34
SLIDE 34

Regularization