biomedicine
play

Biomedicine Enrico Grisan enrico.grisan@dei.unipd.it Neuron basics - PowerPoint PPT Presentation

Applied Machine Learning in Biomedicine Enrico Grisan enrico.grisan@dei.unipd.it Neuron basics Neuron: real and simulated A bit of history From biology to models Biological models? Careful with brain analogies: Many different types of


  1. Applied Machine Learning in Biomedicine Enrico Grisan enrico.grisan@dei.unipd.it

  2. Neuron basics

  3. Neuron: real and simulated

  4. A bit of history

  5. From biology to models

  6. Biological models? Careful with brain analogies: Many different types of neurons Dendrites can perform complex non-linear computations Synapses are not single weights but complex dynamical dynamical system Rate code may not be adequate

  7. Single neuron classifier

  8. Neuron and logistic classifier 1 ๐‘ ๐’š = ๐œ ๐’™ ๐‘ˆ ๐’š = 1 + ๐‘“ โˆ’(๐‘ฅ 0 + ๐‘— ๐‘ฅ ๐‘— ๐‘ฆ ๐‘— ) ๐‘ฆ 0 ๐‘ฅ 0 synapse ๐‘ฅ 0 ๐‘ฆ 0 axon from a neuron dendrite cell body x ๐‘” ๐‘ฅ ๐‘— ๐‘ฆ ๐‘— + ๐‘ ๐‘ฅ 1 ๐‘ฆ 1 ๐‘ง ๐‘— ๐‘ฅ ๐‘— ๐‘ฆ ๐‘— + ๐‘ ๐‘” output axon ๐‘— activation ๐‘ฅ ๐‘— ๐‘ฆ ๐‘— function Forward flow ๐‘ง x

  9. Linking output to input How a change in the output (loss) affects the weights? ๐‘ฆ 0 ๐‘ฅ 0 synapse ๐‘ฅ 0 ๐‘ฆ 0 axon from a neuron dendrite cell body ๐‘” ๐‘ฅ ๐‘— ๐‘ฆ ๐‘— + ๐‘ ๐‘ฅ 1 ๐‘ฆ 1 ๐‘ง ) 2 ๐‘€(๐’™, ๐‘ง) โ‰ˆ (๐‘ง โˆ’ ๐‘— ๐‘ฅ ๐‘— ๐‘ฆ ๐‘— + ๐‘ ๐‘” output axon ๐‘— activation ๐‘ฅ ๐‘— ๐‘ฆ ๐‘— function Backward flow ๐‘ง w

  10. Linking output to input How a change in the output (loss) affects the weights? ๐‘ฆ 0 ๐‘ฅ 0 synapse ๐œ–๐‘” ๐œ–๐‘จ axon from a neuron ๐œ–๐’œ ๐œ–๐‘ฅ 0 ๐œ–๐‘” dendrite ๐œ–๐‘” ๐œ–๐‘จ cell body ๐œ–๐’œ ๐œ–๐‘€ ๐œ–๐’œ ๐œ–๐‘ฅ 1 ๐œ–๐‘” ๐œ–๐‘จ ๐‘” ๐œ–๐’œ ๐œ–๐’œ ๐œ–๐’™ output axon ๐œ–๐‘” ๐œ–๐‘จ activation function ๐œ–๐’œ ๐œ–๐‘ฅ ๐‘— Backward flow w y

  11. Activation function Ups 1) Easy analytical derivatives 2) Squashes numbers to range [0,1] 3) Biological interpretation as saturating ยซfiring rateยป of a neuron Downs 1 1 + ๐‘“ โˆ’๐‘ง(๐’™ ๐‘ˆ ๐’š) = ๐œ ๐’™ ๐‘ˆ ๐’š = ๐œ(๐’œ) 1) Saturated neurons kill the gradients 2) Sigmoid output are not zero- centered

  12. Sigmoid backpropagation Assume the input of a neuron is always positive. What about the gradient on ๐’™ ? ๐‘” ๐‘ฅ ๐‘— ๐‘ฆ ๐‘— + ๐‘ ๐‘— ๐’™ ๐‘ข+1 = ๐’™ ๐‘ข + ๐›ผ ๐’™ ๐‘” Gradient is all positive or all negative!

  13. Improving activation function Ups 1) Still analytical derivatives 2) Squashes numbers to range [-1,1] 3) Zero-centered! tanh ๐‘ฆ = ๐‘“ ๐‘ฆ โˆ’ ๐‘“ โˆ’๐‘ฆ ๐‘“ ๐‘ฆ + ๐‘“ โˆ’๐‘ฆ Downs 1) Saturated neurons kill the ๐‘’ tanh ๐‘ฆ gradients = 1 โˆ’ ๐‘ข๐‘๐‘œโ„Ž 2 (๐‘ฆ) ๐‘’๐‘ฆ

  14. Activation function 2 Rectifying Linear Unit: ReLU Ups 1) Does not saturate 2) Computationally efficient 3) Converges faster in practice ๐‘” ๐‘ฆ = max(0, ๐‘ฆ) Downs 1) What happens for x<0? ๐‘’๐‘” ๐‘’๐‘ฆ = 1 ๐‘ฆ > 0 0 ๐‘ฆ < 0

  15. ReLU neuron killing

  16. Activation function 3 Leaky ReLU Ups 1) Does not saturate 2) Computationally efficient 3) Converges faster in practice 4) Keep neurons alive! ๐‘” ๐‘ฆ = ๐ผ โˆ’๐‘ฆ ๐›ฝ๐‘ฆ + ๐ผ ๐‘ฆ ๐‘ฆ Downs ๐‘’๐‘” ๐‘’๐‘ฆ = 1 ๐‘ฆ > 0 ๐›ฝ ๐‘ฆ < 0

  17. Activation function 4 Ups 1) Does not saturate Maxout 2) Computationally efficient 3) Linear regime 4) Keeps neurons alive! 5) Generalizes ReLU and leaky ๐‘” ๐‘ฆ = max(๐’™ 1๐‘ˆ ๐’š, ๐’™ 2๐‘ˆ ๐’š) ReLU Downs 1) Is not a dot product 2) Doubles the parameters

  18. Neural Networks: architecture

  19. Neural Networks: architecture 2-layers Neural Network 3-layers Neural Network 1-hidden layer Neural Network 2-hidden layers Neural Network

  20. Neural Networks: architecture Number of neurons? Number of weights? Number of parameters?

  21. Neural Networks: architecture Number of neurons: 4+2=6 Number of weights: 4x3+2x4=20 Number of parameters: 20+6

  22. Neural Networks: architecture Number of neurons: 4+2=6 Number of neurons: 4+4+1=9 Number of weights: 4x3+2x4=20 Number of weights: 4x3+4x4+1x4=32 Number of parameters: 20+6 Number of parameters: 32+9

  23. Neural Networks: architecture Modern CNNs: ~10 million artificial neurons Human Visual Cortex: ~5 billion neurons

  24. ANN representation ๐‘ฅ 0,๐‘— ๐‘ฅ 0,๐‘— ๐‘ฅ 1,๐‘— ๐‘ฅ 1,๐‘— ๐‘ฅ 2,๐‘— ๐’™ ๐Ÿ,๐’‹ = ๐’™ ๐Ÿ‘,๐’‹ = ๐‘ฅ 2,๐‘— ๐‘ฅ 3,๐‘— ๐‘ฅ 3,๐‘— ๐‘ฅ 4,๐‘— ๐’™ 1,1 ๐’™ 2,1 ๐‘ฆ 1 ๐‘ฆ 2 ๐’™ 2,2 ๐‘ฆ 3 ๐’™ 1,4 ๐‘ˆ ๐’š 1 1 ๐‘ฆ 11 โ‹ฏ ๐‘ฆ 13 ๐‘ˆ โ‹ฏ ๐‘ฆ 23 1 ๐‘ฆ 21 ๐’š 2 โ‹ฏ ๐‘ฆ 33 ๐’€ = = 1 ๐‘ฆ 31 ๐‘ˆ ๐’š 3 โ‹ฏ โ‹ฎ โ‹ฎ โ‹ฎ 1 ๐‘ฆ ๐‘‚1 โ‹ฏ ๐‘ฆ ๐‘‚3 ๐‘ˆ

  25. ANN representation ๐‘ˆ ๐’š 1 1 ๐‘ฆ 11 โ‹ฏ ๐‘ฆ 13 ๐‘ˆ โ‹ฏ ๐‘ฆ 23 1 ๐‘ฆ 21 ๐’š 2 โ‹ฏ ๐‘ฆ 33 ๐’€ = 1 ๐‘ฆ 31 = ๐‘ˆ ๐’š 3 โ‹ฏ โ‹ฎ โ‹ฎ โ‹ฎ 1 ๐‘ฆ ๐‘‚1 โ‹ฏ ๐‘ฆ ๐‘‚3 ๐‘ˆ ๐’š ๐‘‚ ๐’‚ ๐Ÿ ๐’€ = ๐‘” 1 ๐‘ฟ ๐Ÿ ๐’€ ๐‘ฅ 0,1 ๐‘ฅ 0,2 ๐‘ฅ 0,3 ๐‘ฅ 1,1 ๐‘ฅ 1,2 ๐‘ฅ 1,3 ๐’ ๐’€ = ๐‘” 2 ๐‘ฟ ๐Ÿ‘ ๐’‚ 1 = ๐‘” 2 ๐‘ฟ 2 ๐‘” 1 ๐‘ฟ 1 ๐’€ = ๐’™ ๐Ÿ,๐Ÿ ๐’™ ๐Ÿ,๐Ÿ‘ ๐’™ ๐Ÿ,๐Ÿ’ ๐‘ฟ 1 = ๐‘ฅ 2,1 ๐‘ฅ 2,2 ๐‘ฅ 2,3 ๐‘ฅ 3,1 ๐‘ฅ 3,2 ๐‘ฅ 3,3 ๐‘ฟ 2 = ๐‘ฅ 1,2 ๐‘ฅ 2,2

  26. ANN becoming popular

  27. ANN training: forward flow Define a loss function ๐‘€ ๐’™ ๐‘ˆ ; ๐‘ง Each neuron computes: ๐‘ ๐‘˜ = ๐‘— ๐‘ฅ ๐‘˜๐‘— ๐‘จ ๐‘— And pass to the following layer: ๐‘จ ๐‘˜ = ๐‘”(๐‘ ๐‘˜ )

  28. ANN training: backward flow Need to compute: ๐œ–๐‘ ๐‘˜ ๐œ–๐‘€ = ๐œ–๐‘€ = ๐œ€ ๐‘˜ ๐‘จ ๐‘— ๐œ–๐‘ฅ ๐œ–๐‘ ๐‘˜ ๐œ–๐‘ฅ ๐‘˜๐‘— ๐‘˜๐‘— Considering that for the output neurons: ๐œ€ ๐‘™ = ๐‘ง ๐‘™ โˆ’ ๐‘ง ๐‘™ We get: ๐œ€ ๐‘˜ = ๐‘”โ€ฒ(๐‘ ๐‘˜ ) ๐‘ฅ ๐‘™๐‘˜ ๐œ€ ๐‘™ ๐‘™

  29. ANN training: backpropagation Updating all weights: ๐‘˜๐‘— ๐‘ข+1 = ๐‘ฅ ๐‘˜๐‘— ๐‘ข + ๐œƒ๐œ€ ๐‘ฅ ๐‘˜

  30. ANN training ex: forward ๐‘” ๐‘ = tanh(๐‘) ๐ฟ ๐‘€ ๐‘œ = 1 ๐‘ง ๐‘™ โˆ’ ๐‘ง ๐‘™ 2 2 ๐‘™=1 ๐ธ (1) ๐‘ฆ ๐‘— ๐‘ ๐‘˜ = ๐‘ฅ ๐‘—๐‘˜ ๐‘—=0 ๐‘จ ๐‘˜ = tanh(๐‘ ๐‘˜ ) ๐‘ (2) ๐‘จ ๐‘ง ๐‘™ = ๐‘ฅ ๐‘™๐‘˜ ๐‘˜ ๐‘˜=1

  31. ANN training ex: backward ๐œ€ ๐‘™ = ๐‘ง ๐‘™ โˆ’ ๐‘ง ๐‘™ ๐ฟ ๐‘˜2 ) ๐œ€ ๐‘˜ = (1 โˆ’ ๐‘จ ๐‘ฅ ๐‘™๐‘˜ ๐œ€ ๐‘™ ๐‘™=1 ๐œ–๐‘€ ๐‘œ ๐‘˜๐‘—(1) = ๐œ€ ๐‘˜ ๐‘ฆ ๐‘— ๐œ–๐‘ฅ ๐œ–๐‘€ ๐‘œ ๐œ–๐‘ฅ ๐‘™๐‘˜ (2) = ๐œ€ ๐‘™ ๐‘จ ๐‘˜

  32. What can ANN represent?

  33. What can ANN classify?

  34. Regularization

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend