learning from data lecture 20 multilayer perceptron
play

Learning From Data Lecture 20 Multilayer Perceptron Multiple - PowerPoint PPT Presentation

Learning From Data Lecture 20 Multilayer Perceptron Multiple layers Universal Approximation The Neural Network M. Magdon-Ismail CSCI 4100/6100 recap: Unsupervised Learning k -Means Clustering Gaussian Mixture Model P ( x ) x Hard


  1. Learning From Data Lecture 20 Multilayer Perceptron Multiple layers Universal Approximation The Neural Network M. Magdon-Ismail CSCI 4100/6100

  2. recap: Unsupervised Learning k -Means Clustering Gaussian Mixture Model P ( x ) x ‘Hard’ partition into k -clusters ‘Soft’ probability density estimation M Multilayer Perceptron : 2 /18 � A c L Creator: Malik Magdon-Ismail Bio-inspired Neural Network − →

  3. The Neural Network - Biologically Inspired Engineering success may start with biological inspiration, but then take a totally different path. M Multilayer Perceptron : 3 /18 � A c L Creator: Malik Magdon-Ismail Planes don’t flap wings − →

  4. Planes Don’t Flap Wings to Fly Engineering success may start with biological inspiration, but then take a totally different path. M Multilayer Perceptron : 4 /18 � A c L Creator: Malik Magdon-Ismail xor − →

  5. xor : A Limitation of the Linear Model +1 − 1 x 2 − 1 +1 x 1 M Multilayer Perceptron : 5 /18 � A c L Creator: Malik Magdon-Ismail Decomposing xor − →

  6. Decomposing xor +1 − 1 f = h 1 h 2 + h 1 h 2 x 2 − 1 +1 x 1 +1 − 1 − 1 x 2 x 2 +1 x 1 x 1 h 1 ( x ) = sign( w t 1 x ) h 2 ( x ) = sign( w t 2 x ) M Multilayer Perceptron : 6 /18 � A c L Creator: Malik Magdon-Ismail Perceptrons for or and and − →

  7. Perceptrons for or and and or ( x 1 , x 2 ) = sign( x 1 + x 2 + 1 . 5) and ( x 1 , x 2 ) = sign( x 1 + x 2 − 1 . 5) 1 1 1 . 5 − 1 . 5 x 1 x 1 or ( x 1 , x 2 ) and ( x 1 , x 2 ) 1 1 1 1 x 2 x 2 M Multilayer Perceptron : 7 /18 � A c L Creator: Malik Magdon-Ismail Representing f using or and and − →

  8. Representing f Using or and and f = h 1 h 2 + h 1 h 2 1 1 . 5 h 1 h 2 f 1 1 h 1 h 2 M Multilayer Perceptron : 8 /18 � A c L Creator: Malik Magdon-Ismail Expand and s − →

  9. Representing f Using or and and f = h 1 h 2 + h 1 h 2 1 1 − 1 . 5 − 1 . 5 1 . 5 1 h 1 f 1 − 1 − 1 1 h 2 1 M Multilayer Perceptron : 9 /18 � A c L Creator: Malik Magdon-Ismail Expand h 1 , h 2 − →

  10. Representing f Using or and and f = h 1 h 2 + h 1 h 2 1 1 1 − 1 . 5 w t 1 x − 1 . 5 1 . 5 1 x 1 f 1 − 1 − 1 1 x 2 1 w t 2 x M Multilayer Perceptron : 10 /18 � A c L Creator: Malik Magdon-Ismail The Multilayer Perceptron − →

  11. The Multilayer Perceptron (MLP) 1 1 1 − 1 . 5 w t 1 x − 1 . 5 1 . 5 1 x 1 f 1 − 1 − 1 1 x 2 1 w t 2 x 1 w 0 w 1 x 1 sign( w t x ) w 2 x 2 More layers allow us to implement f These additional layers are called hidden layers M Multilayer Perceptron : 11 /18 � A c L Creator: Malik Magdon-Ismail Universal approximation − →

  12. Universal Approximation Any target function f that can be decomposed into linear separators can be implemented by a 3-layer MLP. M Multilayer Perceptron : 12 /18 � A c L Creator: Malik Magdon-Ismail Circle Example − →

  13. Universal Approximation A sufficiently smooth separator can “essentially” be decomposed into linear separators. − − − − − − + + + + + + + + + + + + − − − − − − Target 8 perceptrons 16 perceptrons M Multilayer Perceptron : 13 /18 � A c L Creator: Malik Magdon-Ismail Approximation versus generalization − →

  14. Approximation Versus Generalization The size of the MLP controls the approximation-generalization tradeoff. More nodes per hidden layer = ⇒ approximation ↑ and generalization ↓ M Multilayer Perceptron : 14 /18 � A c L Creator: Malik Magdon-Ismail Minimizing E in − →

  15. Minimizing E in A combinatorial problem even harder with the MLP than the Perceptron. E in is not smooth (due to sign function), so cannot use gradient descent. sign( x ) ≈ tan( x ) − → gradient descent to minimize E in . M Multilayer Perceptron : 15 /18 � A c L Creator: Malik Magdon-Ismail Neural Network − →

  16. The Neural Network 1 1 1 h ( x ) x 1 θ θ θ x 2 θ θ θ ( s ) . . . s θ x d input layer ℓ = 0 hidden layers 0 < ℓ < L output layer ℓ = L M Multilayer Perceptron : 16 /18 � A c L Creator: Malik Magdon-Ismail Zooming into a hidden node − →

  17. Zooming into a Hidden Node 1 1 1 x 1 h ( x ) θ θ θ x 2 θ θ θ ( s ) . . . s θ x d input layer ℓ = 0 hidden layers 0 < ℓ < L output layer ℓ = L layer ( ℓ + 1) θ W ( ℓ +1) s ( ℓ ) x ( ℓ ) W ( ℓ ) + θ layer ( ℓ − 1) layer ℓ layer ℓ parameters layers ℓ = 0 , 1 , 2 , . . . , L layer ℓ has “dimension” d ( ℓ ) = ⇒ d ( ℓ ) + 1 nodes d ( ℓ ) dimensional input vector s ( ℓ ) signals in   w ( ℓ ) w ( ℓ ) w ( ℓ ) · · · 1 2 d ( ℓ ) d ( ℓ ) + 1 dimensional output vector x ( ℓ ) outputs W ( ℓ ) = .   .  .  ( d ( ℓ − 1) + 1) × d ( ℓ ) dimensional matrix W ( ℓ )   weights in W ( ℓ +1) ( d ( ℓ ) + 1) × d ( ℓ +1) dimensional matrix weights out M Multilayer Perceptron : 17 /18 � A c L Creator: Malik Magdon-Ismail Neural Network − →

  18. The Neural Network Biology − − − − − − − − − − − → Engineering − − − → 1 1 1 x 1 h ( x ) θ θ θ x 2 θ θ θ ( s ) . . . s θ x d input layer ℓ = 0 hidden layers 0 < ℓ < L output layer ℓ = L M Multilayer Perceptron : 18 /18 � A c L Creator: Malik Magdon-Ismail

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend