lecture 4 backpropagation and neural networks part 1
play

Lecture 4: Backpropagation and Neural Networks part 1 Fei-Fei Li - PowerPoint PPT Presentation

Lecture 4: Backpropagation and Neural Networks part 1 Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 1 Administrative A1 is due


  1. Lecture 4: Backpropagation and Neural Networks part 1 Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 1

  2. Administrative A1 is due Jan 20 (Wednesday). ~150 hours left Warning: Jan 18 (Monday) is Holiday (no class/office hours) Also note: Lectures are non-exhaustive. Read course notes for completeness. I’ll hold make up office hours on Wed Jan20, 5pm @ Gates 259 Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 2

  3. Where we are... scores function SVM loss data loss + regularization want Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 3

  4. Optimization (image credits to Alec Radford) Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 4

  5. Gradient Descent Numerical gradient : slow :(, approximate :(, easy to write :) Analytic gradient : fast :), exact :), error-prone :( In practice: Derive analytic gradient, check your implementation with numerical gradient Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 5

  6. Computational Graph x s (scores) * hinge L + loss W R Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 6

  7. Convolutional Network (AlexNet) input image weights loss Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 7

  8. Neural Turing Machine input tape loss Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 8

  9. Neural Turing Machine Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 9

  10. e.g. x = -2, y = 5, z = -4 Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 10

  11. e.g. x = -2, y = 5, z = -4 Want: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 11

  12. e.g. x = -2, y = 5, z = -4 Want: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 12

  13. e.g. x = -2, y = 5, z = -4 Want: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 13

  14. e.g. x = -2, y = 5, z = -4 Want: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 14

  15. e.g. x = -2, y = 5, z = -4 Want: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 15

  16. e.g. x = -2, y = 5, z = -4 Want: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 16

  17. e.g. x = -2, y = 5, z = -4 Want: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 17

  18. e.g. x = -2, y = 5, z = -4 Want: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 18

  19. e.g. x = -2, y = 5, z = -4 Chain rule: Want: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 19

  20. e.g. x = -2, y = 5, z = -4 Want: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 20

  21. e.g. x = -2, y = 5, z = -4 Chain rule: Want: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 21

  22. activations f Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 22

  23. activations “local gradient” f Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 23

  24. activations “local gradient” f gradients Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 24

  25. activations “local gradient” f gradients Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 25

  26. activations “local gradient” f gradients Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 26

  27. activations “local gradient” f gradients Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 27

  28. Another example: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 28

  29. Another example: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 29

  30. Another example: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 30

  31. Another example: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 31

  32. Another example: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 32

  33. Another example: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 33

  34. Another example: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 34

  35. Another example: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 35

  36. Another example: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 36

  37. Another example: (-1) * (-0.20) = 0.20 Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 37

  38. Another example: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 38

  39. Another example: [local gradient] x [its gradient] [1] x [0.2] = 0.2 [1] x [0.2] = 0.2 (both inputs!) Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 39

  40. Another example: Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 40

  41. Another example: [local gradient] x [its gradient] x0: [2] x [0.2] = 0.4 w0: [-1] x [0.2] = -0.2 Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 41

  42. sigmoid function sigmoid gate Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 42

  43. sigmoid function sigmoid gate (0.73) * (1 - 0.73) = 0.2 Fei-Fei Li & Andrej Karpathy & Justin Johnson Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 4 - Lecture 4 - 13 Jan 2016 13 Jan 2016 43

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend