Backpropagation and Gradient Descent Brian Carignan, Dec 5 2016 - PowerPoint PPT Presentation

Mar 24, 2024 •49 likes •221 views

Backpropagation and Gradient Descent Brian Carignan, Dec 5 2016 Overview Notation/background | Neural networks | Activation functions | Vectorization | Cost functions Introduction Algorithm Overview Four fundamental

Backpropagation and Gradient Descent Brian Carignan, Dec 5 2016
Overview ▪ Notation/background | Neural networks | Activation functions | Vectorization | Cost functions ▪ Introduction ▪ Algorithm Overview ▪ Four fundamental equations | Definitions (all 4) and proofs (1 and 2) ▪ Example from thesis related work 2
Neural Networks 1 3
Neural Networks 2 ▪ a – Activation of a neuron is related to the activations in the previous layer ▪ b – bias of a neuron 4
Activation Functions ▪ Similar to an ON/ OFF switch ▪ Required properties | Nonlinear | Continuously differentiable 5
Vectorization ▪ Represent each layer as a vector | Simplifies notation | Leads to faster computation by exploiting vector math ▪ z – weighted input vector 6
Cost Function ▪ Objective Function ▪ Example: ▪ Optimization Problem ▪ Assumptions | Can average over C x | Function of the outputs ▪ x – individual training examples (fixed) 7
Introduction ▪ Backpropagation | Backward propagation of errors | Calculate gradients | One way to train neural networks ▪ Gradient Descent | Optimization method | Finds a local minimum | Takes steps proportional to -gradient at current point 8
Algorithm Overview 9
Equation 1 ▪ Definition of error: 10
Equation 2 ▪ Key difference | Transpose of weight matrix ▪ Pushes error backwards 11
Equation 3 ▪ Note that previous equations computed error 12
Equation 4 ▪ Describes learning rate ▪ General insights | Slow learning when: | Input activation approaches 0 | Output activation approaches 0 or 1 (from derivative of sigmoid) 13
Proof – Equation 1 ▪ Steps 1. Definition of error 2. Chain rule 3. k=j 4. BP1 (components) 14
Proof – Equation 2 ▪ Steps 1. Definition of error 2. Chain rule 3. Substitute definition of error 4. Derivative of weighted input vector 5. BP2 (components) ▪ Recall: 15
Example – Thesis Related Work 16
  References ▪ Michael A. Nielsen, "Neural Networks and Deep Learning", Determination Press, (2015) ▪ Bordes et al. “Translating embeddings for modeling multi-relational data”, NIPS'13, (2013)   17

Recommend

Conjugate Gradient (CG) Majid Lesani Alireza Masoum Overview Backpropagation Gradient

Conjugate Gradient (CG) Majid Lesani Alireza Masoum Overview Backpropagation Gradient Descent Quadratic Forms Gradient Descent in Quadratic Forms Eigen vectors and values Gradient Descent Convergence Conjugate

1.1k views • 50 slides

CS 6316 Machine Learning Gradient Descent Yangfeng Ji Department of Computer Science University

CS 6316 Machine Learning Gradient Descent Yangfeng Ji Department of Computer Science University of Virginia Overview 1. Gradient Descent 2. Stochastic Gradient Descent 3. SGD with Momentum 4. Adaptive Learning Rates 1 Gradient Descent

767 views • 66 slides

Artificial Neural Networks (Part 2) Gradient Descent Learning and Backpropagation Christian Jacob

Artificial Neural Networks (Part 2) Gradient Descent Learning and Backpropagation Christian Jacob CPSC 533 Winter 2001 Learning by Gradient Descent Definition of the Learning Problem Let us start with the simple case of linear cells, which

558 views • 22 slides

Feedforward Networks Gradient Descent Learning and Backpropagation Christian Jacob CPSC 565

Feedforward Networks Gradient Descent Learning and Backpropagation Christian Jacob CPSC 565 Winter 2003 Department of Computer Science University of Calgary Canada Learning by Gradient Descent Definition of the Learning Problem Let us

538 views • 25 slides

Applied Machine Learning Gradient Descent Methods Siamak Ravanbakhsh COMP 551 (Fall 2020)

Applied Machine Learning Gradient Descent Methods Siamak Ravanbakhsh COMP 551 (Fall 2020) Learning objectives Basic idea of gradient descent stochastic gradient descent method of momentum using an adaptive learning rate sub-gradient

569 views • 34 slides

Fitting Neural Networks Gradient Descent and Stochastic Gradient Descent CS109A Introduction to

Fitting Neural Networks Gradient Descent and Stochastic Gradient Descent CS109A Introduction to Data Science Pavlos Protopapas, Kevin Rader and Chris Tanner New requirement for the final project: For the first time ever, researchers who submit

620 views • 34 slides

Painless Stochastic Gradient Descent : Interpolation, Line-Search, and Convergence Rates. MLSS

Painless Stochastic Gradient Descent : Interpolation, Line-Search, and Convergence Rates. MLSS 2020 Aaron Mishkin, amishkin@cs.ubc.ca 1 21 Stochastic Gradient Descent: Workhorse of ML? Stochastic gradient descent (SGD) is today one of

634 views • 21 slides

Machine Learning (CSE 446): Gradient Descent and Stochastic Gradient Descent Sham M Kakade

Machine Learning (CSE 446): Gradient Descent and Stochastic Gradient Descent Sham M Kakade 2018 c University of Washington cse446-staff@cs.washington.edu 1 / 12 Announcements Midterm: Weds, Feb 7th. Policies: You may use a single

333 views • 18 slides

Painless Stochastic Gradient Descent : Interpolation, Line-Search, and Convergence Rates. NeurIPS

Painless Stochastic Gradient Descent : Interpolation, Line-Search, and Convergence Rates. NeurIPS 2019 Aaron Mishkin 1 21 Stochastic Gradient Descent: Workhorse of ML? Stochastic gradient descent (SGD) is today one of the main

1.22k views • 21 slides

LOGISTIC REGRESSION, GRADIENT LOGISTIC REGRESSION, GRADIENT DESCENT, NEWTON DESCENT, NEWTON

LOGISTIC REGRESSION, GRADIENT LOGISTIC REGRESSION, GRADIENT DESCENT, NEWTON DESCENT, NEWTON Matthieu R Bloch Thursday, January 30, 2020 1 LOGISTICS LOGISTICS TAs and Office hours Monday: Mehrdad (TSRB 523a) - 2pm-3:15pm Tuesday: TJ (VL

299 views • 14 slides

Gradient Descent Michail Michailidis & Patrick Maiden Outline

Gradient Descent Michail Michailidis & Patrick Maiden Outline Mo4va4on Gradient Descent Algorithm Issues & Alterna4ves Stochas4c Gradient Descent

840 views • 29 slides

Learning to learn by gradient descent by gradient descent Liyan Jiang July 18, 2019 1

Learning to learn by gradient descent by gradient descent Liyan Jiang July 18, 2019 1 Introduction The general aim of machine learning is always learning the data by itself, with as less human efforts as possible. Then it comes to the focus

396 views • 10 slides

Stochastic Gradient Descent (SGD) Todays Class Stochastic Gradient Descent (SGD) SGD Recap

CS6501: Deep Learning for Visual Recognition Stochastic Gradient Descent (SGD) Todays Class Stochastic Gradient Descent (SGD) SGD Recap Regression vs Classification Generalization / Overfitting / Underfitting Regularization

633 views • 30 slides

Backpropagation Why backpropagation Neural networks are sequences of parametrized functions

Backpropagation Why backpropagation Neural networks are sequences of parametrized functions x ($; !) linear conv subsample conv subsample filters filters weights Parameters ! Why backpropagation Neural networks are

767 views • 45 slides

MLPs with Backpropagation CS 472 Backpropagation 1 Multilayer Nets? Linear Systems F(cx) =

MLPs with Backpropagation CS 472 Backpropagation 1 Multilayer Nets? Linear Systems F(cx) = cF(x) F(x+y) = F(x) + F(y) I N M Z Z = (M(NI)) = (MN)I = PI CS 472 Backpropagation 2 Early Attempts Committee Machine Randomly

812 views • 70 slides

CSC321 Lecture 6: Backpropagation Roger Grosse Roger Grosse CSC321 Lecture 6: Backpropagation 1

CSC321 Lecture 6: Backpropagation Roger Grosse Roger Grosse CSC321 Lecture 6: Backpropagation 1 / 21 Overview Weve seen that multilayer neural networks are powerful. But how can we actually learn them? Backpropagation is the central

577 views • 21 slides

CATON/MANZOOR RESIDENCE 404 GRAND AVENUE BROOKLYN, NY 11238 NYC LPC SUBMISSION 11.21.16

Architect HARPER ARCHITECTURE, P.C. N 343 SIXTH STREET BROOKLYN, NY 11215 GRAND AVENUE P. 718.832.4601 F. 718.832.4602 404 Structural Consultant GATES AVENUE FULTON AVENUE SILMAN 32 OLD SLIP, 10TH FLOOR NEW YORK, NY 10005 P.

517 views • 14 slides

Byron Energy Limited UPDATE DECEMBER 2015 BYR O N ENER G Y LIMITED IMPORTANT NOTICE Disclaimer

Byron Energy Limited UPDATE DECEMBER 2015 BYR O N ENER G Y LIMITED IMPORTANT NOTICE Disclaimer This presentation is provided by Byron Energy Limited ABN 88 113 436 141 (Byron) in connection with providing an overview t o interested

423 views • 41 slides

Barclays CEO Energy-Pow er Conference New York September 6, 2011 Company by the Numbers LTM (1)

Barclays CEO Energy-Pow er Conference New York September 6, 2011 Company by the Numbers LTM (1) Key Financials ($ in MMs) 2010 2009 Revenue $820 $706 $611 Adjusted EBITDA $549 $450 $341 CAPEX (2) $692 $416 $276 Field Statistics (3)

493 views • 32 slides

Navitas Petroleum Capital Market Presentation January 2019 1

Navitas Petroleum Capital Market Presentation January 2019 1 Disclaimer This presentation was prepared by Navitas Petroleum Limited Partnership ( Navitas or the Partnership ). This presentation does not

359 views • 31 slides

Local Calibration of the MEPDG for HMA Pavements in Missouri 2012 NCAUPG Annual Meeting

Local Calibration of the MEPDG for HMA Pavements in Missouri 2012 NCAUPG Annual Meeting Indianapolis, IN Joe Schroer, P.E. Missouri DOT February 16, 2012 MEPDG Implementation Decision made in 2004 by MoDOT Pavement Team members,

483 views • 33 slides

First signature of strong differential rotation in Atype stars A. Reiners F. Royer

First signature of strong differential rotation in Atype stars A. Reiners F. Royer Hamburger Sternwarte, Hamburg (D) Observatoire de Genve, Sauverny (CH) University of California, Berkeley (USA) Observatoire de Paris, Meudon (F) A

131 views • 11 slides

Frasers Property Thailand Industrial Freehold & Leasehold REIT Opportunity Day 13 December

Frasers Property Thailand Industrial Freehold & Leasehold REIT Opportunity Day 13 December 2019 Laemchabang 2, Chonburi, Thailand (FPITs properties) FPIT Bangplee 1, Samutprakarn Important notice This presentation is for information

554 views • 32 slides

Thom Dammrich President, National Marine Manufacturers Association Recreational Boating

Thom Dammrich President, National Marine Manufacturers Association Recreational Boating Industry Trends U.S. Industry Retail Powerboats (1965 2010) Last 5 Years 2006 291,900 2007 267,300 600,000 2008 203,000 2009 153,550 500,000

760 views • 36 slides