Deep Learning for Classification CS293S, Yang, 2017 Computational - PowerPoint PPT Presentation

Deep Learning for Classification CS293S, Yang, 2017

Computational graph for classification w 1 f 1 w 2 S f 2 >0? w 3 f 3 • Objective: Classification Accuracy m l acc ( w ) = 1 ⇣ sign( w > f ( x ( i ) )) == y ( i ) ⌘ X m i =1 – Issue: How to find these parameteres? Slide 1

Neural Net with Soft-Max • Score for y=1: Score for y=-1: − w > f ( x ) w > f ( x ) e w > f ( x ( i ) ) • Probability of label: p ( y = 1 | f ( x ); w ) = e w > f ( x ) + e − w > f ( x ) e − w > f ( x ) p ( y = − 1 | f ( x ); w ) = e w > f ( x ) + e − w > f ( x ) m Y p ( y = y ( i ) | f ( x ( i ) ); w ) l ( w ) = • Objective: i =1 m X log p ( y = y ( i ) | f ( x ( i ) ); w ) • Log: ll ( w ) = i =1 Slide 2

Two-Layer Neural Network w 1 S w 2 1 >0? w 3 1 w 1 1 f 1 w 1 S w 2 w 2 S 2 f 2 >0? w 3 2 2 w 3 f 3 w 1 S w 2 3 >0? w 3 3 z → tanh( z ) = e z − e − z 3 e z + e − z Slide 3

N-Layer Neural Network S S S >0? >0? >0? … f 1 S S S S f 2 >0? >0? >0? … f 3 S S S … >0? >0? >0? Slide 4

Convolutional Network (AlexNet) input image weights loss 5 Lecture 5 - 20 Jan 2016 Fei-Fei Li & Andrej Karpathy & Justin Johnson Slide 5

Activation Functions Leaky ReLU max(0.1x, x) Sigmoid Maxout tanh tanh(x) ELU ReLU max(0,x) 6 Lecture 5 - 20 Jan 2016 Fei-Fei Li & Andrej Karpathy & Justin Johnson Slide 6

Multi-class Softmax • 3-class softmax – classes A, B, C – 3 weight vectors: w A , w B , w C • Probability of label A: (similar for B, C) e w > A f ( x ) p ( y = A | f ( x ); w ) = A f ( x ) + e w > B f ( x ) + e w > e w > C f ( x ) m • Objective: Y p ( y = y ( i ) | f ( x ( i ) ; w ) l ( w ) = i =1 m X log p ( y = y ( i ) | f ( x ( i ) ; w ) ll ( w ) = • Log: i =1 Slide 7

Multi-class Two-Layer Neural Network w 1 A Score for A w 1 S S w 2 1 >0? A w 2 w 3 1 A w 3 1 f 1 B w 1 w 1 Score for B B w 2 S S w 2 2 f 2 >0? B w 3 w 3 2 2 f 3 C w 1 Score for C w 1 S C w 2 S w 2 3 >0? C w 3 w 3 3 z → tanh( z ) = e z − e − z 3 e z + e − z Slide 8

Gradient Descent Method for Optimization • How to find parameters that minimize an objective function? • Idea: – Start somewhere – Repeat: Take a step in the steepest descent direction Figure source: Mathworks Slide 9

Generally, Steepest Direction • Steepest Direction = direction of the gradient   ∂ g ∂ w 1 ∂ g   ∂ w 2 r g =   § Gradient Descent   · · ·   ∂ g ∂ w n • Init: w • For i = 1, 2, … w w � α ⇤ r g ( w ) Slide 10

What is the Steepest Descent Direction? min 2 ≤ ✏ g ( w + ∆ ) ∆ : ∆ 2 1 + ∆ 2 g ( w + ∆ ) ≈ g ( w ) + ∂ g ∆ 1 + ∂ g • First-Order Taylor Expansion: ∆ 2 ∂ w 1 ∂ w 2 ∂ g ∆ 1 + ∂ g ∆ 2 min • Steepest Descent Direction: ∂ w 1 ∂ w 2 ∆ : ∆ 2 1 + ∆ 2 2 ≤ ✏ a = � b ✏ a : k a k ✏ a > b min • Recall: à k b k ✏ �r g • Hence, solution: " # ∂ g kr g k ∂ w 1 r g = ∂ g ∂ w 2 Slide 11

How to Calculate a Partial Deriviate in a Computational Graph Given a function f(x,y,z)= (x+y)z, What is the partial derivie of f with respect to x, y, z? 12 Lecture 5 - 20 Jan 2016 Fei-Fei Li & Andrej Karpathy & Justin Johnson Slide 12

e.g. x = -2, y = 5, z = -4 x, y, z values are from a training example 13 Lecture 5 - 20 Jan 2016 Fei-Fei Li & Andrej Karpathy & Justin Johnson Slide 13

e.g. x = -2, y = 5, z = -4 Want: 14 Lecture 5 - 20 Jan 2016 Fei-Fei Li & Andrej Karpathy & Justin Johnson Slide 14

e.g. x = -2, y = 5, z = -4 Chain rule: Want: 22 Lecture 5 - 20 Jan 2016 Fei-Fei Li & Andrej Karpathy & Justin Johnson Slide 22

e.g. x = -2, y = 5, z = -4 Chain rule: Want: 24 Lecture 5 - 20 Jan 2016 Fei-Fei Li & Andrej Karpathy & Justin Johnson Slide 24

activations f 25 Lecture 5 - 20 Jan 2016 Fei-Fei Li & Andrej Karpathy & Justin Johnson Slide 25

activations “local gradient” f 26 Lecture 5 - 20 Jan 2016 Fei-Fei Li & Andrej Karpathy & Justin Johnson Slide 26

activations “local gradient” f gradients 27 Lecture 5 - 20 Jan 2016 Fei-Fei Li & Andrej Karpathy & Justin Johnson Slide 27

Another example: 31 Lecture 5 - 20 Jan 2016 Fei-Fei Li & Andrej Karpathy & Justin Johnson Slide 31

Another example: (-1) * (-0.20) = 0.20 40 Lecture 5 - 20 Jan 2016 Fei-Fei Li & Andrej Karpathy & Justin Johnson Slide 40

Another example: [local gradient] x [its gradient] [1] x [0.2] = 0.2 [1] x [0.2] = 0.2 (both inputs!) 42 Lecture 5 - 20 Jan 2016 Fei-Fei Li & Andrej Karpathy & Justin Johnson Slide 42

Another example: [local gradient] x [its gradient] x0: [2] x [0.2] = 0.4 w0: [-1] x [0.2] = -0.2 44 Lecture 5 - 20 Jan 2016 Fei-Fei Li & Andrej Karpathy & Justin Johnson Slide 44

sigmoid function sigmoid gate 45 Lecture 5 - 20 Jan 2016 Fei-Fei Li & Andrej Karpathy & Justin Johnson Slide 45

sigmoid function sigmoid gate (0.73) * (1 - 0.73) = 0.2 46 Lecture 5 - 20 Jan 2016 Fei-Fei Li & Andrej Karpathy & Justin Johnson Slide 46

Gradients add at branches + 47 Lecture 5 - 20 Jan 2016 Fei-Fei Li & Andrej Karpathy & Justin Johnson Slide 47

Summary • Deep learning – New direction for test processing given its success in image/audio processing – Framworks and software • TensorFllow (Google). • Others: Theano, Torch, CAFFE, computation graph toolkit (CGT) Slide 48

Deep Learning for Classification CS293S, Yang, 2017 Computational - PowerPoint PPT Presentation

Deep Learning for Classification CS293S, Yang, 2017 Computational graph for classification w 1 f 1 w 2 S f 2 >0? w 3 f 3 Objective: Classification Accuracy m l acc ( w ) = 1 sign( w > f ( x ( i ) )) == y ( i ) X m i =1

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

Machine Learning Algorithms for Classification Machine Learning Algorithms for Classification

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

Multiclass Classification CS 6956: Deep Learning for NLP 1 So far: Binary Classification We

Image Classification with DIGITS NVIDIA Deep Learning Institute 1 DEEP LEARNING INSTITUTE DLI

Graph Classification Classification Outline Introduction, Overview Classification using

Classification of Symmetry Classification of Symmetry Classification of Symmetry Classification

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

Deep Learning: Theory and Practice Deep Learning - Practical 02-04-2020 Considerations

Presentation about Deep Learning --- Zhongwu xie Contents 1.Brief introduction of Deep learning.

Deep Learning on GPUs March 2016 What is Deep Learning? GPUs and DL AGENDA DL in practice

Deep learning Deep reinforcement learning Hamid Beigy Sharif university of technology December

Differen'able Func'onal Programming Noel Welsh @noelwelsh underscore Goals Deep learning

DSC 102 Systems for Scalable Analytics Arun Kumar Topic 6: Deep Learning Systems 1 Outline

ACCELERATE DEEP LEARNING WITH NVIDIA'S DEEP LEARNING PLATFORM | STEPHEN JONES | GTC16 DEEP

Status of the CBM- and HADES RICH projects at FAIR C. Pauly, Wuppertal University for the CBM

SiD detector design - a critics view Felix Sefkow DESY SiD workshop, SLAC, January 12-14,

Analytical and numerical methods for pricing financial derivatives Daniel Sev covi c

Investigation of a Curtain Antenna Array for TIGER Quang Nguyen, Dr John Devlin Electronics

Chapter 10: Artificial Neural Networks Dr. Xudong Liu Assistant Professor School of Computing

Neural Network II Neural Network II Week 8 1 Team Homework Assignment #10 Team Homework

Lecture 2: Linear Classification Princeton University COS 495 Instructor: Yingyu Liang Review:

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Sambuz

Useful Links

Newsletter

Mail Us