In Introductio ion to Neural l Networks I2DL: Prof. Niessner, - PowerPoint PPT Presentation

In Introductio ion to Neural l Networks I2DL: Prof. Niessner, Prof. Leal-Taixé 1

Lecture 2 Recap I2DL: Prof. Niessner, Prof. Leal-Taixé 2

Lin inear Regression = a supervised le learn rning method to find a lin linear r model of the form 𝑒 𝑧 𝑗 = 𝜄 0 + ෍ ො 𝑦 𝑗𝑘 𝜄 𝑘 = 𝜄 0 + 𝑦 𝑗1 𝜄 1 + 𝑦 𝑗2 𝜄 2 + ⋯ + 𝑦 𝑗𝑒 𝜄 𝑒 𝑘=1 Goal: find a model that explains a target y given 𝜄 0 the input x I2DL: Prof. Niessner, Prof. Leal-Taixé 3

Logistic ic Regression • Loss function ℒ 𝑧 𝑗 , ෝ 𝑧 𝑗 = −𝑧 𝑗 ∙ log ෝ 𝑧 𝑗 + (1 − 𝑧 𝑗 ) ∙ log[1 − ෝ 𝑧 𝑗 ]) • Cost function 𝑜 𝒟 𝜾 = − ෍ (𝑧 𝑗 ∙ log ෝ 𝑧 𝑗 + (1 − 𝑧 𝑗 ) ∙ log[1 − ෝ 𝑧 𝑗 ]) 𝑗=1 𝑧 𝑗 = 𝜏(𝑦 𝑗 𝜾) ෝ Minimization I2DL: Prof. Niessner, Prof. Leal-Taixé 4

Lin inear vs Logistic Regressio ion y=1 y=0 Predictions can exceed the range Predictions are guaranteed of the training samples to be within [0;1] → in the case of classification [0;1] this becomes a real issue I2DL: Prof. Niessner, Prof. Leal-Taixé 5

How to obtain in the Model? Data points Labels (ground truth) 𝒛 𝒚 Optimization Loss function Model parameters Estimation ෝ 𝒛 𝜾 I2DL: Prof. Niessner, Prof. Leal-Taixé 6

Lin inear Score re Functio ions • Linear score function as seen in linear regression 𝒈 𝒋 = ෍ 𝑥 𝑙,𝑘 𝑦 𝑘,𝑗 𝒌 (Matrix Notation) 𝒈 = 𝑿 𝒚 I2DL: Prof. Niessner, Prof. Leal-Taixé 7

Lin inear Score re Functio ions on Im Images • Linear score function 𝒈 = 𝑿𝒚 On CIFAR-10 On ImageNet Source:: Li/Karpathy/Johnson I2DL: Prof. Niessner, Prof. Leal-Taixé 8

Lin inear Score re Functio ions? Linear Separation Impossible! Logistic Regression I2DL: Prof. Niessner, Prof. Leal-Taixé 9

Lin inear Score re Functio ions? • Can we make linear regression better? – Multiply with another weight matrix 𝑿 𝟑 ෠ 𝒈 = 𝑿 𝟑 ⋅ 𝒈 ෠ 𝒈 = 𝑿 𝟑 ⋅ 𝑿 ⋅ 𝒚 • Operation is still linear. ෢ 𝑿 = 𝑿 𝟑 ⋅ 𝑿 ෠ 𝒈 = ෢ 𝑿 𝒚 • Solution → add non-linearity!! I2DL: Prof. Niessner, Prof. Leal-Taixé 10

Neural Network • Linear score function 𝒈 = 𝑿𝒚 • Neural network is a nesting of ‘functions’ – 2-layers: 𝒈 = 𝑿 𝟑 max(𝟏, 𝑿 𝟐 𝒚) – 3-layers: 𝒈 = 𝑿 𝟒 max(𝟏, 𝑿 𝟑 max(𝟏, 𝑿 𝟐 𝒚)) – 4-layers: 𝒈 = 𝑿 𝟓 tanh (𝑿 𝟒 , max(𝟏, 𝑿 𝟑 max(𝟏, 𝑿 𝟐 𝒚))) – 5-layers: 𝒈 = 𝑿 𝟔 𝜏(𝑿 𝟓 tanh(𝑿 𝟒 , max(𝟏, 𝑿 𝟑 max(𝟏, 𝑿 𝟐 𝒚)))) – … up to hundreds of layers I2DL: Prof. Niessner, Prof. Leal-Taixé 11

In Introductio ion to Neural l Networks I2DL: Prof. Niessner, Prof. Leal-Taixé 12

His istory of of Neural Networks Source: http://beamlab.org/deeplearning/2017/02/23/deep_learning_101_part1.html I2DL: Prof. Niessner, Prof. Leal-Taixé 13

Neural Network Neural Networks Logistic Regression I2DL: Prof. Niessner, Prof. Leal-Taixé 14

Neural Network • Non-linear score function 𝒈 = … (max(𝟏, 𝑿 𝟐 𝒚)) On CIFAR-10 Visualizing activations of first layer. Source: ConvNetJS I2DL: Prof. Niessner, Prof. Leal-Taixé 15

Neural Network 1-layer network: 𝒈 = 𝑿𝒚 2-layer network: 𝒈 = 𝑿 𝟑 max(𝟏, 𝑿 𝟐 𝒚) 𝒚 𝒚 𝑿 𝒈 𝑿 𝟐 𝑿 2 𝒈 𝒊 128 × 128 = 16384 1000 10 128 × 128 = 16384 10 Why is this structure useful? I2DL: Prof. Niessner, Prof. Leal-Taixé 16

Neural Network 2-layer network: 𝒈 = 𝑿 𝟑 max(𝟏, 𝑿 𝟐 𝒚) 𝒚 𝑿 𝟐 𝑿 2 𝒈 𝒊 128 × 128 = 16384 1000 10 Input Layer Hidden Layer Output Layer I2DL: Prof. Niessner, Prof. Leal-Taixé 17

Net of f Art rtif ificial Neurons 𝑔(𝑋 0,0 𝑦 + 𝑐 0,0 ) 𝑔(𝑋 1,0 𝑦 + 𝑐 1,0 ) 𝑦 1 𝑔(𝑋 0,1 𝑦 + 𝑐 0,1 ) 𝑦 2 𝑔(𝑋 1,1 𝑦 + 𝑐 1,1 ) 𝑔(𝑋 2,0 𝑦 + 𝑐 2,0 ) 𝑔(𝑋 0,2 𝑦 + 𝑐 0,2 ) 𝑦 3 𝑔(𝑋 1,2 𝑦 + 𝑐 1,2 ) 𝑔(𝑋 0,3 𝑦 + 𝑐 0,3 ) I2DL: Prof. Niessner, Prof. Leal-Taixé 18

Neural Network Source: https://towardsdatascience.com/training-deep-neural-networks-9fdb1964b964 I2DL: Prof. Niessner, Prof. Leal-Taixé 19

Activ ivatio ion Functions Leaky ReLU: max 0.1𝑦, 𝑦 1 Sigmoid: 𝜏 𝑦 = (1+𝑓 −𝑦 ) tanh: tanh 𝑦 Parametric ReLU: max 𝛽𝑦, 𝑦 Maxout max 𝑥 1 𝑈 𝑦 + 𝑐 1 , 𝑥 2 𝑈 𝑦 + 𝑐 2 ReLU: max 0, 𝑦 𝑦 if 𝑦 > 0 ELU f x = ቊ α e 𝑦 − 1 if 𝑦 ≤ 0 I2DL: Prof. Niessner, Prof. Leal-Taixé 20

Neural Network 𝒈 = 𝑿 𝟒 ⋅ (𝑿 𝟑 ⋅ 𝑿 𝟐 ⋅ 𝒚 )) Why activation functions? Simply concatenating linear layers would be so much cheaper... I2DL: Prof. Niessner, Prof. Leal-Taixé 21

Neural Network Why organize a neural network into layers? I2DL: Prof. Niessner, Prof. Leal-Taixé 22

Bio iolo logical Neurons Credit: Stanford CS 231n I2DL: Prof. Niessner, Prof. Leal-Taixé 23

Bio iolo logical Neurons Credit: Stanford CS 231n I2DL: Prof. Niessner, Prof. Leal-Taixé 24

Art rtif ificial Neural Networks vs Bra rain Artificial neural networks are insp spired by the brain, but not even close in terms of complexity! The comparison is great for the media and news articles however...  I2DL: Prof. Niessner, Prof. Leal-Taixé 25

Art rtif ificial Neural Network 𝑔(𝑋 0,0 𝑦 + 𝑐 0,0 ) 𝑔(𝑋 1,0 𝑦 + 𝑐 1,0 ) 𝑦 1 𝑔(𝑋 0,1 𝑦 + 𝑐 0,1 ) 𝑦 2 𝑔(𝑋 1,1 𝑦 + 𝑐 1,1 ) 𝑔(𝑋 2,0 𝑦 + 𝑐 2,0 ) 𝑔(𝑋 0,2 𝑦 + 𝑐 0,2 ) 𝑦 3 𝑔(𝑋 1,2 𝑦 + 𝑐 1,2 ) 𝑔(𝑋 0,3 𝑦 + 𝑐 0,3 ) I2DL: Prof. Niessner, Prof. Leal-Taixé 26

Neural Network • Summary – Given a dataset with ground truth training pairs [𝑦 𝑗 ; 𝑧 𝑗 ] , – Find optimal weights 𝑿 using stochastic gradient descent, such that the loss function is minimized • Compute gradients with backpropagation (use batch-mode; more later) • Iterate many times over training set (SGD; more later) I2DL: Prof. Niessner, Prof. Leal-Taixé 27

Computatio ional l Graphs I2DL: Prof. Niessner, Prof. Leal-Taixé 28

Computatio ional Gra raphs • Directional graph • Matrix operations are represented as compute nodes. • Vertex nodes are variables or operators like +, -, *, /, log(), exp() … • Directional edges show flow of inputs to vertices I2DL: Prof. Niessner, Prof. Leal-Taixé 29

Computatio ional Gra raphs • 𝑔 𝑦, 𝑧, 𝑨 = 𝑦 + 𝑧 ⋅ 𝑨 sum 𝑔 𝑦, 𝑧, 𝑨 mult I2DL: Prof. Niessner, Prof. Leal-Taixé 30

Evalu luation: : Forw rward Pass • 𝑔 𝑦, 𝑧, 𝑨 = 𝑦 + 𝑧 ⋅ 𝑨 Initialization 𝑦 = 1, 𝑧 = −3, 𝑨 = 4 1 1 𝑒 = −2 −3 sum sum 𝑔 = −8 −3 mult 4 4 I2DL: Prof. Niessner, Prof. Leal-Taixé 31

Computatio ional Gra raphs • Why discuss compute graphs? • Neural networks have complicated architectures 𝒈 = 𝑿 𝟔 𝜏(𝑿 𝟓 tanh(𝑿 𝟒 , max(𝟏, 𝑿 𝟑 max(𝟏, 𝑿 𝟐 𝒚)))) • Lot of matrix operations! • Represent NN as computational graphs! I2DL: Prof. Niessner, Prof. Leal-Taixé 32

Computatio ional Gra raphs A neural network can be represented as a computational graph... – it has compute nodes (operations) – it has edges that connect nodes (data flow) – it is directional – it can be organized into ‘layers’ I2DL: Prof. Niessner, Prof. Leal-Taixé 33

Computatio ional Gra raphs (2) 𝑔 𝑥 11 (2) (2) 𝑦 1 (3) 𝑨 1 𝑏 1 𝑥 11 (2) 𝑥 12 (2) = ෍ (2) + 𝑐 𝑗 (2) (2) (3) 𝑥 13 𝑥 12 𝑨 𝑙 𝑦 𝑗 𝑥 𝑗𝑙 (2) 𝑥 21 (3) 𝑔 𝑗 𝑥 21 (2) (3) 𝑥 22 (2) 𝑦 2 (2) 𝑨 1 𝑨 2 𝑏 2 (2) (2) = 𝑔(𝑨 𝑙 2 ) 𝑥 23 (3) 𝑥 22 𝑏 𝑙 (2) (3) 𝑥 31 𝑥 31 𝑔 (2) 𝑥 32 (2) (3) 𝑦 3 (2) 𝑨 3 𝑨 2 𝑏 3 (2) 𝑥 33 (3) (3) = ෍ (3) + 𝑐 𝑗 𝑥 32 (2) 𝑥 𝑗𝑙 (3) 𝑨 𝑙 𝑏 𝑗 (2) 𝑐 2 (2) (3) 𝑐 1 𝑐 1 𝑗 (3) (2) 𝑐 2 𝑐 3 … + １ + １ I2DL: Prof. Niessner, Prof. Leal-Taixé 34

Computatio ional Gra raphs • From a set of neurons to a Structured Compute Pipeline [Szegedy et al.,CVPR’15] Going Deeper with Convolutions I2DL: Prof. Niessner, Prof. Leal-Taixé 35

Computatio ional Gra raphs • The computation of Neural Network has further meanings: – The multiplication of 𝑿 𝒋 and 𝒚 : encode input information – The activation function: select the key features Source; https://www.zybuluo.com/liuhui0803/note/981434 I2DL: Prof. Niessner, Prof. Leal-Taixé 36

Computatio ional Gra raphs • The computations of Neural Networks have further meanings: – The convolutional layers: extract useful features with shared weights Source: https://www.zcfy.cc/original/understanding-convolutions-colah-s-blog I2DL: Prof. Niessner, Prof. Leal-Taixé 37

Computatio ional Gra raphs • The computations of Neural Networks have further meanings: – The convolutional layers: extract useful features with shared weights Source: https://www.zybuluo.com/liuhui0803/note/981434 I2DL: Prof. Niessner, Prof. Leal-Taixé 38

In Introductio ion to Neural l Networks I2DL: Prof. Niessner, - PowerPoint PPT Presentation

In Introductio ion to Neural l Networks I2DL: Prof. Niessner, Prof. Leal-Taix 1 Lecture 2 Recap I2DL: Prof. Niessner, Prof. Leal-Taix 2 Lin inear Regression = a supervised le learn rning method to find a lin linear r model of

2020 H Hig ighlig ights 7 October Introductio ion 2 Introductio ions Hosts Special

In Introductio ion to Beyond Pin ink Foundatio ion & & Programs Start Up

IN INTRODUCTIO ION TO H HMIS IS Health Management nt Information n Systems(HMIS IS)

line Nail Ba Nail Bar Presen Presentat tation O ion Out utline Welcome / Introductio ns

Framework and Primary Care Networks In Introductio ion Fiona Lowe 10:00 Agenda Morning

RENEWABLE INTEGRATION By Bharat Heavy Electricals Limited In Introductio ion Why Energy

Grantober Workshop One In Introductio ion to Grant Writ itin ing A welcome to the

In Introductio ion to Deep Learnin ing I2DL: Prof. Niessner, Prof. Leal-Taix 1 The Team

Compsci 101 101 Introductio ion Live L Lectu ture re Susan Rodger August 18, 2020

Compsci 101 101 Introductio ion Live L Lectu ture re Susan Rodger sum(lst) sum of the

Int Introductio ion t n to Deep Deep Lea earn rning Prof. Leal-Taix and Prof. Niessner 1

I nt roduct ion t o Lab 2 I nt roduct ion t o Lab 2 I nt roduct ion t o Lab 2 I nt roduct ion t

Plasmacluster Ion Generator Plasmacluster Ion Generator A Revolution in Air Treatment Natures

ION RIT POWE RPOINT PRE SE NT AT ION SUBMISSION PRE SE NT AT ION GUIDE L INE S

LITHIUM ION IN MATERIALS HANDLING LITHIUM ION IN MATERIALS HANDLING LITHIUM ION IN WAREHOUSE

Intr Introductio oduction n to o Mi Middl ddle e Sch School ool Admi dmissions ssions

Linearly-polarized small-x gluons in forward heavy quark production Pieter Taels, INFN Pavia REF

Space-Time Discontinuous Galerkin Discretizations for Linear First-Order Hyperbolic Evolution

Measurement of helicity dependence of p 0 photoproduction on deuteron Federico Cividini

Page 1 1 Regrading Course Information to request assignment or exam regrade course web

ECE 5984: Introduction to Machine Learning Topics: Neural Networks Backprop Readings:

Deep Neural Networks and Mixed Integer Linear Optimization Matteo Fischetti, University of

CS885 Reinforcement Learning Lecture 4b: May 11, 2018 Deep Q-networks [SutBar] Sec. 9.4, 9.7,

Neural Networks with Googles TensorFlow Shuo Zhang Computational discourse analysis 11/22/16

In Introductio ion to Neural l Networks I2DL: Prof. Niessner, - PowerPoint PPT Presentation

In Introductio ion to Neural l Networks I2DL: Prof. Niessner, Prof. Leal-Taix 1 Lecture 2 Recap I2DL: Prof. Niessner, Prof. Leal-Taix 2 Lin inear Regression = a supervised le learn rning method to find a lin linear r model of

2020 H Hig ighlig ights 7 October Introductio ion 2 Introductio ions Hosts Special

In Introductio ion to Beyond Pin ink Foundatio ion &amp; &amp; Programs Start Up

IN INTRODUCTIO ION TO H HMIS IS Health Management nt Information n Systems(HMIS IS)

line Nail Ba Nail Bar Presen Presentat tation O ion Out utline Welcome / Introductio ns

Framework and Primary Care Networks In Introductio ion Fiona Lowe 10:00 Agenda Morning

RENEWABLE INTEGRATION By Bharat Heavy Electricals Limited In Introductio ion Why Energy

Grantober Workshop One In Introductio ion to Grant Writ itin ing A welcome to the

In Introductio ion to Deep Learnin ing I2DL: Prof. Niessner, Prof. Leal-Taix 1 The Team

Compsci 101 101 Introductio ion Live L Lectu ture re Susan Rodger August 18, 2020

Compsci 101 101 Introductio ion Live L Lectu ture re Susan Rodger sum(lst) sum of the

Int Introductio ion t n to Deep Deep Lea earn rning Prof. Leal-Taix and Prof. Niessner 1

I nt roduct ion t o Lab 2 I nt roduct ion t o Lab 2 I nt roduct ion t o Lab 2 I nt roduct ion t

Plasmacluster Ion Generator Plasmacluster Ion Generator A Revolution in Air Treatment Natures

ION RIT POWE RPOINT PRE SE NT AT ION SUBMISSION PRE SE NT AT ION GUIDE L INE S

LITHIUM ION IN MATERIALS HANDLING LITHIUM ION IN MATERIALS HANDLING LITHIUM ION IN WAREHOUSE

Intr Introductio oduction n to o Mi Middl ddle e Sch School ool Admi dmissions ssions

Linearly-polarized small-x gluons in forward heavy quark production Pieter Taels, INFN Pavia REF

Space-Time Discontinuous Galerkin Discretizations for Linear First-Order Hyperbolic Evolution

Measurement of helicity dependence of p 0 photoproduction on deuteron Federico Cividini

Page 1 1 Regrading Course Information to request assignment or exam regrade course web

ECE 5984: Introduction to Machine Learning Topics: Neural Networks Backprop Readings:

Deep Neural Networks and Mixed Integer Linear Optimization Matteo Fischetti, University of

CS885 Reinforcement Learning Lecture 4b: May 11, 2018 Deep Q-networks [SutBar] Sec. 9.4, 9.7,

Neural Networks with Googles TensorFlow Shuo Zhang Computational discourse analysis 11/22/16

In Introductio ion to Beyond Pin ink Foundatio ion & & Programs Start Up