Neural Networks Oskar Taubert (SCC) SCC 1 15.01.2020 Oskar - - PowerPoint PPT Presentation

neural networks
SMART_READER_LITE
LIVE PREVIEW

Neural Networks Oskar Taubert (SCC) SCC 1 15.01.2020 Oskar - - PowerPoint PPT Presentation

Neural Networks Oskar Taubert (SCC) SCC 1 15.01.2020 Oskar Taubert - Neural Networks SCC www.kit.edu KIT The Research University in the Helmholtz Association Neural Network Concept (very) remotely brain inspired computational system


slide-1
SLIDE 1

1

15.01.2020 Oskar Taubert - Neural Networks SCC

SCC

Neural Networks

Oskar Taubert (SCC)

KIT – The Research University in the Helmholtz Association

www.kit.edu

slide-2
SLIDE 2

Neural Network Concept

2

15.01.2020 Oskar Taubert - Neural Networks SCC

(very) remotely brain inspired computational system directed graph, encoding an ordered system of simple mathematical transformations successor of the perceptron concept (i.e. logistic regression) more complicated ’fit’ i.e. universal function approximator usually supervised machine learning

slide-3
SLIDE 3

Motivation

3

15.01.2020 Oskar Taubert - Neural Networks SCC

automation of tasks machines are traditionally bad at

image recognition natural language processing . . .

image processing challenges e.g. character recognition (MNIST)

slide-4
SLIDE 4

Perceptron

4

15.01.2020 Oskar Taubert - Neural Networks SCC

decide whether an image depicts a 0:

  • = Θ(w · x + b) where

Output: o ∈ {0, 1} Input: x ∈ R|Pixels| Parameter: w ∈ R|Pixels| Parameter: b ∈ R x1 x2

slide-5
SLIDE 5

Perceptron

4

15.01.2020 Oskar Taubert - Neural Networks SCC

decide whether an image depicts a 0:

  • = Θ(w · x + b) where

Output: o ∈ {0, 1} Input: x ∈ R|Pixels| Parameter: w ∈ R|Pixels| Parameter: b ∈ R x1 x2 Problem: Non-linear decision boundaries

slide-6
SLIDE 6

XOR

5

15.01.2020 Oskar Taubert - Neural Networks SCC

0.5 1 0.5 1 x1 x2 OR-gate h1 = x1 + x2 − 1

slide-7
SLIDE 7

XOR

5

15.01.2020 Oskar Taubert - Neural Networks SCC

0.5 1 0.5 1 x1 x2 0.5 1 x1 x2 OR-gate h1 = x1 + x2 − 1 NAND-gate h2 = −x1 − x2 − 1.5

slide-8
SLIDE 8

XOR

5

15.01.2020 Oskar Taubert - Neural Networks SCC

0.5 1 0.5 1 x1 x2 0.5 1 x1 x2 0.5 1 h1 h2 OR-gate h1 = x1 + x2 − 1 NAND-gate h2 = −x1 − x2 − 1.5 AND-gate ˆ y = h1 + h2 − 1.5

slide-9
SLIDE 9

Multilayer Perceptron

6

15.01.2020 Oskar Taubert - Neural Networks SCC

  • = f(W · h + b)

h = g(V · x + c)

  • ∈ R10

W ∈ R10×3 h ∈ R3 b ∈ R10 V ∈ R|Pixels|×3 x ∈ R|Pixels| c ∈ R3

x0 x1 . . . xn h2 h1 h0

  • 1

. . . Input Hidden Output

slide-10
SLIDE 10

Multilayer Perceptron

6

15.01.2020 Oskar Taubert - Neural Networks SCC

  • = f(W · h + b)

h = g(V · x + c) f and g? Values for W and V?

x0 x1 . . . xn h2 h1 h0

  • 1

. . . Input Hidden Output

slide-11
SLIDE 11

Training

7

15.01.2020 Oskar Taubert - Neural Networks SCC

parameters = {W, V, b, c} Error measure: E(o, t) = (o − t)2

slide-12
SLIDE 12

Training

7

15.01.2020 Oskar Taubert - Neural Networks SCC

parameters = {W, V, b, c} Error measure: E(o, t) = (o − t)2 parameters ← parameters − λ ∂error ∂parameters

slide-13
SLIDE 13

Training

7

15.01.2020 Oskar Taubert - Neural Networks SCC

parameters = {W, V, b, c} Error measure: E(o, t) = (o − t)2 parameters ← parameters − λ ∂error ∂parameters ∂E ∂W = ∂E ∂o ∂o ∂W = ∂E ∂o ∂o ∂f ∂f ∂W h ∂E ∂b = ∂E ∂o ∂o ∂b = ∂E ∂o ∂o ∂f ∂f ∂b

slide-14
SLIDE 14

Training

7

15.01.2020 Oskar Taubert - Neural Networks SCC

parameters = {W, V, b, c} Error measure: E(o, t) = (o − t)2 parameters ← parameters − λ ∂error ∂parameters ∂E ∂W = ∂E ∂o ∂o ∂W = ∂E ∂o ∂o ∂f ∂f ∂W h ∂E ∂b = ∂E ∂o ∂o ∂b = ∂E ∂o ∂o ∂f ∂f ∂b ∂E ∂V = ∂E ∂o ∂o ∂f ∂f ∂h ∂h ∂g ∂g ∂V x

slide-15
SLIDE 15

Training

7

15.01.2020 Oskar Taubert - Neural Networks SCC

More general: E(t, fn(Wn · fn−1(. . . f1(W1 · x)))) ∂E ∂Wi = δi · hi−1 δi−1 = δi ∂fi−1|Wl−1 · Wi

slide-16
SLIDE 16

Error functions

8

15.01.2020 Oskar Taubert - Neural Networks SCC

Regression: MSE, KL-divergence Classification: Cross Entropy, NLL-loss Segmentation: Hinge-losses, Overlap/Dissimilarity losses

slide-17
SLIDE 17

Convolutions

9

15.01.2020 Oskar Taubert - Neural Networks SCC

Figure: *

c Machine Learning Guru

slide-18
SLIDE 18

Convolutions

9

15.01.2020 Oskar Taubert - Neural Networks SCC

Figure: *

c Machine Learning Guru

slide-19
SLIDE 19

Convolutions

9

15.01.2020 Oskar Taubert - Neural Networks SCC

Figure: *

c Machine Learning Guru

slide-20
SLIDE 20

Activation Functions

10

15.01.2020 Oskar Taubert - Neural Networks SCC

Activation functions f(x) introduce non-linearity, e.g. sigmoid Other non-linear choices, e.g. tanh(x), relu(x) = max(0, x), softmaxi(x) =

exp(xi) ∑i exp(xi), etc.

Better numerical properties, e.g. avoid vanishing gradient −2−1 0 1 2 −2 −1 1 2 x f(x) sigmoid −2−1 0 1 2 x tanh −2−1 0 1 2 x ReLU −2−1 0 1 2 x SeLU

slide-21
SLIDE 21

Regularization

11

15.01.2020 Oskar Taubert - Neural Networks SCC

degree=0

y

degree=3

x y

degree=1 degree=9

x

slide-22
SLIDE 22

Regularization

12

15.01.2020 Oskar Taubert - Neural Networks SCC

Test loss Training loss Optimum

epoch J early stopping weight decay weight sharing dropout batch normalization data augmentation more data

slide-23
SLIDE 23

Hyperparameters

13

15.01.2020 Oskar Taubert - Neural Networks SCC

guessing experience non-gradient based optimization

grid search random search particle swarm genetic

slide-24
SLIDE 24

Out of Scope

14

15.01.2020 Oskar Taubert - Neural Networks SCC

residual models generative models recurrent models attention models lots reinforcement learning (next week)

slide-25
SLIDE 25

Sources

15

15.01.2020 Oskar Taubert - Neural Networks SCC

http://nyu-cds.sparksites.io/wp-content/uploads/2015/10/ header_4@2x.png https://github.com/Markus-Goetz/gks-2019/blob/solutions/ slides/slides.pdf