Deep Learning - Theory and Practice Linear Regression, Least Squares - - PowerPoint PPT Presentation

▶

Feb 02, 2023 215 likes •468 views

Deep Learning - Theory and Practice Linear Regression, Least Squares 20-02-2020 Classification and Logistic Regression http://leap.ee.iisc.ac.in/sriram/teaching/DL20/ deeplearning.cce2020@gmail.com Least Squares for Classification K-class

SLIDE 1

Deep Learning - Theory and Practice

20-02-2020

Linear Regression, Least Squares Classification and Logistic Regression

http://leap.ee.iisc.ac.in/sriram/teaching/DL20/ deeplearning.cce2020@gmail.com

SLIDE 2

Least Squares for Classification

Bishop - PRML book (Chap 3)

❖ K-class classification problem ❖ With 1-of-K hot encoding, and

least squares regression

SLIDE 3

SLIDE 4

SLIDE 5

Logistic Regression

Bishop - PRML book (Chap 3)

❖ 2- class logistic regression ❖ K-class logistic regression ❖ Maximum likelihood solution ❖ Maximum likelihood solution

SLIDE 6

SLIDE 7

SLIDE 8

SLIDE 9

SLIDE 10

SLIDE 11

Typical Error Surfaces

Typical Error Surface as a function of parameters (weights and biases)

SLIDE 12

SLIDE 13

SLIDE 14

Learning with Gradient Descent

Error surface close to a local

SLIDE 15

Learning Using Gradient Descent

SLIDE 16

Parameter Learning

Solving a non-convex
ptimization.
Iterative solution.
Depends on the initialization.
Convergence to a local
ptima.
Judicious choice of learning

rate

SLIDE 17

Least Squares versus Logistic Regression

Bishop - PRML book (Chap 4)

SLIDE 18

Least Squares versus Logistic Regression

Bishop - PRML book (Chap 4)

SLIDE 19

Neural Networks

SLIDE 20

Perceptron Algorithm

What if the data is not linearly separable Perceptron Model [McCulloch, 1943, Rosenblatt, 1957] Targets are binary classes [-1,1]

SLIDE 21

Multi-layer Perceptron

Multi-layer Perceptron [Hopfield, 1982] thresholding function non-linear function (tanh,sigmoid)

SLIDE 22

Neural Networks

Multi-layer Perceptron [Hopfield, 1982] thresholding function non-linear function (tanh,sigmoid)

Useful for classifying non-linear data boundaries -

non-linear class separation can be realized given enough data.

SLIDE 23

Neural Networks

tanh sigmoid ReLu Cost-Function are the desired outputs Mean Square Error Cross Entropy Types of Non-linearities

Deep Learning - Theory and Practice Linear Regression, Least Squares - - PowerPoint PPT Presentation

Least Squares for Classification

least squares regression

Logistic Regression

Typical Error Surfaces

Typical Error Surface as a function of parameters (weights and biases)

Learning with Gradient Descent

Error surface close to a local

Learning Using Gradient Descent

Parameter Learning

rate

Least Squares versus Logistic Regression

Least Squares versus Logistic Regression

Perceptron Algorithm

What if the data is not linearly separable Perceptron Model [McCulloch, 1943, Rosenblatt, 1957] Targets are binary classes [-1,1]

Multi-layer Perceptron

Multi-layer Perceptron [Hopfield, 1982] thresholding function non-linear function (tanh,sigmoid)

Neural Networks

Multi-layer Perceptron [Hopfield, 1982] thresholding function non-linear function (tanh,sigmoid)

non-linear class separation can be realized given enough data.

Neural Networks

tanh sigmoid ReLu Cost-Function are the desired outputs Mean Square Error Cross Entropy Types of Non-linearities

Learning Posterior Probabilities with NNs

Choice of target function

probabilities