Machine Learning 2007: Lecture 7 Instructor: Tim van Erven - - PowerPoint PPT Presentation

machine learning 2007 lecture 7 instructor tim van erven
SMART_READER_LITE
LIVE PREVIEW

Machine Learning 2007: Lecture 7 Instructor: Tim van Erven - - PowerPoint PPT Presentation

Machine Learning 2007: Lecture 7 Instructor: Tim van Erven (Tim.van.Erven@cwi.nl) Website: www.cwi.nl/erven/teaching/0708/ml/ October 18, 2007 1 / 26 Overview Organisational Organisational Matters Matters Answers Exercises 2


slide-1
SLIDE 1

1 / 26

Machine Learning 2007: Lecture 7 Instructor: Tim van Erven (Tim.van.Erven@cwi.nl) Website: www.cwi.nl/˜erven/teaching/0708/ml/

October 18, 2007

slide-2
SLIDE 2

Overview

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 2 / 26

  • Organisational Matters
  • Answers Exercises 2
  • Linear Functions as Inner Products
  • Vector Valued Outputs in Regression and Classification
  • Neural Networks and the Perceptron

Neural Networks

The Perceptron

Implementing Boolean Functions with a Perceptron

  • Convex Functions
  • Gradient Descent (part 1)
slide-3
SLIDE 3

Course Organisation

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 3 / 26

  • Room of the intermediate exam changed to: Q105.
  • Not necessary to enroll on tisvu.
slide-4
SLIDE 4

Course Organisation

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 3 / 26

  • Room of the intermediate exam changed to: Q105.
  • Not necessary to enroll on tisvu.
  • Next lecture (in two weeks) will be on Wednesday at

13.30-15.15 in room KC159.

slide-5
SLIDE 5

Course Organisation

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 3 / 26

  • Room of the intermediate exam changed to: Q105.
  • Not necessary to enroll on tisvu.
  • Next lecture (in two weeks) will be on Wednesday at

13.30-15.15 in room KC159.

  • Do not submit Office 2007 (.docx) files for the homework. Pdf

is preferred; older Office (.doc) is acceptable.

slide-6
SLIDE 6

Course Organisation

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 3 / 26

  • Room of the intermediate exam changed to: Q105.
  • Not necessary to enroll on tisvu.
  • Next lecture (in two weeks) will be on Wednesday at

13.30-15.15 in room KC159.

  • Do not submit Office 2007 (.docx) files for the homework. Pdf

is preferred; older Office (.doc) is acceptable.

Mitchell:

  • Read: Chapter 4, sections 4.1–4.4.

This Lecture:

  • Explanation of linear functions as inner products is needed to

understand Mitchell.

  • Neural networks are in Mitchell. I have some extra pictures.
  • Convex functions are not discussed in Mitchell.
  • I will give more background on gradient descent.
slide-7
SLIDE 7

Overview

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 4 / 26

  • Organisational Matters
  • Answers Exercises 2
  • Linear Functions as Inner Products
  • Vector Valued Outputs in Regression and Classification
  • Neural Networks and the Perceptron

Neural Networks

The Perceptron

Implementing Boolean Functions with a Perceptron

  • Convex Functions
  • Gradient Descent (part 1)
slide-8
SLIDE 8

Overview

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 5 / 26

  • Organisational Matters
  • Answers Exercises 2
  • Linear Functions as Inner Products
  • Vector Valued Outputs in Regression and Classification
  • Neural Networks and the Perceptron

Neural Networks

The Perceptron

Implementing Boolean Functions with a Perceptron

  • Convex Functions
  • Gradient Descent (part 1)
slide-9
SLIDE 9

Linear Functions as Inner Products

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 6 / 26

Linear Function:

hw(x) = w0 + w1x1 + . . . + wdxd

  • x = (x1, . . . , xd)⊤ is a d-dimensional feature vector.
  • w = (w0, w1, . . ., wd)⊤ is a d + 1-dimensional weight vector.
slide-10
SLIDE 10

Linear Functions as Inner Products

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 6 / 26

Linear Function:

hw(x) = w0 + w1x1 + . . . + wdxd

  • x = (x1, . . . , xd)⊤ is a d-dimensional feature vector.
  • w = (w0, w1, . . ., wd)⊤ is a d + 1-dimensional weight vector.

As Inner Products (a standard trick):

We may change x into a d + 1-dimensional vector x′ by adding an imaginary extra feature x0, which always has value 1: x = (x1, . . . , xd)⊤ ⇒ x′ = (1, x1, . . . , xd)⊤ hw(x) =

d

  • i=0

wix′

i = w, x′

  • Mitchell writes w · x′ for w, x′.
slide-11
SLIDE 11

Overview

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 7 / 26

  • Organisational Matters
  • Answers Exercises 2
  • Linear Functions as Inner Products
  • Vector Valued Outputs in Regression and Classification
  • Neural Networks and the Perceptron

Neural Networks

The Perceptron

Implementing Boolean Functions with a Perceptron

  • Convex Functions
  • Gradient Descent (part 1)
slide-12
SLIDE 12

Vector Valued Outputs

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 8 / 26

Reminder:

  • Regression: Predict the label y for any feature vector x.

Typically y can take infinitely many values.

  • Classification: Predict the class label y for any new feature

vector x. Only finitely many categories for y.

slide-13
SLIDE 13

Vector Valued Outputs

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 8 / 26

Reminder:

  • Regression: Predict the label y for any feature vector x.

Typically y can take infinitely many values.

  • Classification: Predict the class label y for any new feature

vector x. Only finitely many categories for y.

Vector Valued Outputs:

  • In our definition the label y is a single value.
  • This can be generalised to a label vector y.
  • Neural networks typically output label vectors.
slide-14
SLIDE 14

Overview

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 9 / 26

  • Organisational Matters
  • Answers Exercises 2
  • Linear Functions as Inner Products
  • Vector Valued Outputs in Regression and Classification
  • Neural Networks and the Perceptron

Neural Networks

The Perceptron

Implementing Boolean Functions with a Perceptron

  • Convex Functions
  • Gradient Descent (part 1)
slide-15
SLIDE 15

Biology

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 10 / 26

A Neuron [Wikimedia Commons]:

Dendrite Cell body Node of Ranvier Axon Terminal Schwann cell Myelin sheath Axon Nucleus

The Brain:

  • The brain is a complex network of approximately

1011 = 100 000 000 000 neurons.

  • On average each neuron is connected to approximately

104 = 10 000 other neurons.

  • Each neuron has many input channels (dendrites) and one
  • utput channel (axon).
slide-16
SLIDE 16

Artificial Neurons

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 11 / 26

An Artificial Neuron:

An (artificial) neuron is some function h that gets a feature vector x as input and outputs a (single) label y.

slide-17
SLIDE 17

Artificial Neurons

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 11 / 26

An Artificial Neuron:

An (artificial) neuron is some function h that gets a feature vector x as input and outputs a (single) label y.

The Perceptron:

The most famous type of (artificial) neuron is the perceptron: hw(x) =

  • 1

if w0 + w1x1 + . . . wdxd > 0, −1

  • therwise.
  • Applies a threshold to a linear function of x.
  • Has parameters w.
slide-18
SLIDE 18

Artificial Neural Networks

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 12 / 26 INPUTS HIDDEN NEURONS OUTPUT NEURONS x1 x2 x4 x3 x5 x6 y1 y2 y3 y4 OUTPUTS

  • We can create an (artificial) neural network (NN) by

connecting neurons together.

  • We hook up our feature vector x to the input neurons in the
  • network. We get a label vector y from the output neurons.
slide-19
SLIDE 19

Artificial Neural Networks

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 12 / 26 INPUTS HIDDEN NEURONS OUTPUT NEURONS x1 x2 x4 x3 x5 x6 y1 y2 y3 y4 OUTPUTS

  • We can create an (artificial) neural network (NN) by

connecting neurons together.

  • We hook up our feature vector x to the input neurons in the
  • network. We get a label vector y from the output neurons.
  • The parameters of the neural network w consist of all the

parameters of the neurons in the network taken together in

  • ne vector.
slide-20
SLIDE 20

Why Study Neural Networks?

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 13 / 26

Modelling Biology:

  • Some researchers want to study biological learning

processes.

  • They may try to model them using artificial neural networks.
slide-21
SLIDE 21

Why Study Neural Networks?

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 13 / 26

Modelling Biology:

  • Some researchers want to study biological learning

processes.

  • They may try to model them using artificial neural networks.
  • This is not us!
  • In machine learning we often use artificial neural networks

that are poor models of biological neural networks.

slide-22
SLIDE 22

Why Study Neural Networks?

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 13 / 26

Modelling Biology:

  • Some researchers want to study biological learning

processes.

  • They may try to model them using artificial neural networks.
  • This is not us!
  • In machine learning we often use artificial neural networks

that are poor models of biological neural networks.

Obtaining Effective ML Algorithms:

  • We want effective machine learning algorithms.
  • An (artificial) neural network is a hypothesis space H.
  • Each setting of the parameters w corresponds to a different

hypothesis hw ∈ H.

  • This hypothesis space may be used for regression or

classification.

slide-23
SLIDE 23

NN Example: ALVINN

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 14 / 26

Sharp Left Sharp Right

4 Hidden Units 30 Output Units 30x32 Sensor Input Retina

Straight Ahead

slide-24
SLIDE 24

Overview

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 15 / 26

  • Organisational Matters
  • Answers Exercises 2
  • Linear Functions as Inner Products
  • Vector Valued Outputs in Regression and Classification
  • Neural Networks and the Perceptron

Neural Networks

The Perceptron

Implementing Boolean Functions with a Perceptron

  • Convex Functions
  • Gradient Descent (part 1)
slide-25
SLIDE 25

Different Views of The Perceptron

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 16 / 26

Simple Neural Network: Mitchell’s Drawing:

INPUTS OUTPUT NEURONS x1 x2 x4 x3 y1 OUTPUTS

Equation:

hw(x) =

  • 1

if w0 + w1x1 + . . . wdxd > 0, −1

  • therwise.
slide-26
SLIDE 26

Different Views of The Perceptron

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 16 / 26

Simple Neural Network: Mitchell’s Drawing:

INPUTS OUTPUT NEURONS x1 x2 x4 x3 y1 OUTPUTS

Equation:

hw(x) =

  • 1

if w0 + w1x1 + . . . wdxd > 0, −1

  • therwise.
  • One of the most simple neural networks consists of just one

perceptron neuron.

  • A perceptron does classification.
slide-27
SLIDE 27

Different Views of The Perceptron

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 16 / 26

Simple Neural Network: Mitchell’s Drawing:

INPUTS OUTPUT NEURONS x1 x2 x4 x3 y1 OUTPUTS

Equation:

hw(x) =

  • 1

if w0 + w1x1 + . . . wdxd > 0, −1

  • therwise.
  • One of the most simple neural networks consists of just one

perceptron neuron.

  • A perceptron does classification.
  • The network has no hidden units, and just one output.
  • It may have any number of inputs.
slide-28
SLIDE 28

Overview

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 17 / 26

  • Organisational Matters
  • Answers Exercises 2
  • Linear Functions as Inner Products
  • Vector Valued Outputs in Regression and Classification
  • Neural Networks and the Perceptron

Neural Networks

The Perceptron

Implementing Boolean Functions with a Perceptron

  • Convex Functions
  • Gradient Descent (part 1)
slide-29
SLIDE 29

Decision Boundary of the Perceptron

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 18 / 26

Decision boundary: w0 + w1x1 + . . . + wdxd = 0

  • This is where the perceptron changes its output y from −1 (-)

to +1 (+) if we change x a little bit.

  • Always a line.
slide-30
SLIDE 30

Decision Boundary of the Perceptron

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 18 / 26

Decision boundary: w0 + w1x1 + . . . + wdxd = 0

  • This is where the perceptron changes its output y from −1 (-)

to +1 (+) if we change x a little bit.

  • Always a line.

Examples of different Weights (with Boolean inputs: −1 =

false, 1 = true):

AND OR

  • 3
  • 2
  • 1

1 2 3 x1

  • 3
  • 2
  • 1

1 2 3 x2

  • 3
  • 2
  • 1

1 2 3 x1

  • 3
  • 2
  • 1

1 2 3 x2

  • w0 = −0.8, w1 = 0.5, w2 = 0.5

w0 = 0.3, w1 = 0.5, w2 = 0.5 Wrong in Mitchell!

slide-31
SLIDE 31

Perceptron Cannot Represent Exclusive Or

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 19 / 26

Exclusive Or:

  • 3
  • 2
  • 1

1 2 3 x1

  • 3
  • 2
  • 1

1 2 3 x2

  • There exists no line that separates the inputs with label ‘-’

from the inputs with label ‘+’. They are not linearly separable.

slide-32
SLIDE 32

Perceptron Cannot Represent Exclusive Or

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 19 / 26

Exclusive Or:

  • 3
  • 2
  • 1

1 2 3 x1

  • 3
  • 2
  • 1

1 2 3 x2

  • There exists no line that separates the inputs with label ‘-’

from the inputs with label ‘+’. They are not linearly separable.

  • The decision boundary for the perceptron is always a line.
  • Hence a perceptron can never implement the ‘exclusive or’

function, whichever weights we choose.

slide-33
SLIDE 33

Overview

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 20 / 26

  • Organisational Matters
  • Answers Exercises 2
  • Linear Functions as Inner Products
  • Vector Valued Outputs in Regression and Classification
  • Neural Networks and the Perceptron

Neural Networks

The Perceptron

Implementing Boolean Functions with a Perceptron

  • Convex Functions
  • Gradient Descent (part 1)
slide-34
SLIDE 34

Convex Functions

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 21 / 26

Intuition:

  • 10
  • 5

5 10 x 20 40 60 80 100 x2

slide-35
SLIDE 35

Convex Functions

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 21 / 26

Intuition:

  • 10
  • 5

5 10 x 20 40 60 80 100 x2

  • A function is convex if it lies below the line between any two of

its points. For example, f(−3) and f(7).

slide-36
SLIDE 36

Convex Functions

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 21 / 26

Intuition:

  • 10
  • 5

5 10 x 20 40 60 80 100 x2

  • A function is convex if it lies below the line between any two of

its points. For example, f(−3) and f(7).

Definition: A function f(x) is convex if

f(αx1 + (1 − α)x2) ≤ αf(x1) + (1 − α)f(x2) for any two inputs x1, x2 and any 0 ≤ α ≤ 1.

slide-37
SLIDE 37

Examples

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 22 / 26

Convex:

  • 2
  • 1

1 2 3 4 5 x 20 40 60 80 100 120 140 x

slide-38
SLIDE 38

Examples

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 22 / 26

Convex:

  • 2
  • 1

1 2 3 4 5 x 20 40 60 80 100 120 140 x

slide-39
SLIDE 39

Examples

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 22 / 26

Convex:

  • 2
  • 1

1 2 3 4 5 x 20 40 60 80 100 120 140 x

  • 10
  • 5

5 10 x 20 40 60 80 100 x2

slide-40
SLIDE 40

Examples

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 22 / 26

Convex:

  • 2
  • 1

1 2 3 4 5 x 20 40 60 80 100 120 140 x

  • 10
  • 5

5 10 x 20 40 60 80 100 x2

Not Convex:

  • 3
  • 2
  • 1

1 2 3 x

  • 4
  • 2

2 4 x3

slide-41
SLIDE 41

Examples

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 22 / 26

Convex:

  • 2
  • 1

1 2 3 4 5 x 20 40 60 80 100 120 140 x

  • 10
  • 5

5 10 x 20 40 60 80 100 x2

Not Convex:

  • 3
  • 2
  • 1

1 2 3 x

  • 4
  • 2

2 4 x3

slide-42
SLIDE 42

Examples

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 22 / 26

Convex:

  • 2
  • 1

1 2 3 4 5 x 20 40 60 80 100 120 140 x

  • 10
  • 5

5 10 x 20 40 60 80 100 x2

Not Convex:

  • 3
  • 2
  • 1

1 2 3 x

  • 4
  • 2

2 4 x3

  • 10
  • 5

5 10 x

  • 100
  • 80
  • 60
  • 40
  • 20

x2

slide-43
SLIDE 43

Examples

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 22 / 26

Convex:

  • 2
  • 1

1 2 3 4 5 x 20 40 60 80 100 120 140 x

  • 10
  • 5

5 10 x 20 40 60 80 100 x2

Not Convex:

  • 3
  • 2
  • 1

1 2 3 x

  • 4
  • 2

2 4 x3

  • 10
  • 5

5 10 x

  • 100
  • 80
  • 60
  • 40
  • 20

x2

slide-44
SLIDE 44

Overview

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 23 / 26

  • Organisational Matters
  • Answers Exercises 2
  • Linear Functions as Inner Products
  • Vector Valued Outputs in Regression and Classification
  • Neural Networks and the Perceptron

Neural Networks

The Perceptron

Implementing Boolean Functions with a Perceptron

  • Convex Functions
  • Gradient Descent (part 1)
slide-45
SLIDE 45

Gradient Descent

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 24 / 26

  • Gradient descent is a method to find the minimum minx f(x)
  • f a function.
  • It works for convex functions.
  • But not for some other functions.
slide-46
SLIDE 46

Gradient Descent

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 24 / 26

  • Gradient descent is a method to find the minimum minx f(x)
  • f a function.
  • It works for convex functions.
  • But not for some other functions.

General Idea:

1. Pick a random starting point x1. 2. Do a little step in the direction of the derivative: f′(x1). 3. Now we are at x2. 4. Do a little step in the direction of the derivative: f′(x2). 5. Keep doing little steps until f′(xm) ≈ 0: we have reached the minimum.

slide-47
SLIDE 47

Gradient Descent

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 24 / 26

  • Gradient descent is a method to find the minimum minx f(x)
  • f a function.
  • It works for convex functions.
  • But not for some other functions.

General Idea:

1. Pick a random starting point x1. 2. Do a little step in the direction of the derivative: f′(x1). 3. Now we are at x2. 4. Do a little step in the direction of the derivative: f′(x2). 5. Keep doing little steps until f′(xm) ≈ 0: we have reached the minimum.

To be continued next lecture. . .

slide-48
SLIDE 48

Overview

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 25 / 26

  • Organisational Matters
  • Answers Exercises 2
  • Linear Functions as Inner Products
  • Vector Valued Outputs in Regression and Classification
  • Neural Networks and the Perceptron

Neural Networks

The Perceptron

Implementing Boolean Functions with a Perceptron

  • Convex Functions
  • Gradient Descent (part 1)
slide-49
SLIDE 49

References

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 26 / 26

  • Picture of a neuron taken from Wikimedia Commons,

http://commons.wikimedia.org/wiki/Image:Neuron.svg: Originally Neuron.jpg taken from the US Federal (public domain) (Nerve Tissue, retrieved March 2007), redrawn by User:Dhp1080 in Illustrator. Source: ”Anatomy and Physiology” by the US National Cancer Institute’s Surveillance, Epidemiology and End Results (SEER) Program.

  • S. Boyd and L. Vandenberghe. Convex Optimization.

Cambridge University Press, 2004

  • T.M. Mitchell, “Machine Learning”, McGraw-Hill, 1997