[PPT] - Machine Learning 2007: Lecture 7 Instructor: Tim van Erven PowerPoint Presentation

SLIDE 1

1 / 26

Machine Learning 2007: Lecture 7 Instructor: Tim van Erven (Tim.van.Erven@cwi.nl) Website: www.cwi.nl/˜erven/teaching/0708/ml/

October 18, 2007

SLIDE 2

Overview

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 2 / 26

Organisational Matters
Answers Exercises 2
Linear Functions as Inner Products
Vector Valued Outputs in Regression and Classification
Neural Networks and the Perceptron

✦

Neural Networks

✦

The Perceptron

✦

Implementing Boolean Functions with a Perceptron

Convex Functions
Gradient Descent (part 1)

SLIDE 3

Course Organisation

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 3 / 26

Room of the intermediate exam changed to: Q105.
Not necessary to enroll on tisvu.

SLIDE 4

Course Organisation

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 3 / 26

Room of the intermediate exam changed to: Q105.
Not necessary to enroll on tisvu.
Next lecture (in two weeks) will be on Wednesday at

13.30-15.15 in room KC159.

SLIDE 5

Course Organisation

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 3 / 26

Room of the intermediate exam changed to: Q105.
Not necessary to enroll on tisvu.
Next lecture (in two weeks) will be on Wednesday at

13.30-15.15 in room KC159.

Do not submit Office 2007 (.docx) files for the homework. Pdf

is preferred; older Office (.doc) is acceptable.

SLIDE 6

Course Organisation

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 3 / 26

Room of the intermediate exam changed to: Q105.
Not necessary to enroll on tisvu.
Next lecture (in two weeks) will be on Wednesday at

13.30-15.15 in room KC159.

Do not submit Office 2007 (.docx) files for the homework. Pdf

is preferred; older Office (.doc) is acceptable.

Mitchell:

Read: Chapter 4, sections 4.1–4.4.

This Lecture:

Explanation of linear functions as inner products is needed to

understand Mitchell.

Neural networks are in Mitchell. I have some extra pictures.
Convex functions are not discussed in Mitchell.
I will give more background on gradient descent.

SLIDE 7

Overview

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 4 / 26

Organisational Matters
Answers Exercises 2
Linear Functions as Inner Products
Vector Valued Outputs in Regression and Classification
Neural Networks and the Perceptron

✦

Neural Networks

✦

The Perceptron

✦

Implementing Boolean Functions with a Perceptron

Convex Functions
Gradient Descent (part 1)

SLIDE 8

Overview

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 5 / 26

Organisational Matters
Answers Exercises 2
Linear Functions as Inner Products
Vector Valued Outputs in Regression and Classification
Neural Networks and the Perceptron

✦

Neural Networks

✦

The Perceptron

✦

Implementing Boolean Functions with a Perceptron

Convex Functions
Gradient Descent (part 1)

SLIDE 9

Linear Functions as Inner Products

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 6 / 26

Linear Function:

hw(x) = w0 + w1x1 + . . . + wdxd

x = (x1, . . . , xd)⊤ is a d-dimensional feature vector.
w = (w0, w1, . . ., wd)⊤ is a d + 1-dimensional weight vector.

SLIDE 10

Linear Functions as Inner Products

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 6 / 26

Linear Function:

hw(x) = w0 + w1x1 + . . . + wdxd

x = (x1, . . . , xd)⊤ is a d-dimensional feature vector.
w = (w0, w1, . . ., wd)⊤ is a d + 1-dimensional weight vector.

As Inner Products (a standard trick):

We may change x into a d + 1-dimensional vector x′ by adding an imaginary extra feature x0, which always has value 1: x = (x1, . . . , xd)⊤ ⇒ x′ = (1, x1, . . . , xd)⊤ hw(x) =

d

i=0

wix′

i = w, x′

Mitchell writes w · x′ for w, x′.

SLIDE 11

Overview

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 7 / 26

Organisational Matters
Answers Exercises 2
Linear Functions as Inner Products
Vector Valued Outputs in Regression and Classification
Neural Networks and the Perceptron

✦

Neural Networks

✦

The Perceptron

✦

Implementing Boolean Functions with a Perceptron

Convex Functions
Gradient Descent (part 1)

SLIDE 12

Vector Valued Outputs

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 8 / 26

Reminder:

Regression: Predict the label y for any feature vector x.

Typically y can take infinitely many values.

Classification: Predict the class label y for any new feature

vector x. Only finitely many categories for y.

SLIDE 13

Vector Valued Outputs

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 8 / 26

Reminder:

Regression: Predict the label y for any feature vector x.

Typically y can take infinitely many values.

Classification: Predict the class label y for any new feature

vector x. Only finitely many categories for y.

Vector Valued Outputs:

In our definition the label y is a single value.
This can be generalised to a label vector y.
Neural networks typically output label vectors.

SLIDE 14

Overview

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 9 / 26

Organisational Matters
Answers Exercises 2
Linear Functions as Inner Products
Vector Valued Outputs in Regression and Classification
Neural Networks and the Perceptron

✦

Neural Networks

✦

The Perceptron

✦

Implementing Boolean Functions with a Perceptron

Convex Functions
Gradient Descent (part 1)

SLIDE 15

Biology

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 10 / 26

A Neuron [Wikimedia Commons]:

Dendrite Cell body Node of Ranvier Axon Terminal Schwann cell Myelin sheath Axon Nucleus

The Brain:

The brain is a complex network of approximately

1011 = 100 000 000 000 neurons.

On average each neuron is connected to approximately

104 = 10 000 other neurons.

Each neuron has many input channels (dendrites) and one
utput channel (axon).

SLIDE 16

Artificial Neurons

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 11 / 26

An Artificial Neuron:

An (artificial) neuron is some function h that gets a feature vector x as input and outputs a (single) label y.

SLIDE 17

Artificial Neurons

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 11 / 26

An Artificial Neuron:

An (artificial) neuron is some function h that gets a feature vector x as input and outputs a (single) label y.

The Perceptron:

The most famous type of (artificial) neuron is the perceptron: hw(x) =

1

if w0 + w1x1 + . . . wdxd > 0, −1

therwise.
Applies a threshold to a linear function of x.
Has parameters w.

SLIDE 18

Artificial Neural Networks

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 12 / 26 INPUTS HIDDEN NEURONS OUTPUT NEURONS x1 x2 x4 x3 x5 x6 y1 y2 y3 y4 OUTPUTS

We can create an (artificial) neural network (NN) by

connecting neurons together.

We hook up our feature vector x to the input neurons in the
network. We get a label vector y from the output neurons.

SLIDE 19

Artificial Neural Networks

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 12 / 26 INPUTS HIDDEN NEURONS OUTPUT NEURONS x1 x2 x4 x3 x5 x6 y1 y2 y3 y4 OUTPUTS

We can create an (artificial) neural network (NN) by

connecting neurons together.

We hook up our feature vector x to the input neurons in the
network. We get a label vector y from the output neurons.
The parameters of the neural network w consist of all the

parameters of the neurons in the network taken together in

ne vector.

SLIDE 20

Why Study Neural Networks?

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 13 / 26

Modelling Biology:

Some researchers want to study biological learning

processes.

They may try to model them using artificial neural networks.

SLIDE 21

Why Study Neural Networks?

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 13 / 26

Modelling Biology:

Some researchers want to study biological learning

processes.

They may try to model them using artificial neural networks.
This is not us!
In machine learning we often use artificial neural networks

that are poor models of biological neural networks.

SLIDE 22

Why Study Neural Networks?

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 13 / 26

Modelling Biology:

Some researchers want to study biological learning

processes.

They may try to model them using artificial neural networks.
This is not us!
In machine learning we often use artificial neural networks

that are poor models of biological neural networks.

Obtaining Effective ML Algorithms:

We want effective machine learning algorithms.
An (artificial) neural network is a hypothesis space H.
Each setting of the parameters w corresponds to a different

hypothesis hw ∈ H.

This hypothesis space may be used for regression or

classification.

SLIDE 23

NN Example: ALVINN

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 14 / 26

Sharp Left Sharp Right

4 Hidden Units 30 Output Units 30x32 Sensor Input Retina

Straight Ahead

SLIDE 24

Overview

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 15 / 26

Organisational Matters
Answers Exercises 2
Linear Functions as Inner Products
Vector Valued Outputs in Regression and Classification
Neural Networks and the Perceptron

✦

Neural Networks

✦

The Perceptron

✦

Implementing Boolean Functions with a Perceptron

Convex Functions
Gradient Descent (part 1)

SLIDE 25

Different Views of The Perceptron

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 16 / 26

Simple Neural Network: Mitchell’s Drawing:

INPUTS OUTPUT NEURONS x1 x2 x4 x3 y1 OUTPUTS

Equation:

hw(x) =

1

if w0 + w1x1 + . . . wdxd > 0, −1

therwise.

SLIDE 26

Different Views of The Perceptron

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 16 / 26

Simple Neural Network: Mitchell’s Drawing:

INPUTS OUTPUT NEURONS x1 x2 x4 x3 y1 OUTPUTS

Equation:

hw(x) =

1

if w0 + w1x1 + . . . wdxd > 0, −1

therwise.
One of the most simple neural networks consists of just one

perceptron neuron.

A perceptron does classification.

SLIDE 27

Different Views of The Perceptron

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 16 / 26

Simple Neural Network: Mitchell’s Drawing:

INPUTS OUTPUT NEURONS x1 x2 x4 x3 y1 OUTPUTS

Equation:

hw(x) =

1

if w0 + w1x1 + . . . wdxd > 0, −1

therwise.
One of the most simple neural networks consists of just one

perceptron neuron.

A perceptron does classification.
The network has no hidden units, and just one output.
It may have any number of inputs.

SLIDE 28

Overview

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 17 / 26

Organisational Matters
Answers Exercises 2
Linear Functions as Inner Products
Vector Valued Outputs in Regression and Classification
Neural Networks and the Perceptron

✦

Neural Networks

✦

The Perceptron

✦

Implementing Boolean Functions with a Perceptron

Convex Functions
Gradient Descent (part 1)

SLIDE 29

Decision Boundary of the Perceptron

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 18 / 26

Decision boundary: w0 + w1x1 + . . . + wdxd = 0

This is where the perceptron changes its output y from −1 (-)

to +1 (+) if we change x a little bit.

Always a line.

SLIDE 30

Decision Boundary of the Perceptron

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 18 / 26

Decision boundary: w0 + w1x1 + . . . + wdxd = 0

This is where the perceptron changes its output y from −1 (-)

to +1 (+) if we change x a little bit.

Always a line.

Examples of different Weights (with Boolean inputs: −1 =

false, 1 = true):

AND OR

3
2
1

1 2 3 x1

3
2
1

1 2 3 x2

3
2
1

1 2 3 x1

3
2
1

1 2 3 x2

w0 = −0.8, w1 = 0.5, w2 = 0.5

w0 = 0.3, w1 = 0.5, w2 = 0.5 Wrong in Mitchell!

SLIDE 31

Perceptron Cannot Represent Exclusive Or

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 19 / 26

Exclusive Or:

3
2
1

1 2 3 x1

3
2
1

1 2 3 x2

There exists no line that separates the inputs with label ‘-’

from the inputs with label ‘+’. They are not linearly separable.

SLIDE 32

Perceptron Cannot Represent Exclusive Or

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 19 / 26

Exclusive Or:

3
2
1

1 2 3 x1

3
2
1

1 2 3 x2

There exists no line that separates the inputs with label ‘-’

from the inputs with label ‘+’. They are not linearly separable.

The decision boundary for the perceptron is always a line.
Hence a perceptron can never implement the ‘exclusive or’

function, whichever weights we choose.

SLIDE 33

Overview

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 20 / 26

Organisational Matters
Answers Exercises 2
Linear Functions as Inner Products
Vector Valued Outputs in Regression and Classification
Neural Networks and the Perceptron

✦

Neural Networks

✦

The Perceptron

✦

Implementing Boolean Functions with a Perceptron

Convex Functions
Gradient Descent (part 1)

SLIDE 34

Convex Functions

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 21 / 26

Intuition:

10
5

5 10 x 20 40 60 80 100 x2

SLIDE 35

Convex Functions

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 21 / 26

Intuition:

10
5

5 10 x 20 40 60 80 100 x2

A function is convex if it lies below the line between any two of

its points. For example, f(−3) and f(7).

SLIDE 36

Convex Functions

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 21 / 26

Intuition:

10
5

5 10 x 20 40 60 80 100 x2

A function is convex if it lies below the line between any two of

its points. For example, f(−3) and f(7).

Definition: A function f(x) is convex if

f(αx1 + (1 − α)x2) ≤ αf(x1) + (1 − α)f(x2) for any two inputs x1, x2 and any 0 ≤ α ≤ 1.

SLIDE 37

Examples

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 22 / 26

Convex:

2
1

1 2 3 4 5 x 20 40 60 80 100 120 140 x

SLIDE 38

Examples

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 22 / 26

Convex:

2
1

1 2 3 4 5 x 20 40 60 80 100 120 140 x

SLIDE 39

Examples

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 22 / 26

Convex:

2
1

1 2 3 4 5 x 20 40 60 80 100 120 140 x

10
5

5 10 x 20 40 60 80 100 x2

SLIDE 40

Examples

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 22 / 26

Convex:

2
1

1 2 3 4 5 x 20 40 60 80 100 120 140 x

10
5

5 10 x 20 40 60 80 100 x2

Not Convex:

3
2
1

1 2 3 x

4
2

2 4 x3

SLIDE 41

Examples

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 22 / 26

Convex:

2
1

1 2 3 4 5 x 20 40 60 80 100 120 140 x

10
5

5 10 x 20 40 60 80 100 x2

Not Convex:

3
2
1

1 2 3 x

4
2

2 4 x3

SLIDE 42

Examples

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 22 / 26

Convex:

2
1

1 2 3 4 5 x 20 40 60 80 100 120 140 x

10
5

5 10 x 20 40 60 80 100 x2

Not Convex:

3
2
1

1 2 3 x

4
2

2 4 x3

10
5

5 10 x

100
80
60
40
20

x2

SLIDE 43

Examples

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 22 / 26

Convex:

2
1

1 2 3 4 5 x 20 40 60 80 100 120 140 x

10
5

5 10 x 20 40 60 80 100 x2

Not Convex:

3
2
1

1 2 3 x

4
2

2 4 x3

10
5

5 10 x

100
80
60
40
20

x2

SLIDE 44

Overview

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 23 / 26

Organisational Matters
Answers Exercises 2
Linear Functions as Inner Products
Vector Valued Outputs in Regression and Classification
Neural Networks and the Perceptron

✦

Neural Networks

✦

The Perceptron

✦

Implementing Boolean Functions with a Perceptron

Convex Functions
Gradient Descent (part 1)

SLIDE 45

Gradient Descent

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 24 / 26

Gradient descent is a method to find the minimum minx f(x)
f a function.
It works for convex functions.
But not for some other functions.

SLIDE 46

Gradient Descent

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 24 / 26

Gradient descent is a method to find the minimum minx f(x)
f a function.
It works for convex functions.
But not for some other functions.

General Idea:

1. Pick a random starting point x1. 2. Do a little step in the direction of the derivative: f′(x1). 3. Now we are at x2. 4. Do a little step in the direction of the derivative: f′(x2). 5. Keep doing little steps until f′(xm) ≈ 0: we have reached the minimum.

SLIDE 47

Gradient Descent

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 24 / 26

Gradient descent is a method to find the minimum minx f(x)
f a function.
It works for convex functions.
But not for some other functions.

General Idea:

1. Pick a random starting point x1. 2. Do a little step in the direction of the derivative: f′(x1). 3. Now we are at x2. 4. Do a little step in the direction of the derivative: f′(x2). 5. Keep doing little steps until f′(xm) ≈ 0: we have reached the minimum.

To be continued next lecture. . .

SLIDE 48

Overview

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 25 / 26

Organisational Matters
Answers Exercises 2
Linear Functions as Inner Products
Vector Valued Outputs in Regression and Classification
Neural Networks and the Perceptron

✦

Neural Networks

✦

The Perceptron

✦

Implementing Boolean Functions with a Perceptron

Convex Functions
Gradient Descent (part 1)

SLIDE 49

References

Organisational Matters Answers Exercises 2 Linear Functions as Inner Products Vector Valued Outputs in Regression and Classification Neural Networks and the Perceptron Convex Functions Gradient Descent 26 / 26

Picture of a neuron taken from Wikimedia Commons,

http://commons.wikimedia.org/wiki/Image:Neuron.svg: Originally Neuron.jpg taken from the US Federal (public domain) (Nerve Tissue, retrieved March 2007), redrawn by User:Dhp1080 in Illustrator. Source: ”Anatomy and Physiology” by the US National Cancer Institute’s Surveillance, Epidemiology and End Results (SEER) Program.

S. Boyd and L. Vandenberghe. Convex Optimization.

Cambridge University Press, 2004

T.M. Mitchell, “Machine Learning”, McGraw-Hill, 1997