Regression: Simple and Linear Introduction to Machine Learning - - PowerPoint PPT Presentation

regression simple and linear
SMART_READER_LITE
LIVE PREVIEW

Regression: Simple and Linear Introduction to Machine Learning - - PowerPoint PPT Presentation

INTRODUCTION TO MACHINE LEARNING Regression: Simple and Linear Introduction to Machine Learning Regression Principle REGRESSION PREDICTORS RESPONSE Introduction to Machine Learning Example Shop Data: sales, competition, district size, ...


slide-1
SLIDE 1

INTRODUCTION TO MACHINE LEARNING

Regression: Simple and Linear

slide-2
SLIDE 2

Introduction to Machine Learning

Regression Principle

PREDICTORS REGRESSION RESPONSE

slide-3
SLIDE 3

Introduction to Machine Learning

Example

Shop Data: sales, competition, district size, ...

Data Analyst Relationship?

  • Predictors: competition, advertisement, …
  • Response: sales

Shopkeeper Predictions!

slide-4
SLIDE 4

Introduction to Machine Learning

Simple Linear Regression

  • Simple: one predictor to model the response
  • Linear: approximately linear relationship

Linearity Plausible? Scaerplot!

slide-5
SLIDE 5

Introduction to Machine Learning

Example

  • Relationship: advertisement sales
  • Expectation: positively correlated
slide-6
SLIDE 6

Introduction to Machine Learning

Example

  • Observation: upwards linear trend
  • First Step: simple linear regression

5 10 15 100 200 300 400 500

advertisement sales

slide-7
SLIDE 7

Introduction to Machine Learning

Model

Fiing a line

  • Predictor:
  • Intercept:
  • Statistical Error:
  • Response:
  • Slope:
slide-8
SLIDE 8

Introduction to Machine Learning

5 10 15 100 200 300 400 500

advertisement

Advertisement Sales

Estimating Coefficients

True Response Residuals

Fied Response

Minimize!

5 10 15 100 200 300 400 500 5 10 15 100 200 300 400 500

advertisement sales

#Observations

slide-9
SLIDE 9

Introduction to Machine Learning

Estimating Coefficients

> my_lm <- lm(sales ~ ads, data = shop_data)

Predictor Response

5 10 15 100 200 300 400 500

> my_lm$coefficients

Returns coefficients

slide-10
SLIDE 10

Introduction to Machine Learning

Prediction with Regression

Predicting new outcomes

> y_new <- predict(my_lm, x_new, interval = "confidence")

Provides confidence interval

,

Estimated Response New Predictor Instance Estimated Coefficients Must be data frame

Example: Ads: 11.000$ Sales: 380.000$

slide-11
SLIDE 11

Introduction to Machine Learning

Accuracy: RMSE

Measure of accuracy:

Estimated Response True Response # Observations

difficult to interpret! Example: RMSE = 76.000$ Meaning?

RMSE has unit + scale

slide-12
SLIDE 12

Introduction to Machine Learning

Interpretation: % explained variance,

Accuracy: R-squared

Sample mean response

close to 1 good fit!

> summary(my_lm)$r.squared

R-squared Total SS

Example: 0.84

slide-13
SLIDE 13

INTRODUCTION TO MACHINE LEARNING

Let’s practice!

slide-14
SLIDE 14

INTRODUCTION TO MACHINE LEARNING

Multivariable Linear Regression

slide-15
SLIDE 15

Introduction to Machine Learning

5 10 15 100 200 300 400 500 5 10 15 100 200 300 400 500

nearby competition sales

Example

Simple Linear Regression:

> lm(sales ~ ads, data = shop_data)

5 10 15 100 200 300 400 500

nearby competition sales

> lm(sales ~ comp, data = shop_data)

Loss of information!

slide-16
SLIDE 16

Introduction to Machine Learning

Multi-Linear Model

Solution: combine in multi linear model!

Individual Effect

  • Higher predictive power
  • Higher accuracy
slide-17
SLIDE 17

Introduction to Machine Learning

Multi-Linear Regression Model

  • Predictors:
  • Response:
  • Statistical Error:
  • Coefficients:
slide-18
SLIDE 18

Introduction to Machine Learning

Estimating Coefficients

True Response Fied Response #Observations

Minimize! Residuals

slide-19
SLIDE 19

Introduction to Machine Learning

Extending!

> my_lm <- lm(sales ~ ads + comp + ..., data = shop_data)

More predictors: total inventory, district size, …

Predictors Response

Extend methodology to p predictors:

slide-20
SLIDE 20

Introduction to Machine Learning

RMSE & Adjusted R-Squared

More predictors

  • Penalizes more predictors
  • Used to compare

> summary(my_lm)$adj.r.squared

Solution: adjusted R-squared Lower RMSE and higher R-squared Higher complexity and cost

{

In Example: 0.819 0.906

slide-21
SLIDE 21

Introduction to Machine Learning

Influence of predictors

  • p-value: indicator influence of parameter
  • p-value low — more likely parameter has significant influence

> summary(my_lm) Call: lm(formula = sales ~ ads + comp, data = shop_data) Residuals: Min 1Q Median 3Q Max

  • 131.920 -23.009 -4.448 33.978 146.486

Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 228.740 80.592 2.838 0.009084 ** ads 25.521 5.900 4.325 0.000231 *** comp -19.234 4.549 -4.228 0.000296 ***

P-Values

slide-22
SLIDE 22

Introduction to Machine Learning

Example

  • Want 95% confidence — p-value <= 0.05
  • Want 99% confidence — p-value <= 0.01

Note: Do not mix up R-squared with p-values!

Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 228.740 80.592 2.838 0.009084 ** ads 25.521 5.900 4.325 0.000231 *** comp -19.234 4.549 -4.228 0.000296 ***

P-Values

slide-23
SLIDE 23

Introduction to Machine Learning

Assumptions

  • Just make a model, 


make a summary and 
 look at p-values?

  • Not that simple!
  • We made some implicit assumptions
slide-24
SLIDE 24

Introduction to Machine Learning

100 200 300 400 500 600 −100 −50 50 100 150

Residual Plot Estimated Sales Residuals

−2 −1 1 2 −100 −50 50 100 150

Normal Q−Q Plot Theoretical Quantiles Residual Quantiles

> qqnorm(lm_shop$residuals) > plot(lm_shop$fitted.values, lm_shop$residuals)

Verifying Assumptions

Residuals:

  • Identical Normal:

Draws normal Q-Q plot

No paern?

  • Independent:

Approximately a line?

slide-25
SLIDE 25

Introduction to Machine Learning

Verfiying Assumptions

−2 −1 1 2 −100 −50 50 100 150

Normal Q−Q Plot Theoretical Quantiles Residual Quantiles

100 200 300 400 500 600 −100 −50 50 100 150

Residual Plot Estimated Sales Residuals

  • Important to avoid mistakes!
  • Alternative tests exist
slide-26
SLIDE 26

INTRODUCTION TO MACHINE LEARNING

Let’s practice!

slide-27
SLIDE 27

INTRODUCTION TO MACHINE LEARNING

k-Nearest Neighbors 
 and Generalization

slide-28
SLIDE 28

Introduction to Machine Learning

Non-Parametric Regression

2 3 4 5 6 44 46 48 50 52 54 56

x y

Problem: Visible paern, but not linear

slide-29
SLIDE 29

Introduction to Machine Learning

Non-Parametric Regression

Problem: Visible paern, but not linear Solutions:

  • Transformation
  • Multi-linear Regression
  • non-Parametric Regression

Tedious Advanced Doable

slide-30
SLIDE 30

Introduction to Machine Learning

Non-Parametric Regression

Problem: Visible paern, but not linear Techniques:

  • k-Nearest Neighbors
  • Kernel Regression
  • Regression Trees

No parameter estimations required!

slide-31
SLIDE 31

Introduction to Machine Learning

k-NN: Algorithm

4.5 4.6 4.7 4.8 4.9 5.0 45.5 46.0 46.5 47.0 47.5 48.0 48.5 x y

New observation

Given a training set and a new observation:

slide-32
SLIDE 32

Introduction to Machine Learning

4.5 4.6 4.7 4.8 4.9 5.0 45.5 46.0 46.5 47.0 47.5 48.0 48.5 x y

Given a training set and a new observation:

k-NN: Algorithm

1. Calculate the distance in the predictors

4.5 4.6 4.7 4.8 4.9 5.0 45.5 46.0 46.5 47.0 47.5 48.0 48.5 x y

slide-33
SLIDE 33

Introduction to Machine Learning

4.5 4.6 4.7 4.8 4.9 5.0 45.5 46.0 46.5 47.0 47.5 48.0 48.5 x y

k-NN: Algorithm

4.5 4.6 4.7 4.8 4.9 5.0 45.5 46.0 46.5 47.0 47.5 48.0 48.5 x y

k = 4

2. Select the k nearest

Given a training set and a new observation:

slide-34
SLIDE 34

Introduction to Machine Learning

4.5 4.6 4.7 4.8 4.9 5.0 45.5 46.0 46.5 47.0 47.5 48.0 48.5 x y 4.5 4.6 4.7 4.8 4.9 5.0 45.5 46.0 46.5 47.0 47.5 48.0 48.5 x y

k-NN: Algorithm

Mean of 4 responses

3. Aggregate the response of the k nearest

Given a training set and a new observation:

slide-35
SLIDE 35

Introduction to Machine Learning

4.5 4.6 4.7 4.8 4.9 5.0 45.5 46.0 46.5 47.0 47.5 48.0 48.5 x y 4.5 4.6 4.7 4.8 4.9 5.0 45.5 46.0 46.5 47.0 47.5 48.0 48.5 x y

k-NN: Algorithm

Prediction

4. The outcome is your prediction

Given a training set and a new observation:

slide-36
SLIDE 36

Introduction to Machine Learning

Choosing k

  • k = 1: Perfect fit on training set but poor predictions
  • k = #obs in training set: Mean, also poor predictions

Reasonable: k = 20% of #obs in training set Bias - Variance trade off!

slide-37
SLIDE 37

Introduction to Machine Learning

Generalization in Regression

  • Built your own regression model
  • Worked on training set
  • Does it generalize well?!
  • Two techniques
  • Hold Out: simply split the dataset
  • K-fold cross-validation
slide-38
SLIDE 38

Introduction to Machine Learning

Hold Out Method for Regression

Test set Training set

Build regression model on training set Calculate RMSE within training set Predict the outcome of the test set Calculate the RMSE within test set

Compare Test RMSE and Training RMSE

slide-39
SLIDE 39

Introduction to Machine Learning

2 3 4 5 6 7 8 −2 2 4 6 8 10 x y 2 3 4 5 6 7 8 −2 2 4 6 8 10 x y 2 3 4 5 6 7 8 −2 2 4 6 8 10 x y

  • Fit:
  • Generalize:
  • Prediction:

✔ ✔ ✔

Under and Overfiing

Underfit Overfit

  • Fit:
  • Generalize:
  • Prediction:
  • Fit:
  • Generalize:
  • Prediction:

✔ ✔ ✘ ✘ ✘ ✘

slide-40
SLIDE 40

INTRODUCTION TO MACHINE LEARNING

Let’s practice!