Linear Regression via Normal Equations some material thanks to - - PowerPoint PPT Presentation

linear regression via normal equations
SMART_READER_LITE
LIVE PREVIEW

Linear Regression via Normal Equations some material thanks to - - PowerPoint PPT Presentation

Linear Regression via Normal Equations some material thanks to Andrew Ng @Stanford Course Map / module1 LEARNING PERFORMANCE REPRESENTATION DATA PROBLEM RAW DATA EVALUATION FEATURES CLUSTERING housing data spam data ANALYSIS SELECTION


slide-1
SLIDE 1

Linear Regression via Normal Equations

some material thanks to Andrew Ng @Stanford

slide-2
SLIDE 2

Course Map / module1

  • two basic supervised learning algorithms
  • decision trees
  • linear regression
  • two simple datasets
  • housing
  • spam emails

RAW DATA housing data spam data LABELS FEATURES SUPERVISED LEARNING decision tree linear regression CLUSTERING EVALUATION ANALYSIS SELECTION DIMENSIONS DATA PROCESSING TUNING

DATA PROBLEM REPRESENTATION LEARNING PERFORMANCE

slide-3
SLIDE 3

Module 1 Objectives/Linear Regression

  • Linear Algebra Primer
  • matrix equations, notations
  • matrix manipulations
  • Linear Regression
  • objective, convexity
  • matrix form
  • derivation of normal equations
  • Run regression in practice
slide-4
SLIDE 4

Matrix data

  • m datapoints/objects Xi=(x1,x2,…,xd); i=1:m
  • d features/columns f1, f2, …, fd
  • label(Xi) = yi, given for each datapoint in the training set.

x x x13 … … x1d x21 x22 x23 … … x2d … xm1 xm2 xm3 … … xmd

datapoint feature

slide-5
SLIDE 5

Matrix data/ training VS testing

Training Testing

slide-6
SLIDE 6

regression goal

  • housing data, two features (toy example)
  • regressor = a linear predictor
  • such that h(x) approximates label(x)=y as close as

possible, measured by square error

slide-7
SLIDE 7

Regression Normal Equations

  • Linear regression has a well known exact

solution, given by linear algebra

  • X= training matrix of feature values
  • Y= corresponding labels vector
  • then regression coefficients that minimize
  • bjective J are
slide-8
SLIDE 8

Normal equations: matrix derivatives

  • if function f takes a matrix and outputs a

real number, then its derivative is

  • example:
slide-9
SLIDE 9

Normal equations : matrix trace

  • trace(A) = sum of main diagonal
  • easy properties
  • advanced properties
slide-10
SLIDE 10

regression checkpoint: matrix derivative and trace

  • 1) in the example few slides ago explain how

the matrix of derivatives was calculated

  • 2) derive on paper the first three advanced

matrix trace properties

slide-11
SLIDE 11

Normal equations : mean square error

  • data and labels
  • error (difference) for regressor
  • square error
slide-12
SLIDE 12

Normal equations : mean square error difgerential

  • minimize J =>set the derivative to zero:
slide-13
SLIDE 13

linear regression: use on test points

  • x=(x1,x2,…,xd) test point
  • h = (θ0,θ1,…,θd) regression model
  • apply regressor to get a predicted label (add

bias feature x0=1)

  • if y=label(x) is given, measure error
  • absolute difference |y-h(x)|
  • square error (y-h(x))2
slide-14
SLIDE 14

Logistic regression

  • Logistic

transformation

  • Logistic differential
slide-15
SLIDE 15

Logistic regression

  • Logistic regression function
  • Solve the same optimization problem as before
  • no exact solution this time, will use gradient descent

(numerical methods) next module

slide-16
SLIDE 16

Linear Regression Screencast

  • http://www.screencast.com/t/U3usp6TyrOL