recsm summer school machine learning for social sciences
play

RECSM Summer School: Machine Learning for Social Sciences Session - PowerPoint PPT Presentation

RECSM Summer School: Machine Learning for Social Sciences Session 1.3: Supervised Learning and Model Accuracy Reto West Department of Political Science and International Relations University of Geneva 1 Supervised Learning Supervised


  1. RECSM Summer School: Machine Learning for Social Sciences Session 1.3: Supervised Learning and Model Accuracy Reto Wüest Department of Political Science and International Relations University of Geneva 1

  2. Supervised Learning

  3. Supervised Learning Statistical Decision Theory

  4. Statistical Decision Theory • Let X ∈ R p be a vector of input variables and Y ∈ R an output variable, with joint distribution Pr( X, Y ) . • Our goal is to find a function f ( X ) for predicting Y given values of X . • We need a loss function L ( Y, f ( X )) that penalizes errors in prediction. • The most common loss function is squared error loss L ( Y, f ( X )) = ( Y − f ( X )) 2 . (1.3.1) 1

  5. Statistical Decision Theory • The expected prediction error or expected test error is expected test error = E ( Y − f ( X )) 2 . (1.3.2) • We choose f so as to minimize the expected test error. • The solution is the conditional expectation f ( x ) = E ( Y | X = x ) . (1.3.3) • Hence, the best prediction of Y at point X = x is the conditional expectation. • Let’s look at two simple methods that differ in how they approximate the conditional expectation. 2

  6. Supervised Learning Method I: Linear Model and Least Squares

  7. Linear Model and Least Squares • In linear regression, we specify a model to estimate the conditional expectation in (1.3.3) f ( x ) = x T β. (1.3.4) • Using the method of least squares, we choose β to minimize the residual sum of squares N � i β ) 2 . ( y i − x T RSS ( β ) = (1.3.5) i =1 3

  8. Linear Model and Least Squares – Example • Goal is to predict outcome variable G ∈ { blue , orange } on the basis of training data on inputs X 1 ∈ R and X 2 ∈ R . • We fit a linear regression to the training data, with Y coded as 0 for blue and 1 for orange. • Fitted values ˆ Y are converted to a fitted variable ˆ G as follows   if ˆ orange Y > 0 . 5 , ˆ G = (1.3.6)  if ˆ blue Y ≤ 0 . 5 . • In the figure below, the set of points classified as orange is { x ∈ R 2 : x T ˆ β > 0 . 5 } and the set of points classified as blue is { x ∈ R 2 : x T ˆ β ≤ 0 . 5 } . The linear decision boundary separating the two predicted classes is { x ∈ R 2 : x T ˆ β = 0 . 5 } . 4

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend