cs70 lecture 35
play

CS70: Lecture 35. Regression (contd.): Linear and Beyond CS70: - PowerPoint PPT Presentation

CS70: Lecture 35. Regression (contd.): Linear and Beyond CS70: Lecture 35. Regression (contd.): Linear and Beyond 1. Review: Linear Regression (LR), LLSE 2. LR: Examples 3. Beyond LR: Quadratic Regression 4. Conditional Expectation (CE) and


  1. Estimation Error We saw that the LLSE of Y given X is Y = E [ Y ]+ cov ( X , Y ) L [ Y | X ] = ˆ ( X − E [ X ]) . var ( X )

  2. Estimation Error We saw that the LLSE of Y given X is Y = E [ Y ]+ cov ( X , Y ) L [ Y | X ] = ˆ ( X − E [ X ]) . var ( X ) How good is this estimator?

  3. Estimation Error We saw that the LLSE of Y given X is Y = E [ Y ]+ cov ( X , Y ) L [ Y | X ] = ˆ ( X − E [ X ]) . var ( X ) How good is this estimator? That is, what is the mean squared estimation error?

  4. Estimation Error We saw that the LLSE of Y given X is Y = E [ Y ]+ cov ( X , Y ) L [ Y | X ] = ˆ ( X − E [ X ]) . var ( X ) How good is this estimator? That is, what is the mean squared estimation error? We find E [ | Y − L [ Y | X ] | 2 ] = E [( Y − E [ Y ] − ( cov ( X , Y ) / var ( X ))( X − E [ X ])) 2 ] = E [( Y − E [ Y ]) 2 ] − 2 ( cov ( X , Y ) / var ( X )) E [( Y − E [ Y ])( X − E [ X ])] +( cov ( X , Y ) / var ( X )) 2 E [( X − E [ X ]) 2 = var ( Y ) − cov ( X , Y ) 2 . var ( X ) Without observations, the estimate is E [ Y ] = 0. The error is var ( Y ) . Observing X reduces the error.

  5. Wrap-up of Linear Regression Linear Regression

  6. Wrap-up of Linear Regression Linear Regression 1. Linear Regression: L [ Y | X ] = E [ Y ]+ cov ( X , Y ) var ( X ) ( X − E [ X ])

  7. Wrap-up of Linear Regression Linear Regression 1. Linear Regression: L [ Y | X ] = E [ Y ]+ cov ( X , Y ) var ( X ) ( X − E [ X ]) 2. Non-Bayesian: minimize ∑ n ( Y n − a − bX n ) 2

  8. Wrap-up of Linear Regression Linear Regression 1. Linear Regression: L [ Y | X ] = E [ Y ]+ cov ( X , Y ) var ( X ) ( X − E [ X ]) 2. Non-Bayesian: minimize ∑ n ( Y n − a − bX n ) 2 3. Bayesian: minimize E [( Y − a − bX ) 2 ]

  9. Beyond Linear Regression: Discussion

  10. Beyond Linear Regression: Discussion Goal: guess the value of Y in the expected squared error sense. We know nothing about Y other than its distribution. Our best guess is?

  11. Beyond Linear Regression: Discussion Goal: guess the value of Y in the expected squared error sense. We know nothing about Y other than its distribution. Our best guess is? E [ Y ] .

  12. Beyond Linear Regression: Discussion Goal: guess the value of Y in the expected squared error sense. We know nothing about Y other than its distribution. Our best guess is? E [ Y ] . Now assume we make some observation X related to Y .

  13. Beyond Linear Regression: Discussion Goal: guess the value of Y in the expected squared error sense. We know nothing about Y other than its distribution. Our best guess is? E [ Y ] . Now assume we make some observation X related to Y . How do we use that observation to improve our guess about Y ?

  14. Beyond Linear Regression: Discussion Goal: guess the value of Y in the expected squared error sense. We know nothing about Y other than its distribution. Our best guess is? E [ Y ] . Now assume we make some observation X related to Y . How do we use that observation to improve our guess about Y ? Idea: use a function g ( X ) of the observation to estimate Y .

  15. Beyond Linear Regression: Discussion Goal: guess the value of Y in the expected squared error sense. We know nothing about Y other than its distribution. Our best guess is? E [ Y ] . Now assume we make some observation X related to Y . How do we use that observation to improve our guess about Y ? Idea: use a function g ( X ) of the observation to estimate Y . LR: Restriction to linear functions: g ( X ) = a + bX .

  16. Beyond Linear Regression: Discussion Goal: guess the value of Y in the expected squared error sense. We know nothing about Y other than its distribution. Our best guess is? E [ Y ] . Now assume we make some observation X related to Y . How do we use that observation to improve our guess about Y ? Idea: use a function g ( X ) of the observation to estimate Y . LR: Restriction to linear functions: g ( X ) = a + bX . With no such constraints, what is the best g ( X ) ?

  17. Beyond Linear Regression: Discussion Goal: guess the value of Y in the expected squared error sense. We know nothing about Y other than its distribution. Our best guess is? E [ Y ] . Now assume we make some observation X related to Y . How do we use that observation to improve our guess about Y ? Idea: use a function g ( X ) of the observation to estimate Y . LR: Restriction to linear functions: g ( X ) = a + bX . With no such constraints, what is the best g ( X ) ? Answer:

  18. Beyond Linear Regression: Discussion Goal: guess the value of Y in the expected squared error sense. We know nothing about Y other than its distribution. Our best guess is? E [ Y ] . Now assume we make some observation X related to Y . How do we use that observation to improve our guess about Y ? Idea: use a function g ( X ) of the observation to estimate Y . LR: Restriction to linear functions: g ( X ) = a + bX . With no such constraints, what is the best g ( X ) ? Answer: E [ Y | X ] .

  19. Beyond Linear Regression: Discussion Goal: guess the value of Y in the expected squared error sense. We know nothing about Y other than its distribution. Our best guess is? E [ Y ] . Now assume we make some observation X related to Y . How do we use that observation to improve our guess about Y ? Idea: use a function g ( X ) of the observation to estimate Y . LR: Restriction to linear functions: g ( X ) = a + bX . With no such constraints, what is the best g ( X ) ? Answer: E [ Y | X ] . This is called the Conditional Expectation (CE).

  20. Nonlinear Regression: Motivation

  21. Nonlinear Regression: Motivation There are many situations where a good guess about Y given X is not linear.

  22. Nonlinear Regression: Motivation There are many situations where a good guess about Y given X is not linear. E.g., (diameter of object, weight),

  23. Nonlinear Regression: Motivation There are many situations where a good guess about Y given X is not linear. E.g., (diameter of object, weight), (school years, income),

  24. Nonlinear Regression: Motivation There are many situations where a good guess about Y given X is not linear. E.g., (diameter of object, weight), (school years, income), (PSA level, cancer risk).

  25. Nonlinear Regression: Motivation There are many situations where a good guess about Y given X is not linear. E.g., (diameter of object, weight), (school years, income), (PSA level, cancer risk).

  26. Nonlinear Regression: Motivation There are many situations where a good guess about Y given X is not linear. E.g., (diameter of object, weight), (school years, income), (PSA level, cancer risk). Our goal:

  27. Nonlinear Regression: Motivation There are many situations where a good guess about Y given X is not linear. E.g., (diameter of object, weight), (school years, income), (PSA level, cancer risk). Our goal: explore estimates ˆ Y = g ( X ) for nonlinear functions g ( · ) .

  28. Quadratic Regression

  29. Quadratic Regression Let X , Y be two random variables defined on the same probability space.

  30. Quadratic Regression Let X , Y be two random variables defined on the same probability space. Definition:

  31. Quadratic Regression Let X , Y be two random variables defined on the same probability space. Definition: The quadratic regression of Y over X is the random variable

  32. Quadratic Regression Let X , Y be two random variables defined on the same probability space. Definition: The quadratic regression of Y over X is the random variable Q [ Y | X ] = a + bX + cX 2

  33. Quadratic Regression Let X , Y be two random variables defined on the same probability space. Definition: The quadratic regression of Y over X is the random variable Q [ Y | X ] = a + bX + cX 2 where a , b , c are chosen to minimize E [( Y − a − bX − cX 2 ) 2 ] .

  34. Quadratic Regression Let X , Y be two random variables defined on the same probability space. Definition: The quadratic regression of Y over X is the random variable Q [ Y | X ] = a + bX + cX 2 where a , b , c are chosen to minimize E [( Y − a − bX − cX 2 ) 2 ] . Derivation:

  35. Quadratic Regression Let X , Y be two random variables defined on the same probability space. Definition: The quadratic regression of Y over X is the random variable Q [ Y | X ] = a + bX + cX 2 where a , b , c are chosen to minimize E [( Y − a − bX − cX 2 ) 2 ] . Derivation: We set to zero the derivatives w.r.t. a , b , c .

  36. Quadratic Regression Let X , Y be two random variables defined on the same probability space. Definition: The quadratic regression of Y over X is the random variable Q [ Y | X ] = a + bX + cX 2 where a , b , c are chosen to minimize E [( Y − a − bX − cX 2 ) 2 ] . Derivation: We set to zero the derivatives w.r.t. a , b , c . We get E [ Y − a − bX − cX 2 ] 0 =

  37. Quadratic Regression Let X , Y be two random variables defined on the same probability space. Definition: The quadratic regression of Y over X is the random variable Q [ Y | X ] = a + bX + cX 2 where a , b , c are chosen to minimize E [( Y − a − bX − cX 2 ) 2 ] . Derivation: We set to zero the derivatives w.r.t. a , b , c . We get E [ Y − a − bX − cX 2 ] 0 = E [( Y − a − bX − cX 2 ) X ] 0 =

  38. Quadratic Regression Let X , Y be two random variables defined on the same probability space. Definition: The quadratic regression of Y over X is the random variable Q [ Y | X ] = a + bX + cX 2 where a , b , c are chosen to minimize E [( Y − a − bX − cX 2 ) 2 ] . Derivation: We set to zero the derivatives w.r.t. a , b , c . We get E [ Y − a − bX − cX 2 ] 0 = E [( Y − a − bX − cX 2 ) X ] 0 = E [( Y − a − bX − cX 2 ) X 2 ] 0 =

  39. Quadratic Regression Let X , Y be two random variables defined on the same probability space. Definition: The quadratic regression of Y over X is the random variable Q [ Y | X ] = a + bX + cX 2 where a , b , c are chosen to minimize E [( Y − a − bX − cX 2 ) 2 ] . Derivation: We set to zero the derivatives w.r.t. a , b , c . We get E [ Y − a − bX − cX 2 ] 0 = E [( Y − a − bX − cX 2 ) X ] 0 = E [( Y − a − bX − cX 2 ) X 2 ] 0 = We solve these three equations in the three unknowns ( a , b , c ) .

  40. Quadratic Regression Let X , Y be two random variables defined on the same probability space. Definition: The quadratic regression of Y over X is the random variable Q [ Y | X ] = a + bX + cX 2 where a , b , c are chosen to minimize E [( Y − a − bX − cX 2 ) 2 ] . Derivation: We set to zero the derivatives w.r.t. a , b , c . We get E [ Y − a − bX − cX 2 ] 0 = E [( Y − a − bX − cX 2 ) X ] 0 = E [( Y − a − bX − cX 2 ) X 2 ] 0 = We solve these three equations in the three unknowns ( a , b , c ) .

  41. Conditional Expectation Definition Let X and Y be RVs on Ω .

  42. Conditional Expectation Definition Let X and Y be RVs on Ω . The conditional expectation of Y given X is defined as E [ Y | X ] = g ( X )

  43. Conditional Expectation Definition Let X and Y be RVs on Ω . The conditional expectation of Y given X is defined as E [ Y | X ] = g ( X ) where g ( x ) := E [ Y | X = x ] := ∑ yPr [ Y = y | X = x ] . y

  44. Conditional Expectation Definition Let X and Y be RVs on Ω . The conditional expectation of Y given X is defined as E [ Y | X ] = g ( X ) where g ( x ) := E [ Y | X = x ] := ∑ yPr [ Y = y | X = x ] . y

  45. Deja vu, all over again? Have we seen this before?

  46. Deja vu, all over again? Have we seen this before? Yes.

  47. Deja vu, all over again? Have we seen this before? Yes. Is anything new?

  48. Deja vu, all over again? Have we seen this before? Yes. Is anything new? Yes.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend