resources
play

Resources 1. Web Page: www.cs.rpi.edu/ magdon/courses/learn.php - PowerPoint PPT Presentation

Resources 1. Web Page: www.cs.rpi.edu/ magdon/courses/learn.php course info: www.cs.rpi.edu/ magdon/courses/learn/info.pdf slides: www.cs.rpi.edu/ magdon/courses/learn/slides.html assignments: www.cs.rpi.edu/


  1. Resources 1. Web Page: www.cs.rpi.edu/ ∼ magdon/courses/learn.php – course info: www.cs.rpi.edu/ ∼ magdon/courses/learn/info.pdf – slides: www.cs.rpi.edu/ ∼ magdon/courses/learn/slides.html – assignments: www.cs.rpi.edu/ ∼ magdon/courses/learn/assign.html Learning From Data Lecture 1 Learning From Data The Learning Problem 2. Text Book: Abu-Mostafa, Magdon-Ismail, Lin Introduction Motivation Credit Default - A Running Example 3. Book Forum: book.caltech.edu/bookforum Summary of the Learning Problem – discussion about any material in book including problems and exercises. – additional material M. Magdon-Ismail CSCI 4100/6100 4. TA. 5. Professor. 6. Prerequisites? assignment #0 � A M The Learning Problem : 2 /16 c L Creator: Malik Magdon-Ismail The storyline − → The Storyline 1. What is Learning? 2. Can We do it? 3. How to do it? concepts 4. How to do it well? theory practice 5. General principles? 6. Advanced techniques. 7. Other Learning Paradigms. our language will be mathematics . . . . . . our sword will be computer algorithms � A c M The Learning Problem : 3 /16 � A c M The Learning Problem : 4 /16 L Creator: Malik Magdon-Ismail The applications − → L Creator: Malik Magdon-Ismail Define a tree − →

  2. Let’s Define a Tree? Let’s Define a Tree? A brown trunk moving upwards and branching with leaves . . . � A M The Learning Problem : 5 /16 � A M The Learning Problem : 6 /16 c L Creator: Malik Magdon-Ismail A definition − → c L Creator: Malik Magdon-Ismail Does it work? − → Are These Trees? Learning “What are Trees” is ‘Easy’ � A c M The Learning Problem : 7 /16 � A c M The Learning Problem : 8 /16 L Creator: Malik Magdon-Ismail Learning a Tree − → L Creator: Malik Magdon-Ismail Recognizing is easy − →

  3. Defining is Hard; Recognizing is Easy Learning to Rate Movies • Can we predict how a viewer would rate a movie? • Why? So that Netflix can make better movie recommendations, and get more rentals. • $1 million prize for a mere 10% improvement in their recommendation system . Hard to give a complete mathematical definition of a tree. Even a 3 year old can tell a tree from a non-tree. The 3 year old has learned from data. (Other tasks like graphics or GAN?) � A M The Learning Problem : 9 /16 � A M The Learning Problem : 10 /16 c L Creator: Malik Magdon-Ismail Rating movies − → c L Creator: Malik Magdon-Ismail There’s a pattern, we have data − → Previous Ratings Reflect Future Ratings Credit Approval ? s r e ? t e s s ? u i y b u ? d k r e n c C m o o m i l o t b Let’s use a conceptual example to crystallize the issues. c o c a T s r s s e s e e f e k k e k i i r i l l p l • Viewer taste & movie content imply viewer rating. viewer: age 32 years • No magical formula to predict viewer rating. predicted Match corresponding factors gender male rating then add their contributions salary 40,000 • Netflix has data. We can learn to identify movie debt 26,000 movie: “categories” as well as viewer “preferences” years in job 1 year years at home 3 years T c a b o o c l o m . . . . . . m t i c o e k C n d b r u y c u o s i c s n t o e e t n e r Approve for credit? ? i n t n e n t i t t ? Class Motto: A pattern exists. We don’t know it. We have data to learn it. � A c M The Learning Problem : 11 /16 � A c M The Learning Problem : 12 /16 L Creator: Malik Magdon-Ismail Credit approval − → L Creator: Malik Magdon-Ismail There’s a pattern, we have data − →

  4. Credit Approval The Key Players input x ∈ R d = X . • Salary, debt, years in residence, . . . Let’s use a conceptual example to crystallize the issues. • Approve credit or not output y ∈ {− 1 , +1 } = Y . • True relationship between x and y target function f : X �→ Y . age 32 years • Using salary, debt, years in residence, etc., approve for credit or not. (The target f is unknown .) gender male • No magic credit approval formula. salary 40,000 • Data on customers data set D = ( x 1 , y 1 ) , . . . , ( x N , y N ). debt 26,000 • Banks have lots of data. years in job 1 year ( y n = f ( x n ) .) – customer information: salary, debt, etc. years at home 3 years . . . . . . – whether or not they defaulted on their credit. Approve for credit? X Y and D are given by the learning problem; The target f is fixed but unknown. A pattern exists. We don’t know it. We have data to learn it. We learn the function f from the data D . � A M The Learning Problem : 13 /16 � A M The Learning Problem : 14 /16 c L Creator: Malik Magdon-Ismail Key players − → c L Creator: Malik Magdon-Ismail Learning − → Learning Summary of the Learning Setup • Start with a set of candidate hypotheses H which you think are likely to represent f . UNKNOWN TARGET FUNCTION H = { h 1 , h 2 , . . . , } f : X �→ Y (ideal credit approval formula) is called the hypothesis set or model . y n = f ( x n ) • Select a hypothesis g from H . The way we do this is called a learning algorithm . TRAINING EXAMPLES ( x 1 , y 1 ) , ( x 2 , y 2 ) , . . . , ( x N , y N ) • Use g for new customers. We hope g ≈ f . (historical records of credit customers) X Y and D are given by the learning problem; FINAL LEARNING HYPOTHESIS The target f is fixed but unknown . ALGORITHM g ≈ f A (learned credit approval formula) We choose H and the learning algorithm HYPOTHESIS SET H (set of candidate formulas) This is a very general setup (eg. choose H to be all possible hypotheses) � A c M The Learning Problem : 15 /16 � A c M The Learning Problem : 16 /16 L Creator: Malik Magdon-Ismail Summary of learning setup − → L Creator: Malik Magdon-Ismail

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend