instance based learning read ch 8 k nearest neigh b or lo
play

Instance Based Learning [Read Ch. 8] k -Nearest Neigh b - PDF document

Instance Based Learning [Read Ch. 8] k -Nearest Neigh b or Lo cally w eigh ted regression Radial basis functions Case-based reasoning Lazy and eager learning 199 lecture slides for textb o ok


  1. Instance Based Learning [Read Ch. 8] � k -Nearest Neigh b or � Lo cally w eigh ted regression � Radial basis functions � Case-based reasoning � Lazy and eager learning 199 lecture slides for textb o ok Machine L e arning , � c T om M. Mitc hell, McGra w Hill, 1997

  2. Instance-Based Learning Key idea: just store all training examples h x ; f ( x ) i i i Nearest neigh b or: � Giv en query instance x , �rst lo cate nearest q training example x , then estimate n ^ f ( x ) f ( x ) q n k -Nearest neigh b or: � Giv en x , tak e v ote among its k nearest n brs (if q discrete-v alued target function) � tak e mean of f v alues of k nearest n brs (if real-v alued) P k f ( x ) i i =1 ^ f ( x ) q k 200 lecture slides for textb o ok Machine L e arning , � c T om M. Mitc hell, McGra w Hill, 1997

  3. When T o Consider Nearest Neigh b or n � Instances map to p oin ts in < � Less than 20 attributes p er instance � Lots of training data Adv an tages: � T raining is v ery fast � Learn complex target functions � Don't lose information Disadv an tages: � Slo w at query time � Easily fo oled b y irrelev an t attributes 201 lecture slides for textb o ok Machine L e arning , � c T om M. Mitc hell, McGra w Hill, 1997

  4. V oronoi Diagram − − − + + x q − + + − 202 lecture slides for textb o ok Machine L e arning , � c T om M. Mitc hell, McGra w Hill, 1997

  5. Beha vior in the Limit Consider p ( x ) de�nes probabilit y that instance x will b e lab eled 1 (p ositiv e) v ersus 0 (negativ e). Nearest neigh b or: � As n um b er of training examples ! 1 , approac hes Gibbs Algorithm Gibbs: with probabilit y p ( x ) predict 1, else 0 k -Nearest neigh b or: � As n um b er of training examples ! 1 and k gets large, approac hes Ba y es optimal Ba y es optimal: if p ( x ) > : 5 then predict 1, else 0 Note Gibbs has at most t wice the exp ected error of Ba y es optimal 203 lecture slides for textb o ok Machine L e arning , � c T om M. Mitc hell, McGra w Hill, 1997

  6. Distance-W eigh ted k NN Migh t w an t w eigh t nearer neigh b ors more hea vily ... P k w f ( x ) i i i =1 ^ f ( x ) q P k w i i =1 where 1 w � i 2 d ( x ; x ) q i and d ( x ; x ) is distance b et w een x and x q i q i Note no w it mak es sense to use al l training examples instead of just k ! Shepard's metho d 204 lecture slides for textb o ok Machine L e arning , � c T om M. Mitc hell, McGra w Hill, 1997

  7. Curse of Dimensionali t y Imagine instances describ ed b y 20 attributes, but only 2 are relev an t to target function Curse of dimensionality : nearest n br is easily mislead when high-dimensional X One approac h: � Stretc h j th axis b y w eigh t z , where z ; : : : ; z j 1 n c hosen to minimize prediction error � Use cross-v alidati on to automatically c ho ose w eigh ts z ; : : : ; z 1 n � Note setting z to zero eliminates this dimension j altogether see [Mo ore and Lee, 1994] 205 lecture slides for textb o ok Machine L e arning , � c T om M. Mitc hell, McGra w Hill, 1997

  8. Lo cally W eigh ted Regression Note k NN forms lo cal appro ximation to f for eac h query p oin t x q ^ Wh y not form an explici t appro ximation f ( x ) for region surrounding x q � Fit linear function to k nearest neigh b ors � Fit quadratic, ... � Pro duces \piecewise appro ximation" to f Sev eral c hoices of error to minimize: � Squared error o v er k nearest neigh b ors 1 X 2 ^ E ( x ) � ( f ( x ) � f ( x )) 1 q 2 x 2 k near est nbr s of x q � Distance-w eigh ted squared error o v er all n brs 1 X 2 ^ E ( x ) � ( f ( x ) � f ( x )) K ( d ( x ; x )) 2 q q 2 x 2 D � : : : 206 lecture slides for textb o ok Machine L e arning , � c T om M. Mitc hell, McGra w Hill, 1997

  9. Radial Basis F unction Net w orks � Global appro ximation to target function, in terms of linear com bination of lo cal appro ximations � Used, e.g., for image classi�cati on � A di�eren t kind of neural net w ork � Closely related to distance-w eigh ted regression, but \eager" instead of \lazy" 207 lecture slides for textb o ok Machine L e arning , � c T om M. Mitc hell, McGra w Hill, 1997

  10. Radial Basis F unction Net w orks f(x) where a ( x ) are the attributes describing instance i x , and k X f ( x ) = w 0 w + w w K w k ( d ( x ; x )) 0 u u u 1 u =1 ... 1 One common c hoice for K ( d ( x ; x )) is u u 1 2 � d ( x ;x ) u 2 2 � u K ( d ( x ; x )) = e u u ... a (x) a (x) a (x) 1 2 n 208 lecture slides for textb o ok Machine L e arning , � c T om M. Mitc hell, McGra w Hill, 1997

  11. T raining Radial Basis F unction Net- w orks Q1: What x to use for eac h k ernel function u K ( d ( x ; x )) u u � Scatter uniformly throughout instance space � Or use training instances (re�ects instance distribution) Q2: Ho w to train w eigh ts (assume here Gaussian K ) u � First c ho ose v ariance (and p erhaps mean) for eac h K u { e.g., use EM � Then hold K �xed, and train linear output la y er u { e�cien t metho ds to �t linear function 209 lecture slides for textb o ok Machine L e arning , � c T om M. Mitc hell, McGra w Hill, 1997

  12. Case-Based Reasoning Can apply instance-based learning ev en when n X 6 = < ! need di�eren t \distance" metric Case-Based Reasoning is instance-based learning applied to instances with sym b olic logic descriptions ((user-complaint error53-on-shutd own) (cpu-model PowerPC) (operating-system Windows) (network-connecti on PCIA) (memory 48meg) (installed-applic ation s Excel Netscape VirusScan) (disk 1gig) (likely-cause ???)) 210 lecture slides for textb o ok Machine L e arning , � c T om M. Mitc hell, McGra w Hill, 1997

  13. Case-Based Reasoning in CADET CADET: 75 stored examples of mec hanical devices � eac h training example: h qualitati v e function, mec hanical structure i � new query: desired function, � target v alue: mec hanical structure for this function Distance metric: matc h qualitat i v e function descriptions 211 lecture slides for textb o ok Machine L e arning , � c T om M. Mitc hell, McGra w Hill, 1997

  14. Case-Based Reasoning in CADET A stored case: T−junction pipe Structure: Function: Q ,T T = temperature Q 1 1 + Q 1 = waterflow Q 3 Q + 2 Q ,T 3 3 T + 1 T 3 Q ,T T + 2 2 2 A problem specification: Water faucet Structure: Function: + C Q + ? t c + + Q 212 lecture slides for textb o ok Machine L e arning , � c T om M. Mitc hell, McGra w Hill, 1997 + m C Q f + h − + + T c T m T + h

  15. Case-Based Reasoning in CADET � Instances represen ted b y ric h structural descriptions � Multiple cases retriev ed (and com bined) to form solution to new problem � Tigh t coupling b et w een case retriev al and problem solving Bottom line: � Simple matc hing of cases useful for tasks suc h as answ ering help-desk queries � Area of ongoing researc h 213 lecture slides for textb o ok Machine L e arning , � c T om M. Mitc hell, McGra w Hill, 1997

  16. Lazy and Eager Learning Lazy: w ait for query b efore generalizi ng � k -Nearest Neighbor , Case based reasoning Eager: generalize b efore seeing query � Radial basis function net w orks, ID3, Bac kpropagation, Naiv eBa y es, : : : Do es it matter? � Eager learner m ust create global appro ximation � Lazy learner can create man y lo cal appro ximations � if they use same H , lazy can represen t more complex fns (e.g., consider H = linear functions) 214 lecture slides for textb o ok Machine L e arning , � c T om M. Mitc hell, McGra w Hill, 1997

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend