computational learning theory read chapter 7 suggested
play

Computational Learning Theory [read Chapter 7] [Suggested - PDF document

Computational Learning Theory [read Chapter 7] [Suggested exercises: 7.1, 7.2, 7.5, 7.8] Computational learning theory Setting 1: learner p oses queries to teac her Setting 2: teac her c ho oses examples


  1. Computational Learning Theory [read Chapter 7] [Suggested exercises: 7.1, 7.2, 7.5, 7.8] � Computational learning theory � Setting 1: learner p oses queries to teac her � Setting 2: teac her c ho oses examples � Setting 3: randomly generated instances, lab eled b y teac her � Probably appro ximately correct (P A C) learning � V apnik-Cherv onenkis Dimension � Mistak e b ounds 175 lecture slides for textb o ok Machine L e arning , � c T om M. Mitc hell, McGra w Hill, 1997

  2. Computational Learning Theory What general la ws constrain inductiv e learning? W e seek theory to relate: � Probabilit y of successful learning � Num b er of training examples � Complexit y of h yp othesis space � Accuracy to whic h target concept is appro ximated � Manner in whic h training examples presen ted 176 lecture slides for textb o ok Machine L e arning , � c T om M. Mitc hell, McGra w Hill, 1997

  3. Protot ypical Concept Learning T ask � Giv en: { Instances X : P ossible da ys, eac h describ ed b y the attributes Sky, A irT emp, Humidity, Wind, Water, F or e c ast { T arget function c : E nj oy S por t : X ! f 0 ; 1 g { Hyp otheses H : Conjunctions of literals. E.g. h ? ; C ol d; H ig h; ? ; ? ; ? i : { T raining examples D : P ositiv e and negativ e examples of the target function h x ; c ( x ) i ; : : : h x ; c ( x ) i 1 1 m m � Determine: { A h yp othesis h in H suc h that h ( x ) = c ( x ) for all x in D ? { A h yp othesis h in H suc h that h ( x ) = c ( x ) for all x in X ? 177 lecture slides for textb o ok Machine L e arning , � c T om M. Mitc hell, McGra w Hill, 1997

  4. Sample Complexit y Ho w man y training examples are su�cien t to learn the target concept? 1. If learner prop oses instances, as queries to teac her � Learner prop oses instance x , teac her pro vides c ( x ) 2. If teac her (who kno ws c ) pro vides training examples � teac her pro vides sequence of examples of form h x; c ( x ) i 3. If some random pro cess (e.g., nature) prop oses instances � instance x generated randomly , teac her pro vides c ( x ) 178 lecture slides for textb o ok Machine L e arning , � c T om M. Mitc hell, McGra w Hill, 1997

  5. Sample Complexit y: 1 Learner prop oses instance x , teac her pro vides c ( x ) (assume c is in learner's h yp othesis space H ) Optimal query strategy: pla y 20 questions � pic k instance x suc h that half of h yp otheses in V S classify x p ositiv e, half classify x negativ e � When this is p ossible, need d log j H je queries to 2 learn c � when not p ossible, need ev en more 179 lecture slides for textb o ok Machine L e arning , � c T om M. Mitc hell, McGra w Hill, 1997

  6. Sample Complexit y: 2 T eac her (who kno ws c ) pro vides training examples (assume c is in learner's h yp othesis space H ) Optimal teac hing strategy: dep ends on H used b y learner Consider the case H = conjunctions of up to n b o olean literal s and their negations e.g., ( Air T emp = W ar m ) ^ ( W ind = S tr ong ), where Air T emp; W ind; : : : eac h ha v e 2 p ossible v alues. � if n p ossible b o olean attributes in H , n + 1 examples su�ce � wh y? 180 lecture slides for textb o ok Machine L e arning , � c T om M. Mitc hell, McGra w Hill, 1997

  7. Sample Complexit y: 3 Giv en: � set of instances X � set of h yp otheses H � set of p ossible target concepts C � training instances generated b y a �xed, unkno wn probabilit y distribution D o v er X Learner observ es a sequence D of training examples of form h x; c ( x ) i , for some target concept c 2 C � instances x are dra wn from distribution D � teac her pro vides target v alue c ( x ) for eac h Learner m ust output a h yp othesis h estimating c � h is ev aluated b y its p erformance on subsequen t instances dra wn according to D Note: randomly dra wn instances, noise-free classi�cati ons 181 lecture slides for textb o ok Machine L e arning , � c T om M. Mitc hell, McGra w Hill, 1997

  8. T rue Error of a Hyp othesis Instance space X De�nition: The true error (denoted er r or ( h )) of h yp othesis h with resp ect to D - - c h target concept c and distribution D is the + probabilit y that h will misclassify an instance + dra wn at random according to D . er r or ( h ) � Pr [ c ( x ) 6 = h ( x )] D - x 2D Where c and h disagree 182 lecture slides for textb o ok Machine L e arning , � c T om M. Mitc hell, McGra w Hill, 1997

  9. Tw o Notions of Error T r aining err or of h yp othesis h with resp ect to target concept c � Ho w often h ( x ) 6 = c ( x ) o v er training instances T rue err or of h yp othesis h with resp ect to c � Ho w often h ( x ) 6 = c ( x ) o v er future random instances Our concern: � Can w e b ound the true error of h giv en the training error of h ? � First consider when training error of h is zero (i.e., h 2 V S ) H ;D 183 lecture slides for textb o ok Machine L e arning , � c T om M. Mitc hell, McGra w Hill, 1997

  10. Exhausting the V ersion Space ( r = training error, er r or = true error) Hypothesis space H De�nition: The v ersion space V S is said H ;D error =.3 to b e � - exhausted with resp ect to c and D , if error =.1 r =.4 =.2 r error =.2 ev ery h yp othesis h in V S has error less H ;D =0 r than � with resp ect to c and D . VSH,D error =.2 error =.3 =.3 r error =.1 ( 8 h 2 V S ) er r or ( h ) < � =.1 r H ;D D =0 r 184 lecture slides for textb o ok Machine L e arning , � c T om M. Mitc hell, McGra w Hill, 1997

  11. Ho w man y examples will � -exhaust the VS? Theorem: [Haussler, 1988]. If the h yp othesis space H is �nite, and D is a sequence of m � 1 indep enden t random examples of some target concept c , then for an y 0 � � � 1, the probabilit y that the v ersion space with resp ect to H and D is not � -exhausted (with resp ect to c ) is less than � �m j H j e In teresting! This b ounds the probabilit y that an y consisten t learner will output a h yp othesis h with er r or ( h ) � � If w e w an t to this probabilit y to b e b elo w � � �m j H j e � � then 1 m � (ln j H j + ln (1 =� )) � 185 lecture slides for textb o ok Machine L e arning , � c T om M. Mitc hell, McGra w Hill, 1997

  12. Learning Conjunctions of Bo olean Literals Ho w man y examples are su�cien t to assure with probabilit y at least (1 � � ) that ev ery h in V S satis�es er r or ( h ) � � H ;D D Use our theorem: 1 m � (ln j H j + ln (1 =� )) � Supp ose H con tains conjunctions of constrain ts on up to n b o olean attributes (i.e., n b o olean literals) . n Then j H j = 3 , and 1 n m � (ln 3 + ln (1 =� )) � or 1 m � ( n ln 3 + ln (1 =� )) � 186 lecture slides for textb o ok Machine L e arning , � c T om M. Mitc hell, McGra w Hill, 1997

  13. Ho w Ab out E nj oy S por t ? 1 m � (ln j H j + ln (1 =� )) � If H is as giv en in E nj oy S por t then j H j = 973, and 1 m � (ln 973 + ln (1 =� )) � ... if w an t to assure that with probabilit y 95%, V S con tains only h yp otheses with er r or ( h ) � : 1, then D it is su�cien t to ha v e m examples, where 1 m � (ln 973 + ln (1 =: 05)) : 1 m � 10(ln 973 + ln 20) m � 10(6 : 88 + 3 : 00) m � 98 : 8 187 lecture slides for textb o ok Machine L e arning , � c T om M. Mitc hell, McGra w Hill, 1997

  14. P A C Learning Consider a class C of p ossible target concepts de�ned o v er a set of instances X of length n , and a learner L using h yp othesis space H . De�nition: C is P A C-learnable b y L using H if for all c 2 C , distributions D o v er X , � suc h that 0 < � < 1 = 2, and � suc h that 0 < � < 1 = 2, learner L will with probabilit y at least (1 � � ) output a h yp othesis h 2 H suc h that er r or ( h ) � � , in time that is p olynomial in D 1 =� , 1 =� , n and siz e ( c ). 188 lecture slides for textb o ok Machine L e arning , � c T om M. Mitc hell, McGra w Hill, 1997

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend