ba y esian learning read ch suggested exercises
play

Ba y esian Learning Read Ch Suggested exercises - PDF document

Ba y esian Learning Read Ch Suggested exercises Ba y es Theorem MAP ML h yp otheses MAP learners Minim um description length principle


  1. Ba y esian Learning �Read Ch� �� �Suggested exercises� ���� ���� ���� � Ba y es Theorem � MAP � ML h yp otheses � MAP learners � Minim um description length principle � Ba y es optimal classi�er � Naiv e Ba y es learner � Example� Learning o v er text data � Ba y esian b elief net w orks � Exp ectation Maximization algorithm ��� lecture slides for textb o ok Machine L e arning � T� Mitc hell� McGra w Hill� ����

  2. Tw o Roles for Ba y esian Metho ds Pro vides practical learning algorithms� � Naiv e Ba y es learning � Ba y esian b elief net w ork learning � Com bine prior kno wledge �prior probabiliti es� with observ ed data � Requires prior probabiliti es Pro vides useful conceptual framew ork � Pro vides �gold standard� for ev aluating other learning algorithms � Additional insigh t in to Occam�s razor ��� lecture slides for textb o ok Machine L e arning � T� Mitc hell� McGra w Hill� ����

  3. Ba y es Theorem � D j h � P � h � P � h j D � � P � D � P � � h � � prior probabilit y of h yp othesis P h � � D � � prior probabilit y of training data P D � � h j D � � probabilit y of giv en P h D � � D j h � � probabilit y of giv en P D h ��� lecture slides for textb o ok Machine L e arning � T� Mitc hell� McGra w Hill� ����

  4. Cho osing Hyp otheses � D j h � P � h � P � h j D � � P � D � P Generally w an t the most probable h yp othesis giv en the training data Maximum a p osteriori h yp othesis � h M AP � arg max � h j D � h P M AP h � H � D j h � P � h � P � arg max � D � P h � H � arg max � D j h � P � h � P h � H If assume � h � � � h � then can further simplify � P P i j and c ho ose the Maximum likeliho o d �ML� h yp othesis � arg max � D j h � h P M L i h � H i ��� lecture slides for textb o ok Machine L e arning � T� Mitc hell� McGra w Hill� ����

  5. Ba y es Theorem Do es patien t ha v e cancer or not� A patien t tak es a lab test and the result comes bac k p ositiv e� The test returns a correct p ositiv e result in only ��� of the cases in whic h the disease is actually presen t� and a correct negativ e result in only ��� of the cases in whic h the disease is not presen t� F urthermore� � ��� of the en tire p opulation ha v e this cancer� � cancer � � � � cancer � � P P �� j cancer � � � �j cancer � � P P �� j� cancer � � � �j� cancer � � P P ��� lecture slides for textb o ok Machine L e arning � T� Mitc hell� McGra w Hill� ����

  6. Basic F orm ulas for Probabilities � Pr o duct R ule � probabilit y � A � � of a P B conjunction of t w o ev en ts A and B� � A � � � � A j B � P � B � � � B j A � P � A � P B P P � Sum R ule � probabilit y of a disjunction of t w o ev en ts A and B� � A � � � � A � � � B � � � A � � P B P P P B � The or em of total pr ob ability � if ev en ts A � � � � � A � n n are m utually exclusiv e with � A � � �� then P P i i �� n � B � � � B j A � P � A � P P X i i i �� ��� lecture slides for textb o ok Machine L e arning � T� Mitc hell� McGra w Hill� ����

  7. Brute F orce MAP Hyp othesis Learner �� F or eac h h yp othesis in � calculate the h H p osterior probabilit y � D j h � P � h � P � h j D � � P � D � P �� Output the h yp othesis with the highest h M AP p osterior probabilit y � argmax � h j D � h P M AP h � H ��� lecture slides for textb o ok Machine L e arning � T� Mitc hell� McGra w Hill� ����

  8. Relation to Concept Learning Consider our usual concept learning task � instance space � h yp othesis space � training X H examples D � consider the learning algorithm �outputs FindS most sp eci�c h yp othesis from the v ersion space � V S H �D What w ould Ba y es rule pro duce as the MAP h yp othesis� Do es output a MAP h yp othesis�� F indS ��� lecture slides for textb o ok Machine L e arning � T� Mitc hell� McGra w Hill� ����

  9. Relation to Concept Learning Assume �xed set of instances h x i � � � � � x � m Assume is the set of classi�cations D � h c � x � � c � x � i D � � � � � m Cho ose � D j h �� P ��� lecture slides for textb o ok Machine L e arning � T� Mitc hell� McGra w Hill� ����

  10. Relation to Concept Learning Assume �xed set of instances h x i � � � � � x � m Assume is the set of classi�cations D � h c � x � � c � x � i D � � � � � m Cho ose � D j h � P � � D j h � � � if consisten t with P h D � � D j h � � � otherwise P Cho ose � h � to b e uniform distribution P � � � h � � for all in P h H j H j Then� � � if is consisten t with h D � � j V j � S � H �D � � � P � h j D � � � � � � � � otherwise � � � � ��� lecture slides for textb o ok Machine L e arning � T� Mitc hell� McGra w Hill� ����

  11. Ev olution of P osterior Probabiliti es P h ) ( P(h|D 1) P(h|D 1, D 2) hypotheses hypotheses hypotheses ( ) a ( ) b ( ) c ��� lecture slides for textb o ok Machine L e arning � T� Mitc hell� McGra w Hill� ����

  12. Characterizing Learning Algorithms b y Equiv alen t MAP Learners Inductive system Training examples D Output hypotheses Candidate Elimination Hypothesis space H Algorithm Equivalent Bayesian inference system Training examples D Output hypotheses Hypothesis space H Brute force MAP learner P(h) uniform P(D|h) = 0 if inconsistent, = 1 if consistent Prior assumptions made explicit ��� lecture slides for textb o ok Machine L e arning � T� Mitc hell� McGra w Hill� ����

  13. Learning A Real V alued F unction y f h ML e x Consider an y real�v alued target function f T raining examples h x i � where is noisy � d d i i i training v alue � � � x � � d f e i i i � is random v ariable �noise� dra wn e i indep enden tly for eac h according to some x i Gaussian distribution with mean�� Then the maxim um lik eli ho o d h yp othesis is h M L the one that minimizes the sum of squared errors� m � � arg min � � h � x �� h d X M L i i h � H i �� ��� lecture slides for textb o ok Machine L e arning � T� Mitc hell� McGra w Hill� ����

  14. Learning A Real V alued F unction � argmax p � D j h � h M L h � H m � argmax p � d j h � Y i i �� h � H � m d � h � x � � � i i � � � p � argmax e Y � � � � � � i �� h � H Maximize natural log of this instead��� � � � � � h � x � � d m i i p � argmax ln � h X B C M L B C � � � � � � � A i �� h � H � � � h � x � � d � m i i � argmax � X B C B C � � � A i �� h � H m � � argmax � � d � h � x �� X i i i �� h � H m � � argmin � � h � x �� d X i i i �� h � H ��� lecture slides for textb o ok Machine L e arning � T� Mitc hell� McGra w Hill� ����

  15. Learning to Predict Probabiliti es Consider predicting surviv al probabilit y from patien t data T raining examples h x i � where is � or � � d d i i i W an t to train neural net w ork to output a pr ob ability giv en �not a � or �� x i In this case can sho w m � argmax ln h � x � � �� � � ln�� � h � x �� h d d X M L i i i i i �� h � H W eigh t up date rule for a sigmoid unit� � � � w w w j k j k j k where m � w � � d � h � x �� � x X j k i i ij k i �� ��� lecture slides for textb o ok Machine L e arning � T� Mitc hell� McGra w Hill� ����

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend