Ba y esian Learning Read Ch Suggested exercises - PDF document

Ba y esian Learning �Read Ch� �� Suggested exercises� �� Ba y es Theorem � MAP � ML h yp otheses � MAP learners � Minim um description length principle � Ba y es optimal classi�er � Naiv e Ba y es learner � Example� Learning o v er text data � Ba y esian b elief net w orks � Exp ectation Maximization algorithm �� lecture slides for textb o ok Machine L e arning � T� Mitc hell� McGra w Hill� ��

Tw o Roles for Ba y esian Metho ds Pro vides practical learning algorithms� � Naiv e Ba y es learning � Ba y esian b elief net w ork learning � Com bine prior kno wledge �prior probabiliti es� with observ ed data � Requires prior probabiliti es Pro vides useful conceptual framew ork � Pro vides �gold standard� for ev aluating other learning algorithms � Additional insigh t in to Occam�s razor �� lecture slides for textb o ok Machine L e arning � T� Mitc hell� McGra w Hill� ��

Ba y es Theorem � D j h � P � h � P � h j D � � P � D � P � � h � � prior probabilit y of h yp othesis P h � � D � � prior probabilit y of training data P D � � h j D � � probabilit y of giv en P h D � � D j h � � probabilit y of giv en P D h �� lecture slides for textb o ok Machine L e arning � T� Mitc hell� McGra w Hill� ��

Cho osing Hyp otheses � D j h � P � h � P � h j D � � P � D � P Generally w an t the most probable h yp othesis giv en the training data Maximum a p osteriori h yp othesis � h M AP � arg max � h j D � h P M AP h � H � D j h � P � h � P � arg max � D � P h � H � arg max � D j h � P � h � P h � H If assume � h � � � h � then can further simplify � P P i j and c ho ose the Maximum likeliho o d �ML� h yp othesis � arg max � D j h � h P M L i h � H i �� lecture slides for textb o ok Machine L e arning � T� Mitc hell� McGra w Hill� ��

Ba y es Theorem Do es patien t ha v e cancer or not� A patien t tak es a lab test and the result comes bac k p ositiv e� The test returns a correct p ositiv e result in only �� of the cases in whic h the disease is actually presen t� and a correct negativ e result in only �� of the cases in whic h the disease is not presen t� F urthermore� � �� of the en tire p opulation ha v e this cancer� � cancer � � � � cancer � � P P �� j cancer � � � �j cancer � � P P �� j� cancer � � � �j� cancer � � P P �� lecture slides for textb o ok Machine L e arning � T� Mitc hell� McGra w Hill� ��

Basic F orm ulas for Probabilities � Pr o duct R ule � probabilit y � A � � of a P B conjunction of t w o ev en ts A and B� � A � � � � A j B � P � B � � � B j A � P � A � P B P P � Sum R ule � probabilit y of a disjunction of t w o ev en ts A and B� � A � � � � A � � � B � � � A � � P B P P P B � The or em of total pr ob ability � if ev en ts A � � � � � A � n n are m utually exclusiv e with � A � � �� then P P i i �� n � B � � � B j A � P � A � P P X i i i �� lecture slides for textb o ok Machine L e arning � T� Mitc hell� McGra w Hill� ��

Brute F orce MAP Hyp othesis Learner �� F or eac h h yp othesis in � calculate the h H p osterior probabilit y � D j h � P � h � P � h j D � � P � D � P �� Output the h yp othesis with the highest h M AP p osterior probabilit y � argmax � h j D � h P M AP h � H �� lecture slides for textb o ok Machine L e arning � T� Mitc hell� McGra w Hill� ��

Relation to Concept Learning Consider our usual concept learning task � instance space � h yp othesis space � training X H examples D � consider the learning algorithm �outputs FindS most sp eci�c h yp othesis from the v ersion space � V S H �D What w ould Ba y es rule pro duce as the MAP h yp othesis� Do es output a MAP h yp othesis�� F indS �� lecture slides for textb o ok Machine L e arning � T� Mitc hell� McGra w Hill� ��

Relation to Concept Learning Assume �xed set of instances h x i � � � � � x � m Assume is the set of classi�cations D � h c � x � � c � x � i D � � � � � m Cho ose � D j h �� P �� lecture slides for textb o ok Machine L e arning � T� Mitc hell� McGra w Hill� ��

Relation to Concept Learning Assume �xed set of instances h x i � � � � � x � m Assume is the set of classi�cations D � h c � x � � c � x � i D � � � � � m Cho ose � D j h � P � � D j h � � � if consisten t with P h D � � D j h � � � otherwise P Cho ose � h � to b e uniform distribution P � � � h � � for all in P h H j H j Then� � � if is consisten t with h D � � j V j � S � H �D � � � P � h j D � � � � � � � � otherwise � � � � �� lecture slides for textb o ok Machine L e arning � T� Mitc hell� McGra w Hill� ��

Ev olution of P osterior Probabiliti es P h ) ( P(h|D 1) P(h|D 1, D 2) hypotheses hypotheses hypotheses ( ) a ( ) b ( ) c �� lecture slides for textb o ok Machine L e arning � T� Mitc hell� McGra w Hill� ��

Characterizing Learning Algorithms b y Equiv alen t MAP Learners Inductive system Training examples D Output hypotheses Candidate Elimination Hypothesis space H Algorithm Equivalent Bayesian inference system Training examples D Output hypotheses Hypothesis space H Brute force MAP learner P(h) uniform P(D|h) = 0 if inconsistent, = 1 if consistent Prior assumptions made explicit �� lecture slides for textb o ok Machine L e arning � T� Mitc hell� McGra w Hill� ��

Learning A Real V alued F unction y f h ML e x Consider an y real�v alued target function f T raining examples h x i � where is noisy � d d i i i training v alue � � � x � � d f e i i i � is random v ariable �noise� dra wn e i indep enden tly for eac h according to some x i Gaussian distribution with mean�� Then the maxim um lik eli ho o d h yp othesis is h M L the one that minimizes the sum of squared errors� m � � arg min � � h � x �� h d X M L i i h � H i �� lecture slides for textb o ok Machine L e arning � T� Mitc hell� McGra w Hill� ��

Learning A Real V alued F unction � argmax p � D j h � h M L h � H m � argmax p � d j h � Y i i �� h � H � m d � h � x � � � i i � � � p � argmax e Y � � � � � � i �� h � H Maximize natural log of this instead�� h � x � � d m i i p � argmax ln � h X B C M L B C � � � � � � � A i �� h � H � � � h � x � � d � m i i � argmax � X B C B C � � � A i �� h � H m � � argmax � � d � h � x �� X i i i �� h � H m � � argmin � � h � x �� d X i i i �� h � H �� lecture slides for textb o ok Machine L e arning � T� Mitc hell� McGra w Hill� ��

Learning to Predict Probabiliti es Consider predicting surviv al probabilit y from patien t data T raining examples h x i � where is � or � � d d i i i W an t to train neural net w ork to output a pr ob ability giv en �not a � or �� x i In this case can sho w m � argmax ln h � x � � �� ln�� h � x �� h d d X M L i i i i i �� h � H W eigh t up date rule for a sigmoid unit� � � � w w w j k j k j k where m � w � � d � h � x �� x X j k i i ij k i �� lecture slides for textb o ok Machine L e arning � T� Mitc hell� McGra w Hill� ��

Ba y esian Learning Read Ch Suggested exercises - PDF document

Ba y esian Learning Read Ch Suggested exercises Ba y es Theorem MAP ML h yp otheses MAP learners Minim um description length principle

Ba y esian Learning [Read Ch. 6] [Suggested exercises: 6.1, 6.2, 6.6] Ba y es

Ba y esian Decon v olution of Seismic Arra y Data for RippleFired Explosions Eric

Ba Bayesi esian Deep Deep Le Lear arning ning Prof. Leal-Taix and Prof. Niessner 1 Go

Computational Learning Theory [read Chapter 7] [Suggested exercises: 7.1, 7.2, 7.5, 7.8]

Com bining Inductiv e and Analytical Learning [Read Ch. 12] [Suggested exercises: 12.1,

Outline [read Chapter 2] [suggested exercises 2.2, 2.3, 2.4, 2.6] Learning from

Read Chapter 7 of Machine Learning [Suggested exercises: 7.1, 7.2, 7.5, 7.7] Function

Outline read Chapter suggested exercises

EXERCISES EXERCISES Important Perfectly safe for the vast majority of people Those with

Neck Exercises for Prevention, Neck Exercises for Prevention, Rehabilitation and Strength

Course setup 9 ec course examination based on computer exercises weekly exercises

Exercises, II part Forward Chaining: 12 Jul 2012 Exercises, II part Consider the following set

What ' s in a Ba y esian Model ? BAYE SIAN R E G R E SSION MOD E L IN G W ITH R STAN AR M

Welcome ! BAYE SIAN R E G R E SSION MOD E L IN G W ITH R STAN AR M Jake Thompson Ps y

Ba Bayesi esian Deep Deep Le Lear arning ning Prof. Leal-Taix and Prof. Niessner 1 Go

Wh y u se Ba y esian data anal y sis ? FU N DAME N TAL S OF BAYE SIAN DATA AN ALYSIS IN R Rasm u

Bayesian Networks Philipp Koehn 2 April 2020 Philipp Koehn Artificial Intelligence: Bayesian

ECE 4524 Artificial Intelligence and Engineering Applications Lecture 17: Bayesian Inference

Classification Algorithms UCSB 293S, 2017. T. Yang Some of slides based on R. Mooney (UT Austin)

DATA MINING: NAVE BAYES 1 Nave Bayes Classifier Thomas Bayes 1702 - 1761 We will start off

Should all Machine Learning be Bayesian? Should all Bayesian models be non-parametric? Zoubin

Reasoning with Probabilities Paolo Turrini Department of Computing, Imperial College London

Regression L2: Curve fitting and Given a set of observations: x = { x 1 . . . x N } probability

Calibrated Bayes, and Inferential Paradigm for Of7icial Statistics in the Era of Big Data Rod