Bayesian Classification Autonomous Agents Vasilis Papageorgiou - PowerPoint PPT Presentation

Bayesian Classification Autonomous Agents Vasilis Papageorgiou February 23, 2020 Technical University of Crete

Bayesian Networks • Bayesian networks are graphical models that model joint distributions of random variables • They consist of a directed and acyclic graph ( DAG ) • vertices : random variables • edges : random variable dependencies • The conditional probability distribution of each random variable is only dependent on the distribution of its parental vertices : Pr( X i | ∩ j � = i X j ) = Pr( X i | parents( X i )) • These distributions are stored in tables called Conditional Probability Tables (CPTs)

Bayesian Networks • As a result, for the joint distribution of all the random variables of the network it holds: n � Pr( x 1 , . . . , x n ) = Pr( x i | parents( X i )) i =1 • Below we can see an example of a Bayesian Network: Figure 1: Alarm bayesian network.

Exact Inference • Exact Inference is the calculation of the posterior probability distribution for a set of query variables, given the observations of a set of evidence variables • The algorithm that has been implemented is the variable enumeration algorithm: � Pr( X | e ) = a Pr( X , e ) = a Pr( X , e , y ) y where a is a normalization constant, e is the set of evidence variables and y the set of hidden variables

Bayesian Classifiers • Given a dataset D , a statistical classifier is a function f : Ω X → Ω C that maps the values of the attributes X ∈ R n to a unique class label c ∗ ∈ Ω C = { c 1 , . . . , c m } in a way that: c ∗ = argmax { Pr( c j | x ) } j • Using Bayes’ theorem , we can rewrite the equation above as: c ∗ = argmax { Pr( c j )Pr( x | c j ) } j which is the basis of every Bayesian Classifier

Learning Bayesian Networks • Given a dataset D , learning a Bayesian Network consists of two phases: 1. Learning the structure of the DAG 2. Estimating the values of the CPTs (in our case using maximum likelihood estimation) the combination of which aims to induce a Bayesian Network that best describes D . • The challenge arises when the space of the attributes X is of a high dimension. In this case, the estimation of Pr( X | c ) is a hard task. • The solution is given making some arbitrary random variable dependency assumptions that lower the complexity of the problem.

Naive Bayes Classifiers • Naive Bayes Classifiers make the tightest independence assumption: – All the random variables are conditionally independent given the value of the label • Hence: � Pr( x | c ) = Pr( X i = x i | c ) i and as a result each label is assigned by: c ∗ = argmax � { Pr( c j ) Pr( X i = x i | c j ) } j i

Naive Bayes Classifiers • The topology of the DAG of such a classifier suggests that all the attributes of the problem have only one parental vertex , the one of the label random variable. • Such an example is given below: Figure 2: Naive Bayes Classifier DAG example.

Tree Augmented Naive (TAN) Bayes Classifiers • Tree Augmented Naive (TAN) Bayes Classifiers loosen the conditional independence that is suggested by the conventional Naive Bayes Classifiers • They let each random variable to have at most one parental vertex, besides the vertex that corresponds to the label. • An example of such a network is given below: Figure 3: Tree Augmented Naive Bayes Classifier DAG example.

Tree Augmented Naive (TAN) Bayes Classifiers • We can see that initially, the structure of the DAG is uknown . • Hence, TAN classifiers utilize a modification of the Chow Liu algorithm • This algorithm is used to induce graphical model’ structures, with the restriction of a finite number of parental vertices for each node.

Test Cases The aforementioned algorithms were tested with the Bayesian Networks whose DAGs a shown below:

Classification Error Below we can see the percentage of the samples that are wrongly classified using the methods that we discussed earlier, as well as the results of the initial networks (BN), for various labels .

Classification Error • We can see that in the case of the smaller alarm network, both TAN and Naive Bayes classifiers have an efficient performance that is fairly close to the performance of the exact inference to the initial network. This leads us to the conclusion that they have modeled the random variable dependencies well enough in order to achieve low classification error. • On the other hand, we can also see that when tested to the somewhat more complex medical network, they still manage to approximate random variable dependencies well enough in most cases. However, we see that there is a case where Naive Bayes has significant lower performance.

Bayesian Classification Autonomous Agents Vasilis Papageorgiou - PowerPoint PPT Presentation

Bayesian Classification Autonomous Agents Vasilis Papageorgiou February 23, 2020 Technical University of Crete Bayesian Networks Bayesian networks are graphical models that model joint distributions of random variables They consist of a

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

CS440/ECE448 Lecture 15: Bayesian Inference and Bayesian Learning Slides by Svetlana Lazebnik,

Bayesian Learning 1 Outline MLE, MAP vs. Bayesian Learning Bayesian Linear Regression

CS 331: Bayesian Networks 2 1 Bayesian Networks Youve heard about how Bayesian networks

Graph Classification Classification Outline Introduction, Overview Classification using

Classification of Symmetry Classification of Symmetry Classification of Symmetry Classification

A simple Bayesian regression model Alicia Johnson Associate Professor, Macalester College

Part 7 Bayesian hierarchical modelling, simulation and MCMC by Gero Walter 252 Bayesian

Case Study: Bayesian Linear Regression and Sparse Bayesian Models Piyush Rai Dept. of CSE, IIT

AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Bayesian Networks Directed Acyclic Graph (DAG)

Bayesian Networks Youve heard about how Bayesian networks have revolutionized AI

Meta-Bayesian Analysis A Bayesian decision-theoretic analysis of Bayesian inference under model

Lecture 6. Bayesian estimation Lecture 6. Bayesian estimation 1 (172) 6. Bayesian estimation

Bayesian networks (2) Lirong Xia Last class Bayesian networks compact, graphical

Bayesian Classification and Regression Trees James Cussens York Centre for Complex Systems

EST5104 Bayesian Inference EST5803 Advanced Bayesian Inference Ricardo Ehlers ehlers@icmc.usp.br

Biostatistics 602 - Statistical Inference March 14th, 2013 Biostatistics 602 - Lecture 16 Hyun

Bayesian Methods in Reliability Engineering ASQ Reliability Division Webinar Program Nov 15 th

Acknowledgment v At the start of any new venture, it is my

A Decision Model for Cost Optimal Record Matching Presenter: Vassilios S. Verykios IST College /

IndabaX 2019 Malawi An Introduction to ML Amelia Taylor Lecturer in AI The Polytechnic,

Analyzing #POTUS Sentiment on Twitter to Predict Public Opinion on Presidential Issues By: Jacob

A RISK ANALYSIS OF THE MOLYBDENUM-99 SUPPLY CHAIN USING BAYESIAN NETWORKS 2017 Mo-99 Topical