Bayesian networks Petr Pok Czech Technical University in Prague - PowerPoint PPT Presentation

CZECH TECHNICAL UNIVERSITY IN PRAGUE Faculty of Electrical Engineering Department of Cybernetics Bayesian networks Petr Pošík Czech Technical University in Prague Faculty of Electrical Engineering Dept. of Cybernetics Significant parts of this material come from the lectures on Bayesian networks which are part of Artificial Intelligence course by Pieter Abbeel and Dan Klein. The original lectures can be found at http://ai.berkeley.edu P. Pošík c � 2017 Artificial Intelligence – 1 / 36

Introduction P. Pošík c � 2017 Artificial Intelligence – 2 / 36

Uncertainty Probabilistic reasoning is one of the frameworks that allow us to maintain our beliefs and knowledge in uncertain environments. Introduction • Uncertainty • Notation • Cheatsheet • Joint distribution • Contents Bayesian networks Inference Summary P. Pošík c � 2017 Artificial Intelligence – 3 / 36

Uncertainty Probabilistic reasoning is one of the frameworks that allow us to maintain our beliefs and knowledge in uncertain environments. Introduction Usual scenario: • Uncertainty • Notation ■ Observed variables (evidence): known things related to the state of the world; often • Cheatsheet imprecise, noisy (info from sensors, symptoms of a patient, etc.). • Joint distribution • Contents ■ Unobserved, hidden variables: unknown, but important aspects of the world; we Bayesian networks need to reason about them (what the position of an object is, whether a disease is Inference present, etc.) Summary ■ Model: describes the relations among hidden and observed variables; allows us to reason. P. Pošík c � 2017 Artificial Intelligence – 3 / 36

Uncertainty Probabilistic reasoning is one of the frameworks that allow us to maintain our beliefs and knowledge in uncertain environments. Introduction Usual scenario: • Uncertainty • Notation ■ Observed variables (evidence): known things related to the state of the world; often • Cheatsheet imprecise, noisy (info from sensors, symptoms of a patient, etc.). • Joint distribution • Contents ■ Unobserved, hidden variables: unknown, but important aspects of the world; we Bayesian networks need to reason about them (what the position of an object is, whether a disease is Inference present, etc.) Summary ■ Model: describes the relations among hidden and observed variables; allows us to reason. Models (including probabilistic) ■ describe how (a part of) the world works. ■ are always approximations or simplifications: ■ They cannot acount for everything (they would be as complex as the world itself). ■ They represent only a chosen subset of variables and interactions between them. ■ “All models are wrong; some are useful.” — George E. P. Box P. Pošík c � 2017 Artificial Intelligence – 3 / 36

Uncertainty Probabilistic reasoning is one of the frameworks that allow us to maintain our beliefs and knowledge in uncertain environments. Introduction Usual scenario: • Uncertainty • Notation ■ Observed variables (evidence): known things related to the state of the world; often • Cheatsheet imprecise, noisy (info from sensors, symptoms of a patient, etc.). • Joint distribution • Contents ■ Unobserved, hidden variables: unknown, but important aspects of the world; we Bayesian networks need to reason about them (what the position of an object is, whether a disease is Inference present, etc.) Summary ■ Model: describes the relations among hidden and observed variables; allows us to reason. Models (including probabilistic) ■ describe how (a part of) the world works. ■ are always approximations or simplifications: ■ They cannot acount for everything (they would be as complex as the world itself). ■ They represent only a chosen subset of variables and interactions between them. ■ “All models are wrong; some are useful.” — George E. P. Box A probabilistic model is a joint distribution over a set of random variables. P. Pošík c � 2017 Artificial Intelligence – 3 / 36

Notation Random variables (start with capital letters): X , Y , Weather , . . . Introduction • Uncertainty Values of random variables (start with lower-case letters): • Notation • Cheatsheet x 1 , e i , rainy , . . . • Joint distribution • Contents Probability distribution of a random variable: Bayesian networks Inference P ( X ) or P X Summary Probability of a random event: P ( X = x 1 ) or P X ( x 1 ) Shorthand for a probability of a random event (if there is no chance of confusion): P (+ r ) meaning P ( Rainy = true ) or P ( r ) meaning P ( Weather = rainy ) P. Pošík c � 2017 Artificial Intelligence – 4 / 36

Probability cheatsheet Conditional probability: P ( X | Y ) = P ( X , Y ) Introduction P ( Y ) • Uncertainty • Notation Product rule: • Cheatsheet • Joint distribution • Contents P ( X , Y ) = P ( X | Y ) P ( Y ) Bayesian networks Bayes rule: Inference Summary P ( x | y ) = P ( y | x ) P ( x ) P ( y | x ) P ( x ) = P ( y ) ∑ i P ( y | x i ) P ( x i ) Chain rule: n ∏ P ( X 1 , X 2 , . . . , X n ) = P ( X 1 ) P ( X 2 | X 1 ) P ( X 3 | X 1 , X 2 ) · . . . = P ( X i | X 1 , . . . , X i − 1 ) i = 1 X ⊥ ⊥ Y ( X and Y are independent ) iff ∀ x , y : P ( x , y ) = P ( x ) P ( y ) X ⊥ ⊥ Y | Z ( X and Y are conditinally independent given Z ) iff ∀ x , y , z : P ( x , y | z ) = P ( x | z ) P ( y | z ) P. Pošík c � 2017 Artificial Intelligence – 5 / 36

Joint probability distribution Joint distribution over a set of variables X 1 , . . . , X n (here descrete) assigns a probability to each combination of values: Introduction P ( X 1 = x 1 , . . . , X n = x n ) = P ( x 1 , . . . , x n ) • Uncertainty • Notation For a proper probability distribution: • Cheatsheet • Joint distribution • Contents ∑ ∀ x 1 , . . . , x n : P ( x 1 , . . . , x n ) ≥ 0 P ( x 1 , . . . , x n ) = 1 and Bayesian networks x 1 ,..., x n Inference Summary P. Pošík c � 2017 Artificial Intelligence – 6 / 36

Joint probability distribution Joint distribution over a set of variables X 1 , . . . , X n (here descrete) assigns a probability to each combination of values: Introduction P ( X 1 = x 1 , . . . , X n = x n ) = P ( x 1 , . . . , x n ) • Uncertainty • Notation For a proper probability distribution: • Cheatsheet • Joint distribution • Contents ∑ ∀ x 1 , . . . , x n : P ( x 1 , . . . , x n ) ≥ 0 P ( x 1 , . . . , x n ) = 1 and Bayesian networks x 1 ,..., x n Inference Probabilistic inference Summary ■ Compute a desired probability from other known probabilities (e.g. marginal or conditional from joint). ■ Conditional probabilities turn out to be the most interesting ones: ■ They represent our or agent’s beliefs given the evidence (measured values of observable variables). P ( bus on time | rush our ) = 0.8 ■ ■ Probabilities change with new evidence: P ( bus on time ) = 0.95 ■ P ( bus on time | rush our ) = 0.8 ■ P ( bus on time | rush our, dry roads ) = 0.85 ■ P. Pošík c � 2017 Artificial Intelligence – 6 / 36

Contents ■ What is a Bayesian network? ■ How it encodes the joint probability distributions? Introduction ■ What independence assumptions does it encode? • Uncertainty ■ How to perform reasoning using BN? • Notation • Cheatsheet • Joint distribution • Contents Bayesian networks Inference Summary P. Pošík c � 2017 Artificial Intelligence – 7 / 36

Bayesian networks P. Pošík c � 2017 Artificial Intelligence – 8 / 36

What’s wrong with the joint distribution? How many free parameters n params has a probability distribution over n variables, each having at least d possible values? ■ For all variables binary ( d = 2): Introduction Bayesian networks • Issues • BN • BN example • Independence • Independence? • Conditional independence • Causality • Assumptions in BN • Independence in BN • Causal chain • Common cause • Common effect • D-separation • D-sep examples Inference Summary P. Pošík c � 2017 Artificial Intelligence – 9 / 36

What’s wrong with the joint distribution? How many free parameters n params has a probability distribution over n variables, each having at least d possible values? ■ For all variables binary ( d = 2): n params = 2 n − 1 Introduction Bayesian networks ■ In general: • Issues • BN • BN example • Independence • Independence? • Conditional independence • Causality • Assumptions in BN • Independence in BN • Causal chain • Common cause • Common effect • D-separation • D-sep examples Inference Summary P. Pošík c � 2017 Artificial Intelligence – 9 / 36

What’s wrong with the joint distribution? How many free parameters n params has a probability distribution over n variables, each having at least d possible values? ■ For all variables binary ( d = 2): n params = 2 n − 1 Introduction ■ In general: n params ≥ d n − 1 Bayesian networks • Issues • BN • BN example • Independence • Independence? • Conditional independence • Causality • Assumptions in BN • Independence in BN • Causal chain • Common cause • Common effect • D-separation • D-sep examples Inference Summary P. Pošík c � 2017 Artificial Intelligence – 9 / 36

Bayesian networks Petr Pok Czech Technical University in Prague - PowerPoint PPT Presentation

CZECH TECHNICAL UNIVERSITY IN PRAGUE Faculty of Electrical Engineering Department of Cybernetics Bayesian networks Petr Pok Czech Technical University in Prague Faculty of Electrical Engineering Dept. of Cybernetics Significant parts of

CS 331: Bayesian Networks 2 1 Bayesian Networks Youve heard about how Bayesian networks

Bayesian Networks Youve heard about how Bayesian networks have revolutionized AI

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

Bayesian networks (2) Lirong Xia Last class Bayesian networks compact, graphical

AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Bayesian Networks Directed Acyclic Graph (DAG)

Bayesian Methods for Neural Networks Readings: Bishop, Neural Networks for Pattern Recognition .

Chapter14 Probabilistic Reasoning (Bayesian Networks) Sec. 1 - 2 20070607 Chap14 1

CS440/ECE448 Lecture 15: Bayesian Inference and Bayesian Learning Slides by Svetlana Lazebnik,

Bayesian Learning 1 Outline MLE, MAP vs. Bayesian Learning Bayesian Linear Regression

Bayesian Networks Philipp Koehn 2 April 2020 Philipp Koehn Artificial Intelligence: Bayesian

Bayesian Networks Philipp Koehn 6 April 2017 Philipp Koehn Artificial Intelligence: Bayesian

Probabilistic Modeling: Bayesian Networks Bioinformatics: Sequence Analysis COMP 571 - Spring

Bayesian Networks Li Xiong Slide credits: Page (Wisconsin) CS760 , Zhu (Wisconsin) KDD 12

Bayesian Networks Philipp Koehn 29 October 2015 Philipp Koehn Artificial Intelligence: Bayesian

Part 7 Bayesian hierarchical modelling, simulation and MCMC by Gero Walter 252 Bayesian

Introduction to Gaussian Processes Neil D. Lawrence GPMC 6th February 2017 Book Rasmussen and

Formal Modeling in Cognitive Science 1 Distributions Lecture 20: Joint, Marginal, and Conditional

Strong Gravitational Lensing and ML: generative models for galaxies Adam Coogan Dark Machines

Machine Learning - MT 2016 7. Classification: Generative Models Varun Kanade University of

Uncertainty Chapter 13 1 Chapter 13 4 What is this? Outline Uncertainty Probability

Co Cond nditional tional Ind ndep epen endence dence Computer ter Sc Science ce cpsc3

Lecture 18 : Pairs of Continuous Random Variables 0/ 21 Definition Let X and Y be continuous

Probability Review CMSC 473/673 UMBC Some slides adapted from 3SLP, Jason Eisner Probability

Sambuz

Useful Links

Newsletter

Mail Us