Course Overview and Introduction Probabilistic Graphical Models - - PowerPoint PPT Presentation

course overview and introduction
SMART_READER_LITE
LIVE PREVIEW

Course Overview and Introduction Probabilistic Graphical Models - - PowerPoint PPT Presentation

Course Overview and Introduction Probabilistic Graphical Models Sharif University of Technology Soleymani Spring 2018 Some slides have been adopted from Eric Zing, CMU Course info Instructor: Mahdieh Soleymani Email:


slide-1
SLIDE 1

Course Overview and Introduction

Probabilistic Graphical Models Sharif University of Technology Soleymani Spring 2018

Some slides have been adopted from Eric Zing, CMU

slide-2
SLIDE 2

 Instructor: Mahdieh Soleymani

 Email: soleymani@sharif.edu

 Teacher assistants:

 Fatemeh Seyyedsalehi  Ehsan Montahaei  Amirshayan Haghipour  Seyed Mohammad Chavosian  Sajad Behfar

Course info

2

slide-3
SLIDE 3

Text book

 D. Koller and N. Friedman, “Probabilistic Graphical Models: Principles and

Techniques”, MIT Press, 2009.

 M.I. Jordan,“An Introduction to Probabilistic Graphical Models”, Preprint.  Other

C.M. Bishop,“Pattern Recognition and Machine Learning”, Springer, 2006.

Chapters 8-11,13.

K.P. Murphy,“Machine Learning:A Probabilistic Perspective”, MIT Press, 2012.

3

slide-4
SLIDE 4

Evaluation policy

 Mid-term: 20%  Final: 30%  Mini-exams: 10%  Home works & project: 40%

4

slide-5
SLIDE 5

Why using probabilistic models?

5

 Partial knowledge of the state of the world

 Noisy or incomplete observations  We may not know or cover all the involved phenomena in our

model

 Partial knowledge can cause the world seems to be stochastic

 To deal with partial knowledge and/or stochastic worlds

we need reasoning under uncertainty

slide-6
SLIDE 6

Reasoning under uncertainty

6

slide-7
SLIDE 7

Representation, inference, and learning

7

 We will cover three aspects of the probabilistic models:

 Representation of probabilistic knowledge  Inference algorithms on these models  Learning: Using the data to acquire the distribution

slide-8
SLIDE 8

Representation, inference, and learning

8

 Representation: When variables tends to interact directly

with few other variables (local structure)

 Inference: answering queries using the model

 algorithms for answering questions/queries according to the

model and/or based given observation.

 Learning of both the parameters and the structure of the

graphical models

slide-9
SLIDE 9

Main problems

9

 Representation: what is the joint probability dist. on multiple

variables? 𝑄(𝑌1, … , 𝑌𝑒)

 How many state configurations in total?  Are they all needed to be represented?  Do we get any scientific/medical insight?

 Inference: If not all variables are observable, how to compute

the conditional distribution of latent variables given evidence?

 Learning: where do we get all this probabilities?

 Maximal-likelihood estimation? but how many data do we need?  Are there other est. principles?  Where do we put domain knowledge in terms of plausible relationships

between variables, and plausible values of the probabilities?

slide-10
SLIDE 10

Probability review

10

 Marginal probabilities

 𝑄 𝑌 = 𝑧 𝑄(𝑌, 𝑍 = 𝑧)

 Conditional probabilities

 𝑄 𝑌|𝑍 =

𝑄(𝑌,𝑍) 𝑄(𝑍)

 Bayes rule:

 𝑄 𝑌|𝑍 =

𝑄 𝑍|𝑌 𝑄(𝑌) 𝑄(𝑍)

 Chain rule:

 𝑄 𝑌1, … , 𝑌𝑜 = 𝑗=1

𝑜

𝑄(𝑌𝑗|𝑌1, … , 𝑌𝑗−1)

slide-11
SLIDE 11

Medical diagnosis example

11

 Representation

𝑒1 𝑒2 𝑒4 𝑔

1

𝑔

2

𝑔

4

𝑔

3

𝑒3 𝑔

5

diseases Findings (symptoms & tests) 𝑄(𝑔

1|𝑒1)

𝑄(𝑔

2|𝑒1, 𝑒2, 𝑒3)

𝑄(𝑔

3|𝑒3)

slide-12
SLIDE 12

Example Graphical Model

12

slide-13
SLIDE 13

Representation: summary of advantages

13

 Representing large multivariate distributions directly and

exhaustively is hopeless:

 The number of parameters is exponential in the number of

random variables

 Inference can be exponential in the number of variables

 PGM representation

 Compact representation of the joint distribution  Transparent

 We can combine expert knowledge and accumulated data to learn

the model

 Effective for inference and learning

slide-14
SLIDE 14

Why using a graph for representation?

14

 Intuitively appealing interface by which we can models

highly interacting sets of variables

 It allows us to design efficient general purpose inference

algorithms

slide-15
SLIDE 15

Graph structure

15

 Denotes

conditional dependence structure between random variables

 One view: Graph represents a set of independencies  Another view: Graph shows a skeleton for factorizing a joint

distribution

slide-16
SLIDE 16

PGMs as a framework

16

 General-purpose

framework for representing uncertain knowledge and learning and inference in uncertain conditions.

 A graph-based representation as the basis of encoding a

complex distribution compactly

 allows

declarative representation (with clear semantics)

  • f

the probabilistic knowledge

slide-17
SLIDE 17

PGMs as a framework

17

 Intuitive & compact data structure for representation  Efficient reasoning using general-purpose algorithms  Sparse parameterization (enables us to elicit or learn

from data)

slide-18
SLIDE 18

PGM: declarative representation

18

 Separation of knowledge and reasoning  We need to specify our model for a specific application

that represents our probabilistic knowledge

 There is a general suite of reasoning algorithms that can

be used.

slide-19
SLIDE 19

Data Integration

19

 Due to the local structure of the graph, we can handle

different source of information

 Modular combination of heterogeneous parts – data fusion

 Combining different data modality

 Example:T

ext + Image + Network => Holistic Social Media

slide-20
SLIDE 20

History

20

 Wright 1921, 1934 and before  Bayesian

networks are independently developed by Spiegelhalter and Lauritzen in statistics and Pearl in computer science in the late 1980’s

 First applications (1990’s): expert systems and information

retrieval

slide-21
SLIDE 21

PGMs: some application areas

 Machine Learning and computational statistics  Computer vision: e.g., segmenting and denoising images  Robotics: e.g., robot localization and mapping  Natural Language Processing  Speech recognition  Information Retrieval  AI: game playing, planning  Computational Biology  Networks: decoding messages (sent over a noisy channel)  Medical diagnosis and prognosis  …

21

slide-22
SLIDE 22

Graphical models: directed & undirected

22

 Two kinds of graphical models:

 Directed: Bayesian Networks (BNs)  Undirected: Markov Random Fields (MRFs) A B

C D A B C D Causality relations Correlation of variables

slide-23
SLIDE 23

Graphical models: directed & undirected

23

[Pathfinder Project, 1992]

slide-24
SLIDE 24

Medical diagnosis example

24

 Representation  Inference: Given symptoms, what disease is likely?  Eliciting or learning the required probabilities from the

data

𝑒1 𝑒2 𝑒4 𝑔

1

𝑔

2

𝑔

4

𝑔

3

𝑒3 𝑔

5

diseases Findings (symptoms & tests)

slide-25
SLIDE 25

Image denoising example

25

[Bishop]

slide-26
SLIDE 26

Genetic pedigree example

26

A C E A0 A1 Ag B0 B1 Bg C0 C1 Cg D0 D1 Dg E0 E1 Eg B D

slide-27
SLIDE 27

Plan in our course

27

 Fundamentals of Graphical Models:

 Representation

 Bayesian Network  Markov Random Fields

 Exact inference  Basics of learning

 Case studies: Popular graphical models

 Multivariate Gaussian Models  FA, PPCA  HMM, CRF, Kalman filter

 Approximate inference

 Variational methods  Monte Carlo algorithms