Course Overview and Introduction Probabilistic Graphical Models - - PowerPoint PPT Presentation
Course Overview and Introduction Probabilistic Graphical Models - - PowerPoint PPT Presentation
Course Overview and Introduction Probabilistic Graphical Models Sharif University of Technology Soleymani Spring 2018 Some slides have been adopted from Eric Zing, CMU Course info Instructor: Mahdieh Soleymani Email:
Instructor: Mahdieh Soleymani
Email: soleymani@sharif.edu
Teacher assistants:
Fatemeh Seyyedsalehi Ehsan Montahaei Amirshayan Haghipour Seyed Mohammad Chavosian Sajad Behfar
Course info
2
Text book
D. Koller and N. Friedman, “Probabilistic Graphical Models: Principles and
Techniques”, MIT Press, 2009.
M.I. Jordan,“An Introduction to Probabilistic Graphical Models”, Preprint. Other
C.M. Bishop,“Pattern Recognition and Machine Learning”, Springer, 2006.
Chapters 8-11,13.
K.P. Murphy,“Machine Learning:A Probabilistic Perspective”, MIT Press, 2012.
3
Evaluation policy
Mid-term: 20% Final: 30% Mini-exams: 10% Home works & project: 40%
4
Why using probabilistic models?
5
Partial knowledge of the state of the world
Noisy or incomplete observations We may not know or cover all the involved phenomena in our
model
Partial knowledge can cause the world seems to be stochastic
To deal with partial knowledge and/or stochastic worlds
we need reasoning under uncertainty
Reasoning under uncertainty
6
Representation, inference, and learning
7
We will cover three aspects of the probabilistic models:
Representation of probabilistic knowledge Inference algorithms on these models Learning: Using the data to acquire the distribution
Representation, inference, and learning
8
Representation: When variables tends to interact directly
with few other variables (local structure)
Inference: answering queries using the model
algorithms for answering questions/queries according to the
model and/or based given observation.
Learning of both the parameters and the structure of the
graphical models
Main problems
9
Representation: what is the joint probability dist. on multiple
variables? 𝑄(𝑌1, … , 𝑌𝑒)
How many state configurations in total? Are they all needed to be represented? Do we get any scientific/medical insight?
Inference: If not all variables are observable, how to compute
the conditional distribution of latent variables given evidence?
Learning: where do we get all this probabilities?
Maximal-likelihood estimation? but how many data do we need? Are there other est. principles? Where do we put domain knowledge in terms of plausible relationships
between variables, and plausible values of the probabilities?
Probability review
10
Marginal probabilities
𝑄 𝑌 = 𝑧 𝑄(𝑌, 𝑍 = 𝑧)
Conditional probabilities
𝑄 𝑌|𝑍 =
𝑄(𝑌,𝑍) 𝑄(𝑍)
Bayes rule:
𝑄 𝑌|𝑍 =
𝑄 𝑍|𝑌 𝑄(𝑌) 𝑄(𝑍)
Chain rule:
𝑄 𝑌1, … , 𝑌𝑜 = 𝑗=1
𝑜
𝑄(𝑌𝑗|𝑌1, … , 𝑌𝑗−1)
Medical diagnosis example
11
Representation
𝑒1 𝑒2 𝑒4 𝑔
1
𝑔
2
𝑔
4
𝑔
3
𝑒3 𝑔
5
diseases Findings (symptoms & tests) 𝑄(𝑔
1|𝑒1)
𝑄(𝑔
2|𝑒1, 𝑒2, 𝑒3)
𝑄(𝑔
3|𝑒3)
…
Example Graphical Model
12
Representation: summary of advantages
13
Representing large multivariate distributions directly and
exhaustively is hopeless:
The number of parameters is exponential in the number of
random variables
Inference can be exponential in the number of variables
PGM representation
Compact representation of the joint distribution Transparent
We can combine expert knowledge and accumulated data to learn
the model
Effective for inference and learning
Why using a graph for representation?
14
Intuitively appealing interface by which we can models
highly interacting sets of variables
It allows us to design efficient general purpose inference
algorithms
Graph structure
15
Denotes
conditional dependence structure between random variables
One view: Graph represents a set of independencies Another view: Graph shows a skeleton for factorizing a joint
distribution
PGMs as a framework
16
General-purpose
framework for representing uncertain knowledge and learning and inference in uncertain conditions.
A graph-based representation as the basis of encoding a
complex distribution compactly
allows
declarative representation (with clear semantics)
- f
the probabilistic knowledge
PGMs as a framework
17
Intuitive & compact data structure for representation Efficient reasoning using general-purpose algorithms Sparse parameterization (enables us to elicit or learn
from data)
PGM: declarative representation
18
Separation of knowledge and reasoning We need to specify our model for a specific application
that represents our probabilistic knowledge
There is a general suite of reasoning algorithms that can
be used.
Data Integration
19
Due to the local structure of the graph, we can handle
different source of information
Modular combination of heterogeneous parts – data fusion
Combining different data modality
Example:T
ext + Image + Network => Holistic Social Media
History
20
Wright 1921, 1934 and before Bayesian
networks are independently developed by Spiegelhalter and Lauritzen in statistics and Pearl in computer science in the late 1980’s
First applications (1990’s): expert systems and information
retrieval
PGMs: some application areas
Machine Learning and computational statistics Computer vision: e.g., segmenting and denoising images Robotics: e.g., robot localization and mapping Natural Language Processing Speech recognition Information Retrieval AI: game playing, planning Computational Biology Networks: decoding messages (sent over a noisy channel) Medical diagnosis and prognosis …
21
Graphical models: directed & undirected
22
Two kinds of graphical models:
Directed: Bayesian Networks (BNs) Undirected: Markov Random Fields (MRFs) A B
C D A B C D Causality relations Correlation of variables
Graphical models: directed & undirected
23
[Pathfinder Project, 1992]
Medical diagnosis example
24
Representation Inference: Given symptoms, what disease is likely? Eliciting or learning the required probabilities from the
data
𝑒1 𝑒2 𝑒4 𝑔
1
𝑔
2
𝑔
4
𝑔
3
𝑒3 𝑔
5
diseases Findings (symptoms & tests)
Image denoising example
25
[Bishop]
Genetic pedigree example
26
A C E A0 A1 Ag B0 B1 Bg C0 C1 Cg D0 D1 Dg E0 E1 Eg B D
Plan in our course
27
Fundamentals of Graphical Models:
Representation
Bayesian Network Markov Random Fields
Exact inference Basics of learning
Case studies: Popular graphical models
Multivariate Gaussian Models FA, PPCA HMM, CRF, Kalman filter
Approximate inference
Variational methods Monte Carlo algorithms