Probabilistic Graphical Models Part III: Example Applications
Selim Aksoy
Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr
CS 551, Fall 2019
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 1 / 50
Probabilistic Graphical Models Part III: Example Applications Selim - - PowerPoint PPT Presentation
Probabilistic Graphical Models Part III: Example Applications Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2019 CS 551, Fall 2019 2019, Selim Aksoy (Bilkent University) c 1 / 50
Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 1 / 50
◮ We will look at example uses of Bayesian networks and
◮ Alarm network for monitoring intensive care patients —
◮ Recommendation system — Bayesian networks ◮ Diagnostic systems — Bayesian networks ◮ Statistical text analysis — probabilistic latent semantic
◮ Statistical text analysis — latent Dirichlet allocation ◮ Scene classification — probabilistic latent semantic analysis ◮ Object detection — probabilistic latent semantic analysis ◮ Image segmentation — Markov random fields ◮ Contextual classification — conditional random fields CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 2 / 50
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 3 / 50
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 4 / 50
◮ Internal medicine knowledge base ◮ Quick Medical Reference,
◮ INTERNIST-1 → QMR →
◮ 600 diseases and 4000 symptoms
◮ M. A. Shwe, B. Middleton, D. E. Heckerman, M. Henrion,
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 5 / 50
◮ Given user preferences, the system can suggest
◮ Input: movie preferences of many users. ◮ Output: model correlations between movie features.
◮ Users that like comedy, often like drama. ◮ Users that like action, often do not like cartoons. ◮ Users that like Robert De Niro films, often like Al Pacino
◮ Given user preferences, the system can predict the
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 6 / 50
◮ Input: An unorganized collection of documents ◮ Output: An organized collection, and a description of how
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 7 / 50
◮ T. Hofmann, “Unsupervised learning by probabilistic latent
◮ The probabilistic latent semantic analysis (PLSA) algorithm
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 8 / 50
◮ PLSA uses a graphical model for the joint probability of the
◮ Suppose there are N documents having content coming
◮ The collection of documents is summarized in an N-by-M
◮ In addition, there is a latent topic variable zk associated with
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 9 / 50
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 10 / 50
◮ The generative model P(di, wj) = P(di)P(wj|di) for word
K
◮ P(wj|zk) denotes the topic-conditional probability of word wj
◮ P(zk|di) denotes the probability of topic zk observed in
◮ K is the number of topics.
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 11 / 50
◮ Then, the topic specific word distribution P(wj|zk) and the
◮ In PLSA, the goal is to identify the probabilities P(wj|zk)
◮ These probabilities are learned using the EM algorithm.
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 12 / 50
◮ In the E-step, the posterior probability of the latent variables
l=1 P(wj|zl)P(zl|di)
◮ In the M-step, the parameters are updated to maximize the
i=1 n(di, wj)P(zk|di, wj)
m=1
i=1 n(di, wm)P(zk|di, wm)
j=1 n(di, wj)P(zk|di, wj)
j=1 n(di, wj)
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 13 / 50
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 14 / 50
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 15 / 50
◮ D. M. Blei, A. Y. Ng, M. I. Jordan, “Latent Dirichlet
◮ D. M. Blei, “Probabilistic Topic Models,” Communications of
◮ Latent Dirichlet allocation (LDA) is a similar topic model with
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 16 / 50
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 17 / 50
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 18 / 50
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 19 / 50
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 20 / 50
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 21 / 50
◮ P
◮ The PLSA model is used for scene classification by
◮ The topic (aspect) probabilities are used as features as an
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 22 / 50
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 23 / 50
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 24 / 50
◮ H. G. Akcay, S. Aksoy, “Automatic Detection of Geospatial
◮ We used the PLSA technique for object detection to model
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 25 / 50
k−means
quantization histogram
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 26 / 50
P(x|t) P(s) building s t x P(t|s)
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 27 / 50
◮ After learning the parameters of the model, we want to find
◮ This is done by comparing the object specific feature
◮ The similarity between two distributions can be measured
◮ Then, for each object type, the segments can be sorted
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 28 / 50
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 29 / 50
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 30 / 50
◮ Z. Kato, T.-C. Pong, “A Markov random field image
◮ Markov random fields are used as a neighborhood model
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 31 / 50
◮ The goal is to assign each pixel into a set of labels w ∈ Ω. ◮ Pixels are modeled using color and texture features. ◮ Pixel features are modeled using multivariate Gaussians,
◮ A first-order neighborhood system is used as the prior for
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 32 / 50
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 33 / 50
◮ The prior is modeled as
◮ Each clique corresponds to a pair of neighboring pixels. ◮ The potentials favor similar classes in neighboring pixels as
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 34 / 50
◮ The prior is proportional to the length of the region
◮ The final labeling for each pixel is done by maximizing the
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 35 / 50
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 36 / 50
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 37 / 50
◮ A. Rabinovich, A. Vedaldi, C. Galleguillos, E. Wiewiora,
◮ Semantic context among objects is used for improving
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 38 / 50
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 39 / 50
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 40 / 50
◮ A conditional random field (CRF) framework is used to
◮ Given an image I and its segmentation S1, . . . , Sk, the goal
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 41 / 50
◮ This interaction is modeled as a probability distribution
i=1 A(i)
◮ The semantic context information is modeled using context
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 42 / 50
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 43 / 50
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 44 / 50
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 45 / 50
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 46 / 50
◮ x is a sequence of observations: x = (x1, . . . , xn). ◮ y is the corresponding sequence of labels: y = (y1, . . . , yn). ◮ CRF model definition:
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 47 / 50
◮ Without any further assumptions on the structure of y, the
◮ One needs to enumerate all possible sequences y for
y
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 48 / 50
◮ Linear-chain CRFs: consider feature functions
n
◮ Example application: sequence labeling problem for named
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 49 / 50
◮ Example feature functions:
◮ For example, if λ1 > 0, whenever f1 is active (i.e., we observe the
◮ If λ1 < 0, the model will try to avoid the tag PERSON for John.
CS 551, Fall 2019 c 2019, Selim Aksoy (Bilkent University) 50 / 50