probabilistic graphical models cmput 651 undirected

ProbabilisticGraphicalModels(Cmput651): UndirectedGraphicalModels1 - PDF document

Cmput651UndirectedModels1 17/10/08 Cmput651ProbabilisticGraphicalModels ProbabilisticGraphicalModels(Cmput651): UndirectedGraphicalModels1 MatthewBrown 17/10/2008


  1. Cmput
651
‐
Undirected
Models
1
 17/10/08
 Cmput
651
–
Probabilistic
Graphical
Models
 Probabilistic
Graphical
Models
(Cmput
651):
 Undirected
Graphical
Models
1
 Matthew
Brown
 17/10/2008
 1
 Space
of
Topics
 Semantics
 Inference
 Learning
 Directed
 UnDirected
 Discrete
 Continuous
 2


  2. Cmput
651
‐
Undirected
Models
1
 17/10/08
 What
is
an
undirected
model
(or
Markov
net)?
 Some
examples:
 Graph
structure
has
undirected
edges
(understandably).
 3
 Why
use
undirected
models?
 (Misconception
Example) 
 A
 A
 A
 D
 B
 D
 B
 D
 B
 C
 C
 C
 This
works:
 NO!
This
implies:
 NO!
This
implies:
 A ⊥ C | B , D ¬ ( A ⊥ C | B , D ) A ⊥ C | B , D B ⊥ D | A , C ¬ ( B ⊥ D | A , C ) B ⊥ D | A , C Bayesian
networks
cannot
represent
some
distributions
 • 
Misconception
example
from
Koller‐Friedman
(Fig
3.16)
 4


  3. Cmput
651
‐
Undirected
Models
1
 17/10/08
 Why
use
undirected
models?
 (Lab
Dynamics
Example) 
 Two‐way
 Top‐down
 Professor
 A
 communication
 A
 communication
 PhD
Students
 B
 C
 B
 C
 MSc
Students
 D
 E
 D
 E
 Undergrad
 F
 F
 Some
distributions
can
be
represented
by
both
Bayes
and
 Markov
nets.
Sometimes,
the
Markov
net
is
more
natural
 (namely,
when
there
is
no
obvious
directionality).
 (We’ll
come
back
 to
Bayes
vs.
Markov
nets
later.)
 5
 Outline
 • What
are
Markov
networks
 • Relating
undirected
graphs
and
PDFs
 • Beyond
Markov
networks
 • Bringing
it
all
together:
Tumour
segmentation
eg
 • Markov
nets
vs.
Bayes
nets
 6


  4. Cmput
651
‐
Undirected
Models
1
 17/10/08
 Parameterization:
 Recall
Bayes
net
CPDs
 a
 P(B=1|A=a)
 A
 0
 0.6
 1
 0.8
 D
 B
 b,
d
 P(C=1|B=b,D=d)
 0,
0
 0.1
 C
 0,
1
 0.9
 1,
0
 0.5
 1,
1
 0.5
 Recall:
In
Bayes
nets,
conditional
probability
distributions
(CPDs)
 describe
the
relationship
between
nodes
joined
by
a
(directed)
edge.

 7
 Parameterization:
 Factors
 A
 Factors
describe
weighting
between
 connected
nodes,
namely
factors.
 Factor
values
always
≥
0
 D
 B
 Not
necessarily
normalized
 • PDFs,
CPDs
are
special
cases
 C
 KF
Fig
4.1
 8


  5. Cmput
651
‐
Undirected
Models
1
 17/10/08
 Parameterization:
 Factors
and
Factorization
 Probability
distribution
derived
by
 A
 multiplying
the
factors
and
then
 normalizing
them.
 D
 B
 ) = 1 ( [ ] ⋅ φ 2 b , c [ ] ⋅ φ 3 c , d [ ] ⋅ φ 4 a , d [ ] P a , b , c , d Z φ 1 a , b C
 ∑ [ ] ⋅ φ 2 b , c [ ] ⋅ φ 3 c , d [ ] ⋅ φ 4 a , d [ ] Z = φ 1 a , b a , b , c , d Any
probability
distribution
that
can
be
expressed
as
a
 normalized
product
of
factors
in
this
way
is
called
a
 Gibbs
distribution .
 (We’ll
come
back
to
this
below,
also
see
KF
 Definition
4.3.4.) 
 9
 Parameterization:
 Example
(slide
1/2)
 A
 What
happens
when
you
multiply
over
all
 the
factors
below?
 (answer
on
next
slide)
 D
 B
 C
 KF
Fig
4.1
 10


  6. Cmput
651
‐
Undirected
Models
1
 17/10/08
 Parameterization:
 Example
 Product
over
factors
 A
 D
 B
 C
 11
 Parameterization: 
Factor
products
 [ ] = φ 1 A , B [ ] ⋅ φ 2 B , C [ ] ϕ A , B , C Match
up
the
shared
variable
 assignments
(B
in
this
example).
 [ ] φ 1 A , B [ ] φ 2 B , C KF
Fig
4.3
(also
see
Join
from
RG's
slides:
 Variable
elimination
–
slide
25)
 12


  7. Cmput
651
‐
Undirected
Models
1
 17/10/08
 Parameterization:
 General
thoughts
 General
factors
and
Markov
nets:
 • Advantage:
not
normalized
 • Computations
easier,
don’t
have
to
normalize
until
end
 • Disadvantage:
not
normalized
 • Harder
to
intuit
how
changes
to
a
factor
affect
whole
PDF
 • Harder
to
train
 13
 Factorization
of
PDFs:
 Gibbs
distributions
 A
PDF
P(X 1 ,...,X n )
is
a
 Gibbs
distribution
 if
it
factorizes
thus:
 ) = 1 P X 1 , … , X n ] ⋅ … ⋅ φ m D m ( Z φ 1 D [ ] ⋅ φ 2 D 2 [ [ ] 1 ∑ ] ⋅ … ⋅ φ m D m Z = φ 1 D [ ] ⋅ φ 2 D 2 [ [ ] 1 X 1 , … , X n where
D 1 ,
D 2 ,
etc.
are
(possibly
overlapping)
subsets
of
X 1 ,...,X n 
 D i 
is
called
the
scope
of
factor
 ϕ i 

 Z
is
the
 partition
function 
and
normalizes
the
factor
product
in
 the
numerator.
 (also
see
KF
Definition
4.3.4)
 14


  8. Cmput
651
‐
Undirected
Models
1
 17/10/08
 Factorization
of
PDFs
 A
PDF
P(X 1 ,...,X n )
 factorizes
over
a
Markov
net 
H,
if
 1.
P(X 1 ,...,X n )

is
a
Gibbs
distribution:
 ) = 1 P X 1 , … , X n ] ⋅ … ⋅ φ m D m ( Z φ 1 D [ ] ⋅ φ 2 D 2 [ [ ] 1 ∑ ] ⋅ … ⋅ φ m D m Z = φ 1 D [ ] ⋅ φ 2 D 2 [ [ ] 1 and

 X 1 , … , X n 2.
D 1 ,
D 2 ,
etc.
are
(maximal
or
non‐maximal)
cliques
of
H
 Recall:
 clique 
=
a
complete
(fully‐connected)
subgraph
of
H
 maximal
clique 
=
clique
that
is
not
a
subgraph
of
a
larger
clique
 (K&F
use
the
terms
“clique”
and
“subclique”
for
what
Russ
and
I
(and
the
 graphical
modeling
community)
call
“maximal
clique”
and
clique”.)
 15
 Cliques
 Maximal
cliques:
 {A,E}
 {B,C,D,E}
 C
 {D,E,F}
 {D,F,G}
 G
 D
 B
 Examples
of
Cliques:
 {A},
{B},
{C},
etc.
(i.e.
single
nodes)
 {B,C},
{B,D},
{B,E},
{E,D},
{F,G},
etc.
 E
 A
 F
 {B,C,D},
{C,D,E},
etc.
 {D,E,F,G}
is
NOT
a
clique
(no
E‐G
 edge)
 16


  9. Cmput
651
‐
Undirected
Models
1
 17/10/08
 Factorization
of
PDFs:
 example
 ) = 1 ( [ ] ⋅ φ 2 B , C , D [ ] ⋅ φ 3 B , D , E [ ] ⋅ P A , B , C , D , E Z φ 1 A , B C
 [ ] ⋅ φ 5 B , D [ ] ⋅ φ 6 C , D [ ] ⋅ φ 7 B , E [ ] ⋅ φ 8 D , E [ ] ⋅ φ 4 B , C [ ] ⋅ φ 10 B [ ] ⋅ φ 11 C [ ] ⋅ φ 12 D [ ] ⋅ φ 13 E [ ] φ 9 A D
 B
 NOTE:
There
is
more
than
one
way
to
 E
 define
the
factors.
For
example,
one
can
 A
 use
only
the
maximal
cliques
(because
 the
maximal
clique
factors
can
subsume
 the
(sub)clique
factors):
 ) = 1 ( [ ] ⋅ ψ 2 B , C , D [ ] ⋅ ψ 3 B , D , E [ ] P A , B , C , D , E Z ψ 1 A , B 17
 Independence:
global
Markov
assumption
 Active
path :
path
from
X
to
Y
with
 no
conditioned
nodes
on
it.
 C
 Global
Markov
assumption :
if
 there
is
no
active
path
from
X
to
Y
 G
 D
 B
 after
conditioning
on
some
set
of
 nodes
Z,
then
X
and
Y
are
 independent
given
Z.
 E
 A
 F
 ( A ⊥ B | E ) ( A ⊥ { B , C , D , F , G } | E ) Fill
denotes
 ¬ ( B ⊥ G | E ) conditioning
 Also
see
KF
section
4.3.1
 18


  10. Cmput
651
‐
Undirected
Models
1
 17/10/08
 Independence:
global
Markov
assumption
 More
examples:
 C
 ({ B , C } ⊥ { F , G } | D , E ) G
 D
 B
 ( A ⊥ { B , C , F , G } | D , E ) E
 A
 F
 Fill
denotes
 conditioning
 19
 Independence:
global
Markov
assumption
 With
no
conditioning,
no
 C
 independencies
among
any
 nodes.
eg:
 G
 D
 B
 ¬ ( A ⊥ { B , C , D , E , F , G } |{}) E
 A
 F
 20


  11. Cmput
651
‐
Undirected
Models
1
 17/10/08
 Independence:
global
Markov
assumption
 K
 I
 C
 H
 J
 G
 D
 B
 Without
conditioning,
only
non‐ E
 connected
graphs
have
 A
 F
 independencies.
 ({ H , I , J , K } ⊥ { A , B , C , D , E , F , G } |{}) 21
 Global
Markov
independence
is
monotonic
 C
 C
 H
 H
 G
 G
 B
 B
 D
 D
 I
 I
 F
 F
 E
 E
 A
 J
 A
 J
 After
conditioning
on
F
and
G,
 Adding
more
nodes
to
the
 {A,B}
is
independent
of
{H,I,J}.
 conditioned
set
does
NOT
 change
this.
 Monotonicity:
Adding
more
nodes
to
the
conditioned
set
(cut
 set)
does
not
change
independence
relations
in
Markov
nets.
 22


Recommend


More recommend