Markov Random Fields Umamahesh Srinivas iPAL Group Meeting - - PowerPoint PPT Presentation

markov random fields
SMART_READER_LITE
LIVE PREVIEW

Markov Random Fields Umamahesh Srinivas iPAL Group Meeting - - PowerPoint PPT Presentation

Markov Random Fields Umamahesh Srinivas iPAL Group Meeting February 25, 2011 Outline Basic graph-theoretic concepts 1 Markov chain 2 Markov random field (MRF) 3 Gauss-Markov random field (GMRF), and applications 4 Other popular MRFs 5


slide-1
SLIDE 1

Markov Random Fields

Umamahesh Srinivas

iPAL Group Meeting

February 25, 2011

slide-2
SLIDE 2

Outline

1

Basic graph-theoretic concepts

2

Markov chain

3

Markov random field (MRF)

4

Gauss-Markov random field (GMRF), and applications

5

Other popular MRFs

02/25/2011 iPAL Group Meeting 2

slide-3
SLIDE 3

References

1

Charles Bouman, Markov random fields and stochastic image

  • models. Tutorial presented at ICIP 1995

2

Mario Figueiredo, Bayesian methods and Markov random fields. Tutorial presented at CVPR 1998

02/25/2011 iPAL Group Meeting 3

slide-4
SLIDE 4

Basic graph-theoretic concepts

A graph G = (V, E) is a finite collection of nodes (or vertices) V = {n1, n2, . . . , nN} and set of edges E ⊂ V

2

  • We consider only undirected graphs

Neighbor: Two nodes ni, nj ∈ V are neighbors if (ni, nj) ∈ E Neighborhood of a node: N(ni) = {nj : (ni, nj) ∈ E} Neighborhood is a symmetric relation: ni ∈ N(nj) ⇔ nj ∈ N(ni) Complete graph: ∀ni ∈ V, N(ni) = {(ni, nj), j = {1, 2, . . . , N}\{i}} Clique: a complete subgraph of G. Maximal clique: Clique with maximal number of nodes; cannot add any other node while still retaining complete connectedness.

02/25/2011 iPAL Group Meeting 4

slide-5
SLIDE 5

Basic graph-theoretic concepts

A graph G = (V, E) is a finite collection of nodes (or vertices) V = {n1, n2, . . . , nN} and set of edges E ⊂ V

2

  • We consider only undirected graphs

Neighbor: Two nodes ni, nj ∈ V are neighbors if (ni, nj) ∈ E Neighborhood of a node: N(ni) = {nj : (ni, nj) ∈ E} Neighborhood is a symmetric relation: ni ∈ N(nj) ⇔ nj ∈ N(ni) Complete graph: ∀ni ∈ V, N(ni) = {(ni, nj), j = {1, 2, . . . , N}\{i}} Clique: a complete subgraph of G. Maximal clique: Clique with maximal number of nodes; cannot add any other node while still retaining complete connectedness.

02/25/2011 iPAL Group Meeting 4

slide-6
SLIDE 6

Basic graph-theoretic concepts

A graph G = (V, E) is a finite collection of nodes (or vertices) V = {n1, n2, . . . , nN} and set of edges E ⊂ V

2

  • We consider only undirected graphs

Neighbor: Two nodes ni, nj ∈ V are neighbors if (ni, nj) ∈ E Neighborhood of a node: N(ni) = {nj : (ni, nj) ∈ E} Neighborhood is a symmetric relation: ni ∈ N(nj) ⇔ nj ∈ N(ni) Complete graph: ∀ni ∈ V, N(ni) = {(ni, nj), j = {1, 2, . . . , N}\{i}} Clique: a complete subgraph of G. Maximal clique: Clique with maximal number of nodes; cannot add any other node while still retaining complete connectedness.

02/25/2011 iPAL Group Meeting 4

slide-7
SLIDE 7

Illustration

V = {1, 2, 3, 4, 5, 6} E = {(1, 2), (1, 3), (2, 4), (2, 5), (3, 4), (3, 6), (4, 6), (5, 6)} N(4) = {2, 3, 6} Examples of cliques: {(1), (3, 4, 6), (2, 5)} Set of all cliques: V ∪ E ∪ {3, 4, 6}

02/25/2011 iPAL Group Meeting 5

slide-8
SLIDE 8

Separation

Let A, B, C be three disjoint subsets of V C separates A from B if any path from a node in A to a node in B contains some node in C Example: C = {1, 4, 6} separates A = {3} from B = {2, 5}

02/25/2011 iPAL Group Meeting 6

slide-9
SLIDE 9

Markov chains

Graphical model: Associate each node of a graph with a random variable (or a collection thereof) Homogeneous 1-D Markov chain: p(xn|xi, i < n) = p(xn|xn−1) Probability of a sequence given by: p(x) = p(x0)

N

  • n=1

p(xn|xn−1)

02/25/2011 iPAL Group Meeting 7

slide-10
SLIDE 10

2-D Markov chains

Advantages:

Simple expressions for probability Simple parameter estimation

Disadvantages:

No natural ordering of image pixels Anisotropic model behavior

02/25/2011 iPAL Group Meeting 8

slide-11
SLIDE 11

Random fields on graphs

Consider a collection of random variables x = (x1, x2, . . . , xN) with associated joint probability distribution p(x) Let A, B, C be three disjoint subsets of V. Let xA denote the collection of random variables in A.

Conditional independence: A ⊥ ⊥ B | C A ⊥ ⊥ B | C ⇔ p(xA, xB|xC) = p(xA|xC)p(xB|xC)

Markov random field: undirected graphical model in which each node corresponds to a random variable or a collection of random variables, and the edges identify conditional dependencies.

02/25/2011 iPAL Group Meeting 9

slide-12
SLIDE 12

Random fields on graphs

Consider a collection of random variables x = (x1, x2, . . . , xN) with associated joint probability distribution p(x) Let A, B, C be three disjoint subsets of V. Let xA denote the collection of random variables in A.

Conditional independence: A ⊥ ⊥ B | C A ⊥ ⊥ B | C ⇔ p(xA, xB|xC) = p(xA|xC)p(xB|xC)

Markov random field: undirected graphical model in which each node corresponds to a random variable or a collection of random variables, and the edges identify conditional dependencies.

02/25/2011 iPAL Group Meeting 9

slide-13
SLIDE 13

Markov properties

Pairwise Markovianity: (ni, nj) / ∈ E ⇒ xi and xj are independent when conditioned on all

  • ther variables

p(xi, xj|x\{i,j}) = p(xi|x\{i,j})p(xj|x\{i,j}) Local Markovianity: Given its neighborhood, a variable is independent on the rest of the variables p(xi|xV\{i}) = p(xi|xN (i)) Global Markovianity: Let A, B, C be three disjoint subsets of V. If C separates A from B ⇒ p(xA, xB|xC) = p(xA|xC)p(xB|xC), then p(·) is global Markov w.r.t. G.

02/25/2011 iPAL Group Meeting 10

slide-14
SLIDE 14

Markov properties

Pairwise Markovianity: (ni, nj) / ∈ E ⇒ xi and xj are independent when conditioned on all

  • ther variables

p(xi, xj|x\{i,j}) = p(xi|x\{i,j})p(xj|x\{i,j}) Local Markovianity: Given its neighborhood, a variable is independent on the rest of the variables p(xi|xV\{i}) = p(xi|xN (i)) Global Markovianity: Let A, B, C be three disjoint subsets of V. If C separates A from B ⇒ p(xA, xB|xC) = p(xA|xC)p(xB|xC), then p(·) is global Markov w.r.t. G.

02/25/2011 iPAL Group Meeting 10

slide-15
SLIDE 15

Markov properties

Pairwise Markovianity: (ni, nj) / ∈ E ⇒ xi and xj are independent when conditioned on all

  • ther variables

p(xi, xj|x\{i,j}) = p(xi|x\{i,j})p(xj|x\{i,j}) Local Markovianity: Given its neighborhood, a variable is independent on the rest of the variables p(xi|xV\{i}) = p(xi|xN (i)) Global Markovianity: Let A, B, C be three disjoint subsets of V. If C separates A from B ⇒ p(xA, xB|xC) = p(xA|xC)p(xB|xC), then p(·) is global Markov w.r.t. G.

02/25/2011 iPAL Group Meeting 10

slide-16
SLIDE 16

Hammersley-Clifford Theorem

Consider a random field x on a graph G, such that p(x) > 0. Let C denote the set of all maximal cliques of the graph. If the field has the local Markov property, then p(x) can be written as a Gibbs distribution: p(x) = 1 Z exp

  • C∈C

VC(xC)

  • ,

where Z, the normalizing constant, is called the partition function; VC(xC) are the clique potentials If p(x) can be written in Gibbs form for the cliques of some graph, then it has the global Markov property. Fundamental consequence: every Markov random field can be specified via clique potentials.

02/25/2011 iPAL Group Meeting 11

slide-17
SLIDE 17

Hammersley-Clifford Theorem

Consider a random field x on a graph G, such that p(x) > 0. Let C denote the set of all maximal cliques of the graph. If the field has the local Markov property, then p(x) can be written as a Gibbs distribution: p(x) = 1 Z exp

  • C∈C

VC(xC)

  • ,

where Z, the normalizing constant, is called the partition function; VC(xC) are the clique potentials If p(x) can be written in Gibbs form for the cliques of some graph, then it has the global Markov property. Fundamental consequence: every Markov random field can be specified via clique potentials.

02/25/2011 iPAL Group Meeting 11

slide-18
SLIDE 18

Hammersley-Clifford Theorem

Consider a random field x on a graph G, such that p(x) > 0. Let C denote the set of all maximal cliques of the graph. If the field has the local Markov property, then p(x) can be written as a Gibbs distribution: p(x) = 1 Z exp

  • C∈C

VC(xC)

  • ,

where Z, the normalizing constant, is called the partition function; VC(xC) are the clique potentials If p(x) can be written in Gibbs form for the cliques of some graph, then it has the global Markov property. Fundamental consequence: every Markov random field can be specified via clique potentials.

02/25/2011 iPAL Group Meeting 11

slide-19
SLIDE 19

Regular rectangular lattices

V = {(i, j), i = 1, . . . , M, j = 1, . . . , N} Order-K neighborhood system: N K(i, j) = {(m, n) : (i − m)2 + (j − n)2 ≤ K}

02/25/2011 iPAL Group Meeting 12

slide-20
SLIDE 20

Auto-models

Only pair-wise interactions In terms of clique potentials: |C| > 2 ⇒ VC(·) = 0 Simplest possible neighborhood models

02/25/2011 iPAL Group Meeting 13

slide-21
SLIDE 21

Gauss-Markov Random Fields (GMRF)

Joint probability function (assuming zero mean): p(x) = 1 (2π)n/2|Σ|1/2 exp

  • −1

2xT Σ−1x

  • Quadratic form in the exponent:

xT Σ−1x =

  • i
  • j

xixiΣ−1

i,j ⇒ auto-model

The neighborhood system is determined by the potential matrix Σ−1 Local conditionals are univariate Gaussian

02/25/2011 iPAL Group Meeting 14

slide-22
SLIDE 22

Gauss-Markov Random Fields (GMRF)

Joint probability function (assuming zero mean): p(x) = 1 (2π)n/2|Σ|1/2 exp

  • −1

2xT Σ−1x

  • Quadratic form in the exponent:

xT Σ−1x =

  • i
  • j

xixiΣ−1

i,j ⇒ auto-model

The neighborhood system is determined by the potential matrix Σ−1 Local conditionals are univariate Gaussian

02/25/2011 iPAL Group Meeting 14

slide-23
SLIDE 23

Gauss-Markov Random Fields

Specification via clique potentials: VC(xC) = 1 2

  • i∈C

αC

i xi

2 = 1 2

  • i∈V

αC

i xi

2 as long as i / ∈ C ⇒ αC

i = 0

The exponent of the GMRF density becomes: −

  • C∈C

VC(xC) = −1 2

  • C∈C
  • i∈V

αC

i xi

2 = 1 2

  • i∈V
  • j∈V
  • C∈C

αC

i αC j

  • xixj = −1

2xT Σ−1x.

02/25/2011 iPAL Group Meeting 15

slide-24
SLIDE 24

GMRF: Application to image processing

Classical image “smoothing” prior Consider an image to be a rectangular lattice with first-order pixel neighborhoods Cliques: pairs of vertically or horizontally adjacent pixels Clique potentials: squares of first-order differences (approximation of continuous derivative) V{(i,j),(i,j−1)}(xi,j, xi,j−1) = 1 2(xi,j − xi,j−1)2 Resulting Σ−1: block-tridiagonal with tridiagonal blocks

02/25/2011 iPAL Group Meeting 16

slide-25
SLIDE 25

Bayesian image restoration with GMRF prior

Observation model: y = Hx + n, n ∼ N(0, σ2I) Smoothing GMRF prior: p(x) ∝ exp{− 1

2xT Σ−1x}

MAP estimate: ˆ x = [σ2Σ−1 + HT H]−1HT y

02/25/2011 iPAL Group Meeting 17

slide-26
SLIDE 26

Bayesian image restoration with GMRF prior

Figure: (a) Original image, (b) Blurred and slightly noisy image, (c) Restored version of (b), (d) No blur, severe noise, (e) Restored version of (d).

Deblurring: good Denoising: oversmoothing; “edge discontinuities” smoothed out How to preserve discontinuities?

Other prior models Hidden/latent binary random variables Robust potential functions (e.g. L2 vs. L1-norm)

02/25/2011 iPAL Group Meeting 18

slide-27
SLIDE 27

Bayesian image restoration with GMRF prior

Figure: (a) Original image, (b) Blurred and slightly noisy image, (c) Restored version of (b), (d) No blur, severe noise, (e) Restored version of (d).

Deblurring: good Denoising: oversmoothing; “edge discontinuities” smoothed out How to preserve discontinuities?

Other prior models Hidden/latent binary random variables Robust potential functions (e.g. L2 vs. L1-norm)

02/25/2011 iPAL Group Meeting 18

slide-28
SLIDE 28

Compound GMRF

Insert binary variables v to “turn off” clique potentials Modified clique potentials: V (xi,j, xi,j−1, vi,j) = 1 2(1 − vi,j)(xi,j − xi,j−1)2 Intuitive explanation: v = 0 ⇒ clique potential is quadratic (“on”) v = 1 ⇒ VC(·) = 0 → no smoothing; image has an edge at this location. Can choose separate latent variables v and h for vertical and horizontal edges respectively p(x|h, v) ∝ exp

  • −1

2xT Σ−1(h, v)x

  • MAP estimate: ˆ

x = [σ2Σ−1(h, v) + HT H]−1HT y

02/25/2011 iPAL Group Meeting 19

slide-29
SLIDE 29

Compound GMRF

Insert binary variables v to “turn off” clique potentials Modified clique potentials: V (xi,j, xi,j−1, vi,j) = 1 2(1 − vi,j)(xi,j − xi,j−1)2 Intuitive explanation: v = 0 ⇒ clique potential is quadratic (“on”) v = 1 ⇒ VC(·) = 0 → no smoothing; image has an edge at this location. Can choose separate latent variables v and h for vertical and horizontal edges respectively p(x|h, v) ∝ exp

  • −1

2xT Σ−1(h, v)x

  • MAP estimate: ˆ

x = [σ2Σ−1(h, v) + HT H]−1HT y

02/25/2011 iPAL Group Meeting 19

slide-30
SLIDE 30

Compound GMRF

Insert binary variables v to “turn off” clique potentials Modified clique potentials: V (xi,j, xi,j−1, vi,j) = 1 2(1 − vi,j)(xi,j − xi,j−1)2 Intuitive explanation: v = 0 ⇒ clique potential is quadratic (“on”) v = 1 ⇒ VC(·) = 0 → no smoothing; image has an edge at this location. Can choose separate latent variables v and h for vertical and horizontal edges respectively p(x|h, v) ∝ exp

  • −1

2xT Σ−1(h, v)x

  • MAP estimate: ˆ

x = [σ2Σ−1(h, v) + HT H]−1HT y

02/25/2011 iPAL Group Meeting 19

slide-31
SLIDE 31

Compound GMRF

Insert binary variables v to “turn off” clique potentials Modified clique potentials: V (xi,j, xi,j−1, vi,j) = 1 2(1 − vi,j)(xi,j − xi,j−1)2 Intuitive explanation: v = 0 ⇒ clique potential is quadratic (“on”) v = 1 ⇒ VC(·) = 0 → no smoothing; image has an edge at this location. Can choose separate latent variables v and h for vertical and horizontal edges respectively p(x|h, v) ∝ exp

  • −1

2xT Σ−1(h, v)x

  • MAP estimate: ˆ

x = [σ2Σ−1(h, v) + HT H]−1HT y

02/25/2011 iPAL Group Meeting 19

slide-32
SLIDE 32

Discontinuity-preserving restoration

Convex potentials: Generalized Gaussians: V (x) = |x|p, p ∈ [1, 2] Stevenson: V (x) =

  • x2,

|x| < a 2a|x| − a2, |x| ≥ a Green: V (x) = 2a2 log cosh(x/a) Non-convex potentials: Blake, Zisserman: V (x) = (min{|x|, a})2 Geman, McClure: V (x) =

x2 x2+a2 02/25/2011 iPAL Group Meeting 20

slide-33
SLIDE 33

Ising model (2-D MRF)

V (xi, xj) = βδ(xi = xj), β is a model parameter Energy function:

  • C∈C

VC(xC) = β(boundary length) Longer boundaries less probable

02/25/2011 iPAL Group Meeting 21

slide-34
SLIDE 34

Application: Image segmentation

Discrete MRF used to model the segmentation field Each class represented by a value Xs ∈ {0, . . . , M − 1} Joint distribution function: P{Y ∈ dy, X = x} = p(y|x)p(x) (Bayesian) MAP estimation: ˆ X = arg max

x

px|Y (x|Y ) = arg max

x (log p(Y |x) + log(p(x))) 02/25/2011 iPAL Group Meeting 22

slide-35
SLIDE 35

Application: Image segmentation

Discrete MRF used to model the segmentation field Each class represented by a value Xs ∈ {0, . . . , M − 1} Joint distribution function: P{Y ∈ dy, X = x} = p(y|x)p(x) (Bayesian) MAP estimation: ˆ X = arg max

x

px|Y (x|Y ) = arg max

x (log p(Y |x) + log(p(x))) 02/25/2011 iPAL Group Meeting 22

slide-36
SLIDE 36

Application: Image segmentation

Discrete MRF used to model the segmentation field Each class represented by a value Xs ∈ {0, . . . , M − 1} Joint distribution function: P{Y ∈ dy, X = x} = p(y|x)p(x) (Bayesian) MAP estimation: ˆ X = arg max

x

px|Y (x|Y ) = arg max

x (log p(Y |x) + log(p(x))) 02/25/2011 iPAL Group Meeting 22

slide-37
SLIDE 37

MAP optimization for segmentation

Data model: py|x(y|x) =

  • s∈S

p(ys|xs) Prior (Ising) model: px(x) = 1 Z exp{−βt1(x)}, where t1(x) is the number of horizontal and vertical neighbors of x having a different value MAP estimate: ˆ x = arg min

x {− log py|x(y|x) + βt1(x)}

Hard optimization problem

02/25/2011 iPAL Group Meeting 23

slide-38
SLIDE 38

MAP optimization for segmentation

Data model: py|x(y|x) =

  • s∈S

p(ys|xs) Prior (Ising) model: px(x) = 1 Z exp{−βt1(x)}, where t1(x) is the number of horizontal and vertical neighbors of x having a different value MAP estimate: ˆ x = arg min

x {− log py|x(y|x) + βt1(x)}

Hard optimization problem

02/25/2011 iPAL Group Meeting 23

slide-39
SLIDE 39

Some proposed approaches

Iterated conditional modes: iterative minimization w.r.t. each pixel Simulated annealing: Generate samples from prior distribution Multi-scale resolution segmentation

Figure: (a) Synthetic image with three textures, (b) ICM, (c) Simulated annealing, (d) Multi-resolution approach.

02/25/2011 iPAL Group Meeting 24

slide-40
SLIDE 40

Summary

Graphical models study probability distributions whose conditional dependencies arise out of specific graph structures Markov random field is an undirected graphical model with special factorization properties 2-D MRFs have been widely used as priors in image processing problems Choice of potential functions leads to different optimization problems

02/25/2011 iPAL Group Meeting 25