Inference and computing with decomposable graphs Peter Green 1 Alun - - PowerPoint PPT Presentation

inference and computing with decomposable graphs
SMART_READER_LITE
LIVE PREVIEW

Inference and computing with decomposable graphs Peter Green 1 Alun - - PowerPoint PPT Presentation

Inference and computing with decomposable graphs Peter Green 1 Alun Thomas 2 1 School of Mathematics University of Bristol 2 Genetic Epidemiology University of Utah 6 September 2011 / Bayes 250 Green/Thomas (Bristol/Utah) Decomposable graphs in


slide-1
SLIDE 1

Inference and computing with decomposable graphs

Peter Green1 Alun Thomas2

1School of Mathematics

University of Bristol

2Genetic Epidemiology

University of Utah

6 September 2011 / Bayes 250

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 1 / 54

slide-2
SLIDE 2

Outline

1

Decomposable graphs

2

Bayesian model determination

3

Examples

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 2 / 54

slide-3
SLIDE 3

Decomposable graphs

Graphical models

The conditional independence graph of a multivariate distribution (for a random vector X, say) tells us much about the structure of the

  • distribution. Recall that G = (V, E) where the vertex set V is the set of

indices of the components of X, and there is an (undirected) edge between vertices i and j, written i ∼ j unless Xi ⊥ ⊥ Xj | XV\{i,j} Under conditions (positivity is sufficient), global and local Markov properties also hold. Given i.i.d. observations on X, we are often interested in inferring G, sometimes known as structural learning.

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 3 / 54

slide-4
SLIDE 4

Decomposable graphs

Decomposable graphical models

The case where G is decomposable has been much studied. Decomposability is a graph theory concept with statistical and computational implications. A graph is complete if every pair of vertices is joined by an edge. A maximal complete subgraph is called a clique. An ordering of the cliques of an undirected graph, (C1, C2, . . . , Cc) is said to be perfect if for each i = 2, 3, . . . , c, there exists h = h(i) such that Si = Ci ∩

i−1

  • j=1

Cj ⊆ Ch The sets Si are called separators. If an undirected graph admits a perfect ordering, it is said to be decomposable.

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 4 / 54

slide-5
SLIDE 5

Decomposable graphs

Decomposability: junction trees

Decomposable graphs are also known as triangulated: a graph is decomposable if and only if it has no chordless k-cycles for k ≥ 4. A perfect ordering guides the construction of a junction tree: a graph whose vertices are cliques, and with edges between Ci and Ch(i), often labelled with Si, for i = 2, 3, . . . , c. There may be many perfect

  • rderings, and many junction trees, for a given decomposable graph.

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 5 / 54

slide-6
SLIDE 6

Decomposable graphs

A small decomposable graph

Non-uniqueness

7 6 5

  • f junction tree

2 3 4 1

267 236 3456

26 36

267 236 3456

26 36 2

12

12 Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 6 / 54

slide-7
SLIDE 7

Decomposable graphs

A small decomposable graph 7 6 5

Non-uniqueness

  • f junction tree

2 3 4 1

267 236 3456

26 36

267 236 3456

26 36 2 2

12 12

13 Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 7 / 54

slide-8
SLIDE 8

Decomposable graphs

Probabilistic significance of decomposability

If the distribution of a random vector X has a decomposable conditional independence graph, then it has a remarkable representation in terms of (often low-dimensional) marginals: p(X) = c

i=1 p(XCi)

c

i=2 p(XSi)

This is the ultimate generalisation of the fact that for an ordinary Markov chain p(X) = p(X0)

N

  • i=1

p(Xi|Xi−1) = N

i=1 p(X{i−1,i})

N−1

i=2 p(Xi−1)

For a general decomposable graph, the same kind of factorisation follows the edges of the junction tree.

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 8 / 54

slide-9
SLIDE 9

Decomposable graphs

Computational significance of decomposability

There are many consequences for computing with distributions on decomposable graphs, including junction tree algorithms (message passing/probability propagation) for Bayes nets (discrete graphical models).

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 9 / 54

slide-10
SLIDE 10

Decomposable graphs

Message passing

A B C A B C

AB BC B

C=1 C=0 A=1 A=0 1 B=0 .2 .4 B=1 .1 .3 B=0 1/3 2/3 B=1 1/4 3/4 B=0 1 B=1 A 1 A 0 .6 B=1 .4 B=0 1/3  6/1 2/3  6/1 B=1 1/4 .4/1 3/4.4/1 B=0 A=1 A=0 1/3 .6/1 2/3 .6/1 B=1

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 10 / 54

slide-11
SLIDE 11

Decomposable graphs

Message passing

A B C A B C

AB BC B

C=1 C=0 A=1 A=0 .4 B=0 .2 .4 B=1 .1 .3 B=0 .2 .4 B=1 .1 .3 B=0 .6 B=1 A 1 A 0 .6 B=1 .4 B=0 1/3  6/1 2/3  6/1 B=1 1/4 .4/1 3/4.4/1 B=0 A=1 A=0 1/3 .6/1 2/3 .6/1 B=1

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 11 / 54

slide-12
SLIDE 12

Decomposable graphs

Scheduling the messages root root

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 12 / 54

slide-13
SLIDE 13

Decomposable graphs

Scheduling the messages root root

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 13 / 54

slide-14
SLIDE 14

Decomposable graphs

Statistical significance of decomposability

Maximum likelihood estimates can be computed exactly for contingency tables and multivariate Gaussian distributions on decomposable graphs, and there are exact tests for conditional

  • independence. Some of this theory extends to mixed data models

based on CG distributions. In Bayesian modelling, the ideas of hyper Markov modelling allow the construction of prior distributions respecting the graphical structure, which in turn supports the adoption of priors that are guaranteed to be consistent across models. The clique–separator factorisation yields dramatic speed-ups in computing MCMC updates in structural learning, and in simulation and posterior analysis of fitted models.

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 14 / 54

slide-15
SLIDE 15

Decomposable graphs

How restrictive is decomposability?

How many graphs are decomposable? There are 2(v

2) graphs altogether on v vertices.

For v ≤ 3 vertices, all are decomposable for 4 vertices, 61/64 for 6, ≈ 80% for 16, ≈ 45%. The 3 non-decomposable 4-vertex graphs:

         

 61/64 – all but: 16 45% 16 45%

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 15 / 54

slide-16
SLIDE 16

Bayesian model determination

Bayesian graphical model determination

Given n i.i.d. samples X = (X1, X2, . . . , Xn) from a multivariate distribution on Rv parameterised by the graph G and parameters ψ, a typical formulation takes the form p(G, ψ, X) = p(G)p(ψ|G)p(X|G, ψ) and we perform joint structural/quantitative learning by computing the posterior p(G, ψ|X) ∝ p(G, ψ, X). Decomposable G: see Giudici & G (1999) (Gaussian case) and by Giudici, G & Tarantola (2000) (contingency table case). These follow the important work of Dawid & Lauritzen (1993) on hyper-Markov laws that encode parameter priors p(ψ|G) that are consistent across G.

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 16 / 54

slide-17
SLIDE 17

Bayesian model determination

Bayesian graphical model determination

Given n i.i.d. samples X = (X1, X2, . . . , Xn) from a multivariate distribution on Rv parameterised by the graph G and parameters ψ, a typical formulation takes the form p(G, ψ, X) = p(G)p(ψ|G)p(X|G, ψ) and we perform joint structural/quantitative learning by computing the posterior p(G, ψ|X) ∝ p(G, ψ, X). Decomposable G: see Giudici & G (1999) (Gaussian case) and by Giudici, G & Tarantola (2000) (contingency table case). These follow the important work of Dawid & Lauritzen (1993) on hyper-Markov laws that encode parameter priors p(ψ|G) that are consistent across G.

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 16 / 54

slide-18
SLIDE 18

Bayesian model determination

Bayesian graphical model determination

Given n i.i.d. samples X = (X1, X2, . . . , Xn) from a multivariate distribution on Rv parameterised by the graph G and parameters ψ, a typical formulation takes the form p(G, ψ, X) = p(G)p(ψ|G)p(X|G, ψ) and we perform joint structural/quantitative learning by computing the posterior p(G, ψ|X) ∝ p(G, ψ, X). General G: Earlier and later work, by Dellaportas & Forster and others – but use non-hierarchical non-necessarily-consistent formulations. See also Jones et al, Stat. Sci., 2005.

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 17 / 54

slide-19
SLIDE 19

Bayesian model determination

Bayesian graphical model determination

The Giudici & G work on decomposable graphical gaussian model determination considers the joint posterior p(G, ψ|X). In the gaussian case X ∼ Nv(µ, Σ), the graph G is encoded in the pattern of zeroes in the concentration (inverse variance) matrix: (Σ−1)ij = 0 ⇔ Xi ⊥ ⊥ Xj | XV\{i,j} The model places a hyper inverse Wishart prior on Σ−1, in various versions, and exploits ideas of covariance selection and positive definite matrix completion.

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 18 / 54

slide-20
SLIDE 20

Bayesian model determination

Bayesian graphical model determination

In MCMC sampling using single-edge moves, a junction tree representation of the current G permits both cheap pre-testing that the proposed new graph G′ is decomposable fast local updating of the graph from G to G′ when the move passes the Metropolis–Hastings acceptance test

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 19 / 54

slide-21
SLIDE 21

Bayesian model determination

Pre-tests for maintaining decomposability

Frydenberg & Lauritzen: Let G and G′ be decomposable graphs on the same vertex set, with G′ formed from G by the addition of exactly one

  • edge. Then this edge must be contained in exactly one clique of G.

Giudici & Green: If a and b are non-adjacent vertices in a decomposable graph G, then the graph G′ formed from G by connecting (a, b) is decomposable if and only if either

1

a and b are in different connected components, or

2

they are in the same component of G, and there exist cliques a ∪ R and b ∪ T, for which S = R ∩ T is a separator on the path between a ∪ R and b ∪ T in a junction forest representing G.

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 20 / 54

slide-22
SLIDE 22

Bayesian model determination

Pre-tests for maintaining decomposability

Frydenberg & Lauritzen: Let G and G′ be decomposable graphs on the same vertex set, with G′ formed from G by the addition of exactly one

  • edge. Then this edge must be contained in exactly one clique of G.

Giudici & Green: If a and b are non-adjacent vertices in a decomposable graph G, then the graph G′ formed from G by connecting (a, b) is decomposable if and only if either

1

a and b are in different connected components, or

2

they are in the same component of G, and there exist cliques a ∪ R and b ∪ T, for which S = R ∩ T is a separator on the path between a ∪ R and b ∪ T in a junction forest representing G.

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 20 / 54

slide-23
SLIDE 23

Bayesian model determination

Single edge move

You can add edge g (1,7) since 1R and 7T are

7 6 5

cliques (with R={2} and T={2,6}) and { }) RT={2} is a separator on path

2 3 4 1

between them 267 236 3456

26 36

267 236 3456

26 36 2

12

15 Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 21 / 54

slide-24
SLIDE 24

Bayesian model determination

Single edge move

You cannot add edge (1,4) since the only cliques

7 6 5

y q containing 1 and 4

  • resp. are {1,2} and

p { } {3,4,5,6}, and {2}{3,5,6} is not a

2 3 4 1

{ } { } separator on path between them 267 236 3456

26 36

267 236 3456

26 36 2

12

19 Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 22 / 54

slide-25
SLIDE 25

Bayesian model determination

Single edge move

Once the test is

7 6 5

complete, actually committing to adding

  • r deleting the edge

is little work

2 3 4 1

267 236 3456

26 36

267 236 3456

26 36 2

12

22 Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 23 / 54

slide-26
SLIDE 26

Bayesian model determination

Single edge move 7 6 5

Once the test is complete, actually committing to adding

2 3 4 1

  • r deleting the edge

is little work 267 236 3456

26 36

It makes only ( l ti l ) 267 236 3456

26 36 27 2

a (relatively) local change t th j ti t 127 12 to the junction tree

23 Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 24 / 54

slide-27
SLIDE 27

Bayesian model determination

Single edge move 7 6 5

Once the test is complete, actually committing to adding

2 3 4 1

  • r deleting the edge

is little work 267 236 3456

26 36

It makes only ( l ti l ) 267 236 3456

26 36 27

a (relatively) local change t th j ti t

6

127 to the junction tree

24 Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 25 / 54

slide-28
SLIDE 28

Bayesian model determination

Single edge move 7 6 5

Once the test is complete, actually committing to adding

2 3 4 1

  • r deleting the edge

is little work 267 236 356

26 36

It makes only ( l ti l ) 267 236 356

26 36 27

a (relatively) local change t th j ti t

35

127 to the junction tree 345

25 Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 26 / 54

slide-29
SLIDE 29

Bayesian model determination

Bayesian graphical model determination

In MCMC sampling using single-edge moves, a junction tree representation of the current G permits both cheap pre-testing that the proposed new graph G′ is decomposable fast local updating of the graph from G to G′ when the move passes the Metropolis–Hastings acceptance test However, the current junction tree may need to be manipulated to a different tree representing the current graph G before the move can be completed.

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 27 / 54

slide-30
SLIDE 30

Bayesian model determination Sampling junction trees

Using the junction tree as the state

Can we by-pass this manipulation by using directly the junction tree J as part of the model parameterisation, in place of the graph G? This means augmenting the model so that, conditional on G, the junction tree J is a priori drawn uniformly from among all equivalent junction trees, thus replacing the prior p(G) on decomposable graphs by

  • p(J) = p(G(J))

µ(G(J)) where G(J) is the decomposable graph determined by J and µ(G) is the number of equivalent junction trees representing G.

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 28 / 54

slide-31
SLIDE 31

Bayesian model determination Sampling junction trees

Using the junction tree as the state

Trade-off between faster, more restrictive choice of proposed vertex pairs (x, y) specifying edges to be added, and avoidance of the manipulation from one junction tree to another, and the space of possible (junction tree) states of the chain being less connected

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 29 / 54

slide-32
SLIDE 32

Bayesian model determination Sampling junction trees

Using the junction tree as the state

Two decomposable graphs differing in

  • nly one edge.

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 30 / 54

slide-33
SLIDE 33

Bayesian model determination Sampling junction trees

Using the junction tree as the state

Whether they are adjacent in the junction tree representation depends on the choice of junction tree.

127 267 236 3456

26 36 27

12 267 236 3456

26 36 2

12 267 236 3456

26 36 2

(a1) (a2) (b)

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 31 / 54

slide-34
SLIDE 34

Bayesian model determination Sampling junction trees

Using the junction tree as the state

Paraphrasing the conditions for maintaining decomposability: (C) Connecting x and y by adding an edge (x, y) to G will result in a decomposable graph if and only if x and y are contained in cliques that are adjacent in some junction tree of G. (D) Disconnecting x and y by removing an edge (x, y) from G will result in a decomposable graph if and only if x and y are contained in exactly one clique. Our new approach means that we only have to look at the current junction tree in (C).

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 32 / 54

slide-35
SLIDE 35

Bayesian model determination Sampling junction trees

Using the junction tree as the state

Paraphrasing the conditions for maintaining decomposability: (C) Connecting x and y by adding an edge (x, y) to G will result in a decomposable graph if and only if x and y are contained in cliques that are adjacent in some junction tree of G. (D) Disconnecting x and y by removing an edge (x, y) from G will result in a decomposable graph if and only if x and y are contained in exactly one clique. Our new approach means that we only have to look at the current junction tree in (C).

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 32 / 54

slide-36
SLIDE 36

Bayesian model determination Sampling junction trees

Multiple-edge perturbations

We can make bigger perturbations, without losing ability to pre-test for maintaining decomposability and make local updates. We say two disjoint non-empty connected sets of vertices X and Y are completely connected if every vertex in X is connected to every vertex in Y. They are completely disconnected if no vertices in X are connected to any vertices in Y. (C) If X and Y are completely disconnected and subsets of cliques that are adjacent in some junction tree, then X and Y can be completely connected, resulting in a new decomposable graph (D) If X and Y are completely connected, and subsets of exactly one clique, and some other stuff too complicated to fit in here but all checkable locally, then X and Y can be completely disconnected, resulting in a new decomposable graph

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 33 / 54

slide-37
SLIDE 37

Bayesian model determination Sampling junction trees

Manipulations to junction tree on connecting or disconnecting X and Y.

XSP YSQ XS YS XYS XSP YSQ S XS YSQ S XYS YSQ YS XSP YS S XSP XYS XS XS YS S XYS

(a) (b) (c) (d) Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 34 / 54

slide-38
SLIDE 38

Bayesian model determination Sampling junction trees

Manipulations to junction tree on connecting or disconnecting X and Y. XS YS S XYS

(a)

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 35 / 54

slide-39
SLIDE 39

Bayesian model determination Enumerating junction trees

Enumerating junction trees

To use this, we need to know the number of equivalent junction trees for the graph G. We do! It is µ(G) =

s

  • i=1

ν(S[i]) where S[i], i = 1, 2, . . . , s are the distinct separators, and ν(S) = tmS−1

S mS+1

  • j=1

fj. Here tS is the number of nodes in TS, the subtree of J induced by the cliques containing S, mS is the multiplicity of separator S, and fj, j = 1, 2, . . . , mS + 1 are the sizes of the components of the forest FS

  • btained from TS by deleting links associated with S.

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 36 / 54

slide-40
SLIDE 40

Bayesian model determination Enumerating junction trees

Example of enumerating junction trees

A decomposable graph G containing 23 vertices in 4 disjoint components.

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 37 / 54

slide-41
SLIDE 41

Bayesian model determination Enumerating junction trees

Example of enumerating junction trees

One possible junction tree J for the graph shown before.

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 38 / 54

slide-42
SLIDE 42

Bayesian model determination Enumerating junction trees

Example of enumerating junction trees

T{3}, the connected subtree

  • f the junction

graph J induced by the cliques that contain the separator {3}.

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 39 / 54

slide-43
SLIDE 43

Bayesian model determination Enumerating junction trees

Example of enumerating junction trees

F{3}, the forest

  • btained by from

the tree T{3} by deleting edges associated with the separator {3}.

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 40 / 54

slide-44
SLIDE 44

Bayesian model determination Enumerating junction trees

Example of enumerating junction trees

The decomposable graph G can be represented by 57,802,752 different junction trees!

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 41 / 54

slide-45
SLIDE 45

Examples

Demo

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 42 / 54

slide-46
SLIDE 46

Examples

All decomposable graphs on 7 vertices

We iterated through all 2,097,152 undirected graphs on 7 labelled vertices and identified the 617,675 decomposable ones. A list of the cliques of each decomposable graph was found and used as an index into a table of counters. The decomposable graphs were sorted from those with most representations (16,807 for the trivial graph) to least (187,447 have a single junction tree). To test the uniformity of sampling with the new sampler, we used it to sample both uniformly on decomposable graphs, and uniformly on junction trees: 1,000,000 graphs sampled in each case.

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 43 / 54

slide-47
SLIDE 47

Examples

All decomposable graphs on 7 vertices

Comparing theoretical and empirical distributions over graphs, when sampling uniformly (a) on trees, (b) on graphs

0e+00 1e+05 2e+05 3e+05 4e+05 5e+05 6e+05 0.0 0.2 0.4 0.6 0.8 1.0 Index of decomposable graph Cumulative frequency (a) (b) Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 44 / 54

slide-48
SLIDE 48

Examples

A graphical Gaussian intra-class model

Given a decomposable graph G on v vertices labelled 1, 2, . . . , v, and real scalar parameters σ2 > 0 and ρ, we define a non-negative definite matrix V = VG(σ2, ρ) by Vij =

  • σ2

if i = j ρσ2 if (i, j) is an edge in G, and (V −1)ij = 0 if (i, j) is not an edge in G. By Grone et al (1984), since G is decomposable and V restricted to each clique is positive definite, V exists and is unique, in fact the unique completion of the specified entries that is positive definite; it is the variance matrix of a v–variate Gaussian distribution for which G is the conditional independence graph. We call this the graphical Gaussian intra-class model (GGIM).

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 45 / 54

slide-49
SLIDE 49

Examples

A 50-vertex graphical Gaussian intra-class model

We simulated 1000 GGIM observations on 50 variables with σ2 = 30 and ρ = 0.2. We used a second order Markov Chain graphical structure, that is, (V −1)ij = 0 for all i and j such that |i − j| > 2. In each case we started from the initial conditions of σ2 = 1, ρ = 0 and G set to have no edges indicating complete independence between the 50 variables. We made 1,000,000 Metropolis–Hastings updates with each sampler and output values indicating the state of the chain after ever 100 iterations. The parameters σ2 and ρ were updated after each 1,000 Metropolis–Hastings steps. For the junction tree samplers we also randomized the junction tree after every 1,000 Metropolis–Hastings steps.

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 46 / 54

slide-50
SLIDE 50

Examples

A 50-vertex graphical Gaussian intra-class model

Log likelihoods and parameter estimates for three samplers for the GGIM model, plotted by sample

  • number. The

values of the parameters used to generate the data are shown by the red horizontal lines.

0e+00 4e+05 8e+05 1000 2000 Sample number Log likelihood

Junction tree sampler

0e+00 4e+05 8e+05 1000 2000 Sample number Log likelihood

Multi pair junction tree sampler

0e+00 4e+05 8e+05 1000 2000 Sample number Log likelihood

Giudici Green sampler

0e+00 4e+05 8e+05 28.5 29.5 30.5 Sample number Sampled variance

Junction tree sampler

0e+00 4e+05 8e+05 28.5 29.5 30.5 Sample number Sampled variance

Multi pair junction tree sampler

0e+00 4e+05 8e+05 28.5 29.5 30.5 Sample number Sampled variance

Giudici Green sampler

0e+00 4e+05 8e+05 0.00 0.10 0.20 Sample number Sampled correlation

Junction tree sampler

0e+00 4e+05 8e+05 0.00 0.10 0.20 Sample number Sampled correlation

Multi pair junction tree sampler

0e+00 4e+05 8e+05 0.00 0.10 0.20 Sample number Sampled correlation

Giudici Green sampler

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 47 / 54

slide-51
SLIDE 51

Examples

A 50-vertex graphical Gaussian intra-class model

Cumulative acceptance rates and times taken by the three samplers for the GGIM

  • model. In each

case the curve (a) is the single edge junction tree sampler, (b) is the multi edge junction tree sampler, and (c) is the Giudici–Green sampler.

0e+00 2e+05 4e+05 6e+05 8e+05 1e+06 5 10 15 20 Sample number Cumulative acceptance rate %

Cumulative acceptance rate by sample number

(a) (b) (c) 0e+00 2e+05 4e+05 6e+05 8e+05 1e+06 20 40 60 80 Sample number Time taken in seconds

Cumulative time taken by sample number

(a) (b) (c)

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 48 / 54

slide-52
SLIDE 52

Examples

A 50-vertex graphical Gaussian intra-class model

A graph typical of the type sampled early in their runs by all three samplers for the GGIM model. The edge between variables 1 and 39 is spurious, and has to be removed before the correct edges near variables 25 and 26 can be added.

1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 1 3 1 4 1 5 1 6 1 7 1 8 1 9 2 0 2 1 2 2 2 3 2 4 2 5 2 6 2 7 2 8 2 9 3 0 3 1 3 2 3 3 3 4 3 5 3 6 3 7 3 8 3 9 4 0 4 1 4 2 4 3 4 4 4 5 4 6 4 7 4 8 4 9

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 49 / 54

slide-53
SLIDE 53

Examples

A 1000-vertex graphical Gaussian intra-class model

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 50 / 54

slide-54
SLIDE 54

Examples

A 1000-vertex graphical Gaussian intra-class model

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 51 / 54

slide-55
SLIDE 55

Examples

A 1000-vertex graphical Gaussian intra-class model

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 52 / 54

slide-56
SLIDE 56

Examples

A 1000-vertex graphical Gaussian intra-class model

5 10 15 20 28.5 30.0 31.5

Variance

5 10 15 20 0.00 0.10 0.20

Correlation

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 53 / 54

slide-57
SLIDE 57

Examples

“Enumerating the decomposable neighbours of a decomposable graph under a simple perturbation scheme”, CSDA, 2009, by Thomas and Green “Enumerating the junction trees of a decomposable graph”, JCGS, 2009, by Thomas and Green “Sampling decomposable graphs using a Markov chain on junction trees”, submitted, 2011, by Green and Thomas Webpage: www.stats.bris.ac.uk/∼peter/ Email: P .J.Green@bristol.ac.uk

Green/Thomas (Bristol/Utah) Decomposable graphs in statistics Edinburgh, September 2011 54 / 54