Statistical Inference in Gaussian Graphical Models Y. Baraud (1) , - - PowerPoint PPT Presentation

statistical inference in gaussian graphical models
SMART_READER_LITE
LIVE PREVIEW

Statistical Inference in Gaussian Graphical Models Y. Baraud (1) , - - PowerPoint PPT Presentation

Statistical Inference in Gaussian Graphical Models Y. Baraud (1) , C. Giraud (1 , 2) , S. Huet (2) , N. Verzelen (3) (1) Universit e de Nice, (2) INRA Jouy-en-Josas, (3) Universit e Paris Sud Vienna 2008. Christophe GIRAUD Statistical


slide-1
SLIDE 1

Statistical Inference in Gaussian Graphical Models

  • Y. Baraud(1), C. Giraud(1,2), S. Huet(2), N. Verzelen(3)

(1) Universit´ e de Nice, (2) INRA Jouy-en-Josas, (3) Universit´ e Paris Sud

Vienna 2008.

Christophe GIRAUD Statistical Inference of Gaussian Graphs

slide-2
SLIDE 2

Gene - gene regulation network of E. coli

Christophe GIRAUD Statistical Inference of Gaussian Graphs

slide-3
SLIDE 3

Protein - protein network of S. cerevisiae

1458 proteins (vertices) and their 1948 known interactions (edges)

Christophe GIRAUD Statistical Inference of Gaussian Graphs

slide-4
SLIDE 4

Inferring gene regulation networks

Data: massive transcriptomic data sets produced by microarrays. Differential analysis of data obtained in different conditions: with or without deletion

  • f a gene, with or without stress, etc.

Analysis of the conditional dependences in the data (exploits the whole data set).

Christophe GIRAUD Statistical Inference of Gaussian Graphs

slide-5
SLIDE 5

A few statistical tools

Descriptive tools: Kernel methods (supervised learning) Model based tools: Bayesian Networks Gaussian Graphical Models

Christophe GIRAUD Statistical Inference of Gaussian Graphs

slide-6
SLIDE 6

Gaussian Graphical Models

Christophe GIRAUD Statistical Inference of Gaussian Graphs

slide-7
SLIDE 7

Gaussian Graphical Models

Statistical model: The transcription levels (X (1), . . . , X (p)) of the p genes are modeled by a Gaussian law in Rp. Graph of the conditional dependences: graph g with an edge i

g

∼ j between the genes i and j iff X (i) and X (j) are not independent given

  • X (k), k = i, j
  • regulation network ←

→ graph g

Christophe GIRAUD Statistical Inference of Gaussian Graphs

slide-8
SLIDE 8

The task of the statistician

Goal: estimate g from a sample X1, . . . , Xn. Main difficulty: n ≪ p p ≈ a few 100 to a few 1000 genes n ≈ a few tens New algorithms: based on thresholding or regularization − → many of them have quite disappointing numerical performances (Villers et al. 2008) − → no theoretical results or in an asymptotic framework (with strong hypotheses on the covariance)

Christophe GIRAUD Statistical Inference of Gaussian Graphs

slide-9
SLIDE 9

Estimation by model selection

Christophe GIRAUD Statistical Inference of Gaussian Graphs

slide-10
SLIDE 10

Partial correlations

Hypothesis: (X (1), . . . , X (p)) ∼ N(0, C) in Rp, with C ≻ 0. Notation: We write θ =

  • θ(j)

k

  • for the p × p matrix such that

θ(j)

j

= 0 and E

  • X (j) | X (k), k = j
  • =

k=j θ(j) k X (k).

Skeleton of θ: we have θ(j)

i

=

Cov(X (i),X (j)|X (k), k=i,j) Var(X (j)|X (k), k=j)

so θ(j)

i

= 0 ⇐ ⇒ i

g

∼ j Goal: Estimate θ from a sample X1, . . . , Xn with quality criterion MSEP(ˆ θ) = E

  • C 1/2(ˆ

θ − θ)2

p×p

  • = E
  • X T

new(ˆ

θ − θ)2

1×p

  • Christophe GIRAUD

Statistical Inference of Gaussian Graphs

slide-11
SLIDE 11

Partial correlations

Hypothesis: (X (1), . . . , X (p)) ∼ N(0, C) in Rp, with C ≻ 0. Notation: We write θ =

  • θ(j)

k

  • for the p × p matrix such that

θ(j)

j

= 0 and E

  • X (j) | X (k), k = j
  • =

k=j θ(j) k X (k).

Skeleton of θ: we have θ(j)

i

=

Cov(X (i),X (j)|X (k), k=i,j) Var(X (j)|X (k), k=j)

so θ(j)

i

= 0 ⇐ ⇒ i

g

∼ j Goal: Estimate θ from a sample X1, . . . , Xn with quality criterion MSEP(ˆ θ) = E

  • C 1/2(ˆ

θ − θ)2

p×p

  • = E
  • X T

new(ˆ

θ − θ)2

1×p

  • Christophe GIRAUD

Statistical Inference of Gaussian Graphs

slide-12
SLIDE 12

Partial correlations

Hypothesis: (X (1), . . . , X (p)) ∼ N(0, C) in Rp, with C ≻ 0. Notation: We write θ =

  • θ(j)

k

  • for the p × p matrix such that

θ(j)

j

= 0 and E

  • X (j) | X (k), k = j
  • =

k=j θ(j) k X (k).

Skeleton of θ: we have θ(j)

i

=

Cov(X (i),X (j)|X (k), k=i,j) Var(X (j)|X (k), k=j)

so θ(j)

i

= 0 ⇐ ⇒ i

g

∼ j Goal: Estimate θ from a sample X1, . . . , Xn with quality criterion MSEP(ˆ θ) = E

  • C 1/2(ˆ

θ − θ)2

p×p

  • = E
  • X T

new(ˆ

θ − θ)2

1×p

  • Christophe GIRAUD

Statistical Inference of Gaussian Graphs

slide-13
SLIDE 13

Partial correlations

Hypothesis: (X (1), . . . , X (p)) ∼ N(0, C) in Rp, with C ≻ 0. Notation: We write θ =

  • θ(j)

k

  • for the p × p matrix such that

θ(j)

j

= 0 and E

  • X (j) | X (k), k = j
  • =

k=j θ(j) k X (k).

Skeleton of θ: we have θ(j)

i

=

Cov(X (i),X (j)|X (k), k=i,j) Var(X (j)|X (k), k=j)

so θ(j)

i

= 0 ⇐ ⇒ i

g

∼ j Goal: Estimate θ from a sample X1, . . . , Xn with quality criterion MSEP(ˆ θ) = E

  • C 1/2(ˆ

θ − θ)2

p×p

  • = E
  • X T

new(ˆ

θ − θ)2

1×p

  • Christophe GIRAUD

Statistical Inference of Gaussian Graphs

slide-14
SLIDE 14

Estimation strategy

Estimation procedure

1 Choose a collection G of candidate graphs

e.g. all the graphs with p vertices and degree ≤ D,

2 Associate to each graph g ∈ G an estimator ˆ

θg ˆ θg = argmin

A∼g

X(I − A)2

n×p

(empirical MSEP)

3 Select one ˆ

θˆ

g by minimizing a penalized empirical risk

with a criterion inspired by that in Baraud et al.

Christophe GIRAUD Statistical Inference of Gaussian Graphs

slide-15
SLIDE 15

Estimation strategy

Estimation procedure

1 Choose a collection G of candidate graphs

e.g. all the graphs with p vertices and degree ≤ D,

2 Associate to each graph g ∈ G an estimator ˆ

θg ˆ θg = argmin

A∼g

X(I − A)2

n×p

(empirical MSEP)

3 Select one ˆ

θˆ

g by minimizing a penalized empirical risk

with a criterion inspired by that in Baraud et al.

Christophe GIRAUD Statistical Inference of Gaussian Graphs

slide-16
SLIDE 16

Estimation strategy

Estimation procedure

1 Choose a collection G of candidate graphs

e.g. all the graphs with p vertices and degree ≤ D,

2 Associate to each graph g ∈ G an estimator ˆ

θg ˆ θg = argmin

A∼g

X(I − A)2

n×p

(empirical MSEP)

3 Select one ˆ

θˆ

g by minimizing a penalized empirical risk

with a criterion inspired by that in Baraud et al.

Christophe GIRAUD Statistical Inference of Gaussian Graphs

slide-17
SLIDE 17

Estimation strategy

Estimation procedure

1 Choose a collection G of candidate graphs

e.g. all the graphs with p vertices and degree ≤ D,

2 Associate to each graph g ∈ G an estimator ˆ

θg ˆ θg = argmin

A∼g

X(I − A)2

n×p

(empirical MSEP)

3 Select one ˆ

θˆ

g by minimizing a penalized empirical risk

with a criterion inspired by that in Baraud et al.

Christophe GIRAUD Statistical Inference of Gaussian Graphs

slide-18
SLIDE 18

Theorem: risk bound.

When deg(G) = max {deg(g), g ∈ G} fulfills deg(G) ≤ ρ n 2

  • 1.1 + √log p

2 , for some ρ < 1, then the MSEP of ˆ θ is bounded by

MSEP(ˆ θ) ≤ cρ log(p) inf

g∈G

  • MSEP(ˆ

θg) ∨ C 1/2(I − θ)2 n

  • + Rn

where Rn = O

  • Tr(C)e−κρn

.

Christophe GIRAUD Statistical Inference of Gaussian Graphs

slide-19
SLIDE 19

Theorem: risk bound.

When deg(G) = max {deg(g), g ∈ G} fulfills deg(G) ≤ ρ n 2

  • 1.1 + √log p

2 , for some ρ < 1, then the MSEP of ˆ θ is bounded by

MSEP(ˆ θ) ≤ cρ log(p) inf

g∈G

  • MSEP(ˆ

θg) ∨ C 1/2(I − θ)2 n

  • + Rn

where Rn = O

  • Tr(C)e−κρn

.

Christophe GIRAUD Statistical Inference of Gaussian Graphs

slide-20
SLIDE 20

Theorem: risk bound.

When deg(G) = max {deg(g), g ∈ G} fulfills deg(G) ≤ ρ n 2

  • 1.1 + √log p

2 , for some ρ < 1, then the MSEP of ˆ θ is bounded by

MSEP(ˆ θ) ≤ cρ log(p) inf

g∈G

  • MSEP(ˆ

θg) ∨ C 1/2(I − θ)2 n

  • + Rn

where Rn = O

  • Tr(C)e−κρn

.

Christophe GIRAUD Statistical Inference of Gaussian Graphs

slide-21
SLIDE 21

Theorem: risk bound.

When deg(G) = max {deg(g), g ∈ G} fulfills deg(G) ≤ ρ n 2

  • 1.1 + √log p

2 , for some ρ < 1, then the MSEP of ˆ θ is bounded by

MSEP(ˆ θ) ≤ cρ log(p) inf

g∈G

  • MSEP(ˆ

θg) ∨ C 1/2(I − θ)2 n

  • + Rn

where Rn = O

  • Tr(C)e−κρn

.

Christophe GIRAUD Statistical Inference of Gaussian Graphs

slide-22
SLIDE 22

Theorem: risk bound.

When deg(G) = max {deg(g), g ∈ G} fulfills deg(G) ≤ ρ n 2

  • 1.1 + √log p

2 , for some ρ < 1, then the MSEP of ˆ θ is bounded by

MSEP(ˆ θ) ≤ cρ log(p) inf

g∈G

  • MSEP(ˆ

θg) ∨ C 1/2(I − θ)2 n

  • + Rn

where Rn = O

  • Tr(C)e−κρn

.

Christophe GIRAUD Statistical Inference of Gaussian Graphs

slide-23
SLIDE 23

Theory

Condition on the degree

Christophe GIRAUD Statistical Inference of Gaussian Graphs

slide-24
SLIDE 24

How far can we trust the empirical MSEP?

Prediction error: MSEP(ˆ θ) = E(C 1/2(θ−ˆ θ)2) = E(C 1/2(I−ˆ θ)2)−C 1/2(I−θ)2

Proposition: From empirical to population MSEP

Under the previous condition on the degree, we have with large probability (1−δ)C 1/2(I−ˆ θ)p×p ≤ 1 √n X(I−ˆ θ)n×p ≤ (1+δ)C 1/2(I−ˆ θ)p×p for all matrices ˆ θ ∈

g∈G Θg.

Christophe GIRAUD Statistical Inference of Gaussian Graphs

slide-25
SLIDE 25

How far can we trust the empirical MSEP?

Prediction error: MSEP(ˆ θ) = E(C 1/2(θ−ˆ θ)2) = E(C 1/2(I−ˆ θ)2)−C 1/2(I−θ)2

Proposition: From empirical to population MSEP

Under the previous condition on the degree, we have with large probability (1−δ)C 1/2(I−ˆ θ)p×p ≤ 1 √n X(I−ˆ θ)n×p ≤ (1+δ)C 1/2(I−ˆ θ)p×p for all matrices ˆ θ ∈

g∈G Θg.

Christophe GIRAUD Statistical Inference of Gaussian Graphs

slide-26
SLIDE 26

Lemma: Restricted Inf / Sup of Random Matrices

Consider a n × p matrix Z with n < p and i.i.d. Zi,j ∼ N(0, 1). Consider also a collection V1, . . . , VN of subspaces of Rp with di- mension d < n. Then for any x > 0 P

  • inf

v∈V1∪...∪VN 1 √n Zv

v ≤ 1 − √ d + √2 log N + δN + x √n

  • ≤ e−x2/2

where δN =

1 N√8 log N .

Christophe GIRAUD Statistical Inference of Gaussian Graphs

slide-27
SLIDE 27

A geometrical constraint

When C = I, there exists some constant c(δ) > 0 such that for any n, p, G fulfilling deg(G) ≥ c(δ) n 1 + log (p/n), there exists no n × p matrix X fulfilling (1 − δ)C 1/2(I − ˆ θ) ≤ 1 √n X(I − ˆ θ) ≤ (1 + δ)C 1/2(I − ˆ θ) for all ˆ θ ∈

g∈G Θg.

Christophe GIRAUD Statistical Inference of Gaussian Graphs

slide-28
SLIDE 28

In practice

Numerical performance

Christophe GIRAUD Statistical Inference of Gaussian Graphs

slide-29
SLIDE 29

Random graphs, n = 15 and p increases

Christophe GIRAUD Statistical Inference of Gaussian Graphs

slide-30
SLIDE 30

Conclusion

Some nice features:

good theoretical properties: non-asymptotic control of the MSEP with no condition on the covariance matrix C good numerical performances: even when n ≪ p

BUT

very high numerical complexity: typically n × pdeg(G)+1 = ⇒ cannot be used in practice when p > 50 . . .

Ongoing work: with S. Huet and N. Verzelen

Reduction of the size of the collection of graph, using data- driven collections.

Christophe GIRAUD Statistical Inference of Gaussian Graphs

slide-31
SLIDE 31

Conclusion

Some nice features:

good theoretical properties: non-asymptotic control of the MSEP with no condition on the covariance matrix C good numerical performances: even when n ≪ p

BUT

very high numerical complexity: typically n × pdeg(G)+1 = ⇒ cannot be used in practice when p > 50 . . .

Ongoing work: with S. Huet and N. Verzelen

Reduction of the size of the collection of graph, using data- driven collections.

Christophe GIRAUD Statistical Inference of Gaussian Graphs

slide-32
SLIDE 32

Conclusion

Some nice features:

good theoretical properties: non-asymptotic control of the MSEP with no condition on the covariance matrix C good numerical performances: even when n ≪ p

BUT

very high numerical complexity: typically n × pdeg(G)+1 = ⇒ cannot be used in practice when p > 50 . . .

Ongoing work: with S. Huet and N. Verzelen

Reduction of the size of the collection of graph, using data- driven collections.

Christophe GIRAUD Statistical Inference of Gaussian Graphs

slide-33
SLIDE 33

References

Main reference of the talk

  • C. Giraud. Estimation of Gaussian graphs by model selection.

Electronic Journal of Statistics. Vol. 2 (2008) pp. 542–563

Related references

  • Y. Baraud, C. Giraud, S. Huet. Gaussian model selection

with unknown variance.

To appear in the Annals of Statistics (2008).

  • N. Verzelen. High-dimensional Gaussian model selection
  • n a Gaussian design.

Personal communication

Christophe GIRAUD Statistical Inference of Gaussian Graphs