Probabilistic Graphical Models for Cellular Pathways
Florian Markowetz florian.markowetz@molgen.mpg.de Max Planck Institute for Molecular Genetics Computational Diagnostics Group Berlin, Germany
- IPM workshop
Tehran, 2005 April
Probabilistic Graphical Models for Cellular Pathways Florian - - PowerPoint PPT Presentation
Probabilistic Graphical Models for Cellular Pathways Florian Markowetz florian.markowetz@molgen.mpg.de Max Planck Institute for Molecular Genetics
Florian Markowetz florian.markowetz@molgen.mpg.de Max Planck Institute for Molecular Genetics Computational Diagnostics Group Berlin, Germany
Tehran, 2005 April
Figure from http://array.mbb.yale.edu/yeast/transcription/ Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 1
High-throughput assays can probe cells at a genome-wide scale. Very prominent: microarrays that measure mRNA transcript quantitites. Need to use probabilistic models, which account for
the model.
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 2
19017 17003 18001 LAL4 19014 20005 02020 43015 28008 31015 10005 11002 28009 01007 04018 15006 24006 09002 16007 16002 64005 43006 12008 83001 26009 65003 56007 19008 01003 44001 49004 37001 19002 04016 28007 24022 03002 36002 09017 27004 49006 62001 43004 20002 12012 64001 65005 28036 84004 26003 62002 15001 24008 26005 26001 08024 48001 12019 25003 11005 01005 24011 43007 04007 31011 12007 22011 24017 14016 37013 22013 68003 24010 12006 43001 08001 04006 26008 28032 16004 15004 19005 24005 28028 31007 63001 57001 24019 64002 36001 08018 28003 LAL5 22010 12026 06002 04008 16009 68001 25006 22009 24018 04010 28021 24001 30001 28035 28024 27003 28037 28006 28001 28043 28031 33005 28042 43012 28023 28047 08012 08011 28019 01010 28044 28005 62003 15005 09008 31525_s_at 31687_f_at 41165_g_at 41164_at 37039_at 35016_at 38833_at 38095_i_at 38355_at 41214_at 39878_at 39729_at 296_at 39829_at 33238_at 39317_at 32855_at 41468_at 40775_at 37399_at 32649_at 38319_at 38917_at 1110_at 995_g_at 38446_at 41504_s_at 38147_at 37809_at 37558_at 41470_at 36927_at 35372_r_at 37623_at 32612_at 33809_at 36536_at 40953_at 36275_at 34800_at 37625_at 37006_at 1325_at 37280_at 914_g_at 34168_at 36650_at 41193_at 36638_at 36108_at 38052_at 36239_at 307_at 38604_at 39389_at 266_s_at 38242_at 33516_at 38585_at 35926_s_at 33232_at 39710_at 32794_g_at 33412_at 280_g_at 36711_at 37701_at 38354_at 36103_at 1369_s_at 38514_at 37043_at 34210_at 38968_at 38096_f_at 33439_at 41215_s_at 33274_f_at 33273_f_at 39318_at 41166_at 1096_g_at 37988_at 37344_at 41266_at 34362_at 32035_at 33705_at 40936_at 40570_at 1065_at 41356_at 40202_at 38994_at 32542_at 34033_s_at 39839_at 41723_s_at 36878_f_at 36773_f_at
Assumption: Coexpression ∼ coregulation If genes show the same expression profiles they follow the same regulatory regimes [7, 25].
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 3
An expression profile is a random vector X = (X1, . . . , Xp). Correlation graph: Depict genes as vertices of a graph and draw an edge (i, j) iff the correlation coefficient ρij = 0. Advantage: This representation of the marginal dependence structure is easy to interpret and can be accurately estimated even if p ≫ N. Application: Stuart et. al [28] build a graph from coexpression across multiple organisms.
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 4
We cannot distinguish direct from indirect dependencies! Three reasons, why X, Y , and Z are highly correlated:
X Y Z X Z Y X Z Y H
As a cure: search for correlations which cannot be explained by other variables.
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 5
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 6
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 7
Be X, Y, Z random variables with joint distribution P. X is conditionally independent of Y given Z X | = Y | Z ⇔ P(X = x, Y = y|Z = z) = P(X = x|Z = z) · P(Y = y|Z = z) P(X = x|Y = y, Z = z) = P(X = x|Z = z)
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 8
Interpret random variables as abstract pieces of knowledge obtained from, say, reading books [16]. Then X | = Y | Z means Knowing Z, reading Y is irrelevant for reading X If I already know Z, then Y offers me no new information to understand X.
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 9
precision matrix). Then it holds for i, j ∈ {1, . . . , p} with i = j that Xi | = Xj | Xrest ⇔ kij = 0, where rest = {1, . . . , p} \ {i, j} [16].
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 10
Given a random vector X = (X1, . . . , Xp). A Gaussian graphical model [16, 6] is an undirected graph on vertex set V , with |V | = p . To each vertex i ∈ V corresponds a random variable Xi ∈ X. Draw an edge between vertices i and j if and only if kij = 0. Note: In correlation graphs we modeled via Σ, in GGMs we use K = Σ−1.
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 11
Missing edges indicate independencies: Xi | = Xj | Xrest X1 | = X4 | {X2, X3} X2 | = X3 | {X1, X4} X2 | = X4 | {X1, X3}
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 12
Likelihood n(x; K) = (2π)−p
2 |K| 1 2 exp
2xTKx
Likelihood ratio test statistic is asymptotically χ2 distributed [16].
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 13
Full conditional relationships can only be accurately estimated if the number of samples N is relatively large compared to the number of variables p. Thus, if p ≫ N, you can . . . either improve your estimators of partial correlations (e.g. Sch¨ afer and Strimmer [23] use the Moore-Penrose pseudoinverse and bootstrap aggregation (bagging) to stabilize the estimator.)
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 14
Do not condition on the complete rest as in GGMs. Instead explore dependency of two variables conditioned on a third [30, 31, 17, 5]. Draw an edge between vertices i and j (i = j) if and only if the correlation coefficient ρij = 0 and no third variable can explain the correlation: Xi | = / Xj | Xk for all k ∈ rest, whrere again rest = {1, . . . , p} \ {i, j}.
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 15
We have seen methods to build graphs from
Xi | = Xj,
Xi | = Xj | X{1,...,p}\{i,j},
Xi | = Xj | Xk ∀k ∈ {1, . . . , p} \ {i, j}.
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 16
We have seen methods to build graphs from
Xi | = Xj,
Xi | = Xj | X{1,...,p}\{i,j},
Xi | = Xj | Xk ∀k ∈ {1, . . . , p} \ {i, j}. Where does this lead us?
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 16
Draw an edge between vertices i and j if Xi | = / Xj | XS for all S ⊆ {1, . . . , p} \ {i, j}. This includes testing marginal, first order and full conditional independencies.
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 17
Draw an edge between vertices i and j if Xi | = / Xj | XS for all S ⊆ {1, . . . , p} \ {i, j}. This includes testing marginal, first order and full conditional independencies. In the next part we will see:
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 17
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 18
Given random vector X = (X1, . . . , Xp) we can always decompose p(x) = p(x1, . . . , xp) = p(x1, . . . , xp−1) p(xp|x1, . . . , xp−1)
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 19
Given random vector X = (X1, . . . , Xp) we can always decompose p(x) = p(x1, . . . , xp) = p(x1, . . . , xp−1) p(xp|x1, . . . , xp−1) = p(x1)
p
p(xv|x1, . . . , xv−1)
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 19
Given random vector X = (X1, . . . , Xp) we can always decompose p(x) = p(x1, . . . , xp) = p(x1, . . . , xp−1) p(xp|x1, . . . , xp−1) = p(x1)
p
p(xv|x1, . . . , xv−1)
1 2 3
Example: p(x1, x2, x3) = p(x1) p(x2|x1) p(x3|x1, x2)
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 19
Given random vector X = (X1, . . . , Xp) we can always decompose p(x) = p(x1, . . . , xp) = p(x1, . . . , xp−1) p(xp|x1, . . . , xp−1) = p(x1)
p
p(xv|x1, . . . , xv−1)
1 2 3
Example: p(x1, x2, x3) = p(x1) p(x2|x1) p(x3|x1, x2) ⇒ completely connected directed acyclic graph
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 19
A Bayesian Network for a random vector X consists of
p(x) =
v∈V
p(xv | xpa(v), θv)
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 20
− → Conditional Gaussian networks
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 21
− → Conditional Gaussian networks
− → Global Directed Markov Property
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 21
− → Conditional Gaussian networks
− → Global Directed Markov Property
− → Constraint-based algorithm (and a Bayesian in Part III)
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 21
− → Conditional Gaussian networks
− → Global Directed Markov Property
− → Constraint-based algorithm (and a Bayesian in Part III)
− → equivalence of network structures
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 21
The DAG defines families. Relationships are further characterized by local probability distributions:
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 22
The DAG defines families. Relationships are further characterized by local probability distributions:
0 1 X 0 1 2 Z 0 1 Y
p(x) = (0.6 0.4) p(y) = (0.2 0.8) p(z|x, y) =
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 22
The DAG defines families. Relationships are further characterized by local probability distributions:
0 1 X 0 1 2 Z 0 1 Y
p(x) = (0.6 0.4) p(y) = (0.2 0.8) p(z|x, y) = (0.8 0.1 0.1) if (X, Y ) = (0, 0) (0.1 0.8 0.1) if (X, Y ) = (0, 1) (0.1 0.8 0.1) if (X, Y ) = (1, 0) (0.1 0.1 0.8) if (X, Y ) = (1, 1)
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 22
Discrete node with discrete parents Xv | xpa(v), θv ∼ Multin(1, θv|xpa(v)) Parametrization: θv = {θv|xpa(v)} is a set of probability vectors –
Density: [12] p(xv | xpa(v), θv) =
v
θ1(x′
v=xv)
x′
v|xpa(v)
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 23
Continous node with continuous parents Xv | xpa(v), θv ∼ N(µv, σ2
v),
where µv = β(0)
v
+
i∈pa(v) β(i) v xi.
Parametrization: θv = (βv, σ2
v) contains a vector of regression
coefficients and a variance for node v. Density: p(xv | xpa(v), θv) = 1 √ 2πσ exp
2σ2
v
24
Continous node with mixed parents Calling continous variables Y and discrete variables I [16], we can write Yv | ipa(v), ypa(v), θv ∼ N(µv|ipa(v), σ2
v|ipa(v)),
where µv|ipa(v) = β(0)
ipa(v) + i∈pa(v) β(i) ipa(v)xi.
Parametrization: θv = (βv|ipa(v), σ2
v|ipa(v)) contains a vector of
regression coefficients and a variance for node v, which depend
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 25
We can combine the different LPDs in the framework of CG networks:
The random vector X has a discrete part I and a continuous part Y and the distribution decomposes as p(x) = p(i, y) = p(i) p(y|i). These are the general parametric networks used in statistics [16].
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 26
X Y Z
Chain/linear X | = Z | Y and X | = / Z | ∅
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 27
X Y Z
Chain/linear X | = Z | Y and X | = / Z | ∅ p(x, z|y) = p(x, y, z) p(y) = p(x) p(y|x) p(z|y) p(y) = p(x|y) p(z|y)
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 27
X Y Z
Fork/diverging X | = Z | Y and X | = / Z | ∅
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 28
X Y Z
Fork/diverging X | = Z | Y and X | = / Z | ∅ p(x, z|y) = p(x, y, z) p(y) = p(x|y) p(y) p(z|y) p(y) = p(x|y) p(z|y)
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 28
X Y Z
Collider/converging X | = Z | ∅ and X | = / Z | Y
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 29
X Y Z
Collider/converging X | = Z | ∅ and X | = / Z | Y p(x, y, z) = p(x) p(y|x, z) p(z) = p(x) p(z) p(x, y, z) p(x, z)
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 29
How to find the skeleton of a Bayesian network [26, 21] Form the complete undirected graph on node set {1, . . . , p}. For each pair of variables Xi and Xj:
{i, j} such that Xi | = Xj | XS.
independencies.
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 30
How to direct the edges [26, 21] Once we have the skeleton, we can start putting directions on the edges. First identify v-structures: Orient X—Y —Z into X − → Y ← − Z whenever X | = / Z | Y . Second direct as many edges as possible while respecting acyclicity and the independence constraints from step 1.
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 31
Two structures and are equivalent if both represent the same set of independence assertions.
X Y Z X Y Z X Y Z X Y Z X Y Z X Y Z
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 32
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 33
Model: We assume that the dependency structure of a random vector X follows an unknown DAG D. The distribution p(x) is Conditional Gaussian and factors according to D. Data: We observe independent and identically distributed data d = {x1, . . . , xN}. Each observation is a realization of X. Goal: Estimate D from d.
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 34
devise a scoring function that evaluates each network with respect to the training data.
search for the optimal network according to this score.
use MCMC or Bootstrap.
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 35
The posterior distribution of structure and parameters given data is p(D, θ | d) ∝ p(d | D, θ) · p(θ|D) · p(D).
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 36
The posterior distribution of structure and parameters given data is p(D, θ | d) ∝ p(d | D, θ) · p(θ|D) · p(D). Integrating out nuisance parameters yields p(D | d) ∝ p(D) ·
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 36
The posterior distribution of structure and parameters given data is p(D, θ | d) ∝ p(d | D, θ) · p(θ|D) · p(D). Integrating out nuisance parameters yields p(D | d) ∝ p(D) ·
The righthand side will be our score for network fitness. It consists
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 36
We zoom in on one discrete family of nodes with a fixed configuration of parents. Assuming parameter independence [13] we will solve the integral p(batch | D) =
where “batch” means the part of data d corresponding to this one family. To solve it analytically, we need priors, which fit to the likelihood.
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 37
Discrete part: Multinomial likelihood with Dirichlet prior: p(batch | D, θ) =
θnk
k
p(θ | D) = Γ(α+)
Γ(αk)
θαk−1
k
. Mixed part: Gaussian likelihood with Normal-inverse-χ2 prior. Data likelihood is multivariate Normal, vector of regression β coefficients has Normal prior, variance σ2 has inverse-χ2 prior [1, 18].
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 38
p(batch | D) =
= Γ(α+)
θnk+αk−1
k
dθv = Γ(α+)
Γ(α+ + n+) with counts nk and Dirichlet parameters αk. For the marginal likelihood of the complete network, you have to multiply terms like this for all nodes and all configurations of discrete parents [3, 13].
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 39
We learned in the case of discrete networks, how to compute the marginal likelihood p(d | D). This is the right part of the score: p(D | d) ∝ p(D) ·
To complete the score, we need a structure prior p(D). And after that, we have to come up with a smart strategy to find high-scoring network structures.
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 40
Exhaustive search: Infeasible for more than 5 nodes! [22] If topological order of nodes is known Start with empty network and iteratively add parents [3]. Hillclimbing (with random restarts)
Simulated annealing Choose suboptimal neighbor with decreasing probability.
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 41
A quote from Edwards [6]: “Any method (or statistician) that takes a complex multivariate dataset and, from it, claims to identify one true model, is both naive and misleading.” What we have found is just a simple model consistent with the data — nothing more, nothing less.
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 42
Predicting the best network tells us nothing about the robustness of the solution. MCMC: Use Markov Chain Monte Carlo to sample from the posterior distribution [14, 10]. Bootstrap: Computationally efficient approach to address confidence in network features [9, 11]. Biased-corrected bootstrap: Graphical models learned from bootstrap samples are biased towards too complex models. Steck and Jaakkola [27] suggest a bootstrap procedure corrected for this bias.
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 43
If the expression of gene A is regulated by proteins B and C, then A’s expression level is a function of the joint activity levels of B and C. We treat the expression of A as a stochastic function of its regulators.
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 44
If the expression of gene A is regulated by proteins B and C, then A’s expression level is a function of the joint activity levels of B and C. We treat the expression of A as a stochastic function of its regulators. Problem 1: In most current biological data sets, however, we do not have access to measurements of protein activity levels.
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 44
If the expression of gene A is regulated by proteins B and C, then A’s expression level is a function of the joint activity levels of B and C. We treat the expression of A as a stochastic function of its regulators. Problem 1: In most current biological data sets, however, we do not have access to measurements of protein activity levels. Resort: Expression levels of genes as a proxy for the activity level of the proteins they encode.
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 44
If the expression of gene A is regulated by proteins B and C, then A’s expression level is a function of the joint activity levels of B and C. We treat the expression of A as a stochastic function of its regulators. Problem 1: In most current biological data sets, however, we do not have access to measurements of protein activity levels. Resort: Expression levels of genes as a proxy for the activity level of the proteins they encode. Problem 2: There are numerous examples where an activation or silencing of a regulator is carried out by posttranscriptional protein modifications.
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 44
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 45
Much more on http://www.cs.ubc.ca/∼murphyk/Software/BNT/bnsoft.html.
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 46
Clustering, Graphical Gaussian models, Bayesian networks;
Constraint-based approach and Bayesian scoring.
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 47
Clustering, Graphical Gaussian models, Bayesian networks;
Constraint-based approach and Bayesian scoring.
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 47
[1] Susanne Gammelgaard Bøttcher. Learning Bayesian Networks with Mixed Variables. PhD thesis, Aalborg University, Denmark, 2004. [2] Susanne Gammelgaard Bøttcher and Claus Dethlefsen. deal: A package for learning bayesian networks. Journal of Statistical Software, 8(20), 2003. [3] Gregory F. Cooper and Edward Herskovits. A Bayesian Method for the Induction of Probabilistic Networks from Data. Machine Learning, 9:309–347, 1992. [4] R.G. Cowell, A.P. Dawid, S.L. Lauritzen, and D.J. Spiegelhalter. Probabilistic Networks and Expert Systems. Springer-Verlag, New York, 1999. [5] Alberto de la Fuente, Nan Bing, Ina Hoeschele, and Pedro Mendes. Discovery of meaningful associations in genomic data using partial correlation coefficients. Bioinformatics, 20(18):3565–3574, 2004. [6] David Edwards. Introduction to Graphical Modelling. Springer, 2000. [7] MB Eisen, PT Spellman, PO Brown, and D Botstein. Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci U S A, 95(25):14863–8, Dec 1998. [8] Nir Friedman. Inferring Cellular Networks Using Probabilistic Graphical Models. Science, 303(5659):799–805, 2004. [9] Nir Friedman, Moises Goldszmidt, and Abraham Wyner. Data analysis with Bayesian networks: A bootstrap approach. In Uncertainty in Artificial Intelligence: Proceedings of the Fifteenth Conference (UAI-1999), pages 196–205, San Francisco, CA, 1999. Morgan Kaufmann Publishers. [10] Nir Friedman and Daphne Koller. Being Bayesian about network structure: A Bayesian approach to structure discovery in Bayesian
[11] Nir Friedman, Michal Linial, Iftach Nachman, and Dana Pe’er. Using Bayesian networks to analyze expression data. Journal of Computational Biology, 7(3):601–620, August 2000. [12]
[13] David Heckerman, Dan Geiger, and David Maxwell Chickering. Learning Bayesian Networks: The Combination of Knowledge and Statistical Data. Machine Learning, 20(3):197–243, Sep. 1995. Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 48
Dirk Husmeier. Sensitivity and specificity of inferring genetic regulatory interactions from microarray experiments with dynamic Bayesian
[15] Michael I. Jordan, editor. Learning in Graphical Models. MIT Press, Cambridge, MA, 1999. [16] Steffen L. Lauritzen. Graphical Models. Clarendon Press, Oxford, 1996. [17] Paul M Magwene and Junhyong Kim. Estimating genomic coexpression networks using first-order conditional independence. Genome Biol, 5(12):R100, 2004. [18] Florian Markowetz, Steffen Grossmann, and Rainer Spang. Probabilistic soft interventions in conditional gaussian networks. In Robert Cowell and Zoubin Ghahramani, editors, Proc. Tenth International Workshop on Artificial Intelligence and Statistics, Jan 2005. [19] Kevin Murphy. The Bayes Net Toolbox for Matlab. Computing Science and Statistics, 33, 2001. [20] Judea Pearl. Probabilistic Reasoning in Intelligent Systems: networks of plausible inference. Morgan Kaufmann, 1988. [21] Judea Pearl. Causality: Models, Reasoning and Inference. Cambridge University Press, Cambridge, 2000. [22] Robert W. Robinson. Counting labeled acyclic digraphs. In F. Harary, editor, New Directions in the Theory of Graphs, pages 239–273. Academic Press, New York, 1973. [23] Juliane Schfer and Korbinian Strimmer. An empirical Bayes approach to inferring large-scale gene association networks. Bioinformatics, 21(6):754–64, Mar 2005. [24] Peter W. F. Smith and Joe Whittaker. Edge exclusion tests for graphical Gaussian models. In Michael Jordan, editor, Learning in Graphical Models, pages 555 – 574. MIT Press, 1999. [25] PT Spellman, G Sherlock, MQ Zhang, VR Iyer, K Anders, MB Eisen, PO Brown, D Botstein, and B Futcher. Comprehensive identification
[26] Peter Spirtes, Clark Glymour, and Richard Scheines. Causation, Prediction, and Search. MIT Press, Cambridge, MA, second edition, 2000. [27] Harald Steck and Tommi S. Jaakkola. Bias-corrected bootstrap and model uncertainty. In Sebastian Thrun, Lawrence Saul, and Bernhard Sch¨
[28] Joshua M Stuart, Eran Segal, Daphne Koller, and Stuart K Kim. A gene-coexpression network for global discovery of conserved genetic
Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 49
Junbai Wang, Ola Myklebost, and Eivind Hovig. Mgraph: graphical models for microarray data analysis. Bioinformatics, 19(17):2210–2211, 2003. [30] Anja Wille and Peter B¨
Seminar for Statistics, ETH Zrich, 2004. [31] Anja Wille, Philip Zimmermann, Eva Vranov´ a, Andreas F¨ urholz, Oliver Laule, Stefan Bleuler, Lars Hennig, Amela Prelic, Peter von Rohr, Lothar Thiele, Eckart Zitzler, Wilhelm Gruissem, and Peter B¨
network in Arabidopsis thaliana. Genome Biol, 5(11):R92, 2004. Florian Markowetz, Probabilistic Graphical Models for Cellular Pathways, 2005 April 50