Statistical learning of biological networks: a brief overview - - PowerPoint PPT Presentation

statistical learning of biological networks a brief
SMART_READER_LITE
LIVE PREVIEW

Statistical learning of biological networks: a brief overview - - PowerPoint PPT Presentation

Statistical learning of biological networks: a brief overview Florence dAlchBuc IBISC CNRS, Universit dEvry, GENOPOLE, Evry, France Email: florence.dalche@ibisc.fr Statistical learning of biological networks: a brief overview 1 /


slide-1
SLIDE 1

Statistical learning of biological networks: a brief

  • verview

Florence d’Alché–Buc

IBISC CNRS, Université d’Evry, GENOPOLE, Evry, France Email: florence.dalche@ibisc.fr

Statistical learning of biological networks: a brief overview 1 / 30

slide-2
SLIDE 2

Biological networks

Statistical learning of biological networks: a brief overview Introduction 2 / 30

slide-3
SLIDE 3

Motivation

Identify and understand complex mechanisms at work in the cell Biological networks

signaling pathways gene regulatory networks protein-protein interaction networks metabolic pathways

Use experimental data and prior knowledge AND statistical inference to unravel biological networks and predict their behaviour

Statistical learning of biological networks: a brief overview Introduction 3 / 30

slide-4
SLIDE 4

How to learn biological networks from data ?

Data-mining approaches : extract co-expressed patterns and/or co-regulated patterns, reduce dimension [large scale data, often preliminary to more accurate modelling or prediction] Modeling approaches : model the network behavior, can be used to simulate and predict the network as a system [smaller scale data] Predictive approaches : predict (only) edges in an unsupervised

  • r supervised way [large or medium scale data]

Statistical learning of biological networks: a brief overview Introduction 4 / 30

slide-5
SLIDE 5

Learning (biological) networks

Statistical learning of biological networks: a brief overview Introduction 5 / 30

slide-6
SLIDE 6

Outline

1

Introduction

2

Supervised Predictive approaches

3

Modelling approaches

4

Conclusion

Statistical learning of biological networks: a brief overview Supervised Predictive approaches 6 / 30

slide-7
SLIDE 7

Supervised learning of the regulation concept

Instance Problem 1 (transcriptional regulatory networks):Training sample S = {(wi = (vi, v′

i ), yi), i = 1...n}

where wi are pairs of components vi and v′

i (think transcription

factor and potential regulee) and yi ∈ Y indicates if there is vi is a transcription factor for v′

i . We wish to be able to predict new

regulations. Reference : Qian et al. 2003, Bioinformatics. In symbolic machine learning, this corresponds to the framework

  • f relational learning classically associated with inductive logic

programming (ILP) and more recently to statistical ILP : The predicate interaction(X,Y) can be learned from labeled examples

Statistical learning of biological networks: a brief overview Supervised Predictive approaches 7 / 30

slide-8
SLIDE 8

Supervised learning of interactions

From a known network where each vertex is described by some input feature vector x, predict the edges involving new vertices described by their input feature vector

Statistical learning of biological networks: a brief overview Supervised Predictive approaches 8 / 30

slide-9
SLIDE 9

Supervised prediction of protein-protein interaction network

Instance Problem 2 (protein-protein interaction networks) : Training sample S = {(wi = (vi, v′

i ), yi), i = 1...n} where wi are

couples of components vi and v′

i (think proteins) and yi ∈ Y

indicates if there is an edge or not between vi and v′

i . We wish to

predict interactions for test and training input data Noble et al. in 2005 (SVM) with kernel combination Further studied by Biau and Bleakley 2006, Bleakley et al. 2007

Statistical learning of biological networks: a brief overview Supervised Predictive approaches 9 / 30

slide-10
SLIDE 10

Similarity or kernel learning

In the case of non oriented graphs, a similarity between components can be learnt instead of a classification function Yamanishi and Vert’s work (2005) first introduced this kind of approach We proposed a new way of formulating the problem as regression in output space endowed with a kernel (Geurts et al. 2006,2007)

Statistical learning of biological networks: a brief overview Supervised Predictive approaches 10 / 30

slide-11
SLIDE 11

Supervised learning with output (kernel) feature space

Suppose we have a learning sample LS = {xi = x(vi), i = 1, . . . , N} drawn from a fixed but unknown probability distribution and an additional information provided by a Gram matrix K = kij = k(vi, vj), fori, j = 1, . . . , N} that expresses how much objects vi, i = 1...n are close to each other. Let us call respectively φ the implicit output feature map and k the positive definite kernel defined on V × V such that < φ(v), φ(v′) >= k(v, v′). From a learning sample {(xi, Kij|i = 1, . . . , N, j = 1, . . . , N} with xi ∈ X,find a function f : X → F that minimizes the expectation of some loss function ℓ : F × F → IR over the joint distribution of input/output pairs: Ex,φ(v){ℓ(f φ(x), φ(v))}

Statistical learning of biological networks: a brief overview Supervised Predictive approaches 11 / 30

slide-12
SLIDE 12

Application to supervised inference of edges in a graph 1

For objects v1, ..., vN, let us assume we have : feature vectors x(vi), i = 1...N and a Gram matrix K defined as Ki,j = k(vi, vj). The kernel k reflects the proximity between objects v, as vertices in the known graph. Reminder: kernel k is a positive definite (similarity) function. For such function, there exists a function φ called a feature map :V → F such that k(v, v′) = φ(v), φ(v′).

Statistical learning of biological networks: a brief overview Supervised Predictive approaches 12 / 30

slide-13
SLIDE 13

Supervised inference of edges in a graph

Use a machine learning method that can infer a function h : X → F to get for a given x(v), an approximation of φ(v) and get an approximation g(x(v), x(v′)) = h(x(v)), h(x(v′)) of the kernel value between v and v′ described by their input feature vectors x(v) and x(v′) Connect these two vertices if g(x(v), x(v′)) > θ

(by varying θ we get different tradeoffs between true positive and false positive rates)

Statistical learning of biological networks: a brief overview Supervised Predictive approaches 13 / 30

slide-14
SLIDE 14

A kernel on graph nodes

Diffusion kernel (Kondor and Lafferty, 2002): The Gram matrix K with Ki,j = k(vi, vj) is given by: K = exp(−βL) where the graph Laplacian L is defined by: Li,j =    di the degree of node vi if i = j; −1 if vi and vj are connected;

  • therwise.

Statistical learning of biological networks: a brief overview Supervised Predictive approaches 14 / 30

slide-15
SLIDE 15

Interpretability: rules and clusters (an example with a protein-protein network)

Statistical learning of biological networks: a brief overview Supervised Predictive approaches 15 / 30

slide-16
SLIDE 16

Network completion and function prediction for yeast data

Statistical learning of biological networks: a brief overview Supervised Predictive approaches 16 / 30

slide-17
SLIDE 17

Challenges and limitations in supervised predictive approaches

Semi-supervised learning or even transductive learning Issue : unbalanced distribution of positive and negative examples local approach (the graph is not seen as a single variable) data (labeled examples) are not i.i.d. : regulations are not independent

Statistical learning of biological networks: a brief overview Supervised Predictive approaches 17 / 30

slide-18
SLIDE 18

Outline

1

Introduction

2

Supervised Predictive approaches

3

Modelling approaches

4

Conclusion

Statistical learning of biological networks: a brief overviewModelling approaches 18 / 30

slide-19
SLIDE 19

Graphical models : from simple interactions models to complex ones

Graphical Gaussian Model model estimation: estimating partial correlation as a measure of conditional independency (classified as graph prediction in my terminology) Bayesian networks estimation: modelling directed interactions Dynamic Bayesian Networks estimation: modelling directed interactions through time State-space models estimation: modelling observed and hidden dynamical processes as well

Statistical learning of biological networks: a brief overviewModelling approaches 19 / 30

slide-20
SLIDE 20

Focus on state-space models

Goal:

Quantitative models (easier to learn, encompass mechanistic models : biological relevance) Taking into account time Some variables are not measured: assumption of an hidden process Linear Gaussian models: parameters encapsulate network structure (Perrin et al. 03, Rangel et al. 04) Nonlinear models (more biologically relevant): the structure is encapsulated in the form of the transition function (Nachman 04, Rogers et al. 06, Quach et al. 07)

x(tk+1) = F(x(tk), u; θ) + ǫh(tk) y(tk) = H(x(tk), u(tk); θ) + ǫ(tk)

Statistical learning of biological networks: a brief overviewModelling approaches 20 / 30

slide-21
SLIDE 21

System of Ordinary Differential Equations (ODE)

dx dt = f(x(t), u(t); θ) Let us focus on gene regulatory networks x(t) : state variables at time t

protein concentrations mRNA concentrations

f : the form of f encodes the nature of interactions (and their structure)

linear/nonlinear models Michaelis-Menten kinetics Mass action kinetics ...

θ: parameter set (kinetic parameters, rate constants,...) u(t): input variables at time t

Statistical learning of biological networks: a brief overviewModelling approaches 21 / 30

slide-22
SLIDE 22

Reverse Engineering of Biological Networks

Given

An ODE model : dx(t) dt = f(x(t), u(t); θ) A partially and noisy observation model: y(t) = H(x(t), u(t); θ) + ǫ(t) where H is a nonlinear observation function, ǫ(t) is a i.i.d noise A sequence of observed data : y1:K = {y1, ..., yK} at time t1, t2, ..., tk

Statistical learning of biological networks: a brief overviewModelling approaches 22 / 30

slide-23
SLIDE 23

Reverse Engineering of Biological Networks

Given

An ODE model : dx(t) dt = f(x(t), u(t); θ) A partially and noisy observation model: y(t) = H(x(t), u(t); θ) + ǫ(t) where H is a nonlinear observation function, ǫ(t) is a i.i.d noise A sequence of observed data : y1:K = {y1, ..., yK} at time t1, t2, ..., tk

Goal

Structure estimation Parameters estimation θ States estimation x(t)

Statistical learning of biological networks: a brief overviewModelling approaches 22 / 30

slide-24
SLIDE 24

Structure learning

Case 1: a very few variables involved, then a combinatorial search for structure can be processed. For each potential structure, estimation of parameters has to be carried on Case 2: more than a tens of variables are involved, then it is worth using an algorithm dedicated to structure learning. Structure learning in nonlinear dynamical models as well as in static Bayesian networks can be solved by a stochastic exploration of the candidates (huge) set using an appropriate criterion that take into account data and parameters estimation, given the candidate

  • structure. MCMC methods, evolutionary approaches are used.

In the following, we assume that the network structure is given

Statistical learning of biological networks: a brief overviewModelling approaches 23 / 30

slide-25
SLIDE 25

An example of Nonlinear State-Space Model

Continuous time ODE model

dx(t) dt = f(x(t), u(t); θ) y(t) = H(x(t), u(t); θ) + ǫ(t)

The system at discrete-time points t1, ..., tK

x(tk+1) = F(x(tk), u; θ) y(tk) = H(x(tk), u(tk); θ) + ǫ(tk) with F(x(tk), u; θ) = x(tk) + tk+1

tk

f(x(τ), u(τ); θ)dτ

Statistical learning of biological networks: a brief overviewModelling approaches 24 / 30

slide-26
SLIDE 26

Bayesian inference

Given:

Prior distribution over the initial state and parameters: p(x1, θ) A state transition model: p(xk|xk−1, θ) An observation model: p(yk|xk, θ) A sequence of observations: y1:K = {y1, ..., yK}

Statistical learning of biological networks: a brief overviewModelling approaches 25 / 30

slide-27
SLIDE 27

Bayesian inference

Given:

Prior distribution over the initial state and parameters: p(x1, θ) A state transition model: p(xk|xk−1, θ) An observation model: p(yk|xk, θ) A sequence of observations: y1:K = {y1, ..., yK}

Estimating the posterior distributions

Focus on the filtering distribution: p(xk, θ|y1:k) Tool: Unscented Kalman Filter to deal with nonlinearities (Quach et al., 2007)

Statistical learning of biological networks: a brief overviewModelling approaches 25 / 30

slide-28
SLIDE 28

Example: the Repressilator

[Elowitz and Leibler,

Nature 2000]

dr1 dt = vmax

1

kn

12

kn

12 + pn 2

− kmRNA

1

r1 dr2 dt = vmax

2

kn

23

kn

23 + pn 3

− kmRNA

2

r2 dr3 dt = vmax

3

kn

31

kn

31 + pn 1

− kmRNA

3

r3 dp1 dt = k1r1 − kprotein

1

p1 dp2 dt = k2r2 − kprotein

2

p2 dp3 dt = k3r3 − kprotein

3

p3 mRNAs are observed, proteins are hidden mRNA and protein degradation rate constants are supposed to be known Estimate 9 parameters

Statistical learning of biological networks: a brief overviewModelling approaches 26 / 30

slide-29
SLIDE 29

Parameter Estimation

Statistical learning of biological networks: a brief overviewModelling approaches 27 / 30

slide-30
SLIDE 30

Challenges in (dynamical) modelling approaches

Identifiability of dynamical models Theoretical results about sample complexity Scaling to large networks Non stationnarity Incorporate other components : space, cellular compartments ... coupled systems : metabolic and regulatory networks, protein-protein interactions and regulatory network MORE DATA : benchmark problems, challenges

Statistical learning of biological networks: a brief overviewModelling approaches 28 / 30

slide-31
SLIDE 31

General conclusion and perspective

Different views of the learning problem, different scales, different prior knowledge Some of these methods could be linked to participate to the same discovery process Need for building data repository and demand for biological validation

Statistical learning of biological networks: a brief overview Conclusion 29 / 30

slide-32
SLIDE 32

References

  • C. Auliac, V. Frouin, X. Gidrol, F

. d’Alché-Buc, Evolutionary Approaches for the Reverse-Engineering of Gene Regulatory Networks: A Study on a Realistic Biological Dataset, accepté à BMC Bioinformatics, à paraître en 2008. P . Geurts, N. Touleimat, M. Dutreix, F . d’Alché-Buc, Inferring biological networks with output kernel trees, BMC Bioinformatics, to appear, May 3, 2007. Kato, K. Tsuda, EM based algorithm for kernel matrix completion, Bioinformatics, vol. 21, 2005. B.-E. Perrin, L. Ralaivola,A. Mazurie, S. Bottani, J. Mallet, F . d’Alché-Buc, Inference of gene regulatory network with Dynamic Bayesian Network, Bioinformatics (Oxford Press), vol. 19, pi38-49,2003.

  • M. Quach, N.Brunel, F

. d’Alché-Buc, Estimating parameters and hidden variables in nonlinear state-space models based on ODEs for biological networks inference, November, 23:3209-3216, 2007.

  • Y. Yamanishi, Y., J.-P

. Vert and Kanehisa, M. Supervised enzyme network inference from the integration of genomic data and chemical information,Bioinformatics, vol. 21,2005.

Statistical learning of biological networks: a brief overview Conclusion 30 / 30