Graph Algorithms and Graph Measures for the Life Sciences Falk - - PowerPoint PPT Presentation

graph algorithms and graph measures for the life sciences
SMART_READER_LITE
LIVE PREVIEW

Graph Algorithms and Graph Measures for the Life Sciences Falk - - PowerPoint PPT Presentation

Graph Algorithms and Graph Measures for the Life Sciences Falk Schreiber 23/10/2014 1 Networks and Graphs in the Life Sciences Graph Network Network Representation Network is an informal description for a set of elements with


slide-1
SLIDE 1

Falk Schreiber

Graph Algorithms and Graph Measures for the Life Sciences

23/10/2014 1

slide-2
SLIDE 2

Networks and Graphs in the Life Sciences

Graph Network

slide-3
SLIDE 3

Network Representation

Network is an informal description for a set of elements with connections or interactions between them and data attached to them Graph is a formal description, it is a mathematical object consisting of vertices and edges representing elements and connections, respectively

slide-4
SLIDE 4

Interactions à Networks à Pathways

A collection of interactions and/or transformations defines a network Pathways are subsets of networks All pathways are networks, however not all networks are pathways Difference: level of annotation/understanding We can define a pathway as a biological network that relates to a known physiological process or phenotype There is no precise biological definition of a pathway Partitioning of networks into pathways is somewhat arbitrary

slide-5
SLIDE 5

Networks a Decade Ago

slide-6
SLIDE 6

Can you Spot the Error?

[from Milo et al., Science, 2002]

slide-7
SLIDE 7

Retraction and Impact Factor

slide-8
SLIDE 8

Just an Example …

slide-9
SLIDE 9

From Biological Building Blocks to Complex Systems

Genome Set of hereditary instructions needed to build, run and maintain a particular organism Genes Transcripts Proteins Metabolites

slide-10
SLIDE 10

From Biological Building Blocks to Complex Systems

Transcriptome Set of RNA transcribed from genes within the genome by a particular cell at a particular time Depends on the tissue, the developmental stage of the

  • rganism and the metabolic state of the cell

Genes Transcripts Proteins Metabolites

slide-11
SLIDE 11

From Biological Building Blocks to Complex Systems

Proteome Set of proteins translated from RNA within a transcriptome by a particular cell at a particular time Complete proteome of a cell: set of all potential proteins that could be synthesised by the cell Genes Transcripts Proteins Metabolites

slide-12
SLIDE 12

From Biological Building Blocks to Complex Systems

Metabolome Set of all the metabolites inside a particular cell at a particular time Genes Transcripts Proteins Metabolites

slide-13
SLIDE 13

From Biological Building Blocks to Complex Systems

Genes Transcripts Proteins Metabolites

slide-14
SLIDE 14

From Biological Building Blocks to Complex Systems

Genes Transcripts Proteins Metabolites 20th century biology (reductionist approach) Phenylketonuria is caused by a mutated gene for the enzyme phenylalanine hydroxylase (PAH)

slide-15
SLIDE 15

From Biological Building Blocks to Complex Systems

Genes Transcripts Proteins Metabolites 20th century biology (reductionist approach) Cancer, heart diseases, … multiple, complex changes 21th century biology (integrative approach)

slide-16
SLIDE 16

From Biological Building Blocks to Complex Systems

Genes Transcripts Proteins Metabolites

slide-17
SLIDE 17

Biological Pathways and Networks - Examples

Signal transduction pathway and networks Cellular processes that recognize extra- or intra-cellular signals and induce appropriate cellular responses Gene regulatory networks Pathways that regulate a cell’s behaviors, including transcription and translation Metabolic pathway A series of enzymatic reactions that produce a specific product Protein interaction networks Interaction of proteins (e.g. activation, non-covalent binding)

slide-18
SLIDE 18

Biological Pathways and Networks

gene regulation level 1 level 2 protein clustering protein interaction metabolism chromosome location of genes

Andreas Kerren Helen C. Purchase Matthew O. Ward (Eds.)

Multivariate Network Visualization

State-of-the-Art Survey LNCS 8380

123

Dagstuhl Seminar #13201 Dagstuhl Castle, Germany, May 12–17, 2013 Revised Discussions

slide-19
SLIDE 19

Many Informatics Areas

Health informatics/ Environmental informatics Medical informatics Bioinformatics Chemoinformatics Evolutionary networks Infection networks Ecological networks / food webs Neuronal networks Hormonal networks Signalling networks Gene regulatory network Protein interaction networks Metabolic networks Chemical structure graphs

slide-20
SLIDE 20

Network Usage - Examples

Representation/exploration Network analysis Data context/analysis Simulation

slide-21
SLIDE 21

Network Analysis - Network Centralities

Centrality of graph G=(V,E) Funktion c:V→R With c(u)>c(v), if u∈V more important than v∈V Ranking of vertices According to importance Based on the network structure Application examples Hypothesis generation for experiments Which patients should be vaccinated first Problem Works not well with existing algorithms

[from Jeong et al., Nature, 2001]

slide-22
SLIDE 22

New Centrality Measure

Based on network motifs Sub-graphs representing patterns of local interconnections May represent basic building blocks and design patterns of functional modules

[from Babu et al., Current Opinion in Structural Biology, 2004]

slide-23
SLIDE 23

Motifs in Gene Regulatory Networks: Feed-forward Loop

Example of functional properties Noise filtering: responds only to persistent activations

[from Shen-Orr et al., Nature Genetics, 2002]

slide-24
SLIDE 24

Motif-based Centrality

1 v5 1 v1 2 v4 2 v3 3 v2 centrality vertex

Combines centrality measures and network motifs Uses occurrences of a motif in the network Incorporates functional substructures into centrality analysis

  • Motif (Feed-forward loop) Target graph

M G

}

{

M G G G G

M M M M

− ∧ ⊆ = ~ G

}

{

| ) ( ) (

M M M M

G V v G G v c ∈ ∧ ∈ = G

slide-25
SLIDE 25

Motif-based Centrality with Roles

Different vertices have different roles Count number of matches according to roles

  • v5

v4 v3 2 v2 1 v1 centrality vertex 1 1 1 2 1 A B C

Motif (Feed-forward loop) Target graph M G

}

{

M G G G G

M M M M

− ∧ ⊆ = ~ G

}

{

| ) , ( ) ( ) , ( r G v role G V v G G r v c

M M M M M

= ∧ ∈ ∧ ∈ = G

slide-26
SLIDE 26

Gene Regulatory Network of E. coli

Based on data from RegulonDB (http://regulondb.ccg.unam.mx/) 1250 vertices and 2515 edges Global regulators?

slide-27
SLIDE 27

Motif-based Centrality with Roles for E. coli

Top 20 genes (of 1250) 11 of 18 global regulators (Martínez-Antonio and Collado-Vides) Method works also for

  • ther networks

Even better results with different motifs

40 fis 58 arcA 61 ihfAB 150 fnr 254 crp centrality gene 70 53 53 A B C 11 gadE 11 fhlA 14 hns 18 soxS 18 modE 39 1 8 gadX 8 galR 10 rob 11 cpxR 26 5 srlR 6 tdcR 6

  • xyR

6 fur 6 gntR 11 1 36 1 1 5 narL 95

slide-28
SLIDE 28

Two Vague Ideas

Are scale-free and small- world networks relevant or more an artifact ?

THEINTERNET, mapped on the opposite page, is a scale-free network in that some siteS (starbursts and detail above) have a seemingly unlimited number of connections to other sites. This map, made on February 6, 2003, traces the shortest routes from atest WebsinHo about 100,000 others, using like colors for similar Webaddresses.

a

  • Scientistshaverecentlydiscoveredthat variouscomplexsystemshave

antlnderlyihg~..'~tJ;i~e~tu"eg~Ye'l"rne(;lb9.$ha redorga nili ngprincipies. Thisinsighthas important impli~ationsfor a hostof applications, fromdrugdevelopment to Internetsecurity

BYALBERT-U\SZLO BARABASI ANDERICBONABEAU

50

SCIENTIFIC AMERICAN MAY 2003
slide-29
SLIDE 29

Degree Distribution - Examples

slide-30
SLIDE 30

Erdős-Rényi (1960) Watts-Strogatz (1998) Barabási-Albert (1999)

Models for Networks of Complex Topology

slide-31
SLIDE 31

Start with n nodes and 0 edges Connect each pair of vertices with probability pER Many properties in these graphs appear quite suddenly, at a threshold value of pER If pER~c/n with c<1, then almost all nodes belong to isolated trees

The Erdős-Rényi [ER] Model (1960)

slide-32
SLIDE 32

The Watts-Strogatz [WS] Model (1998)

Start with a regular network with n nodes Rewire each edge with probability p For p=0 (regular networks) High clustering coefficient C, high characteristic path length L For p=1 (random networks) Low clustering coefficient C, low characteristic path length L

slide-33
SLIDE 33

The Watts-Strogatz [WS] Model (1998)

There is a broad interval of p for which characteristic path length L is small but clustering coefficient C remains large Small world networks are common

slide-34
SLIDE 34

The Barabási-Albert [BA] Model (1999)

Look at the distribution of degrees k A scale-free network is a network where small proportion of the nodes have high degree of connection ("highly connected hubs“) The probability of finding a highly connected node decreases exponentially with k p(k) ~ k-γ , a given node has k connections to other nodes with probability as the power law distribution with γ = [2, 3]

slide-35
SLIDE 35

The Barabási-Albert [BA] Model (1999)

slide-36
SLIDE 36

Protein Interaction Networks

Also other networks, e.g. transcript correlation networks

slide-37
SLIDE 37

Two Vague Ideas

Are scale-free and small-world networks relevant or more an artifact ? Taxonomy for centrality measures

slide-38
SLIDE 38

Taxonomy for Centrality Measures