Introduction to Microarray Data Analysis and Gene Networks lecture - PowerPoint PPT Presentation

Introduction to Microarray Data Analysis and Gene Networks lecture 8 Alvis Brazma European Bioinformatics Institute

Lecture 8 • Gene networks – part 2 – Network topology (part 2) – Network logics – Network dynamics

Gene Networks - four levels of hierarchical description • Parts list – genes, transcription factors, promoters, binding sites, … • Topology – a graph describing the connections between the parts • Control logics – how combinations of regulatory signals interact (e.g., promoter logics) • Dynamics – how does it all work in real time

The arcs can have different meaning - The product of gene G1 is a G1 G2 transcription factor, which binds to the promoter of gene G2 (in Chip-chip experiment) – physical interaction network (direct network) - The disruption of gene G1 changes G1 G2 the expression level of gene G2 – data interpretation network (indirect network)

How both networks compare • How much the two networks have in common • We can look at the intersection of the networks whether the common parts have evidence in our existing knowledge • If the target sets of the transcription factors present in both networks are similar • Are the network topology (e.g., connectivity) properties similar

A couple of simple notions • Any gene (node in the graph) with outgoing edges is called a source gene • Any gene with incoming edges is a target gene target node • Target set source node target set

A problem: • Both network depend on the chosen significance threshold - i.e., what level of microarray signal to use to draw and edge in the network

The size of the networks for different significance thresholds ChIP ChIP mutant mutant mutant network network network network network ( γ =2.0) ( γ =2.5) ( γ =3.0) (p<0.01) (p<0.001) source genes 202 169 250 236 226 target genes 4939 2845 5396 4778 3920 genes 4980 2930 5654 4798 3959 edges 18842 6170 32017 17436 10356 edges where source gene 3694 857 4096 2425 1507 and target gene have the (19.6%) (13.9%) (12.8%) (13.9%) (14.6%) same cellular role annotation in YPD (http://www.proteome.com ) edges per source gene 93.3 36.5 135.7 73.8 45.6

How both networks compare • How much networks have in common • We can look at the intersection of the networks whether the common parts have evidence in our existing knowledge • If the target sets of the transcription factors present in both networks are similar • Are the network topology (e.g., connectivity) properties similar

Intersection of the networks – many connections are consistent with out a priori knowledge YNL313C YOX1 PDS1 GPA1 YJR030C UFE1 ARG10 YLR104W YDR115W KAR4 ARO1 ARG5 SPT21 FUS1 CDC21 STE12 MUT5 RAD27 CPA2 GCN4 LEU4 PDS5 RFA2 STE2 SST2 IRR1 MET22 ECM40 DIN7 HOM3 GSH1 ERP3 GIC2 YJL073W YAP1 YBR070C YHR149C GIN4 MNN5 SMC3 SW I6 SGA1 YLR460C DUN1 PCL1 YLR103C PCL2 MBP1 YER079W RNR1 PRY2 PLB3 SVS1 YHR150W ABF1 SIC1 YKL185W YDR528W YPL158C YGR086C YLR297W SWI5 YLR194C YER128W HCM1 SWI4 CHS1 MCD1 YPL267W PST1 CCW6 SWE1 YLR049C YPR157W MNN1 CIS3 SCW10 CLB2 YER078C

YNL313C YO X1 P DS 1 GP A 1 YJR030C UFE 1 A RG 10 YLR104W YDR115W K A R4 A RO 1 A RG5 S P T21 FUS 1 CDC21 S TE 12 M UT5 RA D27 CP A 2 G CN4 LE U4 P DS 5 RFA 2 S TE 2 S S T2 IRR1 M E T22 E CM 40 DIN7 HO M 3 GS H1 E RP 3 GIC2 YJ L073W YA P 1 YB R070C YHR149C G IN4 M NN5 S MC3 S W I6 S GA 1 YLR460C DUN1 P CL1 YLR 103C P CL2 M B P 1 YE R079W RNR1 P RY2 P LB 3 S V S 1 YHR150W A B F1 S IC1 YK L185W YDR528W YP L158C YG R086C YLR297W S W I5 YLR194C YE R128W HC M 1 S W I4 CHS 1 M CD1 YP L267W P S T1 CCW 6 S W E 1 YLR 049C YP R157W M NN1 CIS 3 CLB 2 S CW 10 YE R078C Figure 6

How both networks compare • How much networks have in common • We can look at the intersection of the networks whether the common parts have evidence in our existing knowledge • If the target sets of the transcription factors present in both networks are similar • Are the network topology (e.g., connectivity) properties similar

How Chip-chip and disruption networks relate? All genes All genes t Regulation Regulation Transcription set of t set o f t factors h Ef Effectual fectual set set Disrupted genes of h of h

How Chip-chip and disruption networks relate? All genes All genes Regulation Regulation set o set of g f g Transcription factors Ef Effectual fectual set set of g of g Disrupted genes

How to estimate that the overlap is more than expected by random? We assume that the elements of the set E are marked, and pick the set of size |R| at random. Then the size x=| R ∩ E| of the G intersection are distributed according to hypergeometric distribution. R The probability of observing an intersection of size k or larger can be R ∩ E computed according to formula:    −  E | | | | | | E G E     k     ∑ −    | |  i R i ≥ = − ( ) 1 P x k   | | G   =   0 i  | |  R

How Chip-chip and disruption networks relate? All genes All genes 146 Regulation Regulation set o set of g f g Transcription factors 23 (9) Ef Effectual fectual set set of g of g Disrupted genes 213 From 23 transcription factors studied in both networks only 9 have their target sets overlapping more than expected by chance L

From 23 transcription factors studied in both networks only 9 have their target sets overlapping more than expected by chance • Is it as bad as my look? – We will expect many indirect connections in the disruption network that are not present in Chip network – is this the case?

Direct vs. indirect interactions Y Direct Direct Z X Indirect

GLN3 RTG1 YAP1 GCN4 BAS1 YAP6 ROX1 HIS4 ADE3 ADE13 ADE17 ADE4 YOL158C FET4 LYS2 YHM1 ARO3 ARO1 ARG4 YJL200C CPA2 MBP1 SWI6 SWI4 RNR1 NDD1 YBR070C GIC2 SVS1 SOK2 YNL058C GDH3 ECM33 SWI5 SLY1 YDR451C YER189W YER190W PMA1 YGL114W Y HL029C Y IL158W YJL051W CIS3 SUR7 CDC5 CLN1 SRL1 YOR248W YOR315W CLB2 NCE102 YBL029W UTR2

From 23 transcription factors studied in both networks only 9 have their target sets overlapping more than expected by chance • Is it as bad as my look? – We will expect many indirect connections in the disruption network that are not present in Chip network – is this the case? There is an anecdotal evidence that this is the case – What about the connections present in the Chip network, but not in the disruption network? – can be explained by nonfunctional relationships in the chip network and combinatorial regulatory effects

Conclusions • We want to think that networks share enough in common both to be meaningful, but at the same time apparently there is a lots of noise in at least one of them present

How both networks compare • How much networks have in common • We can look at the intersection of the networks whether the common parts have evidence in our existing knowledge • If the target sets of the transcription factors present in both networks are similar • Are the network topology (e.g., connectivity) properties similar – and what are they

Degree of a node in a graph The central node has degree = 7 indegree = 3 outdegree = 4

Important genes and genes with complex regulation Most genes have only a few incoming / outgoing edges, but some have high numbers (>500) Indegree Outdegree

Genes with highest in- and out-degree γ outdegree m n indegree m n 2.0 Carbohydrate metabolism 363 4 Amino-acid metabolism 9 194 RNA turnover 353 4 Nucleotide metabolism 6 82 Meiosis 244 3 Energy generation 5 242 Cellstress 207 9 Small molecule transport 5 343 Protein translocation 197 3 Other metabolism 5 148 2.8 RNA turnover 110 4 Amino-acid metabolism 4 167 Cellstress 8 Nucleotide metabolism 62 3 67 Meiosis 3 Energy generation 54 2 184 Proteinsynthesis 53 7 Differentiation 2 43 Cellwallmaintenance 6 Small molecule transport 47 2 286 3.6 RNA turnover 48 4 Small molecule transport 2 230 RNA processing/ modification 41 4 Other metabolism 2 96 Cellstress 27 8 Nucleotide metabolism 2 58 Small molecule transport 8 Matingresponse 19 2 57 Cellwallmaintenance 19 6 Amino-acid metabolism 2 133 Cellular role table showing the top 5 groups with the highest median degrees for the networks with γ =2.0, 2.8 and 3.6 with a minimum group size of 3 for outdegree and 40 for the indegree (m median degree, n number of genes per group)

Introduction to Microarray Data Analysis and Gene Networks lecture - PowerPoint PPT Presentation

Introduction to Microarray Data Analysis and Gene Networks lecture 8 Alvis Brazma European Bioinformatics Institute Lecture 8 Gene networks part 2 Network topology (part 2) Network logics Network dynamics Gene Networks

Capturing Best Practice for Microarray Gene Expression Data Analysis Gregory Piatetsky-Shapiro

Microarray Data Analysis ECS 289A ECS289A a) Oligonucleotide and b) Spotted Arrays Lochart and

Gene Expression Data Introduction to gene expression data Expression data storage concept An

Eukaryotic Gene Eukaryotic Gene Prediction Prediction Eukaryotic gene structure Eukaryotic

Inference of Gene Relations from Microarray Data by Abduction Irene Papatheodorou & Marek

Introduction to Microarray Data Analysis and Gene Networks Alvis Brazma European Bioinformatics

Gene Finding Strategies to find gene structures on the web Swiss Institute of Bioinformatics

Staphylococcus aureus Pathogenesis - Gene exchanges - Gene regulation - Gene products - Gene

Recent development in microarray data analysis Guan-Hua Huang Institute of Statistics National

A CMOS Label- -free DNA free DNA A CMOS Label Microarray Microarray Erik Anderson Stanford

Gene expression analysis Roadmap Microarray technology: how it work Applications: what

CSci 8980: Advanced Topics in Graphical Models Application: Gene Expression Analysis Instructor:

Introduction to Microarray Data Analysis and Gene Networks Lecture 3 and practical Alvis Brazma

Introduction to Microarray Data Analysis and Gene Networks Alvis Brazma European Bioinformatics

Introduction to Microarray Data Analysis and Gene Networks Lecture 5 Alvis Brazma European

Biology-Driven Clustering of Microarray Data Applications to the NCI60 Data Set K.R. Coombes,

Mass Spectrometry Proteomics for the Computational Biologist December 1, 2006 John T. Prince

Lasagna: Towards Deep Hierarchical Understanding and Searching over Mobile Sensing Data Cihang

Reinforcement Learning Policy Op5miza5on Pieter Abbeel UC Berkeley EECS Policy

Isomorphic Gait Execution in Homogeneous Modular Robots Michael Park, Sachin Chitta, and Mark Yim

Chapter 2 Nature of Matter CHAPTER CHALLENGE TEST REVIEW Chapter 2 Challenge Jeopardy Round

Engineering of xylose reductase xylose reductase and and Engineering of overexpression of

Market Trends and Market Trends and Export of Thai Fruits Export of Thai Fruits Narong

Quiz 1: Thursday 26 Jan 2012 from 11 AM to 12 noon GG Building Ground Floor

Sambuz

Useful Links

Newsletter

Mail Us

Introduction to Microarray Data Analysis and Gene Networks lecture - PowerPoint PPT Presentation

Introduction to Microarray Data Analysis and Gene Networks lecture 8 Alvis Brazma European Bioinformatics Institute Lecture 8 Gene networks part 2 Network topology (part 2) Network logics Network dynamics Gene Networks

Capturing Best Practice for Microarray Gene Expression Data Analysis Gregory Piatetsky-Shapiro

Microarray Data Analysis ECS 289A ECS289A a) Oligonucleotide and b) Spotted Arrays Lochart and

Gene Expression Data Introduction to gene expression data Expression data storage concept An

Eukaryotic Gene Eukaryotic Gene Prediction Prediction Eukaryotic gene structure Eukaryotic

Inference of Gene Relations from Microarray Data by Abduction Irene Papatheodorou &amp; Marek

Introduction to Microarray Data Analysis and Gene Networks Alvis Brazma European Bioinformatics

Gene Finding Strategies to find gene structures on the web Swiss Institute of Bioinformatics

Staphylococcus aureus Pathogenesis - Gene exchanges - Gene regulation - Gene products - Gene

Recent development in microarray data analysis Guan-Hua Huang Institute of Statistics National

A CMOS Label- -free DNA free DNA A CMOS Label Microarray Microarray Erik Anderson Stanford

Gene expression analysis Roadmap Microarray technology: how it work Applications: what

CSci 8980: Advanced Topics in Graphical Models Application: Gene Expression Analysis Instructor:

Introduction to Microarray Data Analysis and Gene Networks Lecture 3 and practical Alvis Brazma

Introduction to Microarray Data Analysis and Gene Networks Alvis Brazma European Bioinformatics

Introduction to Microarray Data Analysis and Gene Networks Lecture 5 Alvis Brazma European

Biology-Driven Clustering of Microarray Data Applications to the NCI60 Data Set K.R. Coombes,

Mass Spectrometry Proteomics for the Computational Biologist December 1, 2006 John T. Prince

Lasagna: Towards Deep Hierarchical Understanding and Searching over Mobile Sensing Data Cihang

Reinforcement Learning Policy Op5miza5on Pieter Abbeel UC Berkeley EECS Policy

Isomorphic Gait Execution in Homogeneous Modular Robots Michael Park, Sachin Chitta, and Mark Yim

Chapter 2 Nature of Matter CHAPTER CHALLENGE TEST REVIEW Chapter 2 Challenge Jeopardy Round

Engineering of xylose reductase xylose reductase and and Engineering of overexpression of

Market Trends and Market Trends and Export of Thai Fruits Export of Thai Fruits Narong

Quiz 1: Thursday 26 Jan 2012 from 11 AM to 12 noon GG Building Ground Floor

Sambuz

Useful Links

Newsletter

Mail Us

Inference of Gene Relations from Microarray Data by Abduction Irene Papatheodorou & Marek