Ra Random matrix analysis for gene co co-ex expres ession ex - - PowerPoint PPT Presentation

ra random matrix analysis for gene co co ex expres ession
SMART_READER_LITE
LIVE PREVIEW

Ra Random matrix analysis for gene co co-ex expres ession ex - - PowerPoint PPT Presentation

Ra Random matrix analysis for gene co co-ex expres ession ex exper erimen ents in in can ancer ce cells OIST-iTHES-CTSR 2016 July 9 th , 2016 Ayumi KIKKAWA (MTPU, OIST) Introduction : What is co-expression of genes? There are


slide-1
SLIDE 1

OIST-iTHES-CTSR 2016 July 9th , 2016

Ra Random matrix analysis for gene co co-ex expres ession ex exper erimen ents in in can ancer ce cells

Ayumi KIKKAWA (MTPU, OIST)

slide-2
SLIDE 2

Introduction : What is co-expression of genes?

ØJonsson,P.F. and Bates,P.A. (2006) Global topological features of cancer proteins in the human interactome. Bioinformatics, 22, 2291‒2297.

  • There are 20~30k genes in

human DNA.

  • They are both coding or

non-coding genes.

  • Complex netwoksof

various transcripts.

  • Gene Interaction network

(regulatory network)

  • Protein-Protein

interactions.

  • mRNA, Non-coding RNA,

Micro RNA, etc., …

  • Transcriptomes
  • System biology
slide-3
SLIDE 3

The microarray experiments to gene interaction network

  • NCBI GEO https://www.ncbi.nlm.nih.gov/gds

GEO is an international public repository that archives and freely distributes microarray, next-generation sequencing, and other forms of high-throughput functional genomics data submitted by the research community.

  • Series 70,997
  • Platforms 16,042
  • Samples 1,858,012
  • More than 10k gene expression in a single assay.
  • Meta-analysis over many experiments is possible.
  • The gene interaction network should change its topology in various

cellular states including disease.

  • The Bayesian inferred gene interaction network algorithm. (SiGN-NNSR)
slide-4
SLIDE 4

The cancer gene interaction database: TCNG

The Cancer Network Galaxy (TCNG) http://tcng.hgc.jp Nonparametrix Bayesian network algorithm (SiGN) Ø Y., Tamada et al. Estimating genome- wide gene networks using nonparametric bayesian network models on massively parallel computers. IEEE/ACM Trans. Comput. Biol. Bioinforma.8, 683–697 (2011). K :京 Riken supercomputer Based on 256 GEO datasets. Total nodes = 22820 Total edges ~ 16M

… …

Sample Sample 1 2 3 n

Gene1 Gene2

Bayesian network 1

Learning Bayesian network Experimental Data

… …

Bayesian network 2

… …

Gene1 Gene2

slide-5
SLIDE 5

RMT analysis for gene interaction

  • The random matrix theory (RMT) can be applied to various biological networks and

we have studied the protein-protein interaction (PPI) networks previously.

  • In many organisms, PPI network shows the universal behavior. The nearest neighbor

level (NNL) spacing distribution P(s) shows the Wigner distribution.

  • The important feature of this level statistics is that the eigenvalues (levels) of

the adjacency matrix repel each other.

  • This is compared to the opposite case where the levels have no correlation

mutually and the distribution behaves as Poisson distribution.

  • The difference of the gene networks between the normal and disease cells is

very important.

  • We apply RMT in cancer gene network in order to study whether there is

distinctive topological behavior in cancer cells.

slide-6
SLIDE 6

The Work flow

slide-7
SLIDE 7

The statistics of the TCNG data

  • Number of inferred edges
  • Number of samples

Frequency (edge attribute) : Edge attribute calculated by SiGN-BN NNSR. It represents the frequency of the edge estimated during the iterations of the NNSR algorithm. The range of the value is from 0 to 1. By the default setting, an edge with Freq greater than 0.2 is regarded as being estimated. You can consider this value as the confidence of the estimated edge. This does not represent the accuracy nor the strength of the edge.

slide-8
SLIDE 8

Poisson to WD distribution change due to the network size

#236 (GSE7904) 51 samples 8000 nodes, 32,124 edges #165(GSE29013) 50 samples 8000 nodes, 51,702 edges

slide-9
SLIDE 9

Poisson to WD distribution change due to the confidence factor of the edges

#18 (GSE11135) 204 samples, 21,001 edges #26 (GSE12276) 204 samples, 51994 edges

slide-10
SLIDE 10

#92: 111 samples , 26,717 edges

slide-11
SLIDE 11

Summary

i. From the view point of RMT, we have observed universal behaviors for gene interaction network in cancer cells with the data from the TCNG database. ii. The NNS distribution for gene interaction matrix changes from Poisson distribution to Wigner distribution when the network size is enlarged.

  • iii. The NNS distribution change from P to W is also
  • bserved when the confidence factor of inferred

edges are strict.

  • iv. As far as our study, the Poisson distribution has

been observed only in the cancer related molecular networks yet. (PPI or gene interaction networks).