Using Network Flow to Bridge the Gap Using Network Flow to Bridge - - PowerPoint PPT Presentation
Using Network Flow to Bridge the Gap Using Network Flow to Bridge - - PowerPoint PPT Presentation
Using Network Flow to Bridge the Gap Using Network Flow to Bridge the Gap between Genotype and Phenotype Teresa Przytycka NIH / NLM / NCBI NIH / NLM / NCBI Journal Wisla (1902) Picture from a local fare in Lublin Poland from a local
Journal “Wisla” (1902) Picture from a local fare in Lublin Poland from a local fare in Lublin, Poland
Phenotypes Genotypes
Journal “Wisla” (1902) Picture from a local fare in Lublin Poland from a local fare in Lublin, Poland
G 1
Association studies
Genome 1 Genome 2 Genome 3 Genome n
Genotype: effects of genotypic effects of genotypic variation:
- change in amino acid
- change in gene structure
- copy number variations ….
5
Genotype: Phenotype (e.g. disease) effects of genotypic effects of genotypic variation:
- change in amino acid
- change in gene structure
- copy number variations ….
6
G l Goals :
- A method for system level analysis of propagation of
y y p p g such perturbation in the network
- Prediction of “causal” mutations
- Prediction of master regulators (network hubs)
involved in disease
- Prediction of pathways dys-regulated in disease
Propagation of the effects of Copy number aberrations in Glioma
Cancer Cases G i d t CNV
Gene 1 Gene 2 Gene 3
Gene expression data . . mosomes . . . chrom I t t d
Gene n
Integrated Protein-protein, protein-DNA phosphorylation network
Copy number aberrations py
- r/and mutations
Gene expression
Copy number aberrations py
- r/and mutations
Gene expression
Signature genes
Copy number aberrations py
- r/and mutations
Signature genes
Copy number aberrations py
- r/and mutations
Signature genes
Method outline Method outline
- 1. Selecting marker genes to be used as “phenotype”
- 2. Genotype-phenotype association
- 3. Uncovering information flow between genotype and
phenotype
- 4. Inferring dys- regulated, genes, pathways, and causal
mutations
13
Selecting “phenotype” genes
C C
Gene 1 Gene 2
Cancer Cases Gene expression data
Gene 2 Gene 3
. . . . . target genes
Gene n
Selecting “phenotype” genes
Selecting “phenotype” genes
Smallest set of genes so that each case is “covered” at least specified number of times
Associations between copy number variations and gene expression of selected target genes and gene expression of selected target genes
Cancer Cases Gene expression data Cancer Cases CNV data
17
Significant correlation between CNV and expression
Cancer Cases
expression
Gene 1 Gene 2 Gene 3
Gene expression da . . . . .
Gene n
18
Significant correlation between CNV and expression
Cancer Cases
expression
Gene expression da target gene locus
19
Significant correlation between CNV and expression
Cancer Cases
expression
Gene expression da target gene candidate causal genes candidate causal genes
20
Uncovering pathways of information flow between CNV and target gene CNV and target gene
Cancer Cases Gene expression da
21
Using expression to guide path discovery
Cancer Cases Gene expression da
22
Translating probabilities it resistances
Cancer Cases Gene expression da
23
Resistance - set to favor most likely path -based on gene expression values
(reversely proportional to the average correlation of the expression of the adjacent genes with expression of the target gene)
Finding subnetworks with significant current flow
Cancer Cases Gene expression da
24
Resistance - set to favor most likely path -based on gene expression values
(reversely proportional to the average correlation of the expression of the adjacent genes with expression of the target gene)
G l Goals :
- A method for system level analysis of propagation of
y y p p g such perturbation in the network
- Prediction of “causal” mutations
- Identification master regulators (network hubs)
involved in disease
- Identification pathways dys-regulated in disease
Putative causal variation
(with lots of additional caveats) Cancer Cases (with lots of additional caveats) Gene expression da
26
Resistance - set to favor most likely path -based on gene expression values
(reversely proportional to the average correlation of the expression of the adjacent genes with expression of the target gene)
Causal copy number aberrations Causal copy number aberrations
27 27
G l Goals :
- A method for system level analysis of propagation of
y y p p g such perturbation in the network
- Prediction of “causal” mutations
- Prediction “master regulators” (network hubs) involved
in disease
- Prediction pathways dys-regulated in disease
Solve current flow for all pairs and find nodes belonging to many paths g g y p
Cancer Cases Gene expression data Cancer Cases CNV data
29
Hubs Hubs
30
G l Goals :
- A method for system level analysis of propagation of
y y p p g such perturbation in the network
- Prediction of “causal” mutations
- Prediction of “master regulators” (network hubs)
involved in disease
- Prediction of pathways dys-regulated in disease
Are there common functional pathways?
Cancer Cases Gene expression dat Cancer Cases CNV data
32
Common GO pathways
33
G l Goals :
- A method for system level analysis of propagation of
y y p p g such perturbation in the network
- Prediction of “causal” mutations
- Prediction of “master regulators” (network hubs)
involved in disease
- Prediction of pathways dys-regulated in disease
Design details under the hood g
- Current flow reduces to solving a set of linear equations (Kirchhoff's
laws) Caveat: We had to solving a linear system with 20,000 variables thousands of times for permutation test required new methodology
- Many biological interactions are directional. This can be taken care by
solving linear program with corresponding constraints - Caveat: the network is to big for solving thousands of linear programs network is to big for solving thousands of linear programs
- Null model and p-value estimations
Kim, Wuchty, Przytycka – PloS Comp Bio 2011
Kim, Przytycki, Wuchty, Przytycka – Phys. Bio. 2011
35
Acknowledgments
Group members: Yoo-Ah Kim DongYeon Cho Yang Huang Jan Hoinka Xiangjun Du g g Damian Wojtowicz Raheleh Salari
Stefan Wuchty (NCBI)
Collaborators:
Journal “Wisla” (1902) Picture from a local fare in Lublin, Poland
Jozef Przytycki (GWU) Stefan Wuchty (NCBI)
my great-great uncle (the “Giant”)
37
Acknowledgments
Group members: Collaborators: Yoo-Ah Kim DongYeon Cho
B i Oli (NIDDK) Stefan Wuchty (NCBI)
Yang Huang
Brian Oliver (NIDDK) John Malone Nicolas Mattiuzzo J ti A d (I di U i it )
Jan Hoinka Xiangjun Du g g
Justin Andrews (Indiana University) Jozef Przytycki (GWU)
Damian Wojtowicz Raheleh Salari
39
Impact of gene copy number on gene expression in Drosophila melanogaster expression in Drosophila melanogaster
ge (log2)
- ld chang
ression fo
- 1
E i ( ild t ) Exp
40
Expression (wild type) collaboration with Brian Oliver group (NIDDK)
CNV-related perturbations propagate t h i t ti t k trough interaction network
41
Co-complex network from Artavanis-Tsakonas group (unpublished)
Impact on copy number on gene i i li expression in glioma
CNV Chromosomes Correlation between CNV and expression
42
Genotype: effects of genotypic effects of genotypic variation:
- change in amino acid
- change in gene structure
- copy number variations ….
43
Phenotype Genotype: effects of genotypic effects of genotypic variation:
- change in amino acid
- change in gene structure
- copy number variations ….
44
Phenotype Genotype: effects of genotypic effects of genotypic variation:
- change in amino acid
Molecular phenotypes
- change in gene structure
- copy number variations ….
phenotypes
- gene expression
- Metabolite level
45
Copy number variations (CNV) (gene dosage)
- implicated in large number of human diseases (cancer, Crohn's disease,
autism)
(gene dosage)
- 28,025 structural variants identified in 1000 genome study (2,000 changes
affecting full genes or exons)
- Frequent type of somatic mutations in cancer
Phenotype Genotype: Molecular phenotypes phenotypes
- gene expression
- Metabolite level
47