Analysis of Gene Expression Profiles Analysis of Gene Expression - PowerPoint PPT Presentation

Analysis of Gene Expression Profiles Analysis of Gene Expression Profiles and Drug Activity Patterns and Drug Activity Patterns for the Molecular Pharmacology of Cancer for the Molecular Pharmacology of Cancer Jeong-Ho Chang, Kyu-Baek Hwang, and Byoung-Tak Zhang School of Computer Science and Engineering Seoul National University 151-742 Seoul, Korea http://bi.snu.ac.kr

Outline Outline ! Introduction ! Analyzing Cell-Cell Relations through Clustering ♦ Experimental Results ! Analyzing Gene-Drug Relations Using Bayesian Networks ♦ Experimental Results ! Concluding Remarks 2

Mining on Mining on Gene Expression and Drug Activity Data Gene Expression and Drug Activity Data ! Relationships among human cancer, gene expression, and drug activity Human cancer Human cancer Gene expression Drug activity Gene expression Drug activity ! Revealing these relationships " ♦ Cause and mechanisms of the cancer development ♦ New molecular targets for anti-cancer drugs 3

NCI60 Cell Lines Data Set NCI60 Cell Lines Data Set ! From 60 human cancer cell lines [Scherf 00] ♦ Colorectal, renal, ovarian, breast, prostate, lung, and central nervous system origin cancers, as well as leukemias and melanomas ! Gene expression patterns ♦ cDNA microarray ! Individual targets ♦ Analysis of molecular characteristics other than mRNA expressions ! Drug activity patterns ♦ Sulphorhodamine B assay " changes in total cellular protein after 48 hours of drug treatment 4

Analytical Effort Analytical Effort ! Analysis of cell-cell relationships using cluster analysis ♦ Clustering of cell lines based on ! Gene expression patterns only. ! Drug activity patterns only. ! Both patterns combined with weighted similarity. ! Analysis of gene-drug correlations using Bayesian networks ♦ Analysis of gene expression-drug activity dependencies ! Each cell line is represented by its gene expression profiles and drug activity patterns. ! Bayesian networks are constructed and analyzed for the discovery of dependencies between gene expressions and drug activities. 5

Analyzing Cell- -Cell Relations through Cell Relations through Analyzing Cell Clustering Clustering

Clustering Methods Clustering Methods ! Soft Topographic Vector Quantization [Graepel 98] ♦ Based on statistical physics ♦ Soft clustering + Topographic Phase transition mapping ♦ Clustering as an optimization Deterministic annealing ♦ Learned by deterministic annealing ( ) Phase transition ∑ − β exp ( , ) h e x c ∈ = jk ik i k ( ) ( ) k P x C ∑ ∑ − β i j exp ( , ) h e x c jk ik i k j k h : neighborhood function jk between cluster j and k 7

Clustering of Cell Lines Clustering of Cell Lines based on Gene Expression Profiles based on Gene Expression Profiles ! Among ten runs, result with the best cost value is shown here. ! Neighbor clusters show similar patterns as in the SOM. ! F ormed clusters tend to reflect the tissue of origin. ♦ CNS, RE, ME, LE, and CO 8

Using Drug Activity Information Using Drug Activity Information in the Analysis of Cell- -Cell Relations (1/3) Cell Relations (1/3) in the Analysis of Cell ! Questions ♦ Are drug activity patterns in cell lines also related with the tissue of origin? ♦ Is this relationship similar to that of gene expression profiles? g e ! Cluster analysis based on jk = − α + α g d ( 1 ) e e e + jk jk jk gene-drug information d e jk ! A linear interpolation of Cluster k distances based on gene expression and drug activity. ! If both patterns depend on the tissue of origin, the cluster structure will not differ strongly. Gene expressions Drug activities 9

Using Drug Activity Information Using Drug Activity Information in the Analysis of Cell- -Cell Relations (2/3) Cell Relations (2/3) in the Analysis of Cell ! Quantitative comparison between the clustering analyses ♦ Entropy n E m ∑ ∑ = = − j j ≤ E ≤ log E E p p ( 0 log ) n j ij ij i n j j = 1 j : the ratio of members in cluster j which belong to class i p ! ij : the number of members in cluster j ! n j ! If the number of clusters is fixed, – The higher value of entropy " lower reflection of the original class structure. ♦ Averaged Pearson correlation 2 ∑ < n R m ∑ = = ( , ) R r x x j j R − j i k ( 1 ) i k n n n = j j 1 j 10

Using Drug Activity Information Using Drug Activity Information in the Analysis of Cell- -Cell Relations (3/3) Cell Relations (3/3) in the Analysis of Cell Clustering Entropy 0.35 1.4 0.3 1.2 0.25 1 Av erag e c o rrelatio n 0.2 0.8 Entropy 0.6 0.15 15Clusters_Gene 0.4 0.1 15Clusters_Drug 11Clusters_Gene 0.2 0.05 11Clusters_Drug 0 0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Value of alpha Value of alpha α Average Pearson correlation Entropy with varying α with varying 11

Clustering of Cell Lines Clustering of Cell Lines based on Drug Activity Patterns based on Drug Activity Patterns ! Among ten runs, result with the best cost value is shown here. ! The clusters does not reflect the tissue of origin, compared to the result based on gene expression profiles. 12

Analyzing Gene- -Drug Relations Drug Relations Analyzing Gene Using Bayesian Networks Using Bayesian Networks

Bayesian Networks Bayesian Networks ! The joint probability distribution over all the variables in the Bayesian network. [Heckerman 96] ∏ = = n ( , ,..., ) ( | ) P X X X P X Pa Local probability 1 2 n i i 1 i distribution for X i Pa : the set of parents of X i i Θ = θ θ ( ,..., ) ~ parameter for ( | ) P X Pa 1 i i iq i i i θ = θ α α ( ) Dir ( | ,..., ) P 1 ij ij ij ijr A B i : # of configurat ions for q Pa i i : # of states for r X i i C D ( , , , , ) P A B C D E = ( ) ( | ) ( | , ) ( | , , ) ( | , , , ) P A P B A P C A B P D A B C P E A B C D E = ( ) ( ) ( | , ) ( | ) ( | ) P A P B P C A B P D B P E C 14

Bayesian Network Learning Bayesian Network Learning ! Learning for the local probability distribution θ = θ α α ( ) Dir ( | ,..., ) P ij ij ij 1 ijr i θ = θ α + α + ( | ) Dir ( | ,..., ) P D N N 1 1 ij ij ij ij ijr ijr i i ! Learning for the network structure [Friedman and Goldszmidt 99] ♦ Search for the best-scoring network structure (greedy search) ♦ BD (Bayesian Dirichlet) score [Heckerman et al. 95] = ⋅ ( , ) ( ) ( | ) p D S p S p D S Γ α Γ α + ( ) ( ) N ∏ ∏ ∏ = ⋅ n q r ij ijk ijk i i ( ) p S Γ α + Γ α = = = i 1 j 1 ( ) k 1 ( ) N ij ij ijk : training data D : network structure Sufficient S Prior ∑ α ∑ statistics α = = , N N ij ijk ij ijk k k calculated from D Γ = Γ + = Γ 15 ( 1 ) 1 , ( 1 ) ( ) x x x

Schematic View Schematic View of the Modeling Approach of the Modeling Approach Preprocessing Gene B - Thresholding Gene A - Clustering Gene Expression Gene Expression Drug A - Discretization Data Data Drug B Cancer Drug activity Drug activity - Selected genes, drugs Data Data and cancer type node Gene A Gene B Drug A Bayesian network Drug B learning Cancer < Learned Bayesian network > - Dependency analysis - Probabilistic inference 16

Data Preparation Data Preparation ! cDNA microarray data 60 samples ♦ Gene expression profiles on Gene expressions 60 cell lines ♦ 1376 × 60 matrix 1376 genes ! Drug activity data ♦ Drug activity patterns on 60 cell lines 60 samples ♦ 118 × 60 matrix Drug activities 118 drugs (1376 + 118) × 60 data matrix 17

Preprocessing Preprocessing 60 samples ! Thresholding 60 samples ♦ Elimination of 1376 unknown ESTs " 805 genes genes 805 genes ♦ Elimination of drugs 84 118 which have more drugs drugs than 4 missing values " 84 drugs ! Discretization 0 ♦ Local probability model for Bayesian networks: -1 1 multinomial distribution 18 µ - c ⋅ σ µ µ + c ⋅ σ

Bayesian Network Learning Bayesian Network Learning for Gene- -Drug Analysis Drug Analysis for Gene ! Large-scale Bayesian network ♦ Several hundreds nodes (up to 890) ♦ General greedy search is inapplicable because of time and space complexity. ! Search heuristics ♦ Local to global search heuristics ♦ Exploit the locality of Bayesian networks to reduce the entire search space. ! The local structure: Markov blanket [Pearl 88] ! Find the candidate Markov blanket (of pre-determined size k ) of each node " reduce the global search space 19

Analysis of Gene Expression Profiles Analysis of Gene Expression - PowerPoint PPT Presentation

Analysis of Gene Expression Profiles Analysis of Gene Expression Profiles and Drug Activity Patterns and Drug Activity Patterns for the Molecular Pharmacology of Cancer for the Molecular Pharmacology of Cancer Jeong-Ho Chang, Kyu-Baek Hwang,

Gene Expression Data Introduction to gene expression data Expression data storage concept An

CSci 8980: Advanced Topics in Graphical Models Application: Gene Expression Analysis Instructor:

A Data Warehouse-based A Data Warehouse-based Gene Expression Analysis Gene Expression Analysis

1 Milestones Milestones ID Task Name Duration Start Finish % Complete 1 Project Proposal

CSEP 527 Computational Biology Gene Expression Analysis 1 Assaying Gene Expression 3

CSEP 527 Computational Biology Gene Expression Analysis 1 Assaying Gene Expression 3

CSEP 590 B Computational Biology Gene Expression Analysis 1 Assaying Gene Expression 3

The Expression Problem and Lenses Lambdajam 2016 Tony Morris The Expression Problem A new name

Gene Expression Remember the days of 10 th grade biology Learning about gene expression Which can

AP BIOLOGY Gene Expression Summer 2013 www.njctl.org Slide 3 / 199 Gene Expression Unit Topics

Differential expression analysis John Blischak Instructor DataCamp Differential Expression

PULTRUSION PROFILES and APPLICATIONS Example of various shapes and size of pultruded profiles

A Parallel Approximation Hitting Set Algorithm for Gene Expression Analysis D. P. Ruchkys

Examples of online analysis tools for gene expression data Tools integrated in data repositories

Boolean models of the lac operon in E. coli Matthew Macauley Clemson University Gene expression

Survival Models built from Gene Expression Data using Gene Groups as Covariates Kai Kammers,

The Center for integrative genomics Report 20052006 Index Presentation Directors message

An integrated meta-QTL and transcriptomic data mining approach to select candidates controlling

Proteomic Analysis of Developing Pecan Nuts Kristen Clermont Mattison Lab, USDA Why study

Genome Sequencing of Lycomia zaccaria gen. nov. sp. nov. , Chryseobacterium haifense , and

Pathophysiology: Molecular Medicine Oral Presentations 2019/2020 HEMATO-ONCOLOGY AND IMMUNOLOGY

Development of silkworm ( Bombyx mori ) as a platform for producing biomaterials and growth

Other rare red blood cells disorders Batrice GULBIS, M.D., PhD 1 BHS training course 2017

Heme A Heme A is a heme protein that is a coordination complex consisting of a porphyrin

Sambuz

Useful Links

Newsletter

Mail Us