CAMDA 03: Weakest Link Models for Detecting Small Groups of Genes - PowerPoint PPT Presentation

CAMDA ’03: Weakest Link Models for Detecting Small Groups of Genes to Predict Lung Cancer Survival Presenter: Thomas J. Richards, Ph.D. November 13, 2003

Affiliation: Dorothy P. & Richard P. Simmons Center for Interstitial Lung Diseases in the Division of Pulmonary, Allergy, and Critical Care Medicine University of Pittsburgh

In Collaboration with: Roger S. Day, Sc.D. University of Pittsburgh Department of Biostatistics and University of Pittsburgh Cancer Institute

Weakest Link Models • Make sense in biology; • Can be applied to gene expression data; • May identify novel gene interactions.

Response: Plant Growth � 5 Necessary factors: • Water; • Sunlight; • P; • K; • Ca; � How do factors combine to effect plant growth?

They don’t work together like this…

They may work together like this…

Contour plots of E ( Y| X ) Reality? excess sun Sun Water Sun Traditional Models excess water “Curve of Optimal Use (COU)” Water Weakest link model

They may work together like this…

Or like this…

Source: H. Frederik Nijhout, American Scientist (2003)

The Weakest Link Idea E( Y i ) = min j { ϕ j (x ij ; θ j ): j = 1, …, m} •Usually, ϕ j = ϕ for all j; •Weakest link gene minimizes ϕ; •Each patient has his/her own weakest link;

WL Model for Binary Response Data: ϕ j (x ij ; θ j ) = logit –1 ( α j + β j x ij ) E( Y i ) = min j {logit –1 ( α j + β j x ij ) : j = 1, …, m} and θ j = ( α j , β j ). Parametric Weakest Link (PWL) Model

Parametric Weakest Link Model For Survival Data λ (t; x ij ) = λ 0 (t)exp[min j { ϕ j (x ij ; θ j )}] λ (t; x ij ) = λ 0 (t)exp[min j { β jxij }]

Quantile-Matching Weakest Link (QWL) Model:

Curve of Optimal Use:   −     = − 1 − −∆   f p 1 F F 1 p           ,   ?                   F or CDF; Normal Logistic   − -1     = = − 1 − +∆     f p f p 1 F F  1 p                        ?         -?                     =       f f p f p.       +     ? ? ? ?             1 2 1 2    

Data Pre-processing: Simplify! Simplify the process, minimize data handling: • Affy: • Run RMA, then generate ratios. • cDNA arrays: use ratios. • Focus on known genes only; • 2000 LocusLink IDs in all 4 data sets;

Approach to Data Analysis Gene Selection: Based on substantive hypotheses; • Use DAVID at NIAID to get gene classes: • Not optimal, but necessary in this case;

Approach to Data Analysis Groups of genes, from DAVID: • Cell Cycle (CELL, 24 genes); • Apoptosis (AP, 12 genes); • Extracellular Matrix (ECM, 18 genes); • Matrix Metalloproteinases (MMPs, 10 genes); • WNT Pathway (11 genes).

Approach to Data Analysis Form dyads of genes, for testing: • CELL.AP (288), CELL.ECM (432), … • AP.ECM (216), AP.MMP (120), … Etc. • Pair up all of the above genes with 45 genes from the Beer supplemental data.

Approach to Data Analysis Use profile likelihood to estimate a COU for each pair of genes; Use Bonferroni-by-4 on the p-values; For the direction, take the smallest of the four p-values.

Selected Results CELL.AP: 60 of 288 had adjusted p < 0.05. ECM.MMP: 37 of 180 had adjusted p < 0.05. ECM.BEER: 299 of 810 had adjusted p < 0.05. WNT.BEER: 152 of 495 had adjusted p < 0.05.

Selected Results CELL.AP, 60 significant pairs: 5 minp1p2; 17 maxp1p2; 13 maxp1q2; 25 minp1q2. ECM.MMP, 37 significant pairs: 2 minp1p2; 6 maxp1p2; 11 maxp1q2; 18 minp1q2. ECM.BEER, 299 significant pairs: 60 minp1p2; 65 maxp1p2; 100 maxp1q2; 74 minp1q2. WNT.BEER, 152 significant pairs: 32 minp1p2; 19 maxp1p2; 56 maxp1q2; 45 minp1q2.

Selected Results LocusLink ID = 4175, a Cell Cycle component, MCM6, minichromosome maintenance deficient 6 (S.cerevisae), involved in initiating replication. Biological interaction with 7 LocusLink IDs in the apoptosis class (5 in same direction): 2 minp1p2: TRAF1, TNFRSF1B; 3 maxp1p2: SFRS2IP, MCL1, TRADD; 1 maxp1q2: CRADD (good prognosis) 1 minp1q2: BCL2L2

MCM6 • MCM’s 2- 7 binds to DNA after mitosis and enable DNA replication. • MCM2 is a biomarker of proliferating cells and a marker for premalignant lung cells. • MCM6 is in a chromosomal region that is amplified in lung cancer and its mRNA level is also increased (Kaminski, Dehan unpublished data)

Selected Results II Can we find unexpected interactions? Biological interactions between Beer & ECM? ECM genes show up in every cancer dataset. Fibronectin is a predictor of melanoma invasiveness.

PAI-1 (Plasminogen Inhibitor 1) • Is a known marker of bad prognosis • Interacts significantly with at least 4 ECM genes • Vitronectin maxp1p2 ( Good Prognosis ! ) • Collagen 1A2 maxp1q2 • Collagen 9A2 minp1q1 • Collagen 5A1 minp1q1

Does it make sense? • Elevated PAI-1 activities are associated with coronary thrombosis and with a poor prognosis in many cancers • Vitronectin binding extends the lifetime of active PAI-1, which controls hemostasis and has also been implicated in angiogenesis. • The PAI-1 effects on cell adhesion and motility depend on vitronectin binding…

Conclusions Weakest Link Models: • Make sense in biology; • Can be applied to gene expression data; • May identify novel gene interactions.

Next Steps • Validation on independent data set; • Extend from dyads to triads; • Use tryads to explore pathways; • Extend to arbitrary number of genes.

Acknowledgements: Naftali Kaminski, M.D. Director, Dorothy P. & Richard P. Simmons Center for Interstitial Lung Diseases Public Defenders’ Association

Supplementary Slides

Potential Problems with Linear Models • Mechanistic model, not just predictive. • Several covariates impact a response. – Example: immune response in Melanoma. • Each covariate is “necessary.” – Necessary = “Necessary to impact response probability.” • Logistic Model is unrealistic : 18-Nov-03 Introduction: Motivation for Model 43

– Increasing a covariate always has an effect. – One covariate can be traded off for another. • Example : Branch, Bryant, et al (1997): N- acetyltransferase Metabolic Activity and Bladder Cancer. – Goal : determine role of N-acetyltransferase slow acetylator phenotype in susceptibility to occupationally related aggressive bladder cancer. – Problem : possible interaction without main effect. 18-Nov-03 Introduction: Motivation for Model 44

Interaction without Main Effects • For categorical data, not a new idea: – “Synergism” in BFH (1975). – 2 x 2 x 2 contingency table. – BFH cite Worcester [1971] model, for thromboembolism data. • My adaptation of BFH… – (To SWP3.0)

Est. RR (Controlling age, sex, alcohol, tobacco) Occupational exposure Occupational exposure Acetylator Unexposed Exposed Acetylator Unexposed Exposed Phenotype Phenotype Fast 1.0 1.0 Fast 1.0 1.0 Slow 1.1 8.0 Slow 1.1 8.0 (1.9, 3.4) = 95% ci. (1.9, 3.4) = 95% ci. p < 0.01 p < 0.01 Is there “synergy”, or “synergism”, here?

  π = E Y X X , ;   i i 1 2       = + logit p a ß? ,   i i     ( )   min -1  p , f p ; or    1 ? 2       ( )   max   -1 p , f p ; or     1 ? 2 =     where ? , and   ( ) i    max  -1 p , 1-f p ; or     1 ? 2     ( )     min -1 p , 1-f p ;     1 ? 2       →     f : 0,1 0,1 is defined by     ?     −   1 ( ) = − − −∆     f p 1 F F 1 p ,where     ?       F is a symmetric distribution function.

The Quantile-Matching Weakest Link (QWL) Model In p 1 -p 2 space, the unit square, define a new covariate, one of: ρ = min{p 1 , p 2 } (minp1p2) ρ = max{p 1 , p 2 } (maxp1p2) ρ = max{p 1 , 1 - p 2 } (maxp1q2) ρ = min{p 1 , 1 - p 2 } (minp1q2)

QWL Model For binary response data: E[Y i | X 1 , X 2 ] = α + β ρ i For survival data: λ (t; x i ) = λ 0 (t)exp( β ρ i ) Fitting this QWL Model: Done.

CAMDA 03: Weakest Link Models for Detecting Small Groups of Genes - PowerPoint PPT Presentation

CAMDA 03: Weakest Link Models for Detecting Small Groups of Genes to Predict Lung Cancer Survival Presenter: Thomas J. Richards, Ph.D. November 13, 2003 Affiliation: Dorothy P. & Richard P. Simmons Center for Interstitial Lung

Human Error - The Weakest link in CyberSecurity Exceptional IT. Real People. Bigger Purpose.

CAMDA: An Overview Michael Ochs Bioinformatics Fox Chase Cancer Center Bioinformatics Fox

Detecting Spammers and Content Detecting Spammers and Content Detecting Spammers and Content

12/6/2013 Detecting Fakes Image Forensics: Detecting Forged Photos 1.Detecting photorealistic

Models and refined models for involutory reflection groups and classical Weyl groups FABRIZIO

The Weakest Failure Detectors to Boost Obstruction-Freedom Rachid Guerraoui 1 Micha Kapaka 1

A weakest precondition approach to active analysis attacks analysis Musard Balliu, Isabella

The weakest failure detectors to solve certain fundamental problems in distributed computing

COMP2111 Week 9 Term 1, 2020 Hoare Logic 1 Summary Weakest precondition reasoning Handling

Constructing non-positively curved spaces and groups Day 3: Artin groups and small-cancellation

Corporate Presentation September 2018 About Link REIT About Link REIT Link is Our Portfolio (1)

10 GHz Microwave Link 10 GHz Microwave Link 10 GHz Microwave Link 10 GHz Microwave Link Project

Vertex Standard EVX-Link Training EVX-Link Training What is the EVX-Link EVX-Link is a fast

Changing the Game - The De-Linking Paradigm Old Way Our Way De-Link De-Link Link Link

Importance of sterilization packaging every chain is only as strong as its weakest link

TH THE WEAKEST ST LINK IN CY CYBER ER SECU SECURITY TY 1 Introduction In Ren Sloos

Iron Deficiency Anemia Remember that ferrtin levels increase with age so may not be useful

PRENATAL SERVICES PRESUMPTIVE ELIGIBLITY Provider Certification Training Program 1 TODAYS

W e have prepared this Alert to advise preferential leave treatment for pregnant you of a

PE and Provider-Based Enrollments in 2014 and Beyond Lynn Kersey, MA, MPH E.D. Maternal and

Phoenix Group plc Investor Day 2013 Thursday 16 May 2013 Clive Bannister, Group Chief

Clients Thrive James O. Prochaska, Ph.D. Director and Professor Cancer Prevention Research

What is f NL ? For a pedagogical introduction to f NL , see Komatsu, astro-ph/0206039 In

Workplace Wellness Achieving Wellness at Oakland County Nancy

Sambuz

Useful Links

Newsletter

Mail Us