Combination of protein biomarkers UseR! 2009 Rennes, July 8, 2009 - - PowerPoint PPT Presentation

combination of protein biomarkers
SMART_READER_LITE
LIVE PREVIEW

Combination of protein biomarkers UseR! 2009 Rennes, July 8, 2009 - - PowerPoint PPT Presentation

Combination of protein biomarkers UseR! 2009 Rennes, July 8, 2009 Xavier Robin 1 Outline Outline Introduction clinical problem biomarkers Combining biomarkers ROC Curves Comparison Comparing panels with single biomarkers Conclusion


slide-1
SLIDE 1

1

Combination of protein biomarkers

UseR! 2009

Rennes, July 8, 2009

Xavier Robin

slide-2
SLIDE 2

2

Outline Outline

Introduction

clinical problem biomarkers

Combining biomarkers ROC Curves

Comparison Comparing panels with single biomarkers

Conclusion Acknowledgements

slide-3
SLIDE 3

3

Aneurysmal Subarachnoid hemorrhage (aSAH) Aneurysmal Subarachnoid hemorrhage (aSAH)

SAH: rupture of a blood vessel just

  • utside the brain

Main cause (80%): aneurysm (dilation

  • f a blood vessel) : aSAH

1/10 000 people each year “Young patients” (mean: 55) Many patients are chronically disabled Needs: prognosis tools to aid physician for the management of patient and family.

slide-4
SLIDE 4

4

Biomarkers Biomarkers

Biomarkers are “characteristics objectively measured” whose concentration are different in two groups of patients.

Diagnosis, prognosis, therapeutic monitoring, …

At the BPRG we are interested in several brain damage markers

discovered by comparing ante- and post- mortem cerebrospinal fluid

When several proteins are considered in a single classifier (potentially with clinical information) one calls this a panel

New overfitting and reproducibility problems

slide-5
SLIDE 5

5

Biomarkers Biomarkers

Name Biological Role Marker for H-FABP

Fatty acid-binding protein Lipid Binding Cardiac, brain damage

NDKA

Nucleoside diphosphate kinase A regulation of apoptosis Brain damage

UFD1

Ubiquitin fusion degradation protein 1 protein degradation Brain damage

DJ1

Protein DJ-1 protein binding Brain damage, Parkinson

S100B

Protein S100-B protein binding Brain damage

Troponin-I

Troponin I, cardiac muscle protein binding Cardiac (but also brain)

slide-6
SLIDE 6

6

Cohorts and Goal of the study Cohorts and Goal of the study

Cohort:

113 patients validation: 25 patients from the same hospital collected later

Goal:

Predict outcome after 6 months

Focus attention on patients at risk of poor outcome

Want a high specificity to avoid false positives (good

  • utcome patients classified as poor outcome) and

give them the best management. Use partial area under the ROC curve With biomarkers or a combination of them

slide-7
SLIDE 7

7

Data Description Data Description

Quantitative measure of protein (continuous) and clinical (discrete) data Box-cox transformation (Yeo and Johnson, 2000)

slide-8
SLIDE 8

8

S100B is the best protein biomarker WFNS is the best clinical marker ­ Their accura cies are low (3.4% of the total area)

Biomarkers & Clinical parameters Biomarkers & Clinical parameters

— 113-set — 25-set ◊ pAUC

100 80 60 40 20 20 40 60 80 100

NDKA

100 80 60 40 20 20 40 60 80 100

H-FABP

100 80 60 40 20 20 40 60 80 100

S100B

100 80 60 40 20 20 40 60 80 100

Troponin-I

100 80 60 40 20 20 40 60 80 100

UFD1

100 80 60 40 20 20 40 60 80 100

DJ1

100 80 60 40 20 20 40 60 80 100

WFNS

100 80 60 40 20 20 40 60 80 100

Fisher

100 80 60 40 20 20 40 60 80 100

Age Specificity (%) Sensitivity (%)

slide-9
SLIDE 9

9

Combining biomarkers Combining biomarkers

— 113-set — 25-set ◊ pAUC

RIL : simple threshold- based method Packages used:

kernlab (svm) stats (lm & glm) pls kknn

100 80 60 40 20 20 40 60 80 100

RIL

100 80 60 40 20 20 40 60 80 100

SVM

100 80 60 40 20 20 40 60 80 100

LM

100 80 60 40 20 20 40 60 80 100

GLM

100 80 60 40 20 20 40 60 80 100

PLS

100 80 60 40 20 20 40 60 80 100

KKNN Specificity (%) Sensitivity (%)

slide-10
SLIDE 10

10

Combining biomarkers Combining biomarkers

Different methods to compute pAUC give different results Validation cohort is small (25 patients)

Mean of k*n pAUCs pAUC of means of n predictions Validation RIL 5.6 4.0 3.6 SVM 3.1 3.1 5.3 PLS 4.2 2.3 3.2 LM 5.1 3.7 2.8 GLM 4.6 3.1 2.8 KNN 2.9 1.0 2.6

RIL: best on cross-validation SVM: best on validation cohort

slide-11
SLIDE 11

11

Comparing ROC Curves Comparing ROC Curves

Several methods are available:

Bootstrapping DeLong, 1988

We will compare:

The best individual predictor (S100B) The best combination method (RIL)

and see how comparison methods perform

100 80 60 40 20 20 40 60 80 100 Specificity (%) Sensitivity (%) S100B RIL

slide-12
SLIDE 12

12

Comparing ROC curves: Bootstrap Comparing ROC curves: Bootstrap

Sigma computed by bootstrapping D ~ N(0, 1)

(see Hanley & McNeil, Radiology, 1983)

D= AUC1−AUC2 

slide-13
SLIDE 13

13

Comparing ROC curves: Bootstrap Comparing ROC curves: Bootstrap

Advantage:

Flexible Applicable to pAUCs

Disadvantage:

Slow Same observations

100 80 60 40 20 20 40 60 80 100 Specificity (%) Sensitivity (%) S100B RIL

p = 0.082

p-values Frequency 0.0 0.2 0.4 0.6 0.8 1.0 20 40 60

Resampled

slide-14
SLIDE 14

14

Comparing ROC curves: DeLong Comparing ROC curves: DeLong

Based on U statistics: Variance computed according to Hoeffding's theory AUC= 1 mn∑

j=1 n

j=1 m

X i ,Y j X ,Y ={ 1 YX ½ Y=X YX

DeLong et al., Biometrics, 1988

slide-15
SLIDE 15

15

Comparing ROC curves: DeLong Comparing ROC curves: DeLong

Advantages:

Fast and easy Based on robust statistics Non parametric

100 80 60 40 20 20 40 60 80 100 Specificity (%) Sensitivity (%) S100B RIL

p = 0.081

p-values Frequency 0.0 0.2 0.4 0.6 0.8 1.0 20 40

Resampled

slide-16
SLIDE 16

16

Comparing ROC curves Comparing ROC curves

Bootstrap is flexible and displays good results DeLong’s method works equally well pAUC computations should be straightforward Combinations does not appear significantly better than individual biomarkers

slide-17
SLIDE 17

17

Comparing panels with single biomarkers Comparing panels with single biomarkers

We want to be sure that the chosen panel performs better than the biomarkers taken individually Panel performances are cross-validated; individual biomarkers are not How can we compare them fairly?

Do we absolutely need a “validation” cohort?

slide-18
SLIDE 18

18

Conclusion Conclusion

The use of protein biomarkers is already widely spread We are not sure if using combination of several protein or clinical parameters can significantly increase accuracy

we don't know the influence of no cross- validation for single molecules

Acceptance by the medical community

Model must be simple and clear, understandable to non-experts

slide-19
SLIDE 19

19

Acknowledgements Acknowledgements

Natacha Turck Alexandre Hainard Loïc Dayon Natalia Tiberti Catherine Fouda Nadia Walter Jean-Charles Sanchez Markus Müller Frédérique Lisacek Other collaborations Louis Puybasset Paola Sanchez Laszlo Vutskits Marianne Gex-Fabry