The Analysis of Placement Values for Evaluating Discriminatory - PowerPoint PPT Presentation

The Analysis of Placement Values for Evaluating Discriminatory Measures Margaret Sullivan Pepe & Tianxi Cai Biometrics (2004) Allison Meisner · May 27, 2014 1

Overview When we have a continuous test Y and a binary outcome D , the ROC curve plots the (FPR, TPR) pairs for each possible cutoff of the test. Problem: The ROC curve may differ by patient characteristics. Identifying such variability helps us to apply the test in an optimal way. Solution: ROC regression with placement values 2

Motivating Example Prostate-specific antigen (PSA) is a popular, though controversial, way to screen men for prostate cancer (PCa). The biology of PSA and PCa has implications for the usefulness of PSA as a screening tool: ◮ PSA levels differ by age: older men typically have higher PSA, regardless of PCa status ◮ Age can potentially affect the ability of PSA to discriminate PCa cases ◮ Among PCa cases, PSA measured closer to diagnosis does a better job of discriminating PCa 3

Background: FPR, TPR, ROC 4

Background: Effect of Covariates on ROC 8

Background: Effect of Covariates on ROC Recall, ROC ( u ) = (TPR at FPR = u ) . 15

ROC Model ◮ ROC model (Pepe, 1997): ROC Z D ( u ) = g ( β T Z D + H α ( u )) ◮ α = underlying shape of ROC curve ◮ β = impact of Z D on shape of ROC curve ◮ Problem: estimation ◮ Pepe (2000) and Alonzo and Pepe (2002) create indicators I ( Y Di ≥ F − 1 D (1 − u )) for some set of FPRs u and then use binary regression techniques ◮ Pepe & Cai propose using placement values and what is known about their distribution to estimate the parameters more efficiently 16

Placement Values ◮ Definitions ◮ Placement values: U Di = 1 − F D ( Y Di ) for the i th diseased subject. In words, the placement value for the i th diseased subject is the proportion of the reference (non-diseased) population with marker Y values above Y Di . ◮ If Z D affects the distribution of Y in the reference population, U Di = 1 − F D, Z D ( Y Di ). ◮ ROC curve: ROC ( u ) = P ( Y D ≥ F − 1 D (1 − u )) = (TPR at FPR=u) ◮ Relationship between ROC and placement values P ( Y D ≥ F − 1 ROC ( u ) = D (1 − u )) = P (1 − u ≤ F D ( Y D )) = P (1 − F D ( Y D ) ≤ u ) = P ( U D ≤ u ) 17

Placement Values 18

Proposed Method ◮ ROC model (Pepe, 1997): ROC Z D ( u ) = g ( β T Z D + H α ( u )) ◮ Proposed model: H α ( U D ) = − β T Z D + ǫ , where ǫ ∼ g ◮ Proof of equivalence: Pr ( U D ≤ u ) = Pr ( H α ( U D ) ≤ H α ( u )) Pr ( − β T Z D + ǫ ≤ H α ( u )) = Pr ( ǫ ≤ β T Z D + H α ( u )) = g ( β T Z D + H α ( u )) = ROC Z D ( u ) = Recall that if Z D affects the distribution of Y in the reference population, U Di = 1 − F D, Z D ( Y Di ); then we may write H α ( U D ) = − β T Z D + ǫ ⇔ ROC Z D , Z D ( u ) = g ( β T Z D + H α ( u )) ◮ In our example, Z D = age and Z D = (age, time). 19

Proposed Method: Algorithm Since Pr ( U D ≤ u ) = g ( β T Z D + H α ( u )), we know the density function is f ( u ) = ∂g ( β T Z D + H α ( u )) . ∂u Then, for [ a, b ] ⊂ (0 , 1), the log likelihood is n D [ I ( U Di < a )log { g ( β T Z Di + H α ( a )) } � ℓ ( θ ) = i =1 + I ( U Di > b )log { 1 − g ( β T Z Di + H α ( b )) } + I ( U Di ∈ ( a, b ))log f ( U Di )] where θ = ( α , β ). 20

Proposed Method: Algorithm Estimating F D, Z D ◮ Pepe and Cai advise estimating F D, Z D nonparametrically if Z D is discrete and semiparametrically otherwise. ◮ For semiparametric estimation, Pepe and Cai recommend the semiparamtric regression quantile estimation procedure developed by Heagerty and Pepe (1999). The estimates of the placement values, ˆ U Di , are substituted into ℓ ( θ ), yielding a pseudo-log-likelihood*, which is maximized to estimate θ . 21

Competing Method: Algorithm Alonzo and Pepe proposed an algorithm for fitting ROC regression based on binary regression methods. 1. For [ a, b ] ⊂ (0 , 1), let T = { u 1 , ..., u n T } = { 1 − j/n D ; j = 1 , ..., n D − 1 } ∩ [ a, b ] (the maximal set). 2. Then for each diseased subject i , the n T binary variables B ui are calculated: B ui = I [ ˆ U Di ≤ u ] , u ∈ T. 3. The binary generalized linear regression model E { B ui } = g { β T Z D + H α ( u ) } is fit using standard techniques. The Pepe and Cai method is claimed to be more efficient than that of Alonzo and Pepe. 22

Simulations Set-up ◮ Y D = α − 1 1 { α 0 + β 1 Z 1 + ( β 2 + 0 . 5 α 1 ) Z 2 + ǫ D } Y D = 0 . 5 Z 2 + ǫ D ◮ Z 1 ∼ Bernoulli(0 . 5), Z 2 ∼ Uniform(0 , 1) ◮ ǫ D ∼ N (0 , 1), ǫ D ∼ N (0 , 1) Induced ROC curve: ROC Z D , Z D ( u ) = Pr ( U D ≤ u ) = Pr (1 − F D ( Y D ) ≤ u ) Pr ( F − 1 D (1 − u ) ≤ α − 1 = 1 { α 0 + β 1 z 1 + ( β 2 + 0 . 5 α 1 ) z 2 + ǫ D ) Pr (Φ − 1 (1 − u ) + 0 . 5 z 2 ≤ = α − 1 1 { α 0 + β 1 z 1 + ( β 2 + 0 . 5 α 1 ) z 2 + ǫ D } ) Pr ( ǫ D ≤ − α 1 Φ − 1 (1 − u ) + α 0 + β 1 z 1 + β 2 z 2 ) = Φ( α 1 Φ − 1 ( u ) + α 0 + β 1 z 1 + β 2 z 2 ) = g ( β T Z D + H α ( u )) = Recall, α = shape of ROC, β = effects of Z D on ROC 23

Simulations Note that here Z D = Z 2 and Z D = ( Z 1 , Z 2 ) . Despite their recommendations, Pepe and Cai did not use the semiparametric method of Heagerty and Pepe to estimate placement values. Instead, Pepe and Cai regress Y on Z 2 among the non-diseased subjects: E ( Y D | Z 2 = z 2 ) = γ 0 + γ 1 z 2 ⇒ ˆ ǫ Di = Y Di − ˆ γ 0 − ˆ γ 1 z 2 Di . Then the placement value for subject i was estimated to be n D U Di = 1 ˆ � I (ˆ ǫ D j > Y Di − ˆ γ 0 − ˆ γ 1 z 2 Di ) . n D j =1 24

Simulations Two sets of simulations (1000 simulations each): 1. Pepe and Cai method only ◮ Bias ◮ Empirical SE ◮ Mean estimated SE ◮ Empirical coverage probability ◮ Note: α 0 = 1 , α 1 = 1 , β 1 = 0 . 5 , β 2 = 0 . 7 throughout ◮ Considered [ a, b ] = [0 . 01 , 0 . 99] and [ a, b ] = [0 . 01 , 0 . 20] 2. Pepe and Cai vs. Alonzo and Pepe ◮ Bias ◮ MSE ◮ Two sets of parameter values considered ◮ α 0 = 1 , α 1 = 1 , β 1 = 0 . 5 , β 2 = 0 . 7 ◮ α 0 = 1 . 5 , α 1 = 0 . 9 , β 1 = 0 . 5 , β 2 = 0 . 7 ◮ Considered [ a, b ] = [0 . 01 , 0 . 99] and [ a, b ] = [0 . 01 , 0 . 50] 25

Simulations: Pepe & Cai ◮ [ a, b ] = [0 . 01 , 0 . 99] 26

Simulations: Pepe & Cai vs. Alonzo & Pepe ◮ α 0 = 1 , α 1 = 1 , β 1 = 0 . 5 , β 2 = 0 . 7 ◮ [ a, b ] = [0 . 01 , 0 . 99] 27

Application The proposed method was applied to data from a study on PSA and PCa screening. ◮ 88 PCa cases, 88 age-matched controls ◮ Recall, Z D = age and Z D = (age, time) ◮ Model: ROC Z D , Z D ( u ) = Φ( α 0 + α 1 Φ − 1 ( u ) + β 1 time + β 2 age) ◮ SE estimates from the bootstrap (500 replications) Estimate (SE) α 0 4.30 (0.93) α 1 0.84 (0.09) β 1 -0.16 (0.03) β 2 -0.04 (0.01) 28

Conclusions ◮ The proposed method has nice intuition behind it and makes full use of the data through placement values, as opposed to creating indicators. ◮ Implementation of the proposed method is less straightforward and is not particularly computationally efficient. ◮ In most scenarios, the proposed method is more statistically efficient than the binary regression technique. ◮ Both methods are susceptible to misspecification in both the estimation of F D and the form of the ROC model. 29

Effects of Misspecification What happens when Y D = 0 . 5 Z 2 2 + N (0 , ( Z 2 + 0 . 5) 2 ) but we still assume Y D = 0 . 5 Z 2 + N (0 , 1)? This will impact 1. estimates of placement values 2. form of the induced ROC curve (used in the likelihood calculation) 30

Effects of Misspecification ◮ α 0 = 1 , α 1 = 1 , β 1 = 0 . 5 , β 2 = 0 . 7 31

Effects of Misspecification ◮ α 0 = 1 . 5 , α 1 = 0 . 9 , β 1 = 0 . 5 , β 2 = 0 . 7 32

Conclusions ◮ The proposed method has nice intuition behind it and makes full use of the data through placement values, as opposed to creating indicators. ◮ Implementation of the proposed method is less straightforward and is not particularly computationally efficient. ◮ In most scenarios, the proposed method is more statistically efficient than the binary regression technique. ◮ Both methods are susceptible to misspecification in both the estimation of F D and the form of the ROC model. 33

Simulations: Pepe & Cai ◮ [ a, b ] = [0 . 01 , 0 . 20] 34

Simulations: Pepe & Cai vs. Alonzo & Pepe ◮ α 0 = 1 , α 1 = 1 , β 1 = 0 . 5 , β 2 = 0 . 7 ◮ [ a, b ] = [0 . 01 , 0 . 50] 35

Simulations: Pepe & Cai vs. Alonzo & Pepe ◮ α 0 = 1 . 5 , α 1 = 0 . 9 , β 1 = 0 . 5 , β 2 = 0 . 7 ◮ [ a, b ] = [0 . 01 , 0 . 99] 36

Simulations: Pepe & Cai vs. Alonzo & Pepe ◮ α 0 = 1 . 5 , α 1 = 0 . 9 , β 1 = 0 . 5 , β 2 = 0 . 7 ◮ [ a, b ] = [0 . 01 , 0 . 0 . 5] 37

The Analysis of Placement Values for Evaluating Discriminatory - PowerPoint PPT Presentation

The Analysis of Placement Values for Evaluating Discriminatory Measures Margaret Sullivan Pepe & Tianxi Cai Biometrics (2004) Allison Meisner May 27, 2014 1 Overview When we have a continuous test Y and a binary outcome D , the ROC

VLSI Placement Sadiq M. Sait & Habib Youssef December 1995 Placement Placement is the

TimberWolf 7.0 Placement Perform TimberWolf placement Based on the given standard cell

Values Learning Outcomes Define what values are Identify your personal values Relate

Outline Motivation Seeing the Forest and the Why current placement tools are outdated

Student Placement Task Force Student placement option presentation Maize Board of Education |

College Placement Presentation October 30, 2019 Dave Bucciero Director of College Placement

ADVANCED PLACEMENT The purpose of the Advanced Placement program is to provide the students with

Advanced Placement Physics 1 Advanced Placement Physics 2 Dr. Matt Frederickson Dr. Kevin

College Placement Presentation October 24, 2018 Dave Bucciero Director of College Placement

INCREASING CIRCULATION BOOK DISPLAYS THROUGH 2 Placement PLACEMENT LIBRARY GEOGRAPHY

BonnPlace : A Self-Stabilizing Placement Framework Ulrich Brenner, Anna Hermann, Nils Hoppmann,

The ISPD 2006 Placement Contest and Benchmark Suite Gi-Joon Nam, Charles J. Alpert, Paul G.

GORDIAN Placement Perform GORDIAN placement Uniform area and net weight, area balance

Using machine learning Learning knot methods in geometric modeling placement SVM knot placement

ISPD 2005/2006 Placement Contest Updates Gi-Joon Nam IBM Corp. 2 ISPD Placement Contest ISPD

CS137: Today Electronic Design Automation Placement Problem Partitioning Placement

Test Positive Aware Network Rogers Park Edgewater Uptown Michelle Randall-English

AVAILABILITY OF MEDICINES: IMPACT ON PUBLIC HEALTH Unique example of cancer Elisabeth de Vries,

M8 Sustainable Limited An integrated and sustainable waste management business focused on

STANDGAS STANDALONE DETECTOR AND STANDALONE RS485 DETECTOR FOR THE DETECTION OF TOXIC AND

Smart Energy Grid Cherie Gregoire Cambridgeshire County Council Sheryl French Cambridgeshire County

Tracking Adoption Rate of Tracking Adoption Rate of Children Available for Adoption

ALLIANCE BUSINESS ACADEMY, BANGALORE, INDIA TQM I MPLEMENTATI ON I N MANAGEMENT EDUCATI ON

Evolving Role of the Quality Professional Presented By General Manager Greg Weiler ASQ Asia

Sambuz

Useful Links

Newsletter

Mail Us

The Analysis of Placement Values for Evaluating Discriminatory - PowerPoint PPT Presentation

The Analysis of Placement Values for Evaluating Discriminatory Measures Margaret Sullivan Pepe & Tianxi Cai Biometrics (2004) Allison Meisner May 27, 2014 1 Overview When we have a continuous test Y and a binary outcome D , the ROC

VLSI Placement Sadiq M. Sait &amp; Habib Youssef December 1995 Placement Placement is the

TimberWolf 7.0 Placement Perform TimberWolf placement Based on the given standard cell

Values Learning Outcomes Define what values are Identify your personal values Relate

Outline Motivation Seeing the Forest and the Why current placement tools are outdated

Student Placement Task Force Student placement option presentation Maize Board of Education |

College Placement Presentation October 30, 2019 Dave Bucciero Director of College Placement

ADVANCED PLACEMENT The purpose of the Advanced Placement program is to provide the students with

Advanced Placement Physics 1 Advanced Placement Physics 2 Dr. Matt Frederickson Dr. Kevin

College Placement Presentation October 24, 2018 Dave Bucciero Director of College Placement

INCREASING CIRCULATION BOOK DISPLAYS THROUGH 2 Placement PLACEMENT LIBRARY GEOGRAPHY

BonnPlace : A Self-Stabilizing Placement Framework Ulrich Brenner, Anna Hermann, Nils Hoppmann,

The ISPD 2006 Placement Contest and Benchmark Suite Gi-Joon Nam, Charles J. Alpert, Paul G.

GORDIAN Placement Perform GORDIAN placement Uniform area and net weight, area balance

Using machine learning Learning knot methods in geometric modeling placement SVM knot placement

ISPD 2005/2006 Placement Contest Updates Gi-Joon Nam IBM Corp. 2 ISPD Placement Contest ISPD

CS137: Today Electronic Design Automation Placement Problem Partitioning Placement

Test Positive Aware Network Rogers Park Edgewater Uptown Michelle Randall-English

AVAILABILITY OF MEDICINES: IMPACT ON PUBLIC HEALTH Unique example of cancer Elisabeth de Vries,

M8 Sustainable Limited An integrated and sustainable waste management business focused on

STANDGAS STANDALONE DETECTOR AND STANDALONE RS485 DETECTOR FOR THE DETECTION OF TOXIC AND

Smart Energy Grid Cherie Gregoire Cambridgeshire County Council Sheryl French Cambridgeshire County

Tracking Adoption Rate of Tracking Adoption Rate of Children Available for Adoption

ALLIANCE BUSINESS ACADEMY, BANGALORE, INDIA TQM I MPLEMENTATI ON I N MANAGEMENT EDUCATI ON

Evolving Role of the Quality Professional Presented By General Manager Greg Weiler ASQ Asia

Sambuz

Useful Links

Newsletter

Mail Us

VLSI Placement Sadiq M. Sait & Habib Youssef December 1995 Placement Placement is the