nearest neighbor and kernel survival analysis
play

Nearest Neighbor and Kernel Survival Analysis Nonasymptotic Error - PowerPoint PPT Presentation

Nearest Neighbor and Kernel Survival Analysis Nonasymptotic Error Bounds and Strong Consistency Rates George H. Chen Assistant Professor of Information Systems Carnegie Mellon University June 11, 2019 Survival Analysis Gluten


  1. Nearest Neighbor and 
 Kernel Survival Analysis Nonasymptotic Error Bounds and Strong Consistency Rates George H. Chen 
 Assistant Professor of Information Systems 
 Carnegie Mellon University June 11, 2019

  2. Survival Analysis Gluten 
 Immuno- 
 Low resting Irregular Time of High BMI allergy suppressant heart rate heart beat death Day 2 Day 10 Day ≥ 6

  3. Survival Analysis Gluten 
 Immuno- 
 Low resting Irregular Time of High BMI allergy suppressant heart rate heart beat death Day 2 Feature vector X Observed time Y X Y Day 10 Day ≥ 6

  4. Survival Analysis Gluten 
 Immuno- 
 Low resting Irregular Time of High BMI allergy suppressant heart rate heart beat death Day 2 Feature vector X Observed time Y X Y Day 10 Day ≥ 6 When we stop collecting training data, not everyone has died!

  5. Survival Analysis Gluten 
 Immuno- 
 Low resting Irregular Time of High BMI allergy suppressant heart rate heart beat death Day 2 Feature vector X Observed time Y X Y Day 10 Day ≥ 6 When we stop collecting training data, not everyone has died! Goal: Estimate S ( t | x ) = P ( survive beyond time t | feature vector x )

  6. Problem Setup

  7. Problem Setup Model: Generate data point as follows: ( X, Y, δ )

  8. Problem Setup Model: Generate data point as follows: ( X, Y, δ ) 1. Sample feature vector X ∼ P X

  9. Problem Setup Model: Generate data point as follows: ( X, Y, δ ) 1. Sample feature vector X ∼ P X 2. Sample time of death T ∼ P T | X

  10. Problem Setup Model: Generate data point as follows: ( X, Y, δ ) 1. Sample feature vector X ∼ P X 2. Sample time of death T ∼ P T | X 3. Sample time of censoring C ∼ P C | X

  11. Problem Setup Model: Generate data point as follows: ( X, Y, δ ) 1. Sample feature vector X ∼ P X 2. Sample time of death T ∼ P T | X 3. Sample time of censoring C ∼ P C | X 4. If death happens before censoring ( ): T ≤ C

  12. Problem Setup Model: Generate data point as follows: ( X, Y, δ ) 1. Sample feature vector X ∼ P X 2. Sample time of death T ∼ P T | X 3. Sample time of censoring C ∼ P C | X 4. If death happens before censoring ( ): Set Y = T, δ = 1 T ≤ C

  13. Problem Setup Model: Generate data point as follows: ( X, Y, δ ) 1. Sample feature vector X ∼ P X 2. Sample time of death T ∼ P T | X 3. Sample time of censoring C ∼ P C | X 4. If death happens before censoring ( ): Set Y = T, δ = 1 T ≤ C Otherwise: Set Y = C, δ = 0

  14. Problem Setup Model: Generate data point as follows: ( X, Y, δ ) 1. Sample feature vector X ∼ P X 2. Sample time of death T ∼ P T | X 3. Sample time of censoring C ∼ P C | X 4. If death happens before censoring ( ): Set Y = T, δ = 1 T ≤ C Otherwise: Set Y = C, δ = 0 Estimator (Beran 1981):

  15. Problem Setup Model: Generate data point as follows: ( X, Y, δ ) 1. Sample feature vector X ∼ P X 2. Sample time of death T ∼ P T | X 3. Sample time of censoring C ∼ P C | X 4. If death happens before censoring ( ): Set Y = T, δ = 1 T ≤ C Otherwise: Set Y = C, δ = 0 Estimator (Beran 1981): x

  16. Problem Setup Model: Generate data point as follows: ( X, Y, δ ) 1. Sample feature vector X ∼ P X 2. Sample time of death T ∼ P T | X 3. Sample time of censoring C ∼ P C | X 4. If death happens before censoring ( ): Set Y = T, δ = 1 T ≤ C Otherwise: Set Y = C, δ = 0 Estimator (Beran 1981): find training k data points k x points closest to . x

  17. Problem Setup Model: Generate data point as follows: ( X, Y, δ ) 1. Sample feature vector X ∼ P X 2. Sample time of death T ∼ P T | X 3. Sample time of censoring C ∼ P C | X 4. If death happens before censoring ( ): Set Y = T, δ = 1 T ≤ C Otherwise: Set Y = C, δ = 0 Estimator (Beran 1981): find training Kaplan-Meier k b data points S ( t | x ) k x points closest to . estimator x

  18. Problem Setup Model: Generate data point as follows: ( X, Y, δ ) 1. Sample feature vector X ∼ P X 2. Sample time of death T ∼ P T | X 3. Sample time of censoring C ∼ P C | X 4. If death happens before censoring ( ): Set Y = T, δ = 1 T ≤ C Otherwise: Set Y = C, δ = 0 Estimator (Beran 1981): find training Kaplan-Meier k b data points S ( t | x ) k x points closest to . estimator x Kernel variant is similar

  19. Problem Setup Model: Generate data point as follows: ( X, Y, δ ) 1. Sample feature vector X ∼ P X 2. Sample time of death T ∼ P T | X 3. Sample time of censoring C ∼ P C | X 4. If death happens before censoring ( ): Set Y = T, δ = 1 T ≤ C Otherwise: Set Y = C, δ = 0 Estimator (Beran 1981): find training Kaplan-Meier k b data points S ( t | x ) k x points closest to . estimator x Kernel variant is similar Error: | b for time horizon t sup S ( t | x ) − S ( t | x ) | τ t ∈ [0 , τ ]

  20. Problem Setup Model: Generate data point as follows: ( X, Y, δ ) 1. Sample feature vector X ∼ P X 2. Sample time of death T ∼ P T | X 3. Sample time of censoring C ∼ P C | X 4. If death happens before censoring ( ): Set Y = T, δ = 1 T ≤ C Otherwise: Set Y = C, δ = 0 Estimator (Beran 1981): find training Kaplan-Meier k b data points S ( t | x ) k x points closest to . estimator x Kernel variant is similar Error: | b for time horizon t sup S ( t | x ) − S ( t | x ) | τ t ∈ [0 , τ ] Enough of the n training data have Y values > t Y n τ

  21. Problem Setup Model: Generate data point as follows: ( X, Y, δ ) 1. Sample feature vector X ∼ P X 2. Sample time of death T ∼ P T | X 3. Sample time of censoring C ∼ P C | X 4. If death happens before censoring ( ): Set Y = T, δ = 1 T ≤ C Otherwise: Set Y = C, δ = 0 Feature space is separable metric space 
 Estimator (Beran 1981): (intrinsic dimension d ) d find training Kaplan-Meier k b data points S ( t | x ) k x points closest to . estimator x Kernel variant is similar Error: | b for time horizon t sup S ( t | x ) − S ( t | x ) | τ t ∈ [0 , τ ] Enough of the n training data have Y values > t Y n τ

  22. Problem Setup Model: Generate data point as follows: ( X, Y, δ ) 1. Sample feature vector X ∼ P X Continuous r.v. in time & 
 2. Sample time of death T ∼ P T | X smooth w.r.t. feature space 
 3. Sample time of censoring C ∼ P C | X (Hölder index a ) α 4. If death happens before censoring ( ): Set Y = T, δ = 1 T ≤ C Otherwise: Set Y = C, δ = 0 Feature space is separable metric space 
 Estimator (Beran 1981): (intrinsic dimension d ) d find training Kaplan-Meier k b data points S ( t | x ) k x points closest to . estimator x Kernel variant is similar Error: | b for time horizon t sup S ( t | x ) − S ( t | x ) | τ t ∈ [0 , τ ] Enough of the n training data have Y values > t Y n τ

  23. Problem Setup Model: Generate data point as follows: ( X, Y, δ ) 1. Sample feature vector X ∼ P X Borel prob. measure Continuous r.v. in time & 
 2. Sample time of death T ∼ P T | X smooth w.r.t. feature space 
 3. Sample time of censoring C ∼ P C | X (Hölder index a ) α 4. If death happens before censoring ( ): Set Y = T, δ = 1 T ≤ C Otherwise: Set Y = C, δ = 0 Feature space is separable metric space 
 Estimator (Beran 1981): (intrinsic dimension d ) d find training Kaplan-Meier k b data points S ( t | x ) k x points closest to . estimator x Kernel variant is similar Error: | b for time horizon t sup S ( t | x ) − S ( t | x ) | τ t ∈ [0 , τ ] Enough of the n training data have Y values > t Y n τ

  24. Theory (Informal)

  25. Theory (Informal) k -NN estimator with has strong consistency rate: k = e Θ ( n 2 α /(2 α + d ) ) | b S ( t | x ) − S ( t | x ) | ≤ e sup O ( n − α /(2 α + d ) ) t ∈ [0 , τ ]

  26. Theory (Informal) k -NN estimator with has strong consistency rate: k = e Θ ( n 2 α /(2 α + d ) ) | b S ( t | x ) − S ( t | x ) | ≤ e sup O ( n − α /(2 α + d ) ) t ∈ [0 , τ ] If no censoring, problem reduces to conditional CDF estimation

  27. Theory (Informal) k -NN estimator with has strong consistency rate: k = e Θ ( n 2 α /(2 α + d ) ) | b S ( t | x ) − S ( t | x ) | ≤ e sup O ( n − α /(2 α + d ) ) t ∈ [0 , τ ] If no censoring, problem reduces to conditional CDF estimation → Error upper bound, up to a log factor, matches conditional CDF estimation lower bound by Chagny & Roche 2014

  28. Theory (Informal) k -NN estimator with has strong consistency rate: k = e Θ ( n 2 α /(2 α + d ) ) | b S ( t | x ) − S ( t | x ) | ≤ e sup O ( n − α /(2 α + d ) ) t ∈ [0 , τ ] If no censoring, problem reduces to conditional CDF estimation → Error upper bound, up to a log factor, matches conditional CDF estimation lower bound by Chagny & Roche 2014 Proof ideas also give finite sample rates for:

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend