 
              GP Regression with Censored Data using EP Gaussian Process Regression with Perry Groot , Peter Lucas Censored Data Using Expectation Introduction Propagation Problem Setting Bayesian Framework Perry Groot , Peter Lucas Prior - GPs Likelihood Inference Experiments Radboud University Nijmegen { perry , peterl } @ cs . ru . nl Conclusions 6 th European Workshop on Probabilistic Graphical Models PGM 2012
Introduction GP Regression 0.35 with Censored Data using EP Perry Groot , 0.3 Peter Lucas Introduction 0.25 Problem Setting 0.2 Bayesian Framework Prior - GPs Likelihood 0.15 Inference Experiments 0.1 Conclusions 0.05 0 0 1 2 3 4 5 6 7 8 9 10
Introduction - Truncation GP Regression with Censored Data using EP Perry Groot , Peter Lucas Introduction Problem Setting Bayesian Framework Prior - GPs Likelihood Inference Experiments Conclusions 0 1 2 3 4 5 6 7 8 9 10
Introduction - Censoring GP Regression with Censored Data using EP Perry Groot , Peter Lucas Introduction Problem Setting Bayesian Framework Prior - GPs Likelihood Inference Experiments Conclusions 0 1 2 3 4 5 6 7 8 9 10
Introduction GP Regression with Censored Data using EP Both represent a limitation. Perry Groot , Peter Lucas Truncation: the population from which data is drawn Introduction Censoring: the variable of interest Problem Setting Bayesian Key difference is in the explanatory variable Framework Prior - GPs Truncation: missing Likelihood Inference Censoring: fully observable Experiments Conclusions Examples: Survival analysis, reliability testing
Problem Setting Learn a function GP Regression f : R D → R with Censored Data using EP given a set of observations Perry Groot , Peter Lucas D = { ( x 1 , y 1 ) , . . . , ( x n , y n ) } Introduction Problem Setting where y is a censored version of y ∗ : Bayesian Framework  if y ∗ ≤ l l Prior - GPs  Likelihood if l < y ∗ < u y ∗ y = Inference if y ∗ ≥ u Experiments  u Conclusions We are interested in the posterior p ( f |D ) = p ( f ) p ( D| f ) p ( D )
Gaussian Processes GP Regression with Censored Data using EP A Gaussian process (GP) is collection of random variables Perry Groot , { f i } with the property that the joint distribution of any finite Peter Lucas subset has a joint Gaussian distribution. Introduction Problem A GP specifies a probability distribution over functions Setting f ( x ) ∼ GP ( m ( x ) , k ( x , x ′ )) and is fully specified by its mean Bayesian Framework function m ( x ) and covariance (or kernel) function k ( x , x ′ ) . Prior - GPs Likelihood Inference Experiments Typically m ( x ) = 0 , which gives Conclusions { f ( x 1 ) , . . . , f ( x I ) } ∼ N ( 0 , K ) with K ij = k ( x i , x j )
Gaussian Processes - Posterior process GP A priori, given data D = { X , y } with y = f ( X ) and test Regression with Censored points X ∗ we have Data using EP Perry Groot , � f ( X ) � K ( X , X ) � � �� Peter Lucas K ( X , X ∗ ) ∼ N 0 , f ( X ∗ ) k ( X ∗ , X ) K ( X ∗ , X ∗ ) Introduction Problem Setting and after conditioning Bayesian Framework f ( X ∗ ) | X ∗ , X , y ∼ N ( µ , Σ ) Prior - GPs Likelihood Inference with Experiments Conclusions = K ( X ∗ , X ) K ( X , X ) − 1 y µ = K ( X ∗ , X ∗ ) − K ( X ∗ , X ) K ( X , X ) − 1 K ( X , X ∗ ) Σ � �� � O ( n 3 )
Gaussian Processes - 1D demo GP Regression 10 10 with Censored Data using EP 5 5 Perry Groot , Peter Lucas 0 0 Introduction −5 −5 Problem Setting −10 −10 0 2 4 6 8 10 0 2 4 6 8 10 Bayesian Framework Prior - GPs Likelihood Inference 10 10 Experiments 5 5 Conclusions 0 0 −5 −5 −10 −10 0 2 4 6 8 10 0 2 4 6 8 10
Likelihood GP Assume that latent function values are contaminated with Regression with Censored Gaussian noise with zero mean and unknown variance. Data using EP Perry Groot , Peter Lucas Likelihood becomes a mixture of Gaussian and probit Introduction likelihood terms: Problem Setting n � � f i − l �� � � L = p ( y i | f i ) = 1 − Φ Bayesian Framework σ Prior - GPs i = 1 y i = l Likelihood � 1 � y i − f i �� Inference � σφ Experiments σ l < y i < u Conclusions � � f i − u �� � Φ σ y i = u which is well-known as the Tobit likelihood.
Expectation Propagation GP Regression with Censored The posterior p ( f |D ) = p ( f ) p ( D| f ) is intractable Data using EP p ( D ) Perry Groot , EP approximates the likelihood by a Gaussian Peter Lucas distribution making the posterior tractable Introduction Local likelihood approximations Problem Setting p ( y i | f i ) ≃ t i ( f i | ˜ i ) = ˜ σ 2 σ 2 Bayesian Z i , ˜ µ i , ˜ Z i N ( f i | ˜ µ i , ˜ i ) Framework Prior - GPs Likelihood Approximation is iteratively updated Inference Experiments In the Gaussian case the update step turns out to be Conclusions the same as moment matching The zeroth, first, and second moments of the Tobit likelihood can be computed analytically
Experiments GP Regression 20 with Censored Data using EP Perry Groot , 15 Peter Lucas Introduction 10 Problem Setting Bayesian Framework 5 Prior - GPs Likelihood Inference Experiments 0 Conclusions −5 −10 0 0.2 0.4 0.6 0.8 1
Experiments GP Regression 20 with Censored Data using EP Perry Groot , 15 Peter Lucas Introduction 10 Problem Setting Bayesian Framework 5 Prior - GPs Likelihood Inference Experiments 0 Conclusions −5 −10 0 0.2 0.4 0.6 0.8 1
Experiments GP Regression 20 with Censored Data using EP Perry Groot , 15 Peter Lucas Introduction 10 Problem Setting Bayesian Framework 5 Prior - GPs Likelihood Inference Experiments 0 Conclusions −5 −10 0 0.2 0.4 0.6 0.8 1
u u u (2) Experiments u (1) GP (4) u Regression with Censored Data using EP Concordance index: u Perry Groot , Peter Lucas c ( D , G , f ) = 1 � 1 f ( x i ) < f ( x j ) Introduction (3) |E| E ij Problem Setting l l l Bayesian where G = ( X , E ) order graph with Framework edges E according to (1)–(4) Prior - GPs Likelihood Inference Experiments Fraction of all pairs of inputs whose Conclusions predicted values are correctly ordered among all inputs that can be ordered
Experiments GP Regression Housing data: with Censored Data using EP 506 observations on 14 real-valued variables Perry Groot , Peter Lucas median value greater than $50.000 appear as $50.000 Introduction 16 observations were censored (3 . 2 % of the data) Problem Setting Bayesian Table: Concordance results housing data (mean c-index and Framework Prior - GPs standard deviation). Likelihood Inference method c-index Experiments 0 . 866 ± 0 . 003 Conclusions GP Tobit-GP (LA) 0 . 879 ± 0 . 008 0 . 892 ± 0 . 007 Tobit-GP (EP)
Conclusions GP Regression with Censored Data using EP Perry Groot , Peter Lucas GPs provide a flexible, non-parametric Bayesian Introduction framework that can be extended to censored Problem Setting observations Bayesian Framework The intractable posterior in case of a Tobit likelihood Prior - GPs can be approximated with EP using analytic update Likelihood Inference steps leading to a stable algorithm Experiments Conclusions
Recommend
More recommend