super resolution using gaussian process regression
play

Super-resolution using Gaussian Process Regression Final Year - PowerPoint PPT Presentation

Super-resolution using Gaussian Process Regression Final Year Project Interim Report He He Department of Electronic and Information Engineering The Hong Kong Polytechnic Unviersity December 30, 2010 () December 30, 2010 1 / 33 Outline


  1. Super-resolution using Gaussian Process Regression Final Year Project Interim Report He He Department of Electronic and Information Engineering The Hong Kong Polytechnic Unviersity December 30, 2010 () December 30, 2010 1 / 33

  2. Outline Introduction 1 Gaussian Process Regression 2 Multivariate Normal Distribution Gaussian Process Regression Training GPR for Super-resolution 3 Framework Covariance Function () December 30, 2010 2 / 33

  3. Outline Introduction 1 Gaussian Process Regression 2 Multivariate Normal Distribution Gaussian Process Regression Training GPR for Super-resolution 3 Framework Covariance Function () December 30, 2010 3 / 33

  4. The goal of super-resolution (SR) is to estimate a high-resolution (HR) image from one or a set of low-resolution (LR) images. It is widely applied in face recognition, medical imaging, HDTV etc. Figure: Face recognition in video. () December 30, 2010 4 / 33

  5. The goal of super-resolution (SR) is to estimate a high-resolution (HR) image from one or a set of low-resolution (LR) images. It is widely applied in face recognition, medical imaging, HDTV etc. Figure: Super-resolution in medical imaging. () December 30, 2010 4 / 33

  6. Super-resolution Methods Interpolation-based methods Fast but the HR image is usually blurred. E.g., bicubic interpolation, NEDI. Learning-based methods Hallucinate textures from the HR/LR image pair database. Reconstruction-based methods Formalize an optimization problem constrained by the LR image with various priors. () December 30, 2010 5 / 33

  7. Outline Introduction 1 Gaussian Process Regression 2 Multivariate Normal Distribution Gaussian Process Regression Training GPR for Super-resolution 3 Framework Covariance Function () December 30, 2010 6 / 33

  8. Multivariate Normal Distribution Definition A random vector X = ( X 1 , X 2 , . . . , X p ) is said to be multivariate normally (MVN) distributed if every linear combination of its components Y = a T X has a univariate normal distribution. Real-world random variables can often be approximated as following a multivariate normal distribution. The probability density function of X is � 1 � 1 2( x − µ ) T Σ − 1 ( x − µ ) f ( x ) = (2 π ) ( p / 2) | Σ | 1 / 2 exp (1) where µ is the mean of X and Σ is the covariance matrix. () December 30, 2010 7 / 33

  9. Multivariate Normal Distribution Example Bivariate normal distribution � 1 � 0 µ = [1 1] ′ , Σ = . 0 1 () December 30, 2010 8 / 33

  10. Multivariate Normal Distribution Property 1 The joint distribution of two MVN random variables is also an MVN distribution. � X 1 � Given X 1 ∼ N ( µ 1 , Σ 1 ), X 2 ∼ N ( µ 2 , Σ 2 ) and X = , we have X 2 � µ 1 � Σ 11 � � Σ 12 X ∼ N p ( µ, Σ ) with µ = , Σ = . µ 2 Σ 21 Σ 11 () December 30, 2010 9 / 33

  11. Multivariate Normal Distribution Property 2 The conditional distribution of the components of MVN are (multivariate) normal. The distribution of X 1 , given that X 2 = x 2 , is normal and has Mean = µ 1 + Σ 12 Σ − 1 22 ( x 2 − µ 2 ) (2) Covariance = Σ 11 − Σ 12 Σ − 1 22 Σ 21 (3) () December 30, 2010 10 / 33

  12. Gaussian Process Definition Gaussian Process (GP) defines a distribution over the function f , where f is a mapping from the input space X to R , such that for any finite subset of X , its marginal distribution P ( f ( x 1 ) , f ( x 2 ) , ... f ( x n )) is a multivariate normal distribution. f | X ∼ N ( m ( x ) , K ( X , X )) (4) where X = { x 1 , x 2 , . . . , x n } (5) m ( x ) = E [ f ( x )] (6) � ( f ( x i ) − m ( x ))( f ( x i ) T − m ( x T )) � k ( x i , x j ) = E (7) and K ( X , X ) denotes the covariance matrix such that K ij = k ( x i , x j ). () December 30, 2010 11 / 33

  13. Gaussian Process Formally, we write the Gaussian Process as f ( x ) ∼ GP ( m ( x ) , k ( x i , x j )) (8) Without loss of generality, the mean is usually taken to be zero. Parameterized by the mean function m ( x ) and the covariance function k ( x i , x j ) Infer in the function space directly () December 30, 2010 12 / 33

  14. Gaussian Process Regression Model: f ( x ) ∼ GP ( m ( x ) , k ( x i , x j )) (9) Given the inputs X ∗ , the output f ∗ is f ∗ ∼ N ( 0 , K ( X ∗ , X ∗ )) (10) According to the Gaussian prior, the joint distribution of the training outputs f , and the test outputs f ∗ is � f � K ( X , X ) � � �� K ( X , X ∗ ) ∼ N 0 , . (11) f ∗ K ( X ∗ , X ) K ( X ∗ , X ∗ ) () December 30, 2010 13 / 33

  15. Noisy Model In reality, we do not have access to true function values but rather noisy observations. Assuming independent indentically distributed noise, we have the noisy model f ( x ) + ε, ε ∼ N (0 , σ 2 y = n ) (12) f ( x ) ∼ GP ( m ( x ) , K ( X , X )) (13) Var( f ( x )) + Var( ε ) = K ( X , X ) + σ 2 Var( y ) = n I (14) Thus, the joint distribution for prediction is � y � K ( X , X ) + σ 2 � � �� n I K ( X , X ∗ ) ∼ N 0 , (15) f ∗ K ( X ∗ , X ) K ( X ∗ , X ∗ ) () December 30, 2010 14 / 33

  16. Prediction Referring to the previous property of the conditional distribution, we can obtain N ( ¯ f ∗ ∼ f , V ( f ∗ )) (16) ¯ K ( X ∗ , X )[ K ( X , X ) + σ 2 n I ] − 1 y , f ∗ = (17) V ( f ∗ ) = K ( X ∗ , X ∗ ) − K ( X ∗ , X )[ K ( X , X ) + σ 2 n I ] − 1 K ( X , X ∗ ) . (18) y are the training outputs and f ∗ are the test outputs, which are predicted as the mean ¯ f . () December 30, 2010 15 / 33

  17. Marginal Likelihood GPR model: y = f + ǫ (19) f ∼ GP ( m ( x ) , K ) (20) N ( 0 , σ 2 ǫ ∼ n I ) (21) y is an n-dimensional vector of observations. Without loss of generality, let m ( x ) = 0. Thus y | X follows a normal distribution with E ( y | X ) = 0 (22) K ( X , X ) + σ 2 Var ( y | X ) = n I (23) () December 30, 2010 16 / 33

  18. Marginal Likelihood Let K y = Var ( y | X ), � � 1 − 1 2 y T K − 1 p ( y | X ) = (2 π ) n / 2 | K y | 1 / 2 exp y y (24) The log marginal likelihood is L = log p ( y | X ) = − n 2log 2 π − 1 2log | K y | − 1 2 f T K − 1 y f (25) () December 30, 2010 17 / 33

  19. Maximum a posteriori Matrix derivative: ∂ − Y − 1 ∂ Y Y − 1 ∂ x Y = (26) ∂θ i ∂ tr ( Y − 1 ∂ Y ∂ x log | Y | = ) (27) ∂θ i Gradient ascent: ∂ L = 1 2 y T K − 1 ∂ K K − 1 y − 1 2 tr ( K − 1 ∂ K ) (28) ∂θ i ∂θ i ∂θ i ∂ K ∂θ i is a matrix of derivatives of each element. () December 30, 2010 18 / 33

  20. Outline Introduction 1 Gaussian Process Regression 2 Multivariate Normal Distribution Gaussian Process Regression Training GPR for Super-resolution 3 Framework Covariance Function () December 30, 2010 19 / 33

  21. Graphical Representation Model: y = f ( x ) + ε Squares: observed pixels Circles: unknown Gaussian field Inputs ( x ): neighbors (predictors) of the target pixel Outputs ( y ): pixel at the center of each 3 × 3 patch Thick horizontal line: a set of fully connected nodes. () December 30, 2010 20 / 33

  22. Workflow Stage 1: interpolation Input LR patch () December 30, 2010 21 / 33

  23. Workflow Stage 1: interpolation Sample training targets () December 30, 2010 21 / 33

  24. Workflow Stage 1: interpolation SR based on Bicubic Interpolation Stage 2: deblurring () December 30, 2010 21 / 33

  25. Workflow Stage 1: interpolation Stage 2: deblurring Sample training targets () December 30, 2010 21 / 33

  26. Workflow Stage 1: interpolation Stage 2: deblurring Obtain neighbors from the downsampled patch () December 30, 2010 21 / 33

  27. Workflow Stage 1: interpolation Stage 2: deblurring SR based on the simulated blurring process () December 30, 2010 21 / 33

  28. Covariance Equation defines the similarity between two points (vectors) indicate the underlying distribution of functions in GP Squared Exponential covariance function � ( x i − x j ) ′ ( x i − x j ) � − 1 k ( x i , x j ) = σ 2 f exp (29) ℓ 2 2 σ 2 f represents the signal variance and ℓ defines the characteristic length scale . Given an image I , the covariance between two pixels I i , j and I m , n is calculated as k ( I ( i , j ) , N , I ( m , n ) , N ), where N means to take the 8 nearest pixels around the pixel. Therefore, the similarity is based on the Euclidean distance between the pixels’ neighbors. () December 30, 2010 22 / 33

  29. Covariance Equation (a) Test point (b) Training patch (c) Covariance ma- trix Local similarity : high responses (red regions) from the training patch are concentrated on edges Global similarity : high-responsive regions also include other similar edges within the patch Conclusion : pixels embedded in a similar structure to that of the target pixel in terms of the neighborhood tend to have higher weights during prediction () December 30, 2010 23 / 33

  30. Hyperparameter Adaptation Hyperparameters : σ 2 f : signal variance σ 2 n : noise variance ℓ : characteristic length scale (a) Test (b) Training (c) ℓ = .50, (d) ℓ = .05, (e) ℓ = 1.65, σ n = .01 σ n = .001 σ n = .14 (c) : MAP estimation (d) : Quickly varying field with low noise (e) : Slowly varyin field with high noise () December 30, 2010 24 / 33

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend