cse 802 spring 2017 logistic regression
play

CSE 802 Spring 2017 Logistic Regression Inci M. Baytas Computer - PowerPoint PPT Presentation

CSE 802 Spring 2017 Logistic Regression Inci M. Baytas Computer Science Michigan State University March 29, 2017 1 / 10 Introduction Consider two-class classification problem, the posterior probability of class C 1 can be written as: w T


  1. CSE 802 Spring 2017 Logistic Regression Inci M. Baytas Computer Science Michigan State University March 29, 2017 1 / 10

  2. Introduction ◮ Consider two-class classification problem, the posterior probability of class C 1 can be written as: w T Φ � � p ( C 1 | Φ) = y (Φ) = σ (1) ◮ σ ( · ) is the logistic sigmoid function. ◮ p ( C 2 | Φ) = 1 − p ( C 1 | Φ) ◮ Φ is a feature vector, a non-linear transformation on original observation space x . ◮ The model in Eq.1 is called as Logistic Regression in the terminology of statistics. 2 / 10

  3. Logistic Regression I ◮ A classification model rather than regression ◮ A probabilistic discriminative model ◮ We estimate the parameter w directly. ◮ Comparison of logistic regression and generative model in M -dimensional space Φ : ◮ Logistic regression: M adjustable parameters. ◮ Generative models: Assume we fit Gaussian class conditional densities using maximum likelihood; M ( M + 5) / 2 + 1 = Means: 2 M + Shared covariance: ( M + 1) M/ 2 + Prior p ( C 1 ) : 1 ◮ Maximum likelihood is used to determine the parameters of logistic regression model. 3 / 10

  4. Logistic Regression II ◮ Definition and properties of logistic sigmoid function: 1 σ ( a ) = 1 + exp ( − a ) (2) σ ( − a ) = 1 − σ ( a ) dσ da = σ (1 − σ ) 4 / 10

  5. Logistic Regression III - How to Estimate w ◮ For a training data set { Φ n , t n } , where t n ∈ { 0 , 1 } and Φ n = Φ ( x n ) , with n = 1 , ..., N , the log likelihood can be written as: N n { 1 − y n } 1 − t n � y t n p ( t | w ) = (3) n =1 where t = ( t 1 , ..., t N ) T and y n = p ( C 1 | Φ n ) ◮ The error function is the negative logarithm of the likelihood, known as Cross-entropy error function: N � E ( w ) = − ln p ( t | w ) = − { t n ln y n + (1 − t n ) ln (1 − y n ) } n =1 (4) where y n = σ ( a n ) and a n = w T Φ n . 5 / 10

  6. Logistic Regression IV - How to Estimate w ◮ There is no analytical (closed-form) solution. ◮ The cross entropy loss is a convex function. ◮ There is a global minimum. ◮ Can use an iterative approach. ◮ Calculate the gradient with respect to w : N � ∇ E ( w ) = ( y n − t n ) Φ n (5) n =1 ◮ Use gradient descent (batch or online): w τ +1 = w τ − η ∇ E ( w τ ) (6) 6 / 10

  7. Logistic Regression V - How to Estimate w ◮ Newton-Raphson Algorithm w ( new ) = w ( old ) − H − 1 ∇ E ( w ) (7) ◮ It uses a local quadratic approximation to the cross-entropy error function to update w iteratively. ◮ Newton-Raphson algorithm is also known as iterative reweighted least squares . ◮ Convexity: H is positive definite (eigenvalues of H are non-negative). 7 / 10

  8. Multi-class Logistic Regression ◮ Cross-entropy for multi-class classification problem: N K � � E ( w 1 , ..., w K ) = − t nk ln y nk (8) n =1 k =1 exp ( w T k Φ ) where y k (Φ) = p ( C k | Φ) = j Φ ) which is called j exp ( w T � softmax function . ◮ Use maximum likelihood to estimate the parameters. ◮ Use an iterative approach such as Newton-Rapson. 8 / 10

  9. Over-fitting in Logistic Regression ◮ Maximum likelihood can suffer from severe over-fitting. ◮ This can be overcome by finding a MAP solution for w (Bayesian treatment). ◮ Another alternative is to use regularization. ◮ Add regularizers to the loss function, regularized log-likelihood. ◮ ℓ 2 norm ◮ ℓ 1 norm (Lasso) 9 / 10

  10. References ◮ Classification lecture of Dr. Jiayu Zhou. ◮ Christopher Bishop, Pattern Recognition and Machine Learning, Information Science and Statistics , Springer-Verlag New York, 2006. 10 / 10

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend