linear models for classification ii
play

Linear Models for Classification II Henrik I Christensen Robotics - PowerPoint PPT Presentation

Introduction Generative Models Prob. Disc. Models Class Projects Summary Linear Models for Classification II Henrik I Christensen Robotics & Intelligent Machines @ GT Georgia Institute of Technology, Atlanta, GA 30332-0280


  1. Introduction Generative Models Prob. Disc. Models Class Projects Summary Linear Models for Classification II Henrik I Christensen Robotics & Intelligent Machines @ GT Georgia Institute of Technology, Atlanta, GA 30332-0280 hic@cc.gatech.edu Henrik I Christensen (RIM@GT) Linear Bayes Classification 1 / 25

  2. Introduction Generative Models Prob. Disc. Models Class Projects Summary Outline Introduction 1 Probabilistic Generative Models 2 Probabilistic Discriminative Models 3 Class Projects 4 Summary 5 Henrik I Christensen (RIM@GT) Linear Bayes Classification 2 / 25

  3. Introduction Generative Models Prob. Disc. Models Class Projects Summary Introduction Recap: Last time we talked about linear classification as an optimization problem Today - Bayesian Models for Classification Discussion of possible class projects. Summary Henrik I Christensen (RIM@GT) Linear Bayes Classification 3 / 25

  4. Introduction Generative Models Prob. Disc. Models Class Projects Summary Outline Introduction 1 Probabilistic Generative Models 2 Probabilistic Discriminative Models 3 Class Projects 4 Summary 5 Henrik I Christensen (RIM@GT) Linear Bayes Classification 4 / 25

  5. Introduction Generative Models Prob. Disc. Models Class Projects Summary Probabilistic Generative Models Objective - p ( C k | x ) Modelling using p ( C k ) - the class priors p ( x |C k ) - the class conditionals Think two classes p ( x |C 1 ) p ( C 1 ) p ( C 1 | x ) = p ( x |C 1 ) p ( C 1 ) + p ( x |C 2 ) p ( C 2 ) Henrik I Christensen (RIM@GT) Linear Bayes Classification 5 / 25

  6. Introduction Generative Models Prob. Disc. Models Class Projects Summary Sigmoid Formulation Reformulation 1 p ( C k | x ) = 1 + e − a = σ ( a ) where a = ln p ( x |C 1 ) p ( C 1 ) p ( x |C 2 ) p ( C 2 ) Logistic Sigmoid, σ ( a ), defined by 1 σ ( a ) = 1 + e − a Note σ ( − a ) = 1 − σ ( a ) Henrik I Christensen (RIM@GT) Linear Bayes Classification 6 / 25

  7. Introduction Generative Models Prob. Disc. Models Class Projects Summary Sigmoid Function 1 0.5 0 −5 0 5 Henrik I Christensen (RIM@GT) Linear Bayes Classification 7 / 25

  8. Introduction Generative Models Prob. Disc. Models Class Projects Summary Generalization beyond K > 2 Consider p ( x |C k ) p ( C k ) p ( C k | x ) = � i p ( x |C i ) p ( C i ) e − a k = � i e a i where a k = ln ( p ( x |C k ) p ( C k )) Henrik I Christensen (RIM@GT) Linear Bayes Classification 8 / 25

  9. Introduction Generative Models Prob. Disc. Models Class Projects Summary The case with Normal distributions Consider a D-dimensional distribution with mean µ k and covariance Σ The result would be p ( C k | x ) = σ ( w T x + w 0 ) where Σ − 1 ( µ 1 − µ 2 ) w = − 1 1 Σ − 1 µ 1 + 1 2 Σ − 1 µ 2 + ln p ( C 1 ) 2 µ T 2 µ T w 0 = p ( C 2 ) Henrik I Christensen (RIM@GT) Linear Bayes Classification 9 / 25

  10. Introduction Generative Models Prob. Disc. Models Class Projects Summary The multi class Normal case For each case a k ( x ) = w T x + w k 0 then Σ − 1 µ k = w k − 1 k Σ − 1 µ k + ln p ( C k ) 2 µ T w k 0 = Henrik I Christensen (RIM@GT) Linear Bayes Classification 10 / 25

  11. Introduction Generative Models Prob. Disc. Models Class Projects Summary Small multi-class Normal distribution example 2.5 2 1.5 1 0.5 0 −0.5 −1 −1.5 −2 −2.5 −2 −1 0 1 2 Henrik I Christensen (RIM@GT) Linear Bayes Classification 11 / 25

  12. Introduction Generative Models Prob. Disc. Models Class Projects Summary The Maximum Likelihood Solution For two class example with priors ( π, 1 − π ) Then we have p ( x n , C 1 ) = p ( C 1 ) p ( x n |C 1 ) = π N ( x n | µ 1 , Σ) The joint likelihood function is then N � [ π N ( x i | µ 1 , Σ)] t i [(1 − π ) N ( x i | µ 2 , Σ)] 1 − t i p ( t | π, µ 1 , µ 2 , Σ) = i =1 where t i is the classification result of the i ’th sample we can compute the maximum of p ( . ) Henrik I Christensen (RIM@GT) Linear Bayes Classification 12 / 25

  13. Introduction Generative Models Prob. Disc. Models Class Projects Summary The Maximum Likelihood Solution (2) The class probabilities are then N 1 π = N 1 + N 2 the class means are µ i = 1 � t n x n N i n =1 and Σ = S = N 1 N S 1 + N 2 N S 2 In reality the results are not surprising Could we compute the optimal ML solution directly? Henrik I Christensen (RIM@GT) Linear Bayes Classification 13 / 25

  14. Introduction Generative Models Prob. Disc. Models Class Projects Summary Outline Introduction 1 Probabilistic Generative Models 2 Probabilistic Discriminative Models 3 Class Projects 4 Summary 5 Henrik I Christensen (RIM@GT) Linear Bayes Classification 14 / 25

  15. Introduction Generative Models Prob. Disc. Models Class Projects Summary Probabilistic Discriminative Models Could we analyze the problem direct rather than through a generative model? I.e. could we perform ML directly on p ( C k | x )? Could involve less parameters! Henrik I Christensen (RIM@GT) Linear Bayes Classification 15 / 25

  16. Introduction Generative Models Prob. Disc. Models Class Projects Summary Logistic Regression Consider the two class problem. Formulation as a Sigmoid p ( C 1 | φ ) = y ( φ ) = σ ( w T φ ) then p ( C 2 | φ ) = 1 − p ( C 1 | phi ) Consider d σ da = σ (1 − σ ) Henrik I Christensen (RIM@GT) Linear Bayes Classification 16 / 25

  17. Introduction Generative Models Prob. Disc. Models Class Projects Summary Logistic Regression - II For a dataset { φ n , t n } we have N � y t i i { 1 − y i } 1 − t i p ( t | w ) = i =1 Associated error function N � E ( w ) = − ln p ( t | w ) = − { t i ln y i + (1 − t i ) ln(1 − y i ) } i =1 the gradient is then N � ∇ E ( w ) = ( y i − t i ) φ i i =1 Henrik I Christensen (RIM@GT) Linear Bayes Classification 17 / 25

  18. Introduction Generative Models Prob. Disc. Models Class Projects Summary Newton-Raphson Optimization We want to find an extremum of a function f ( . ) f ( x + ∆ x ) = f ( x ) + f ′ ( x )∆ x + 1 2 f ′′ ( x )∆ x 2 Extremum when ∆ x solves: f ′ ( x ) + f ′′ ( x )∆ x = 0 In vector form: x n +1 = x n − [ Hf ( x n )] − 1 ] ∇ f ( x n ) Henrik I Christensen (RIM@GT) Linear Bayes Classification 18 / 25

  19. Introduction Generative Models Prob. Disc. Models Class Projects Summary Iterated reweighted least square Formulate the optimization problem as w ( τ +1) = w ( τ ) − H − 1 ∇ E ( w ) the gradient and Hessian are given by Φ T Φ w − Φ T t ∇ E ( w ) = Φ T Φ = H Solution is “obvious” w ( τ +1) = w ( τ ) − (Φ T Φ) − 1 � � Φ T Φ w − Φ T t which results w = (Φ T Φ) − 1 Φ T t this is the LSQ solution! Henrik I Christensen (RIM@GT) Linear Bayes Classification 19 / 25

  20. Introduction Generative Models Prob. Disc. Models Class Projects Summary Optimization for the cross-entropy For the function E ( w ) Φ T ( y − t ) ∇ E ( w ) = Φ T R Φ H = where R is a diagonal matrix R nn = y n (1 − y n ) The regression/discrimination is then w ( τ +1) = (Φ T R Φ) − 1 Φ R � Φ w ( τ ) − R − 1 ( y − t ) � Henrik I Christensen (RIM@GT) Linear Bayes Classification 20 / 25

  21. Introduction Generative Models Prob. Disc. Models Class Projects Summary Outline Introduction 1 Probabilistic Generative Models 2 Probabilistic Discriminative Models 3 Class Projects 4 Summary 5 Henrik I Christensen (RIM@GT) Linear Bayes Classification 21 / 25

  22. Introduction Generative Models Prob. Disc. Models Class Projects Summary Class Projects - Examples Feature integration for robust detection Multi-recognition strategies Comparison of recognition methods Space Categorization Learning of obstacle avoidance strategy Henrik I Christensen (RIM@GT) Linear Bayes Classification 22 / 25

  23. Introduction Generative Models Prob. Disc. Models Class Projects Summary Class Projects - II Problems: Novel new “research” - robotics/mobile/manipulation Comparative evaluation Integration of methods Aspects Modelling - what is a good/adequate model? What is a good benchmark/evaluation Evaluation of method - alone or in comparison Teaming 2-3 students in a group Henrik I Christensen (RIM@GT) Linear Bayes Classification 23 / 25

  24. Introduction Generative Models Prob. Disc. Models Class Projects Summary Outline Introduction 1 Probabilistic Generative Models 2 Probabilistic Discriminative Models 3 Class Projects 4 Summary 5 Henrik I Christensen (RIM@GT) Linear Bayes Classification 24 / 25

  25. Introduction Generative Models Prob. Disc. Models Class Projects Summary Summary Consideration of a Bayesian formulation for class discrimination For linear systems the LSQ a solution Iterative solutions Discussion of class projects Henrik I Christensen (RIM@GT) Linear Bayes Classification 25 / 25

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend