Gaussian Discriminant Analysis material thanks to Andrew Ng - PowerPoint PPT Presentation

Nov 01, 2023 •347 likes •477 views

Gaussian Discriminant Analysis material thanks to Andrew Ng @Stanford Course Map / module3 module 3: generative methods LEARNING PERFORMANCE REPRESENTATION DATA PROBLEM CLUSTERING RAW DATA EVALUATION FEATURES EM algorithm artificial

Gaussian Discriminant Analysis material thanks to Andrew Ng @Stanford
Course Map / module3 module 3: generative methods LEARNING PERFORMANCE REPRESENTATION DATA PROBLEM CLUSTERING RAW DATA EVALUATION FEATURES EM algorithm artificial data spam data ANALYSIS SUPERVISED coin flips SELECTION LEARNING LABELS likelihoods TUNING GDA DIMENSIONS DATA naive bayes PROCESSING graphical models • Gaussian Discriminant Analysis
Density Estimation Problem • P(y|x) = P(y|x 1 ,x 2 ,…,x d ) joint (d+1)-dim distribution • … actually we cannot estimate this joint • if each feature has 10 buckets, and we have 100 features (very reasonable assumptions) • then the joint distribution has 10 100 cells - impossible
how to get around estimating the joint P(x 1 ,x 2 ,…,x d |y) ? • SOLUTION: model/restrict the joint, instead of estimating any possible such joint distribution - fore example with a well known parametrized form - such as multi-dim gaussian distribution - estimate the parameters of the imposed model • called Gaussian Discriminant Analysis (when the model imposed is gaussian) - easy to implement due to math tools facilitating gaussian parameters estimation (mean, covariance) - multidim implies “covariance” matrix instead of simple variance - doesnt fit data in many cases
Gaussian Fit - Idea: fit a parametrized distribution to histogram (density or counts) � - The gaussian (normal) density is controlled by mean and variance � 1 2 π e − ( x − µ )2 P ( x | µ, σ 2 ) = normal ( x, µ, σ 2 ) = 2 σ 2 √ � σ � - the best fit is the one that maximizes likelihood of the data m m Y X P ( x | µ, σ 2 ) = logP ( x | µ, σ 2 ) log L = log i =1 i =1
Lets impose a nice probabilistic model • Multi-variate normal θ = ( µ, Σ ) distribution � � � - plotted Σ =identity (or independent variables) � � � � �
Lets impose a nice probabilistic model • Multi-variate normal distribution � � - plotted Σ =variance only or independent variables � � � � �  2  1  0 . 6 � � � 0 0 0 Σ = Σ = Σ = 0 2 0 1 0 0 . 6
Lets impose a nice probabilistic model • Multi-variate normal distribution � � � - plotted Σ≠ identity - dependent variables � � � � �
Lets impose a nice probabilistic model • Multi-variate normal distribution � � - Σ≠ identity=>dependent variables � � � � �
GDA Setup • multi normal density estimation for each y (common Σ ) � � � � • log likelihood
GDA parameter solution • max likelihood for GDA has close form solution! • can be derived using differentials - estimate mean for each class - estimate covariance for entire training set - or separately for each class - no need for Gradient Descent or other optimizers
GDA visual classification • if common Σ , the two gaussians are identical except for the mean � • the separation is a line of equidistant points to the two means

Recommend

Discriminant Analysis aka. Discriminant Function Analysis Discriminant Analysis (DISCRIM)

Multivariate Fundamentals: Rotation Pre-determined Groups Discriminant Analysis aka. Discriminant Function Analysis Discriminant Analysis (DISCRIM) Analysis for pre-determined groups Objective - Rotate that data so that variation between groups

110 views • 8 slides

SVM-flexible discriminant analysis Huimin Peng November 20, 2014 Outline SVM Nonlinear SVM =

SVM-flexible discriminant analysis Huimin Peng November 20, 2014 Outline SVM Nonlinear SVM = Penalization method discriminant analysis FDA: flexible discriminant analysis penalized discriminant analysis mixture discriminant analysis

603 views • 22 slides

Discriminant Analysis In discriminant analysis, we try to find functions of the data that

Discriminant Analysis In discriminant analysis, we try to find functions of the data that optimally discriminate between two or more groups. Discriminant analysis is, in a sense, MANOVA in reverse. In MANOVA, we ask whether two or more groups

697 views • 26 slides

Flexible Discriminant Analysis Using Motivation MGLMM Multivariate Mixed Models Discriminant

Flexible Discriminant Analysis Using Multivariate Mixed Models D. Hughes Flexible Discriminant Analysis Using Motivation MGLMM Multivariate Mixed Models Discriminant Analysis ISDR Example Conclusions David Hughes 2015 Flexible

508 views • 39 slides

Local Fisher Discriminant Local Fisher Discriminant Analysis for Supervised Analysis for

ICML2006, Pittsburgh, USA June 25-29, 2006 Local Fisher Discriminant Local Fisher Discriminant Analysis for Supervised Analysis for Supervised Dimensionality Reduction Dimensionality Reduction Masashi Sugiyama Tokyo Institute of Technology,

721 views • 23 slides

Gaussian Filter The Gaussian filter 1 2 1 A Gaussian kernel gives less 1 2 4 2 weight to

Gaussian Filter The Gaussian filter 1 2 1 A Gaussian kernel gives less 1 2 4 2 weight to pixels further from 16 the center of the window 1 2 1 This kernel is an approximation of a Gaussian function Gaussian filtering versus mean

405 views • 11 slides

Semi-Supervised Local Fisher Semi-Supervised Local Fisher Discriminant Analysis Discriminant

PAKDD2008 May 20-23, 2008 Semi-Supervised Local Fisher Semi-Supervised Local Fisher Discriminant Analysis Discriminant Analysis for Dimensionality Reduction for Dimensionality Reduction Masashi Sugiyama (Tokyo Tech.) Tsuyoshi Ide (IBM)

927 views • 36 slides

Linear Discriminant Functions Linear Discriminant Functions 5.8, 5.9, 5.11 Jacob Hays Amit

10/2/2008 Linear Discriminant Functions Linear Discriminant Functions 5.8, 5.9, 5.11 Jacob Hays Amit Pillay James DeFelice Minimum Squared Error Minimum Squared Error Previous methods only worked on linear separable cases, by looking at

390 views • 19 slides

Linear Discrimination Discriminant-Based Classification 1 Linear Discrimination Linearly

Discriminant-Based Classification Posteriors Logistic Discrimination Discriminant-Based Classification Posteriors Logistic Discrimination Linear Discrimination Discriminant-Based Classification 1 Linear Discrimination Linearly Separable

223 views • 5 slides

Lecture 3 Capacity of Multiuser Gaussian Channels The Gaussian uplink: 6.1 The fading

Lecture 3 Capacity of Multiuser Gaussian Channels The Gaussian uplink: 6.1 The fading Gaussian uplink: 6.3 (parts) The Gaussian downlink: 6.2 The fading Gaussian downlink: 6.4 (parts) Mikael Skoglund, Theoretical Foundations of

347 views • 10 slides

Dimensionality Reduction: Linear Discriminant Analysis and Principal Component Analysis CMSC 678

Dimensionality Reduction: Linear Discriminant Analysis and Principal Component Analysis CMSC 678 UMBC Outline Linear Algebra/Math Review Two Methods of Dimensionality Reduction Linear Discriminant Analysis (LDA, LDiscA) Principal Component

884 views • 70 slides

Lecture #13: Discriminant Analysis Data Science 1 CS 109A, STAT 121A, AC 209A, E-109A Pavlos

Lecture #13: Discriminant Analysis Data Science 1 CS 109A, STAT 121A, AC 209A, E-109A Pavlos Protopapas Kevin Rader Margo Levine Rahul Dave Lecture Outline Discriminant Analysis LDA for one predictor LDA for p > 1 QDA Comparison of

748 views • 60 slides

Lecture 14: Discriminant Analysis CS109A Introduction to Data Science Pavlos Protopapas and Kevin

Lecture 14: Discriminant Analysis CS109A Introduction to Data Science Pavlos Protopapas and Kevin Rader Lecture Outline Discriminant Analysis LDA for one predictor LDA for p > 1 QDA Comparison of Classification Methods (so

733 views • 38 slides

Introduction to Machine Learning Classification: Discriminant Analysis

Introduction to Machine Learning Classification: Discriminant Analysis compstat-lmu.github.io/lecture_i2ml LINEAR DISCRIMINANT ANALYSIS (LDA) LDA follows a generative approach k ( x ) = P ( y = k | x ) = P ( x | y = k ) P ( y = k ) p ( x | y

492 views • 11 slides

Discriminant Analysis James H. Steiger Department of Psychology and Human Development Vanderbilt

Discriminant Analysis James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) 1 / 54 Discriminant Analysis Introduction 1 Classification in One Dimension 2 A Simple

827 views • 54 slides

Discriminant Analysis Aleix M. Martinez aleix@ece.osu.edu PCA Eigenfaces (PCA) 1 Linear

Machine Learning & Pattern Recognition Discriminant Analysis Aleix M. Martinez aleix@ece.osu.edu PCA Eigenfaces (PCA) 1 Linear Discriminant Analysis If we have samples corresponding to two or more classes, we prefer to select those

442 views • 8 slides

Maximum Likelihood Setting parameters Chris Williams, School of Informatics We choose a

Maximum Likelihood Setting parameters Chris Williams, School of Informatics We choose a parametric model p ( x | ) University of Edinburgh Overview We are given data x 1 , . . . , x n Maximum likelihood parameter estimation

46 views • 3 slides

Lecture 7: Maximum Likelihood Estimation (MLE) Maximum a Posteriori (MAP) Aykut Erdem

Lecture 7: Maximum Likelihood Estimation (MLE) Maximum a Posteriori (MAP) Aykut Erdem October 2016 Hacettepe University Administrative Assignment 2 will be out on Thursday It is due November 10 (i.e. in 2 weeks) You will

831 views • 70 slides

15-388/688 - Practical Data Science: Maximum likelihood estimation, nave Bayes J. Zico Kolter

15-388/688 - Practical Data Science: Maximum likelihood estimation, nave Bayes J. Zico Kolter Carnegie Mellon University Spring 2018 1 Outline Maximum likelihood estimation Naive Bayes Machine learning and maximum likelihood 2 Outline

352 views • 22 slides

Maximum Likelihood Jonathan Pillow Mathematical Tools for Neuroscience (NEU 314) Spring, 2016

Maximum Likelihood Jonathan Pillow Mathematical Tools for Neuroscience (NEU 314) Spring, 2016 lecture 16 Estimation model measured dataset parameter (sample) An e stimator is a function often we will write or just

330 views • 19 slides

Maximum Likelihood Density Estimation under Total Positivity Elina Robeva MIT joint work with

Maximum Likelihood Density Estimation under Total Positivity Elina Robeva MIT joint work with Bernd Sturmfels, Ngoc Tran, and Caroline Uhler arXiv:1806.10120 ICERM Workshop on Nonlinear Algebra in Applications November 12, 2018 1 / 48

1k views • 72 slides

CS54701: Information Retrieval CS-54701 Information Retrieval Course Review Luo Si Department

CS54701: Information Retrieval CS-54701 Information Retrieval Course Review Luo Si Department of Computer Science Purdue University Basic Concepts of IR: Outline Basic Concepts of Information Retrieval: Task definition of Ad-hoc IR

1.08k views • 53 slides

Maximum likelihood models Tues. Feb. 27, 2018 1 Overview of today Informal notion of

COMP 546 Lecture 14 Maximum likelihood models Tues. Feb. 27, 2018 1 Overview of today Informal notion of likelihood Formal definition of likelihood as conditional probability Maximum likelihood problems (sketch) 2 Scene

960 views • 40 slides

Fast and Stable Maximum Likelihood Estimation for Incomplete Multinomial Models Chenyang Zhang,

Fast and Stable Maximum Likelihood Estimation for Incomplete Multinomial Models Chenyang Zhang, Guosheng Yin Department of Statistics and Actuarial Science, The University of Hong Kong June 13, 2019 (HKU SAAS) ICML 2019 June 13, 2019 1 / 9

135 views • 10 slides