Generalization Ability of Majority Vote Point classifiers Akshat - PowerPoint PPT Presentation

Generalization Ability of Majority Vote Point classifiers Akshat Agarwal Rahul K Sevakula Department of Electrical Engineering Indian Institute of Technology Kanpur August 22, 2015 Akshat Agarwal, Rahul K Sevakula (IITK) Generalization Ability of Majority Vote Point classifiers August 22, 2015 1 / 16

Outline Problem Description 1 Health Monitoring of Industrial Machines Need for Highly Generalized Classifiers Mathematical Background 2 VC Dimension Growth function The Majority Vote Point (MVP) classifier 3 Features Upper Bound on VC dimension Empirical Estimate on VC dimension Case Study 4 Conclusion 5 References 6 Akshat Agarwal, Rahul K Sevakula (IITK) Generalization Ability of Majority Vote Point classifiers August 22, 2015 2 / 16

Health Monitoring of Industrial Machines Health monitoring of machines has been a subject of great interest for many decades. Diagnosis or prognosis of machine components is generally done by analysing various machine parameters like vibration, acoustics, temperature, pressure etc. A common observation with these machine parameters is that they can be very inconsistent. An example of this inconsistency can be seen in acoustic fault diagnosis of air compressors, where the nature of acoustic recordings changes with time, wear and tear of the machine, and even on its repair. Akshat Agarwal, Rahul K Sevakula (IITK) Generalization Ability of Majority Vote Point classifiers August 22, 2015 3 / 16

Need for Classifiers with High Generalization Figure : Though both recordings are taken in the same machine state and from the same sensor position, they are quite different from each other In such a situation, performing real time diagnosis with low level features [1] can be very difficult. This brings out the need for a classifier that is highly generalized. Classification problems with small number of samples and high dimensionality also need highly generalized classifiers. Akshat Agarwal, Rahul K Sevakula (IITK) Generalization Ability of Majority Vote Point classifiers August 22, 2015 4 / 16

Vapnik-Chervonenkis (VC) Dimension A measure of the capacity or complexity of a classification algorithm It is defined as the cardinality of the largest set of points that the algorithm can shatter [2]. A set of points is said to be shattered by a class of functions if a member of the class can perfectly separate them no matter how we assign a binary label to each point. Figure : VC dimension of a 2D space is 3 [3] Akshat Agarwal, Rahul K Sevakula (IITK) Generalization Ability of Majority Vote Point classifiers August 22, 2015 5 / 16

Growth function The growth function Π H ( m ) of a classifier space H is the maximum number of ways into which m points can be classified by H . Π H ( m ) = max {| Π H ( S ) | : S ⊆ Ω , | S | = m } (1) where Ω is the sample space { 0 , 1 } m . Therefore, if VCD ( H ) = d , when m ≤ d , Π H ( m ) = 2 m . When m ≥ d , an upper bound can be applied on the growth function using Sauer’s lemma [4], Sauer’s lemma d � m � � em � d � Π H ( m ) ≤ Φ d ( m ) := ≤ (2) i d i =0 Akshat Agarwal, Rahul K Sevakula (IITK) Generalization Ability of Majority Vote Point classifiers August 22, 2015 6 / 16

The Majority Vote Point (MVP) classifier Domain of the hypothesis space is in R . The range for class labels is { 0 , 1 } , meaning there are only two classes spanning the entire data, namely 0 and 1. The classifier will be trained on a single feature in the data and minimization of training error will be its objective. Learning the individual classifiers is similar to finding a threshold point on a line that has direction information regarding the class label. The number of classifiers selected for majority voting will be equal to the number of features in the data. Akshat Agarwal, Rahul K Sevakula (IITK) Generalization Ability of Majority Vote Point classifiers August 22, 2015 7 / 16

Akshat Agarwal, Rahul K Sevakula (IITK) Generalization Ability of Majority Vote Point classifiers August 22, 2015 8 / 16

Upper Bound on VC dimension Consider a hypothesis space H with VC ( H ) = d and let H N be a majority vote classifier combining N ( ≥ 1) classifiers in H . Let VC ( H N ) = p . Then there exists a subset S of the sample space Ω with p elements such that S is shattered by H N . Then, from (1) Π H N ( p ) = 2 p (3) Since H N consists of a combination of N classifiers from H, Π H N ( p ) ≤ (Π H ( p )) N (4) � d � N �� ep (Π H ( p )) N ≤ (5) d From (3), (4) and (5) � d � N �� ep 2 p ≤ d Akshat Agarwal, Rahul K Sevakula (IITK) Generalization Ability of Majority Vote Point classifiers August 22, 2015 9 / 16

Solving, we get the following two bounds on the value of p � − ln 2 � � − ln 2 � − W 0 × Nd − W − 1 × Nd eN eN p 1 = , p 2 = (6) ln 2 ln 2 where W 0 ( x ) and W − 1 ( x ) denote the main branch and a lower branch of the Lambert W. function. Here p 1 ≤ p ≤ p 2 . The lower bound p 1 is a monotonically decreasing function of N with a maximum value of 1.0627. The upper bound p 2 is a monotonically increasing function of N . 1200 1000 Upper Bound, p 2 800 600 400 200 0 5 10 15 20 25 30 35 40 45 50 Number of features, N Figure : Upper bound on VC dimension p Akshat Agarwal, Rahul K Sevakula (IITK) Generalization Ability of Majority Vote Point classifiers August 22, 2015 10 / 16

Empirical Estimation of VC dimension The VC dimension of any classifier space can be found by examining a plot of the ratio (Π H ( m ) / 2 m ) to m . The last value of m at which the graph has the value of unity is the VC dimension of H . Growth function Π H ( m ) of the MVP classifier space was calculated by exhaustively searching through the sample space. The size of our search space was drastically reduced from the infinitely large real number space R m × n to a set consisting of � n + m ! − 1 � inputs m ! − 1 that are representative of all possible input combinations. For finding the exact value of Π H N ( m ) for a given m and n , its value � N + g − 1 � was found for all inputs, and the maximum value among g − 1 them was reported. Akshat Agarwal, Rahul K Sevakula (IITK) Generalization Ability of Majority Vote Point classifiers August 22, 2015 11 / 16

1 0.9 0.8 0.7 Π H N ( m ) / 2 m 0.6 0.5 0.4 0.3 5 features 0.2 7 features 9 features 0.1 0 1 2 3 4 5 6 7 8 Number of samples, m Figure : Plot of the ratio Π H N ( m ) / 2 m versus number of classifiers m . Each graph deviates from 1 at m = 3. Hence it appears that the VC dimension of MVP classifier is 3, irrespective of the number of features. Akshat Agarwal, Rahul K Sevakula (IITK) Generalization Ability of Majority Vote Point classifiers August 22, 2015 12 / 16

Case study on acoustic fault diagnosis Generalization ability of MVP classifier was compared with linear and RBF kernel SVM on acoustic data obtained from air compressors [5]. The training and testing set both consisted of 256 samples, each with 286 features. Since the number of samples was less than the number of features, it raised the possibility of overfitting. Therefore, performance of the classifier was checked twice : 1) with all 286 features and 2) with a reduced set of 25 features. Akshat Agarwal, Rahul K Sevakula (IITK) Generalization Ability of Majority Vote Point classifiers August 22, 2015 13 / 16

Conclusion A class of majority vote classifiers, MVP classifier was proposed that is more generalized than linear SVM on account of lower VC dimension. An upper bound on the VC dimension was formulated. The exact value was empirically estimated to be 3. A case study on a real world application demonstrated the high generalization ability of the MVP classifier in comparison to SVM in case of low level feature data. Akshat Agarwal, Rahul K Sevakula (IITK) Generalization Ability of Majority Vote Point classifiers August 22, 2015 14 / 16

Future Work Checking the performance of MVP classifier on multi-class problems. A limitation of the MVP classifier is that in many problems it lacks sufficient flexibility to fit the training data well. Hence a possible extension of this work could involve the use of deep learning techniques for transforming low level features to higher level features, to give low training error with MVP classifier. Akshat Agarwal, Rahul K Sevakula (IITK) Generalization Ability of Majority Vote Point classifiers August 22, 2015 15 / 16

References I Y. Bengio, “Learning deep architectures for ai,” Foundations and � in Machine Learning , vol. 2, no. 1, pp. 1–127, 2009. trends R V. N. Vapnik, Statistical learning theory . Wiley New York, 1998. C. J. Burges, “A tutorial on support vector machines for pattern recognition,” Data mining and knowledge discovery , vol. 2, no. 2, pp. 121–167, 1998. N. Sauer, “On the density of families of sets,” Journal of Combinatorial Theory, Series A , vol. 13, no. 1, pp. 145–147, 1972. N. Verma, R. Sevakula, S. Dixit, and A. Salour, “Intelligent condition based monitoring using acoustic signals for air compressors,” Reliability, IEEE Transactions on , vol. PP, no. 99, pp. 1–19, 2015. Akshat Agarwal, Rahul K Sevakula (IITK) Generalization Ability of Majority Vote Point classifiers August 22, 2015 16 / 16

Generalization Ability of Majority Vote Point classifiers Akshat - PowerPoint PPT Presentation

Generalization Ability of Majority Vote Point classifiers Akshat Agarwal Rahul K Sevakula Department of Electrical Engineering Indian Institute of Technology Kanpur August 22, 2015 Akshat Agarwal, Rahul K Sevakula (IITK) Generalization

Nonlinear Classifiers II 2 Nonlinear Classifiers: Introduction Classifiers Supervised

Cognitive Modeling Unseen Examples 2 Bayes Classifiers Lecture 14: Naive Bayes Classifiers

Majority element using O (1) memory Anil Maheshwari School of Computer Science Carleton

Fusion of Continuous Output Classifiers Classifiers Jacob Hays Amit Pillay James DeFelice

Machine Learning Nave Bayes classifiers Types of classifiers We can divide the large

Occasion-level Classifiers or Event-level Classifiers? -Evidence from Child Language Acquisition

CS440/ECE448 Lecture 22: Including Slides by Svetlana Lazebnik, 10/2016 Linear Classifiers

Combining Classifiers d i,j = 1 if D i labels x in i , and d i,j = 0 otherwise. In this case,

Majority Rule in the Absence of a Majority Klaus Nehring and Marcus Pivato ESSLLI August 13,

Majority)Gate)with)Temporary) Signals The$following$version$of$the$majority$gate$uses$some$

Bayes Classifiers Nave Bayes Classification Patrick Mair Bayes Classifiers Weather data

REGISTER TO VOTE | MAKE A VOTING PLAN VISIT WMICH.EDU/VOTE TO LEARN HOW. MAKE A VOTING PLAN:

https://www.gov.uk/register-to-vote Can register here and at home, vote in both local elections

Local Substitutability for Sequence Generalization Fran cois Coste , Ga elle Garet , Jacques

Data Anonymization - Generalization Algorithms Li Xiong, Slawek Goryczka CS573 Data Privacy and

Data Anonymization - Generalization Algorithms Li Xiong CS573 Data Privacy and Anonymity

Phased Array Feeds at NRAO NRAO: B. Shillue, R. Fisher, B. Simon, A. Roshi, S. White BYU: K.

Calculus Review Session Brian Prest Duke University Nicholas School of the Environment August

Packaging for Power Electronics Habilitation Diriger des Recherches Cyril B UTTAY

Shawano School District 2016-17 Budget Community Budget Meetings August 31 & September 6

Presentation for Set 4: Day One 14 November 2016 Presentation Overview 1 A legacy of investing

SHAPING YOUR FUTURE DR. BRAD BAHLER PROVINCIAL MEDICAL DIRECTOR PCN EVOLUTION, AMA ACTT, CHAIR

X Ray Detection and Analysis for the PFRC Alexandra Bosh 7/28/2016 1 Objectives Background

Integrated assessment in a multi-region world with multiple energy sources John Hassler, Per

Generalization Ability of Majority Vote Point classifiers Akshat - PowerPoint PPT Presentation

Generalization Ability of Majority Vote Point classifiers Akshat Agarwal Rahul K Sevakula Department of Electrical Engineering Indian Institute of Technology Kanpur August 22, 2015 Akshat Agarwal, Rahul K Sevakula (IITK) Generalization

Nonlinear Classifiers II 2 Nonlinear Classifiers: Introduction Classifiers Supervised

Cognitive Modeling Unseen Examples 2 Bayes Classifiers Lecture 14: Naive Bayes Classifiers

Majority element using O (1) memory Anil Maheshwari School of Computer Science Carleton

Fusion of Continuous Output Classifiers Classifiers Jacob Hays Amit Pillay James DeFelice

Machine Learning Nave Bayes classifiers Types of classifiers We can divide the large

Occasion-level Classifiers or Event-level Classifiers? -Evidence from Child Language Acquisition

CS440/ECE448 Lecture 22: Including Slides by Svetlana Lazebnik, 10/2016 Linear Classifiers

Combining Classifiers d i,j = 1 if D i labels x in i , and d i,j = 0 otherwise. In this case,

Majority Rule in the Absence of a Majority Klaus Nehring and Marcus Pivato ESSLLI August 13,

Majority)Gate)with)Temporary) Signals The$following$version$of$the$majority$gate$uses$some$

Bayes Classifiers Nave Bayes Classification Patrick Mair Bayes Classifiers Weather data

REGISTER TO VOTE | MAKE A VOTING PLAN VISIT WMICH.EDU/VOTE TO LEARN HOW. MAKE A VOTING PLAN:

https://www.gov.uk/register-to-vote Can register here and at home, vote in both local elections

Local Substitutability for Sequence Generalization Fran cois Coste , Ga elle Garet , Jacques

Data Anonymization - Generalization Algorithms Li Xiong, Slawek Goryczka CS573 Data Privacy and

Data Anonymization - Generalization Algorithms Li Xiong CS573 Data Privacy and Anonymity

Phased Array Feeds at NRAO NRAO: B. Shillue, R. Fisher, B. Simon, A. Roshi, S. White BYU: K.

Calculus Review Session Brian Prest Duke University Nicholas School of the Environment August

Packaging for Power Electronics Habilitation Diriger des Recherches Cyril B UTTAY

Shawano School District 2016-17 Budget Community Budget Meetings August 31 &amp; September 6

Presentation for Set 4: Day One 14 November 2016 Presentation Overview 1 A legacy of investing

SHAPING YOUR FUTURE DR. BRAD BAHLER PROVINCIAL MEDICAL DIRECTOR PCN EVOLUTION, AMA ACTT, CHAIR

X Ray Detection and Analysis for the PFRC Alexandra Bosh 7/28/2016 1 Objectives Background

Integrated assessment in a multi-region world with multiple energy sources John Hassler, Per

Shawano School District 2016-17 Budget Community Budget Meetings August 31 & September 6