Lecture Notes for Chapter 5 Slides by Tan, Steinbach, Kumar adapted - PowerPoint PPT Presentation

Classifcation - Alternative Techniques Lecture Notes for Chapter 5 Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler Look for accompanying R code on the course web site.

Topics • Rule-Based Classifier • Nearest Neighbor Classifier • Naive Bayes Classifier • Artificial Neural Networks • Support Vector Machines • Ensemble Methods

Rule-Based Classifer • Classify records by using a collection of “if… then…” rules • Rule: ( Condition )  y - where • Condition is a conjunctions of attributes • y is the class label - LHS : rule antecedent or condition - RHS : rule consequent - Examples of classification rules: • (Blood Type=Warm)  (Lay Eggs=Yes)  Birds • (Taxable Income < 50K)  (Refund=Yes)  Evade=No

Rule-based Classifer (Example) Name Blood Type Give Birth Can Fly Live in Water Class human warm yes no no mammals python cold no no no reptiles salmon cold no no yes fishes whale warm yes no yes mammals frog cold no no sometimes amphibians komodo cold no no no reptiles bat warm yes yes no mammals pigeon warm no yes no birds cat warm yes no no mammals leopard shark cold yes no yes fishes turtle cold no no sometimes reptiles penguin warm no no sometimes birds porcupine warm yes no no mammals R1 eel cold no no yes fishes salamander cold no no sometimes amphibians gila monster cold no no no reptiles platypus warm no no no mammals owl warm no yes no birds dolphin warm yes no yes mammals eagle warm no yes no birds R1: (Give Birth = no)  (Can Fly = yes)  Birds R2: (Give Birth = no)  (Live in Water = yes)  Fishes R3: (Give Birth = yes)  (Blood Type = warm)  Mammals R4: (Give Birth = no)  (Can Fly = no)  Reptiles R5: (Live in Water = sometimes)  Amphibians

Application of Rule-Based Classifer • A rule r covers an instance x if the attributes of the instance satisfy the condition of the rule R1: (Give Birth = no)  (Can Fly = yes)  Birds R2: (Give Birth = no)  (Live in Water = yes)  Fishes R3: (Give Birth = yes)  (Blood Type = warm)  Mammals R4: (Give Birth = no)  (Can Fly = no)  Reptiles R5: (Live in Water = sometimes)  Amphibians Name Blood Type Give Birth Can Fly Live in Water Class hawk warm no yes no ? grizzly bear warm yes no no ? The rule R1 covers a hawk => Bird The rule R3 covers the grizzly bear => Mammal

Ordered Rule Set vs. Voting • Rules are rank ordered according to their priority - An ordered rule set is known as a decision list • When a test record is presented to the classifier - It is assigned to the class label of the highest ranked rule it has triggered - If none of the rules fired, it is assigned to the default class R1: (Give Birth = no)  (Can Fly = yes)  Birds R2: (Give Birth = no)  (Live in Water = yes)  Fishes R3: (Give Birth = yes)  (Blood Type = warm)  Mammals R4: (Give Birth = no)  (Can Fly = no)  Reptiles R5: (Live in Water = sometimes)  Amphibians Name Blood Type Give Birth Can Fly Live in Water Class turtle cold no no sometimes ? • Alternative: (weighted) voting by all matching rules.

Rule Coverage and Accuracy • Coverage of a rule: Tid Refund Marital Taxable Class Status Income - Fraction of records 1 Yes Single 125K No that satisfy the 2 No Married 100K No antecedent of a rule 3 No Single 70K No 4 Yes Married 120K No • Accuracy of a rule: 5 No Divorced 95K Yes - Fraction of records 6 No Married 60K No 7 Yes Divorced 220K No that satisfy both the 8 No Single 85K Yes antecedent and 9 No Married 75K No consequent of a rule 10 No Single 90K Yes 1 0 (Status=Single)  No Coverage = 40%, Accuracy = 50%

Rules From Decision Trees Aquatic Creature = No was pruned • Rules are mutually exclusive and exhaustive (cover all training cases) • Rule set contains as much information as the tree • Rules can be simplified (similar to pruning of the tree) • Example: C4.5rules

Direct Methods of Rule Generation • Extract rules directly from the data • Sequential Covering (Example: try to cover class +) d R1 R1 c ... R2 a b x (iii) Step 2 (iv) Step 3 (ii) Step 1 R1: a > x > b ∧ c > y > d ⟶ c l a s s +

Advantages of Rule-Based Classifers • As highly expressive as decision trees • Easy to interpret • Easy to generate • Can classify new instances rapidly • Performance comparable to decision trees

Nearest Neighbor Classifers • Basic idea: - If it walks like a duck, quacks like a duck, then it’s probably a duck Compute Test Distance Record Training Choose k of the Records “nearest” records

Nearest-Neighbor Classifers • Requires three things Unknown record - The set of stored records k = 3 y – - Distance Metric to compute – – + distance between records – + + - The value of k, the number of – + nearest neighbors to retrieve – + • To classify an unknown record: – – – – - Compute distance to other – training records – – - Identify k nearest neighbors – – – – - Use class labels of nearest – + + + + + neighbors to determine the – – + + + class label of unknown record – (e.g., by taking majority vote) x

Defnition of Nearest Neighbor X X X (a) 1-nearest neighbor (b) 2-nearest neighbor (c) 3-nearest neighbor K-nearest neighbors of a record x are data points that have the k smallest distance to x

Nearest Neighbor Classifcation • Compute distance between two points: - Euclidean distance ( p i − q i ) 2 d ( p, q )= √ ∑ i • Determine the class from nearest neighbor list - take the majority vote of class labels among the k-nearest neighbors - Weigh the vote according to distance • weight factor, w 2 = 1 / d

Nearest Neighbor Classifcation… • Choosing the – y – value of k: – – – - If k is too small, – + – – sensitive to – – noise points – – - If k is too large, + – – – + + neighborhood + – – – may include – – – points from – + + + + + – – other classes + + + – x k is too large!

Scaling issues • Attributes may have to be scaled to prevent distance measures from being dominated by one of the attributes • Example: • height of a person may vary from 1.5m to 1.8m • weight of a person may vary from 90lb to 300lb • income of a person may vary from $10K to $1M -> Income will dominate Euclidean distance • Solution: scaling/standardization (Z-Score) Z = X − barX sd ( X )

Nearest neighbor Classifcation… • k-NN classifiers are lazy learners - It does not build models explicitly (unlike eager learners such as decision trees) - Needs to store all the training data - Classifying unknown records are relatively expensive (find the k- nearest neighbors) • Advantage: Can create non-linear decision boundaries – y – + – – – – + + – – – – + – – + – – + + – – – + + x k=1

Bayes Classifer • A probabilistic framework for solving classification problems • Conditional Probability: P ( C ∣ A )= P ( A ,C ) C and A P ( A ) are events. P ( A ∣ C )= P ( A ,C ) A is called evidence. P ( C ) • Bayes theorem: P ( C ∣ A )= P ( A ∣ C ) P ( C ) P ( A )

Example of Bayes Theorem - A doctor knows that meningitis causes stiff neck 50% of the time → P ( S | M ) = . 5 - Prior probability of any patient having meningitis is P ( M ) = 1 / 5 0 , 0 0 0 = 0 . 0 0 0 0 2 - Prior probability of any patient having stiff neck is P ( S ) = 1 / 2 0 = 0 . 0 5 • If a patient has stiff neck, what’s the probability he/ she has meningitis? P ( M ∣ S )= P ( S ∣ M ) P ( M ) = 0.5 × 1 / 50000 = 0.0002 1 / 20 P ( S ) Increases the probability by x10!

Bayesian Classifers • Consider each attribute and class label as random variables • Given a record with attributes ( A , A , …, A ) 1 2 n - Goal is to predict class C - Specifically, we want to find the value of C that maximizes P ( C | A , A , …, A ) 1 2 n

Bayesian Classifers • compute the posterior probability P for all ( C | A , A , …, A ) 1 2 n values of C using the Bayes theorem P ( C ∣ A 1 A 2 … A n )= P ( A 1 A 2 … A n ∣ C ) P ( C ) P ( A 1 A 2 … A n ) • Choose value of C that maximizes P ( C | A , A , …, A ) this is a constant! 1 2 n • Equivalent to choosing value of C that maximizes P ( A , A , …, A | C ) P ( C ) 1 2 n • How to estimate P ? ( A , A , …, A | C ) 1 2 n

Naïve Bayes Classifer Assume independence among attributes A when class is i given: - P ( A , A , …, A | C ) = P ( A | C ) P ( A | C ) … P ( A | C ) 1 2 n 1 j 2 j n j - Can estimate P for all A and C . ( A | C ) i j i j - New point is classified to C such that: j max j ( P ( C j ) ∏ P ( A j ∣ C j ) )

Lecture Notes for Chapter 5 Slides by Tan, Steinbach, Kumar adapted - PowerPoint PPT Presentation

Classifcation - Alternative Techniques Lecture Notes for Chapter 5 Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler Look for accompanying R code on the course web site. Topics Rule-Based Classifier Nearest Neighbor Classifier

Pocket Lecture Pocket Lecture Pocket Lecture Pocket Lecture Listen Audio Notes Progress

Topics 11/13/2006 Chapter 11, start Chapter 12 11/20/2006 Chapter 12 11/27/2006 Chapter 13

C R RAO AIMSCS Lecture Notes Series Author (s): B.L.S. PRAKASA RAO Title of the Notes : Brief

Alexander Volya 2016, Feb. GGI Lecture notes www.volya.net Alexander Volya 2016, Feb. GGI

IBM Model 701 (Early 1950's) CS 140 Lecture Notes: Introduction Slide 1 IBM 7094 (Early 1960's)

Problem solved: IBM Notes Replacement 2 IBM Notes Replacement Migrating from IBM Notes to

Printout Tuesday, October 29, 2019 7:38 PM Quick Notes Page 1 Quick Notes Page 2 Quick Notes

Briefing Notes The Briefing Notes Page The Briefing Notes include: An introduction to the

Slides from lecture Friday, April 26, 2019 12:02 PM Unfiled Notes Page 1 Unfiled Notes Page 2

AMath 483/583 Lecture 8 Notes: This lecture: Fortran subroutines and functions Arrays

Topics 11/13/2006 Chapter 11, start Chapter 12 11/20/2006 Chapter 12 Inheritance Concepts

Malaysian Healthy Ageing Society Plenary Lecture Plenary Lecture Plenary Lecture Plenary

CSE 527, Additional notes on MLE & EM Based on earlier notes by C. Grant & M. Narasimhan

Signals & Systems The Continuous-Time Fourier Transform Adapted From: Lecture Notes From MIT

Chapter 13 Chapter 13 1 What is this? Chapter 13 2 What is this? Chapter 13 3 What is

Cell History and Structure Quiz on Block Day January 18-19, 2016 Admit Ticket NOTES: Take notes

Communication Issues in Collective Decision Making Nicolas Maudet nicolas.maudet@lip6.fr

YOLO9000: Better, Faster, Stronger Date: January 24, 2018 Prepared by Haris Khan (University of

Preprocessing input data for machine learning by FCA Jan OUTRATA Dept. Computer Science

How Travis AFB Transformed its Cleanup Program into an Award Winning Green Sustainable

Districting and Gerrymandering Andrea Scozzari University Niccol` o Cusano Caen, July 8-12 2014

WHAT WE ARE WORKING TOWARDS Automated process of generating complex, bespoke assemblies in BIM

Boundaries and novelty: the correspondence between points of change and perceived boundaries

ECO 199 GAMES OF STRATEGY Spring Term 2004 April 1 STRATEGIC MOVES SUMMARY AND

Lecture Notes for Chapter 5 Slides by Tan, Steinbach, Kumar adapted - PowerPoint PPT Presentation

Classifcation - Alternative Techniques Lecture Notes for Chapter 5 Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler Look for accompanying R code on the course web site. Topics Rule-Based Classifier Nearest Neighbor Classifier

Pocket Lecture Pocket Lecture Pocket Lecture Pocket Lecture Listen Audio Notes Progress

Topics 11/13/2006 Chapter 11, start Chapter 12 11/20/2006 Chapter 12 11/27/2006 Chapter 13

C R RAO AIMSCS Lecture Notes Series Author (s): B.L.S. PRAKASA RAO Title of the Notes : Brief

Alexander Volya 2016, Feb. GGI Lecture notes www.volya.net Alexander Volya 2016, Feb. GGI

IBM Model 701 (Early 1950's) CS 140 Lecture Notes: Introduction Slide 1 IBM 7094 (Early 1960's)

Problem solved: IBM Notes Replacement 2 IBM Notes Replacement Migrating from IBM Notes to

Printout Tuesday, October 29, 2019 7:38 PM Quick Notes Page 1 Quick Notes Page 2 Quick Notes

Briefing Notes The Briefing Notes Page The Briefing Notes include: An introduction to the

Slides from lecture Friday, April 26, 2019 12:02 PM Unfiled Notes Page 1 Unfiled Notes Page 2

AMath 483/583 Lecture 8 Notes: This lecture: Fortran subroutines and functions Arrays

Topics 11/13/2006 Chapter 11, start Chapter 12 11/20/2006 Chapter 12 Inheritance Concepts

Malaysian Healthy Ageing Society Plenary Lecture Plenary Lecture Plenary Lecture Plenary

CSE 527, Additional notes on MLE &amp; EM Based on earlier notes by C. Grant &amp; M. Narasimhan

Signals &amp; Systems The Continuous-Time Fourier Transform Adapted From: Lecture Notes From MIT

Chapter 13 Chapter 13 1 What is this? Chapter 13 2 What is this? Chapter 13 3 What is

Cell History and Structure Quiz on Block Day January 18-19, 2016 Admit Ticket NOTES: Take notes

Communication Issues in Collective Decision Making Nicolas Maudet nicolas.maudet@lip6.fr

YOLO9000: Better, Faster, Stronger Date: January 24, 2018 Prepared by Haris Khan (University of

Preprocessing input data for machine learning by FCA Jan OUTRATA Dept. Computer Science

How Travis AFB Transformed its Cleanup Program into an Award Winning Green Sustainable

Districting and Gerrymandering Andrea Scozzari University Niccol` o Cusano Caen, July 8-12 2014

WHAT WE ARE WORKING TOWARDS Automated process of generating complex, bespoke assemblies in BIM

Boundaries and novelty: the correspondence between points of change and perceived boundaries

ECO 199 GAMES OF STRATEGY Spring Term 2004 April 1 STRATEGIC MOVES SUMMARY AND

CSE 527, Additional notes on MLE & EM Based on earlier notes by C. Grant & M. Narasimhan

Signals & Systems The Continuous-Time Fourier Transform Adapted From: Lecture Notes From MIT