decision tree learning
play

Decision tree learning Introduction to Machine Learning Task of - PowerPoint PPT Presentation

INTRODUCTION TO MACHINE LEARNING Decision tree learning Introduction to Machine Learning Task of classification Automatically assign class to observations with features Observation: vector of features , with a class Automatically


  1. INTRODUCTION TO MACHINE LEARNING Decision tree learning

  2. Introduction to Machine Learning Task of classification ● Automatically assign class to observations with features ● Observation: vector of features , with a class ● Automatically assign class to new observation with features , using previous observations ● Binary classification: two classes ● Multiclass classification: more than two classes

  3. Introduction to Machine Learning Example ● A dataset consisting of persons ● Features : age, weight and income ● Class : ● binary : happy or not happy ● multiclass : happy, satisfied or not happy

  4. Introduction to Machine Learning Examples of features ● Features can be numerical ● age: 23, 25, 75, … ● height: 175.3, 179.5, … ● Features can be categorical ● travel_class: first class, business class, coach class ● smokes?: yes, no

  5. Introduction to Machine Learning The decision tree ● Suppose you’re classifying patients as sick or not sick ● Intuitive way of classifying: ask questions Is the patient young or old?

  6. Introduction to Machine Learning The decision tree ● Suppose you’re classifying patients as sick or not sick ● Intuitive way of classifying: ask questions Is the patient young or old? Old

  7. Introduction to Machine Learning The decision tree ● Suppose you’re classifying patients as sick or not sick ● Intuitive way of classifying: ask questions Is the patient young or old? Old Smoked for more than 10 years?

  8. Introduction to Machine Learning The decision tree ● Suppose you’re classifying patients as sick or not sick ● Intuitive way of classifying: ask questions Is the patient young or old? Young Old Vaccinated against the measles? Smoked for more than 10 years?

  9. Introduction to Machine Learning The decision tree ● Suppose you’re classifying patients as sick or not sick ● Intuitive way of classifying: ask questions Is the patient young or old? Young Old Vaccinated against the measles? Smoked for more than 10 years? Yes No Yes No … … … …

  10. Introduction to Machine Learning The decision tree ● Suppose you’re classifying patients as sick or not sick ● Intuitive way of classifying: ask questions Is the patient young or old? Young Old Vaccinated against the measles? Smoked for more than 10 years? Yes No Yes No … … … … It’s a decision tree!!!

  11. Introduction to Machine Learning Define the tree A B C D E F G

  12. Introduction to Machine Learning Define the tree A Nodes B C D E F G

  13. Introduction to Machine Learning Define the tree A Edges B C D E F G

  14. Introduction to Machine Learning Define the tree Root A B C D E F G

  15. Introduction to Machine Learning Define the tree Root A B C D E F G Leafs

  16. Introduction to Machine Learning Define the tree Root A Children of A B C Children of B, C D E F G Grandchildren of A

  17. Introduction to Machine Learning Define the tree Root A Children of A B C Children of B, C D E F G Grandchildren of A Leafs

  18. Introduction to Machine Learning Questions to ask age <= 18 yes no vaccinated smoked yes no yes no not not sick sick sick sick

  19. Introduction to Machine Learning Categorical feature ● Can be a feature test on itself ● travel_class: coach, business or first travel_class coach first business … … …

  20. Introduction to Machine Learning Classifying with the tree Observation: patient of 40 years, vaccinated and didn’t smoke age <= 18 yes no vaccinated smoked yes no yes no not not sick sick sick sick

  21. Introduction to Machine Learning Classifying with the tree Observation: patient of 40 years, vaccinated and didn’t smoke age <= 18 yes no vaccinated smoked yes no yes no not not sick sick sick sick

  22. Introduction to Machine Learning Classifying with the tree Observation: patient of 40 years, vaccinated and didn’t smoke age <= 18 yes no vaccinated smoked yes no yes no not not sick sick sick sick

  23. Introduction to Machine Learning Classifying with the tree Observation: patient of 40 years, vaccinated and didn’t smoke age <= 18 yes no vaccinated smoked yes no yes no not not sick sick sick sick

  24. Introduction to Machine Learning Classifying with the tree Observation: patient of 40 years, vaccinated and didn’t smoke age <= 18 yes no vaccinated smoked yes no yes no not not sick sick sick sick

  25. Introduction to Machine Learning Classifying with the tree Observation: patient of 40 years, vaccinated and didn’t smoke age <= 18 yes no vaccinated smoked yes no yes no not not sick sick sick sick Prediction: not sick

  26. Introduction to Machine Learning Learn a tree ● Use training set ● Come up with queries (feature tests) at each node

  27. Introduction to Machine Learning Split into parts training set 2 parts for binary test age <= 18 yes no part of training set part of training set part of training set part of training set TRUE FALSE

  28. Introduction to Machine Learning part of training set part of training set feature test feature test yes no yes no part of training set part of training set part of training set part of training set part of training set part of training set part of training set part of training set

  29. Introduction to Machine Learning part of training set part of training set part of training set part of training set keep splitting until leafs contain small portion of training set

  30. Introduction to Machine Learning Learn the tree ● Goal : end up with pure leafs — leafs that contain observations of one particular class leaf part of training set class 1 class class 2

  31. Introduction to Machine Learning Learn the tree ● Goal : end up with pure leafs — leafs that contain observations of one particular class leaf leaf ● In practice : almost never the case — noise ● When classifying new instances part of training set part of training set class 1 class 1 class class 2 class 2 ● end up in leaf

  32. Introduction to Machine Learning Learn the tree ● Goal : end up with pure leafs — leafs that contain observations of one particular class leaf ● In practice : almost never the case — noise ● When classifying new instances part of training set class 2 class 1 ● end up in leaf ● assign class of majority of training instances

  33. Introduction to Machine Learning Learn the tree ● At each node ● Iterate over di ff erent feature tests ● Choose the best one ● Comes down to two parts ● Make list of feature tests ● Choose test with best split

  34. Introduction to Machine Learning Construct list of tests ● Categorical features ● Parents/grandparents/… didn’t use the test yet ● Numerical features ● Choose feature ● Choose threshold

  35. Introduction to Machine Learning Choose best feature test ● More complex ● Use spli � ing criteria to decide which test to use ● Information gain ~ entropy

  36. Introduction to Machine Learning Information gain ● Information gained from split based on feature test ● Test leads to nicely divided classes 
 -> high information gain ● Test leads to scrambled classes 
 -> low information gain ● Test with highest information gain will be chosen

  37. Introduction to Machine Learning Pruning ● Number of nodes influences chance on overfit ● Restrict size — higher bias ● Decrease chance on overfit ● Pruning the tree

  38. INTRODUCTION TO MACHINE LEARNING Let’s practice!

  39. INTRODUCTION TO MACHINE LEARNING k-Nearest Neighbors

  40. Introduction to Machine Learning Instance-based learning ● Save training set in memory ● No real model like decision tree ● Compare unseen instances to training set ● Predict using the comparison of unseen data and the training set

  41. Introduction to Machine Learning k-Nearest Neighbor ● Form of instance-based learning ● Simplest form: 1-Nearest Neighbor or Nearest Neighbor

  42. Introduction to Machine Learning Nearest Neighbor - example ● 2 features : X1 and X2 ● Class : red or blue ● Binary classification

  43. Introduction to Machine Learning Nearest Neighbor - example

  44. Introduction to Machine Learning Nearest Neighbor - example ● Save complete training set

  45. Introduction to Machine Learning Nearest Neighbor - example ● Save complete training set ● Given: unseen observation with features X = (1.3, -2)

  46. Introduction to Machine Learning Nearest Neighbor - example ● Save complete training set ● Given: unseen observation with features X = (1.3, -2) ● Compare training set with new observation

  47. Introduction to Machine Learning Nearest Neighbor - example ● Save complete training set ● Given: unseen observation with features X = (1.3, -2) ● Compare training set with new observation ● Find closest observation — nearest neighbor — and assign same class just Euclidean distance, nothing fancy

  48. Introduction to Machine Learning k-Nearest Neighbors ● k is the amount of neighbors ● If k = 5 ● Use 5 most similar observations (neighbors) ● Assigned class will be the most represented class within the 5 neighbors

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend