decision trees representation
play

Decision Trees: Representation Machine Learning 1 Some slides from - PowerPoint PPT Presentation

Decision Trees: Representation Machine Learning 1 Some slides from Tom Mitchell, Dan Roth and others Key issues in machine learning Modeling How to formulate your problem as a machine learning problem? How to represent data? Which


  1. Decision Trees: Representation Machine Learning 1 Some slides from Tom Mitchell, Dan Roth and others

  2. Key issues in machine learning • Modeling How to formulate your problem as a machine learning problem? How to represent data? Which algorithms to use? What learning protocols? • Representation Good hypothesis spaces and good features • Algorithms – What is a good learning algorithm? – What is success? – Generalization vs overfitting – The computational question: How long will learning take? 2

  3. Coming up… (the rest of the semester) Different hypothesis spaces and learning algorithms – Decision trees and the ID3 algorithm – Linear classifiers • Perceptron • SVM • Logistic regression – Combining multiple classifiers • Boosting, bagging – Non-linear classifiers – Nearest neighbors 3

  4. Coming up… (the rest of the semester) Different hypothesis spaces and learning algorithms – Decision trees and the ID3 algorithm Important issues to consider – Linear classifiers • Perceptron 1. What do these hypotheses represent? • SVM • Logistic regression 2. Implicit assumptions and tradeoffs – Combining multiple classifiers • Boosting, bagging 3. Generalization? – Non-linear classifiers – Nearest neighbors 4. How do we learn? 4

  5. This lecture: Learning Decision Trees 1. Representation : What are decision trees? 2. Algorithm : Learning decision trees The ID3 algorithm: A greedy heuristic – 3. Some extensions 5

  6. This lecture: Learning Decision Trees 1. Representation : What are decision trees? 2. Algorithm : Learning decision trees The ID3 algorithm: A greedy heuristic – 3. Some extensions 6

  7. Representing data Data can be represented as a big table, with columns denoting different attributes Name Label Claire Cardie - Peter Bartlett + Eric Baum + Haym Hirsh - Leslie Pack Kaelbling + Yoav Freund - 7

  8. Representing data Data can be represented as a big table, with columns denoting different attributes Second Length of Same first Name has Name character of first letter in two Label punctuation? first name name>5? names? Claire Cardie No l Yes Yes - Peter Bartlett No e No No + Eric Baum No r No No + Haym Hirsh No a No Yes - Leslie Pack No e Yes No + Kaelbling Yoav Freund No o No No - 8

  9. Representing data Data can be represented as a big table, with columns denoting different attributes Second Length of Same first Name has Name character of first letter in two Label punctuation? first name name>5? names? Claire Cardie No l Yes Yes With these four attributes, how many unique rows are possible? - 2 · 26 · 26 · 2 = 2704 Peter Bartlett No e No No + If there are 100 attributes, all binary, how many unique rows are possible? Eric Baum No r No No + 2 100 Haym Hirsh No a No Yes - Leslie Pack No e Yes No + Kaelbling Yoav Freund No o No No - 9

  10. Representing data Data can be represented as a big table, with columns denoting different attributes Second Length of Same first Name has Name character of first letter in two Label punctuation? first name name>5? names? Claire Cardie No l Yes Yes With these four attributes, how many unique rows are possible? - 2×26×2×2 = 208 Peter Bartlett No e No No + If there are 100 attributes, all binary, how many unique rows are possible? Eric Baum No r No No + 2 100 Haym Hirsh No a No Yes - Leslie Pack No e Yes No + Kaelbling Yoav Freund No o No No - 10

  11. Representing data Data can be represented as a big table, with columns denoting different attributes Second Length of Same first Name has Name character of first letter in two Label punctuation? first name name>5? names? Claire Cardie No l Yes Yes With these four attributes, how many unique rows are possible? - 2×26×2×2 = 208 Peter Bartlett No e No No + If there are 100 attributes, all binary, how many unique rows are possible? Eric Baum No r No No + 2 100 Haym Hirsh No a No Yes - Leslie Pack No e Yes No + Kaelbling Yoav Freund No o No No - 11

  12. Representing data Data can be represented as a big table, with columns denoting different attributes Second Length of Same first Name has Name character of first letter in two Label punctuation? first name name>5? names? Claire Cardie No l Yes Yes With these four attributes, how many unique rows are possible? - 2×26×2×2 = 208 Peter Bartlett No e No No + If there are 100 attributes, all binary, how many unique rows are possible? Eric Baum No r No No + (100 times) 2×2×2× ⋯×2 = 2 )** Haym Hirsh No a No Yes - Leslie Pack No e Yes No + Kaelbling Yoav Freund No o No No - 12

  13. Representing data Data can be represented as a big table, with columns denoting different attributes Second Length of Same first Name has Name character of first letter in two Label punctuation? first name name>5? names? Claire Cardie No l Yes Yes With these four attributes, how many unique rows are possible? - 2×26×2×2 = 208 Peter Bartlett No e No No + If there are 100 attributes, all binary, how many unique rows are possible? Eric Baum No r No No + (100 times) 2×2×2× ⋯×2 = 2 )** Haym Hirsh No a No Yes - If we wanted to store all possible rows, this number is too large. Leslie Pack No e Yes No + Kaelbling We need to figure out how to represent data in a better, more efficient way Yoav Freund No o No No - 13

  14. What are decision trees? A hierarchical data structure that represents data using a divide-and-conquer strategy Can be used as hypothesis class for non-parametric classification or regression General idea: Given a collection of examples, learn a decision tree that represents it 14

  15. What are decision trees? • Decision trees are a family of classifiers for instances that are represented by collections of attributes (i.e. features) • Nodes are tests for feature values • There is one branch for every value that the feature can take • Leaves of the tree specify the class labels 15

  16. Let’s build a decision tree for classifying shapes Label=C Label=B Label=A 16

  17. Let’s build a decision tree for classifying shapes Label=C Label=B Label=A Before building a decision tree: What is the label for a red triangle? And why? 17

  18. Let’s build a decision tree for classifying shapes Label=C Label=B Label=A What are some attributes of the examples? 18

  19. Let’s build a decision tree for classifying shapes Label=C Label=B Label=A What are some attributes of the examples? Color, Shape 19

  20. Let’s build a decision tree for classifying shapes Label=C Label=B Label=A What are some attributes of the examples? Color, Shape Color? 20

  21. Let’s build a decision tree for classifying shapes Label=C Label=B Label=A What are some attributes of the examples? Color, Shape Color? Blue Green Red 21

  22. Let’s build a decision tree for classifying shapes Label=C Label=B Label=A What are some attributes of the examples? Color, Shape Color? Blue Green Red B 22

  23. Let’s build a decision tree for classifying shapes Label=C Label=B Label=A What are some attributes of the examples? Color, Shape Color? Blue Green Red Shape? B circle triangle square B C A 23

  24. Let’s build a decision tree for classifying shapes Label=C Label=B Label=A What are some attributes of the examples? Color, Shape Color? Blue Green Red Shape? Shape? B circle triangle square circle square B A B C A 24

  25. Let’s build a decision tree for classifying shapes 1. How do we learn a decision tree? Coming up soon… 2. How to use a decision tree for prediction ? What is the label for a red triangle? • Just follow a path from the root to a leaf • Label=C Label=B Label=A What are some attributes of the examples? What about a green triangle? • Color, Shape Color? Blue Green Red Shape? Shape? B circle triangle square circle square B A B C A 25

  26. Let’s build a decision tree for classifying shapes 1. How do we learn a decision tree? Coming up soon… 2. How to use a decision tree for prediction ? What is the label for a red triangle? • Just follow a path from the root to a leaf • Label=C Label=B Label=A What are some attributes of the examples? What about a green triangle? • Color, Shape Color? Blue Green Red Shape? Shape? B circle triangle square circle square B A B C A 26

  27. Expressivity of Decision trees What Boolean functions can decision trees represent? – Any Boolean function Every path from the tree to a root is a rule The full tree is equivalent to the conjunction of all the rules (Color=blue AND Shape=triangle ) Label=B) AND (Color=blue AND Shape=square ) Label=A) AND (Color=blue AND Shape=circle ) Label=C) AND…. Any Boolean function can be represented as a decision tree. 27

  28. Expressivity of Decision trees What Boolean functions can decision trees represent? – Any Boolean function Every path from the tree to a root is a rule The full tree is equivalent to the conjunction of all the rules (Color=blue AND Shape=triangle ) Label=B) AND (Color=blue AND Shape=square ) Label=A) AND (Color=blue AND Shape=circle ) Label=C) AND…. Any Boolean function can be represented as a decision tree. 28

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend