Decision Trees: Representation Machine Learning 1 Some slides from - PowerPoint PPT Presentation

Decision Trees: Representation Machine Learning 1 Some slides from Tom Mitchell, Dan Roth and others

Key issues in machine learning • Modeling How to formulate your problem as a machine learning problem? How to represent data? Which algorithms to use? What learning protocols? • Representation Good hypothesis spaces and good features • Algorithms – What is a good learning algorithm? – What is success? – Generalization vs overfitting – The computational question: How long will learning take? 2

Coming up… (the rest of the semester) Different hypothesis spaces and learning algorithms – Decision trees and the ID3 algorithm – Linear classifiers • Perceptron • SVM • Logistic regression – Combining multiple classifiers • Boosting, bagging – Non-linear classifiers – Nearest neighbors 3

Coming up… (the rest of the semester) Different hypothesis spaces and learning algorithms – Decision trees and the ID3 algorithm Important issues to consider – Linear classifiers • Perceptron 1. What do these hypotheses represent? • SVM • Logistic regression 2. Implicit assumptions and tradeoffs – Combining multiple classifiers • Boosting, bagging 3. Generalization? – Non-linear classifiers – Nearest neighbors 4. How do we learn? 4

This lecture: Learning Decision Trees 1. Representation : What are decision trees? 2. Algorithm : Learning decision trees The ID3 algorithm: A greedy heuristic – 3. Some extensions 5

This lecture: Learning Decision Trees 1. Representation : What are decision trees? 2. Algorithm : Learning decision trees The ID3 algorithm: A greedy heuristic – 3. Some extensions 6

Representing data Data can be represented as a big table, with columns denoting different attributes Name Label Claire Cardie - Peter Bartlett + Eric Baum + Haym Hirsh - Leslie Pack Kaelbling + Yoav Freund - 7

Representing data Data can be represented as a big table, with columns denoting different attributes Second Length of Same first Name has Name character of first letter in two Label punctuation? first name name>5? names? Claire Cardie No l Yes Yes - Peter Bartlett No e No No + Eric Baum No r No No + Haym Hirsh No a No Yes - Leslie Pack No e Yes No + Kaelbling Yoav Freund No o No No - 8

Representing data Data can be represented as a big table, with columns denoting different attributes Second Length of Same first Name has Name character of first letter in two Label punctuation? first name name>5? names? Claire Cardie No l Yes Yes With these four attributes, how many unique rows are possible? - 2 · 26 · 26 · 2 = 2704 Peter Bartlett No e No No + If there are 100 attributes, all binary, how many unique rows are possible? Eric Baum No r No No + 2 100 Haym Hirsh No a No Yes - Leslie Pack No e Yes No + Kaelbling Yoav Freund No o No No - 9

Representing data Data can be represented as a big table, with columns denoting different attributes Second Length of Same first Name has Name character of first letter in two Label punctuation? first name name>5? names? Claire Cardie No l Yes Yes With these four attributes, how many unique rows are possible? - 2×26×2×2 = 208 Peter Bartlett No e No No + If there are 100 attributes, all binary, how many unique rows are possible? Eric Baum No r No No + 2 100 Haym Hirsh No a No Yes - Leslie Pack No e Yes No + Kaelbling Yoav Freund No o No No - 10

Representing data Data can be represented as a big table, with columns denoting different attributes Second Length of Same first Name has Name character of first letter in two Label punctuation? first name name>5? names? Claire Cardie No l Yes Yes With these four attributes, how many unique rows are possible? - 2×26×2×2 = 208 Peter Bartlett No e No No + If there are 100 attributes, all binary, how many unique rows are possible? Eric Baum No r No No + 2 100 Haym Hirsh No a No Yes - Leslie Pack No e Yes No + Kaelbling Yoav Freund No o No No - 11

Representing data Data can be represented as a big table, with columns denoting different attributes Second Length of Same first Name has Name character of first letter in two Label punctuation? first name name>5? names? Claire Cardie No l Yes Yes With these four attributes, how many unique rows are possible? - 2×26×2×2 = 208 Peter Bartlett No e No No + If there are 100 attributes, all binary, how many unique rows are possible? Eric Baum No r No No + (100 times) 2×2×2× ⋯×2 = 2 )** Haym Hirsh No a No Yes - Leslie Pack No e Yes No + Kaelbling Yoav Freund No o No No - 12

Representing data Data can be represented as a big table, with columns denoting different attributes Second Length of Same first Name has Name character of first letter in two Label punctuation? first name name>5? names? Claire Cardie No l Yes Yes With these four attributes, how many unique rows are possible? - 2×26×2×2 = 208 Peter Bartlett No e No No + If there are 100 attributes, all binary, how many unique rows are possible? Eric Baum No r No No + (100 times) 2×2×2× ⋯×2 = 2 )** Haym Hirsh No a No Yes - If we wanted to store all possible rows, this number is too large. Leslie Pack No e Yes No + Kaelbling We need to figure out how to represent data in a better, more efficient way Yoav Freund No o No No - 13

What are decision trees? A hierarchical data structure that represents data using a divide-and-conquer strategy Can be used as hypothesis class for non-parametric classification or regression General idea: Given a collection of examples, learn a decision tree that represents it 14

What are decision trees? • Decision trees are a family of classifiers for instances that are represented by collections of attributes (i.e. features) • Nodes are tests for feature values • There is one branch for every value that the feature can take • Leaves of the tree specify the class labels 15

Let’s build a decision tree for classifying shapes Label=C Label=B Label=A 16

Let’s build a decision tree for classifying shapes Label=C Label=B Label=A Before building a decision tree: What is the label for a red triangle? And why? 17

Let’s build a decision tree for classifying shapes Label=C Label=B Label=A What are some attributes of the examples? 18

Let’s build a decision tree for classifying shapes Label=C Label=B Label=A What are some attributes of the examples? Color, Shape 19

Let’s build a decision tree for classifying shapes Label=C Label=B Label=A What are some attributes of the examples? Color, Shape Color? 20

Let’s build a decision tree for classifying shapes Label=C Label=B Label=A What are some attributes of the examples? Color, Shape Color? Blue Green Red 21

Let’s build a decision tree for classifying shapes Label=C Label=B Label=A What are some attributes of the examples? Color, Shape Color? Blue Green Red B 22

Let’s build a decision tree for classifying shapes Label=C Label=B Label=A What are some attributes of the examples? Color, Shape Color? Blue Green Red Shape? B circle triangle square B C A 23

Let’s build a decision tree for classifying shapes Label=C Label=B Label=A What are some attributes of the examples? Color, Shape Color? Blue Green Red Shape? Shape? B circle triangle square circle square B A B C A 24

Let’s build a decision tree for classifying shapes 1. How do we learn a decision tree? Coming up soon… 2. How to use a decision tree for prediction ? What is the label for a red triangle? • Just follow a path from the root to a leaf • Label=C Label=B Label=A What are some attributes of the examples? What about a green triangle? • Color, Shape Color? Blue Green Red Shape? Shape? B circle triangle square circle square B A B C A 25

Let’s build a decision tree for classifying shapes 1. How do we learn a decision tree? Coming up soon… 2. How to use a decision tree for prediction ? What is the label for a red triangle? • Just follow a path from the root to a leaf • Label=C Label=B Label=A What are some attributes of the examples? What about a green triangle? • Color, Shape Color? Blue Green Red Shape? Shape? B circle triangle square circle square B A B C A 26

Expressivity of Decision trees What Boolean functions can decision trees represent? – Any Boolean function Every path from the tree to a root is a rule The full tree is equivalent to the conjunction of all the rules (Color=blue AND Shape=triangle ) Label=B) AND (Color=blue AND Shape=square ) Label=A) AND (Color=blue AND Shape=circle ) Label=C) AND…. Any Boolean function can be represented as a decision tree. 27

Expressivity of Decision trees What Boolean functions can decision trees represent? – Any Boolean function Every path from the tree to a root is a rule The full tree is equivalent to the conjunction of all the rules (Color=blue AND Shape=triangle ) Label=B) AND (Color=blue AND Shape=square ) Label=A) AND (Color=blue AND Shape=circle ) Label=C) AND…. Any Boolean function can be represented as a decision tree. 28

Decision Trees: Representation Machine Learning 1 Some slides from - PowerPoint PPT Presentation

Decision Trees: Representation Machine Learning 1 Some slides from Tom Mitchell, Dan Roth and others Key issues in machine learning Modeling How to formulate your problem as a machine learning problem? How to represent data? Which

Learning Decision Trees Representation is a decision tree. Bias is towards simple decision

Decision Trees Lecture 23 To left or to right 1 Decision Trees 2 Decision Trees A different

Decision Trees Lecture 22 To left or to right 1 Decision Trees 2 Decision Trees A different

Trees Trees CSE, IIT KGP Trees and Spanning Trees Trees and Spanning Trees A graph having

( ( ) ) ( ) ( ) = = Work = h log t n B- B -Trees Trees B B- -Trees

Trees Chapter 11 Chapter Summary Introduction to Trees Applications of Trees Tree

Decision Tree R Greiner Cmput 466 / 551 Learning Decision Trees Def'n: Decision Trees

Trees Eric McCreath Overview In this lecture we will explore: general trees, binary trees,

Learning Decision Trees Machine Learning 1 Some slides from Tom Mitchell, Dan Roth and others

Decision Trees: Discussion Machine Learning 1 Some slides from Tom Mitchell, Dan Roth and others

Lecture 23: Decision Trees Decision trees Prof. Julia Hockenmaier

Outline Univariate Trees 1 Decision Trees Classification Regression Pruning Steven J Zeil

2-3-4 Trees and Red- Black Trees 204 erm CS 16: Balanced Trees 2-3-4 Trees Revealed Nodes

/ + - * * 5 3 2 6 5 2 Examples Binary Trees BSTs Augmenting BinExpr General Trees

Optimal Sparse Decision Trees Xiyang Hu Cynthia Rudin Margo Seltzer Carnegie Mellon Duke

Decision trees Decision Trees / Discrete Variables Location Season Location Fun? Ski Slope

input output VS-Screen D-Tree DT-Panel A module tab {Q} post condition tab {P} pre

Lectur ture 2 e 24 Decis isio ion Networks a and Sequen uenti tial al Decision on Probl

24) Condition-Action-Analysis and Event-Condition-Action-Based Design Prof. Dr. U. Amann 1.

A Business Rules Perspective on a Standard Rules Language Gary Hallmark April 28, 2005 Outline

Implementation of Decision Trees using R Margaret Mir-Juli, Arnau Mir and Monica J.

Announcem ents ( 1 ) Background reading for next week is posted. Learning to recognize

SLIM and the future of FitNesse Gojko Adzic http://gojko.net gojko@gojko.com

Decision Networks Yuqing Tang BROOKLYN Doctoral Program in Computer Science The Graduate Center