Learning Decision Trees Machine Learning 1 Some slides from Tom - PowerPoint PPT Presentation

Learning Decision Trees Machine Learning 1 Some slides from Tom Mitchell, Dan Roth and others

This lecture: Learning Decision Trees 1. Representation : What are decision trees? 2. Algorithm : Learning decision trees The ID3 algorithm: A greedy heuristic – 3. Some extensions 2

This lecture: Learning Decision Trees 1. Representation : What are decision trees? 2. Algorithm : Learning decision trees The ID3 algorithm: A greedy heuristic – 3. Some extensions 3

History of Decision Tree Research Full search decision tree methods to model human concept learning: Hunt et • al 60s, psychology Quinlan developed the ID3 ( I terative D ichotomiser 3 ) algorithm, with the • information gain heuristic to learn expert systems from examples (late 70s) Breiman, Freidman and colleagues in statistics developed CART ( C lassification • A nd R egression T rees) A variety of improvements in the 80s: coping with noise, continuous • attributes, missing data, non-axis parallel, etc. Quinlan’s updated algorithms, C4.5 (1993) and C5 are more commonly used • Boosting (or Bagging) over decision trees is a very good general purpose • algorithm 4

Will I play tennis today? • Features – Outlook: {Sun, Overcast, Rain} – Temperature: {Hot, Mild, Cool} – Humidity: {High, Normal, Low} – Wind: {Strong, Weak} • Labels – Binary classification task: Y = {+, -} 5

Will I play tennis today? O utlook: S unny, O T H W Play? O vercast, 1 S H H W - R ainy 2 S H H S - 3 O H H W + T emperature: H ot, 4 R M H W + M edium, 5 R C N W + C ool 6 R C N S - 7 O C N S + H umidity: H igh, 8 S M H W - N ormal, 9 S C N W + 10 R M N W + L ow 11 S M N S + W ind: S trong, 12 O M H S + W eak 13 O H N W + 14 R M H S - 6

Basic Decision Tree Learning Algorithm • Data is processed in Batch (i.e. all the data available) O T H W Play? 1 S H H W - 2 S H H S - 3 O H H W + 4 R M H W + 5 R C N W + 6 R C N S - 7 O C N S + 8 S M H W - 9 S C N W + 10 R M N W + 11 S M N S + 12 O M H S + 13 O H N W + 7 14 R M H S -

Basic Decision Tree Learning Algorithm • Data is processed in Batch (i.e. all the data available) • Recursively build a decision tree top down. O T H W Play? 1 S H H W - Outlook? 2 S H H S - 3 O H H W + 4 R M H W + Sunny Overcast Rain 5 R C N W + 6 R C N S - Wind? Humidity? Yes 7 O C N S + 8 S M H W - High 9 S C N W + Normal Strong Weak 10 R M N W + 11 S M N S + No Yes No Yes 12 O M H S + 13 O H N W + 8 14 R M H S -

Basic Decision Tree Learning Algorithm • Data is processed in Batch (i.e. all the data available) • Recursively build a decision tree top down. O T H W Play? 1. Decide what attribute 1 S H H W - Outlook? goes at the top 2 S H H S - 3 O H H W + 4 R M H W + Sunny Overcast Rain 5 R C N W + 6 R C N S - Wind? Humidity? Yes 7 O C N S + 8 S M H W - High 9 S C N W + Normal Strong Weak 10 R M N W + 11 S M N S + No Yes No Yes 12 O M H S + 13 O H N W + 9 14 R M H S -

Basic Decision Tree Learning Algorithm • Data is processed in Batch (i.e. all the data available) • Recursively build a decision tree top down. O T H W Play? 1. Decide what attribute 1 S H H W - Outlook? goes at the top 2 S H H S - 3 O H H W + 4 R M H W + Sunny Overcast Rain 5 R C N W + 2. Decide what to do for 6 R C N S - Wind? Humidity? Yes each value the root 7 O C N S + attribute takes 8 S M H W - High 9 S C N W + Normal Strong Weak 10 R M N W + 11 S M N S + No Yes No Yes 12 O M H S + 13 O H N W + 10 14 R M H S -

Basic Decision Tree Algorithm: ID3 Input : S the set of Examples ID3(S, Attributes): Attributes is the set of measured attributes 1. If all examples are have same label: Return a single node tree with the label 2. Otherwise 1. Create a Root node for tree 2. A = attribute in Attributes that best classifies S 3. for each possible value v of that A can take: 1. Add a new tree branch corresponding to A=v 2.Let S v be the subset of examples in S with A=v 3.if S v is empty: why? add leaf node with the common value of Label in S For generalization at test time Else: below this branch add the subtree ID3( S v , Attributes - {A}, Label) 4. Return Root node 11

Basic Decision Tree Algorithm: ID3 Input : S the set of Examples ID3(S, Attributes): Attributes is the set of measured attributes 1. If all examples are have same label: Return a single node tree with the label 2. Otherwise 1. Create a Root node for tree 2. A = attribute in Attributes that best classifies S 3. for each possible value v of that A can take: 1. Add a new tree branch corresponding to A=v 2.Let S v be the subset of examples in S with A=v 3.if S v is empty: why? add leaf node with the common value of Label in S For generalization at test time Else: below this branch add the subtree ID3( S v , Attributes - {A}, Label) 4. Return Root node 12

Basic Decision Tree Algorithm: ID3 Input : S the set of Examples ID3(S, Attributes): Attributes is the set of measured attributes 1. If all examples are have same label: Return a single node tree with the label 2. Otherwise Decide what attribute goes at the top 1. Create a Root node for tree 2. A = attribute in Attributes that best classifies S 3. for each possible value v of that A can take: 1. Add a new tree branch corresponding to A=v 2.Let S v be the subset of examples in S with A=v 3.if S v is empty: why? add leaf node with the common value of Label in S For generalization at test time Else: below this branch add the subtree ID3( S v , Attributes - {A}, Label) 4. Return Root node 13

Basic Decision Tree Algorithm: ID3 Input : S the set of Examples ID3(S, Attributes): Attributes is the set of measured attributes 1. If all examples are have same label: Return a single node tree with the label 2. Otherwise Decide what attribute goes at the top 1. Create a Root node for tree 2. A = attribute in Attributes that best classifies S 3. for each possible value v of that A can take: 1. Add a new tree branch corresponding to A=v 2.Let S v be the subset of examples in S with A=v 3.if S v is empty: why? add leaf node with the common value of Label in S For generalization at test time Else: below this branch add the subtree ID3( S v , Attributes - {A}, Label) 4. Return Root node 14

Basic Decision Tree Algorithm: ID3 Input : S the set of Examples ID3(S, Attributes): Attributes is the set of measured attributes 1. If all examples are have same label: Return a single node tree with the label 2. Otherwise 1. Create a Root node for tree 2. A = attribute in Attributes that best classifies S Decide what to do for each 3. for each possible value v of that A can take: value the root attribute takes 1. Add a new tree branch corresponding to A=v 2.Let S v be the subset of examples in S with A=v 3.if S v is empty: why? add leaf node with the common value of Label in S For generalization at test time Else: below this branch add the subtree ID3( S v , Attributes - {A}, Label) 4. Return Root node 15

Basic Decision Tree Algorithm: ID3 Input : S the set of Examples ID3(S, Attributes): Attributes is the set of measured attributes 1. If all examples are have same label: Return a single node tree with the label 2. Otherwise 1. Create a Root node for tree 2. A = attribute in Attributes that best classifies S Decide what to do for each 3. for each possible value v of that A can take: value the root attribute takes 1. Add a new tree branch for attribute A taking value v 2.Let S v be the subset of examples in S with A=v 3.if S v is empty: why? add leaf node with the common value of Label in S For generalization at test time Else: below this branch add the subtree ID3( S v , Attributes - {A}, Label) 4. Return Root node 16

Basic Decision Tree Algorithm: ID3 Input : S the set of Examples ID3(S, Attributes): Attributes is the set of measured attributes 1. If all examples are have same label: Return a single node tree with the label 2. Otherwise 1. Create a Root node for tree 2. A = attribute in Attributes that best classifies S Decide what to do for each 3. for each possible value v of that A can take: value the root attribute takes 1. Add a new tree branch for attribute A taking value v 2.Let S v be the subset of examples in S with A=v 3.if S v is empty: why? add leaf node with the common value of Label in S For generalization at test time Else: below this branch add the subtree ID3( S v , Attributes - {A}, Label) 4. Return Root node 17

Basic Decision Tree Algorithm: ID3 Input : S the set of Examples ID3(S, Attributes): Attributes is the set of measured attributes 1. If all examples are have same label: Return a single node tree with the label 2. Otherwise 1. Create a Root node for tree 2. A = attribute in Attributes that best classifies S Decide what to do for each 3. for each possible value v of that A can take: value the root attribute takes 1. Add a new tree branch for attribute A taking value v 2.Let S v be the subset of examples in S with A=v 3.if S v is empty: add leaf node with the common value of Label in S For generalization at test time Else: below this branch add the subtree ID3( S v , Attributes - {A}, Label) 4. Return Root node 18

Learning Decision Trees Machine Learning 1 Some slides from Tom - PowerPoint PPT Presentation

Learning Decision Trees Machine Learning 1 Some slides from Tom Mitchell, Dan Roth and others This lecture: Learning Decision Trees 1. Representation : What are decision trees? 2. Algorithm : Learning decision trees The ID3 algorithm: A greedy

Learning Decision Trees Representation is a decision tree. Bias is towards simple decision

Decision Trees Lecture 23 To left or to right 1 Decision Trees 2 Decision Trees A different

Decision Trees Lecture 22 To left or to right 1 Decision Trees 2 Decision Trees A different

Trees Trees CSE, IIT KGP Trees and Spanning Trees Trees and Spanning Trees A graph having

Decision Tree R Greiner Cmput 466 / 551 Learning Decision Trees Def'n: Decision Trees

( ( ) ) ( ) ( ) = = Work = h log t n B- B -Trees Trees B B- -Trees

Trees Chapter 11 Chapter Summary Introduction to Trees Applications of Trees Tree

Decision Trees: Discussion Machine Learning 1 Some slides from Tom Mitchell, Dan Roth and others

Trees Eric McCreath Overview In this lecture we will explore: general trees, binary trees,

Lecture 23: Decision Trees Decision trees Prof. Julia Hockenmaier

Outline Univariate Trees 1 Decision Trees Classification Regression Pruning Steven J Zeil

2-3-4 Trees and Red- Black Trees 204 erm CS 16: Balanced Trees 2-3-4 Trees Revealed Nodes

/ + - * * 5 3 2 6 5 2 Examples Binary Trees BSTs Augmenting BinExpr General Trees

Optimal Sparse Decision Trees Xiyang Hu Cynthia Rudin Margo Seltzer Carnegie Mellon Duke

Decision trees Decision Trees / Discrete Variables Location Season Location Fun? Ski Slope

Supervised Learning via Decision Trees Lecture 4 Supervised Learning via Decision Trees October

ARTIFICIAL INTELLIGENCE Supervised learning: classification Lecturer: Silja Renooij These slides

Perform EDA AN ALYZ IN G IOT DATA IN P YTH ON Matthias Voppichler IT Developer Plot dataframe

Contribution to the humidity condition supervision in the ATLAS Inner Detector volume

Concept Learning Mitchell, Chapter 2 CptS 570 Machine Learning School of EECS Washington State

Evolution of convective of convective cloud cloud top top height height: : Evolution

Average Path Profile of Atmospheric Temperature and Humidity

Discovering and Building Semantic Models of Web Sources Craig A. Knoblock University of Southern

Administrative notes Labs this week: project time. Remember, you need to pass the project in