Explaining the result of a Decision Tree to the End-User Isabelle - PDF document

Explaining the result of a Decision Tree to the End-User Isabelle Alvarez 1 2 Abstract. This paper addresses the problem of the explanation of tests that are in the trace can have no consequences on the result. the result given by a decision tree, when it is used to predict the class Conversely, a little change of the value of an attribute that doesn’t of new cases. In order to evaluate this result, the end-user relies on appear in the the trace can lead to a modification of the resulting some estimate of the error rate and on the trace of the classification. class. The fact is that the trace doesn’t exploit the information that is Unfortunately the trace doesn’t contain the information necessary to embedded in the partition realized by the DT in the input space. understand the case at hand. We propose a new method to qualify the We propose a geometric method to take into account the complete result given by a decision tree when the data are continuous-valued. partition of the input space, when it possible to define a metric. This We perform a geometric study of the decision surface (the boundary method is based on the study of the decision surface (DS), that is the of the inverse image of the different classes). This analysis gives the boundary of the inverse image of the different classes in the input list of the tests of the tree that are the most sensitive to a change in space. We consider that the position of a case relatively to the DS the input data. Unlike the trace, this list can easily be ordered and can give a good description of the situation to the end-user. It allows pruned so that only the most important tests are presented. We also to identify the tests of the DT that are the most sensitive to a change show how the metric can be used to interact with the end-user. in the input data. Contrary to the trace, this list of tests is relevant to 3 explain the particular classification of a case, since if the tests of the list aren’t verified any more, the class changes. The paper is organized as follow: Section 2 presents the drawbacks 1 INTRODUCTION of the trace as an explanation support. Simple geometric examples show why they cannot be bypassed by any processing of the trace. Real-world applications of decision trees (DT) are used as deci- The same examples suggest a geometric method to identify more sion support system in various domain [13]. DT algorithms are relevant tests to describe the situation of a case. Section 3 presents also integrated in software for data mining or decision support pur- the geometric sensitivity analysis method, some interesting proper- pose (see for instance software lists on http://www.kdnuggets.com or ties of the sensitive tests (uniqueness, robustness, ordering relation) http://www.mlnet.org/). They often offer many possibilities to build, and general results. Section 4 focuses on one example and studies the prune, manipulate or validate decision trees. However, when it comes role of the metric. Possible complementary viewpoints are discussed to the final use of the DT, to classify real cases and make a decision, in the concluding section. end-users find little information to assess the relevance of the result. This kind of information is generally available by the mean of error rates or probability estimators. [4] [11] [7]. In practice, these 2 LIMITS OF THE TRACE AS AN estimators are not always available, since they are developed for the EXPLANATION SUPPORT construction of the tree and not for the end-user’s need (see examples in [15] and [6]). They are also not necessarily accurate ([11]; Software that integrates decision trees algorithms generally allows [10]). Besides, little information is developed to help the end-user to the user to visualize the trace of the classification of a new case. But link the result to the input data, to assess the relevance of the result. it is not easy to read, all the more so as it grows in size. Moreover it Actually, it’s a difficult problem, since it depends on both the user has similar drawbacks to the trace of reasoning in rule based system and the system. Works on tree intelligibility are an attempt to answer (see [5]), since it is easy to translate a decision tree into an ordered this question. This is done mainly by pruning methods (see [8] for a list of rules (by following every path from the root to the different review). Works on feature selection (see [20]) contribute also to this leaves). In fact, works on trace of reasoning finally directed toward objective. It is also one of the main objectives of fuzzy DT [17]. But reconstructive explanation [14], [18], [9]. The following examples with these methods, intelligibility is sought for the tree itself, consid- illustrate why the trace cannot be used to provide to the end-user rel- ered as a model. The relevance of a particular result is only available evant information about the case. We consider binary linear decision by the mean of the trace of the classification, that is the path followed trees (LDT): a test consists in computing the algebraic distance h of a in the tree, the list of the tests passed by the case from the root to the new case (the point P ) to a hyperplane H . The point P passes the test leaf that finally gives the class. depending on the sign of h ( P, H ) . So the area classified by a leaf is Unfortunately the trace doesn’t hold the right information that is the intersection of halfspaces E ( H ) . The tree induces a partition of necessary to understand the situation of a case. The change of some the input space, and we call decision surface the union of the bound- aries of the different areas corresponding to the different classes. In 1 LIP6, Paris VI University, Paris, France email: isabelle.alvarez@lip6.fr the case of LDT, it consists in pieces of hyperplanes. Figures 2 and 3 2 Cemagref, Aubi` ere, France show examples of partitions induced by the trees in Figure 1. 3 this paper is the extended version of I. Alvarez (2004) ”Explaining the result We consider the trace of the classification given by the trees for of a Decision Tree to the End-User”. In Proceedings of the 16th European several points. DT1 classifies P 1 at the first test, so the trace of the Conference on Artificial Intelligence, pp. 411–415, IOS Press.

Explaining the result of a Decision Tree to the End-User Isabelle - PDF document

Explaining the result of a Decision Tree to the End-User Isabelle Alvarez 1 2 Abstract. This paper addresses the problem of the explanation of tests that are in the trace can have no consequences on the result. the result given by a decision

Decision Tree Decision Trees A decision tree is a decision support tool that uses a tree-like

Learning Decision Trees Representation is a decision tree. Bias is towards simple decision

voice Kate Howland End-user programming? End-user programming? End-user programming?

Are Hybrid Physical Designs Important? 1 B+ tree 2 C O L B+ tree 3 ? C O L C O L B+ tree

61A Lecture 21 Announcements Binary Trees Binary Tree Class 4 Binary Tree Class class

Explaining a Result to the End-User: A Geometric Approach for Classification Problems Isabelle

Tree-sitter @maxbrunsfeld What is Tree-sitter? Why I wrote Tree-sitter What were

Decision tree learning Aim: find a small tree consistent with the training examples Idea:

2010 Full Year Result 2010 Full Year Result 23 February 2011 2010 Full Year Result 2010 Full

A Brief History of Decision Tree Implementation MAX AUSTIN Overview Famous Decision Tree

Decision Making Under Decision Making . . . General Set Uncertainty: Proof of This Result

Writing reliable end to end tests End to end browser tests They take a long time to run. Around

Final Examples Announcements Trees Tree-Structured Data def tree(label, branches=[]): A tree

Explaining Deep Learning Predictions and Isaac Ahern Integrating Domain Ontologies Outline

Explaining Type Errors Brent Yorgey Richard Eisenberg Harley Eades Off the Beaten Track 13

Decision Tree R Greiner Cmput 466 / 551 Learning Decision Trees Def'n: Decision Trees

Napa County Travel Behavior Study NCTPA Board Meeting Presentation December 17, 2014 1 Overview

Rules and Regulations Compliance, Inspection & Certification Bill Gibson Consultant Marine

Folktale Classification using Learning to Rank Dong Nguyen, Dolf Trieschnigg, and Marit Theune

2 Related Work Figure 1. Training and Test Instances. Most existing work on associative

Regional Workshop Workshop Purpose To increase RCA understanding of what it will take to Purpose

Industry Suppliers Mar de 2020 March 2020 Catalonia Industry Suppliers | Portals and Business

Knowledge Representation Gabi Stanovsky About me Third year PhD student at Bar Ilan

Locations I Barcelona (Head Office) Poligono Pratense Calle 113 N 6-8 08820 El Prat de

Explaining the result of a Decision Tree to the End-User Isabelle - PDF document

Explaining the result of a Decision Tree to the End-User Isabelle Alvarez 1 2 Abstract. This paper addresses the problem of the explanation of tests that are in the trace can have no consequences on the result. the result given by a decision

Decision Tree Decision Trees A decision tree is a decision support tool that uses a tree-like

Learning Decision Trees Representation is a decision tree. Bias is towards simple decision

voice Kate Howland End-user programming? End-user programming? End-user programming?

Are Hybrid Physical Designs Important? 1 B+ tree 2 C O L B+ tree 3 ? C O L C O L B+ tree

61A Lecture 21 Announcements Binary Trees Binary Tree Class 4 Binary Tree Class class

Explaining a Result to the End-User: A Geometric Approach for Classification Problems Isabelle

Tree-sitter @maxbrunsfeld What is Tree-sitter? Why I wrote Tree-sitter What were

Decision tree learning Aim: find a small tree consistent with the training examples Idea:

2010 Full Year Result 2010 Full Year Result 23 February 2011 2010 Full Year Result 2010 Full

A Brief History of Decision Tree Implementation MAX AUSTIN Overview Famous Decision Tree

Decision Making Under Decision Making . . . General Set Uncertainty: Proof of This Result

Writing reliable end to end tests End to end browser tests They take a long time to run. Around

Final Examples Announcements Trees Tree-Structured Data def tree(label, branches=[]): A tree

Explaining Deep Learning Predictions and Isaac Ahern Integrating Domain Ontologies Outline

Explaining Type Errors Brent Yorgey Richard Eisenberg Harley Eades Off the Beaten Track 13

Decision Tree R Greiner Cmput 466 / 551 Learning Decision Trees Def'n: Decision Trees

Napa County Travel Behavior Study NCTPA Board Meeting Presentation December 17, 2014 1 Overview

Rules and Regulations Compliance, Inspection &amp; Certification Bill Gibson Consultant Marine

Folktale Classification using Learning to Rank Dong Nguyen, Dolf Trieschnigg, and Marit Theune

2 Related Work Figure 1. Training and Test Instances. Most existing work on associative

Regional Workshop Workshop Purpose To increase RCA understanding of what it will take to Purpose

Industry Suppliers Mar de 2020 March 2020 Catalonia Industry Suppliers | Portals and Business

Knowledge Representation Gabi Stanovsky About me Third year PhD student at Bar Ilan

Locations I Barcelona (Head Office) Poligono Pratense Calle 113 N 6-8 08820 El Prat de

Rules and Regulations Compliance, Inspection & Certification Bill Gibson Consultant Marine