MORPHOLOGICAL EXPLANATION IN EXPERT SYSTEMS I. ALVAREZ CEMAGREF, - - PDF document

morphological explanation in expert systems
SMART_READER_LITE
LIVE PREVIEW

MORPHOLOGICAL EXPLANATION IN EXPERT SYSTEMS I. ALVAREZ CEMAGREF, - - PDF document

MORPHOLOGICAL EXPLANATION IN EXPERT SYSTEMS I. ALVAREZ CEMAGREF, Electronics and artificial intelligence laboratory, BP121, F-92185 Antony, FRANCE LAFORIA, University of Paris VI, F-75005 Paris, FRANCE tel: (33-1) 40 96 61 77 fax: (33-1) 40 96


slide-1
SLIDE 1

MORPHOLOGICAL EXPLANATION IN EXPERT SYSTEMS

  • I. ALVAREZ

CEMAGREF, Electronics and artificial intelligence laboratory, BP121, F-92185 Antony, FRANCE LAFORIA, University of Paris VI, F-75005 Paris, FRANCE tel: (33-1) 40 96 61 77 fax: (33-1) 40 96 60 80 Topic: Principles of AI Applications Category: ECAI-92 Workshop "Improving the use of Knowledge-based systems with explanations", 1992, pp 75-81

slide-2
SLIDE 2

Abstract Knowledge-based systems are able to provide automatically an explanation of the reasoning, generally with the help of an additional unit. However their abilities are still limited, compared to human "natural" explanatory capabilities. Human explanation is very rich, depending on several parameters that are not mastered yet. In this paper we present a type of explanation centred not on the reasoning but on the case to explain. It consists in a study of the behaviour of the system for input data slightly different from the initial set. The analyze of the results given by the system allows to define relevant directions relatively to the case to explain, by analogy with the mathematic notion of

  • gradient. These directions can be used to produce explanations, not in a logical nor

chronological way, but in a geometrical way. This morphological explanation leads to reconsider how to take into account the user's knowledge and the explanatory semantic

  • f the domain.

Introduction Explanation of reasoning has always been considered as one of the main advantages of expert systems compared to classical algorithmic programs, for which the trace of calculus has to be specified. Therefore it is well known that trace of the reasoning can be very far from what human being calls explanation. Since expert systems claim to be standard development tools some weaknesses of the trace-based explanation have been corrected, mainly by using model-based systems or several reasoning modules. Part of implicit knowledge is made explicit in this way, and it is easier to define different levels of abstraction. But some major defaults remain, although the explanation task is now generally treated separately from the resolution task: main points are still extracted from details through the process of reasoning, (although it may not be relevant); explanation has to be generated in natural language, even when pieces of knowledge are expressed in a quasi-natural language, and so on. This lack of explanation ability is an obstacle as important to the operational use of expert systems as updating rulebase, validation and real-time problems. Explanation abilities of expert systems don't seem to be currently up to their promising beginning. Complexity of explanatory discourse between human beings accounts certainly for the difficulties encountered by automatic explanation. In a first part we remind of general characteristics of human explanation, and we make them match solutions that are applied to expert systems. In a second part we focus on the notion of case to explain. Explanation is often restricted to explanation of reasoning and more precisely to justification, which is only a way of explanation, as well as context description, comparison, and so on; we propose to consider the explanation issue from the point of view of the case to explain, not from the reasoning one. We defend the need for selection criteria between elements of the explanatory discourse, that depend particularly on the case to explain and relatively not

  • n the other parameters of explanation.

Explanation purpose is to make the user understand the result given by the expert system; generally explanation of reasoning make the user understand how the result was obtained. It doesn't explain directly why this result is the "good" one. In a third part we propose a type of explanation which tries to focus on the validity of the

slide-3
SLIDE 3

result, no matter how it was deduced. Morphological explanation, as we call it, is not based on the trace of reasoning. It consists in a local study of the behavior of the expert system in a small neighborhood of the case to explain. It proposes explanations in a geometrical way rather than in a logical way. This type of explanation has been applied to SCORPIO, a diagnosis expert system of diesel engines, and in a fourth part we describe the type of explanation we can provide.

  • 1. "Natural" and automatic explanations

Explanation is a task usually performed in every day life, in many ways: justification, dispute, teaching, decision making, and so on. It is part of common sense, which is so difficult to capture into models. This is certainly a main reason to the dissatisfaction of end-users facing automatic explanations [Schank], whatever efforts are done to approximate human explanation - and it's not only a problem of natural language. 1.1. natural explanation characteristics Natural explanation is very rich, adjusted to each case, depending on contextual

  • parameters. Many types of explanation are used in free discourse and selected through a

process that implies at least the search for the user's needs, the choice of a strategy, the choice of a level of abstraction, the choice of the factual elements to emphasize. Main strategies of explanation are paraphrase, justification, proof by absurd, example (and counter-example), analogy, comment. These strategies apply to both positive and negative natural explanation. The choice of a strategy and of the content of an explanation depends principally on several parameters:

  • the object of the explanation; human beings use different ways to explain

different objects, for instance a scientific statement like a theorem in mathematics, a political decision, a behavior, a feeling, and so on.

  • author's purpose; making the interlocutor understand is rather an imprecise

definition, because degree of understanding varies continuously. The psychological context is also important. The author's purpose may be: to emphasize the understanding of one or both interlocutors. This case includes free discussion; the explanation function is to make new elements emerge, either by reorganizing facts that are already known, or by going deeper in the analysis. to convince the hearer; The speaker want him to accept a statement, right or

  • wrong. (When it's wrong on purpose he wants to deceive his interlocutor.)

to teach; the explanation purpose in this case is to pass on knowledge.

  • recipient's representation; An explanation is different depending on the

supposed level of knowledge of the recipient, on his reactions, (including questions, gestures, nods, ...), on his relationship with the author.

slide-4
SLIDE 4
  • degree of interactivity; The explanation content is deeply different depending
  • n the mean of communication (oral, written, visual, ...) and the possibility of exchange
  • f information.

1.2. automatic explanation processing From the form point of view, natural explanation uses (when it is possible) other means of communication than language, mainly gestures and graphical illustrations. Automatic explanation is necessarily different; user perception is acquired through filtered text or selection of pre-defined screen areas, which gives very little information. On the other hand, communication by means of a screen gives a completely different form to explanation discourse, and graphical possibilities of computers may offer new expression means. Anyway the explanation form problem is part of the more general problem of man-machine communication, and our interest here is explanation substance. Independently of the problems of understanding and generating natural language, it is presently impossible to take into account all the parameters involved in natural explanation. Work on explanation is centered on particular fields (diagnosis, debugging, theorem proving, ...) and explanation purpose is generally supposed to be teaching (ITS) or knowledge transfer (expert to novice) [Wognum], with a high degree

  • f interactivity.

Finding out user's need is a very complicated problem since user's representation is very hard to manage: it implies to build and to update a model of the interlocutor. The setting and the use of such models are still difficult. [Goguen], [Wiener]. In practice it is easier to design a system for users whom knowledge is over a minimum. This excludes ITS and knowledge transfer, but it is the case for decision making help systems (including diagnosis). Concerning the choice of a level of abstraction, the use of the trace of reasoning in rule-based systems presents well-known inconvenience [Clancey1] [Chandrasekaran]. The structuration of knowledge used by the system and of the reasoning process is a good way to organize the trace of reasoning: it corresponds to the decomposition of a problem in sub-problems with a specific type of reasoning, specific knowledge or level of details. Spliting the trace in this way is therefore relevant to the user and well accepted in an interactive mode [Haziza]. But if some fields or purposes are very interactive (debugging, on-line diagnosis, knowledge transfer, ...), some others are less interactive. In this case users are interested by a global explanation that involves its specific characteristics [Swartout]. In this context the structuration of the trace by the reasoning is not necessarily a good way to find the good level of abstraction. Regarding strategies, most of explanation work handle with justification, because it is based on the trace of reasoning. Explanation tries often to answer to these questions [Clancey2], [Swartout], [Safar], [David&Krivine]:

  • how was a result deduced ? (justification of the result)
  • how was a result not deduced? (justification in negative explanation)
  • in what is a particular fact important in the reasoning process? (justification of

a step)

slide-5
SLIDE 5

However explanation purpose is not justification [Paris&Wick]. It is not either to convince the user that the result is good (it may be wrong). Explanation purpose is to make the user understand why a particular result was deduced, and so to convince him

  • f the validity of this result. For instance, if a system explains itself correctly, the user

may be able to understand that the result given by the system is false, and why. (That is what was false in the reasoning of the system, or what piece of knowledge is missing,

  • r badly used).

In this perspective other strategies, like example-based explanation, analogy [Mittal&Paris], or comment have to be examined. Concerning the choice of elements to present to the user (relatively to a specific strategy and level of abstraction), it is still a difficult question particularly in the case of global explanation [Kassel]. As regards strategy and the choice of elements to emphasize, we propose a new type of explanation that allows to define relevant elements bound to the case to explain (and relatively not to the strategy).

  • 2. definition of object-dependent criteria for explanation

The explanation process focuses generally on the result to explain and the line

  • f reasoning. Most of the difficulties come from the fact that the line of reasoning has to

be correctly replaced in the context of the set of potential lines of reasoning. In this way it is possible to make the user understand the strategy followed by the system (and even understand when he doesn't agree). Without criteria of choice of important points in a line of reasoning, it is very hard to select few relevant elements to present to the user. In the global explanation case, the issue is not to sum up the reasoning, but to point out the most relevant factors regarding the situation. This brings us to the definition of "the case to explain". definition; The case to explain consists in the pair set of initial data / result given by the system for these data. This definition can be used for many systems (functional systems), that associate a result to a vector of input data. The study of the behavior of the system near the case to explain gives information on the validity of the result. It allows to find out privileged directions in the input space, along which the result changes the most

  • rapidly. (figure 1)

figure 1: relevant points and directions

slide-6
SLIDE 6

definition; These directions and the corresponding points for which the result changes (when they can be found), are called relevant. They are interesting for many reasons:

  • relevant directions extend the notion of sensitivity; if it is necessary to test the

sensitivity of a result when data are inaccurate or not certain [Dubois&Prade], it can be useful to do it in any case, because sensitivity is an objective information that depend "only" on the case to explain.

  • relevant points and directions are found independently from the type of

reasoning (and, if the result is correct, independently from the system: any other system should conclude on the same result for all these points). Their are only defined by the structure of the input space and the topology induced by the system (which depends on the nature of the problem). In this way, they are objective. They are strongly determined by the case to explain, and, therefore, are very meaningful for the explanation concern. In particular they should be useful whatever the strategy applied.

  • in the case of a global trace, they can be used to give a comment of the result.

Since they are not (directly) bound to the reasoning process, they give a form-based explanation that is neither logical nor historical as the trace of reasoning. This kind of explanation may be complementary to a justification, or used as a filter to point out relevant elements.

  • 3. Morphological explanation

The purpose of morphological explanation is to give information about the structure of the case to explain, in a geometrical approach. This method consists in a relaxation of the input data to find out the "closest" input points for which the result of the system is different. 3.1. the notion of proximity in the input space The definition of relevant points assumes that a distance can be defined on the input space. In practise the definition of proximity between two sets of data is

  • sufficient. This definition has to be given in a constructive way, that is a declarative or

algorithmic method to calculate the sets of data close to an input set. Let's give the significance of the notion of distance; For a system that makes a classification, the results it gives belong to a set of classes. The system does a partition

  • f the input space. In the expert's view, all cases in a same class are very similar,

because they lead to the same result. In this way it would not be difficult to define a distance on the input space, for which two sets of data are close if they give the same

  • result. On the other hand, for a total ignorant of the domain, the input space is flat: in a
slide-7
SLIDE 7

discrete space, the relevant distance is a kind of Hamming's distance*. The distance expresses in a certain way the user's level of knowledge. Regarding the explanation purpose, a distance is interesting when it integrates the properties of the input space that are verified everywhere (global properties that are part of the definition of the domain,

  • pposite to local properties relative to the problem).

The notion of proximity has then to be as general as possible; it has to be defined relatively to the domain and not to the specific problem the system tries to solve, neither to a particular case to explain. The distance must not miss some general properties of the domain: in this case relevant directions would always be bound to these missing properties. For instance, when playing bridge, using Hamming's distance will always produce relevant directions bound to the exchange between small cards and honors, because close neighbors of a hand will differ in a great number of points. In the bridge case (whatever the question is: bid, answer, etc) two hands are close if they differ only in small cards. 3.2. the search for relevant points an directions Once the notion of proximity has been defined, the search for relevant points consists in relaxing the input data until there is a change in the result given by the

  • system. When the input space is not discrete, a discretisation is necessary: the

constructive definition of proximity is used in order to calculate the neighbors of the case to explain. We consider that a set of initial conditions for the system is a point X in the multi-dimensional input space. If Xo represents the initial conditions of the case to explain, generally there are several points Xi near Xo for which the system answers differently from Xo. For each Xi we determine its closest neighbors Yij (with XoYij ≤ XoXi) for which the system gives the same result than for Xo. The Yij points are on the frontier between two different areas of the input space, for which the system gives two different results. The resultant Ri >of the vectors YijXi > give the elementary change in the input space that makes the result change near Xi. (Its direction is slightly different from the direction of XoXi > because of the discretisation). figure 2: calculus of a relevant direction

* Hamming'distance between two boolean vectors is the number of bits that are different: e.g. d[ (0 0 1)

(0 1 1)] = 1

slide-8
SLIDE 8

The sets (Xi, Ri >) describes the behavior of the system near the case to

  • explain. In the continuous case, if the system is represented by a function sufficiently

smooth, the relevant directions are gradients. In fact the method we describe consists in explaining the result of a function with its differential [Alvarez1]. Relevant directions are then characteristics of the case to explain and should have a meaning to the user. 3.3. the semantic terms of a domain Relevant directions can not be used roughly to give an explanation to the user. Even if they are significant of the case to explain, they are an operational explanation, not a symbolic one [Varela]. To produce symbolic explanation it is necessary to know the semantic terms that are valid for the field. These terms are used by the expert to make its own explanations and can be different from the terms and concepts used for the resolution [Clancey2]. They have to be gathered separately from the resolution knowledge and the relevant directions must be described with these semantic terms.

  • 4. Generating explanation for SCORPIO

4.1. definition of proximity and semantic of explanation SCORPIO is a diagnosis expert system for engines of agricultural tractors. Sensors are set up on the tractor and record the functioning points of significant magnitudes (power, pomp output, engine output (SC), torque). Those points are compared to references and analyzed as well as operator's comments to identify defaults and to propose maintenance advice. Generally about 3 defaults are found for a tractor. Although the tractor is a physical system it is presently not possible to use a model or simulation of the engine to produce a diagnosis, and we have developed a rule-based expert system to make an automatic diagnosis [Alvarez2]. The aim of the system is to give a diagnosis as quickly as possible, and so it is not very interactive. Apart from the operator's comments, which deal with the historical record of the tractor, the input data of the system include about ten test and reference points on three

  • curves. Part of the operator's comments vary continuously; for each one, the expert

gave the relevant step of relaxation. For instance, for the number of working hours, a good step of relaxation is the average amount of working hours per year (about 300 hours). Concerning the test curves, only two curves can be shifted independently (the

  • ther ones are calculated). The expert defined the notion of proximity between two set
  • f curves, answering the question: "when can we say that two tests are close?". The

direct neighbors of a curve are obtained by propagation of a "wave" on three points like in figure 3. Because of the reference, the direction that brings a curve closer to its reference is privileged. (The average gap diminishes).

slide-9
SLIDE 9

neighbor level 2 figure 3: direct and indirect neighbors of a curve Concerning the semantic terms, the expert uses several specific expressions to explain a diagnosis, like for instance: "the engine transforms its fuel well ": it is a relation in proportion between the power, the engine output and the pomp output (∆ power ≥ 2 ∆ engine output ; ∆ pomp

  • utput = ∆ power + ∆ engine output)

"the evolution of motor output and pomp output is correlated": it is a relation between the variation of the gap of both curves with their references. A relevant direction is translated in these terms when the corresponding relation is verified, applied to Xo (input data of the case to explain), Xi (one of the closest initial conditions that change the result) and Ri >(resultant of elementary changes near Xi). 4.2. emphasize explanation to point out the validity of the result When explaining a result, the neighbors of the initial data are calculated and the system gives the diagnosis for each of them. Since the resolution is quick (less than 3 seconds), a large amount of cases can be explored in a reasonable time. In the definition

  • f the notion of proximity, we have seen that a curve has six direct neighbors, and that

we generally use only the three of them that reduce the gap with the reference. The two modes are available, but when the reference is taken into account it gives an explanation of the kind: "why is there a default instead of nothing", that is an answer in a repair concern. On the contrary, when all the neighbors are scanned, the explanation answers to a question of the kind: "why is there this default instead of another one"; it is an explanation in a diagnosis concern. Concerning the repair-based explanation, a case has 6 direct neighbors since two curves are independent. For example, we consider a case of diagnosis "pomp advance badly tuned". It may be not the only default recorded, but we explore the neighborhood of the case to explain until this diagnosis is not obtained any more. The time allowed to the exploration is one minute. Figure 4 presents the study of a real case. In the neighborhood of the set of initial conditions Xo, 13 cases are generated and analyzed in

slide-10
SLIDE 10

about 30 seconds. Near Xo there are 5 independent directions of variation, from a to e. The first configurations for which the system doesn't give the diagnosis "pomp advance badly tuned" are the cases (c,b) and (b,b). They are level 2 configurations. (b) is their common neighbor, (c) is a neighbor of the case (c,b) that leads too to the same result than Xo. The description of these cases is the following:

  • (b,c): "∆ pomp output increases and ∆ SC increases in the same proportion"
  • (b,b): "∆ pomp output unchanged and ∆ SC unchanged"

Both cases verify the definition of correlation (and their neighbors don't). The explanation given by the system is then: «"pomp advance badly tuned" because the pomp output and SC curves are NOT correlated.» This comment is the result of a (geometrical) study of the validity of the diagnosis near the case to explain. It presents the facts shared by the two relevant points (because they have a common neighbor; otherwise there would have been two separate comments). The aim of this comment is to make the user understand the answer of the

  • system. If he doesn't agree, it is either because the comment should lead to a different

result or because he needs more information. In the first case other types of explanation must be tried, that details the reasoning, for instance a trace-based justification. (At least the user will either understand the comment or correct the system). In the second case it is necessary to have a more interactive system.

slide-11
SLIDE 11

figure 4: study of a case to explain

slide-12
SLIDE 12

Conclusion The type of explanation we have presented is very useful to lower the threshold

  • f misunderstanding when a result is presented to the user. It is different from negative

explanation, although it can propose comparison with other results; The only results that are taken into account are the ones that can be reached with a slight change of the initial conditions. For the other results the method give no information. On the other hand, since it is relatively independent of the resolution process, it can be applied to systems for which the trace is not available or useless [Alvarez&al]. This method assumes that it is possible to define a notion of proximity, and that high-level explanations are possible. (This is not always the case, for instance in pattern recognition). The resolution process has to be quick, because the complexity of this explanation process is exponential. We are presently working on these questions to

  • ptimize the search of the first opponent. Another direction of work is to filter the trace
  • f reasoning with the relevant points. This should be possible, because these points are

significant relatively to the case to explain. Bibliography [Alvarez&al] Alvarez I., Bochereau L., Bourgine P., Deffuant G. "Classifieurs multi- couches et explication" Applica 90, Lille, September 1990. [Alvarez1] Alvarez I. "Explication comparative dans les sytèmes-experts" Proceedings

  • f the XI° International Conference Expert Systems and their Applications, Second

Generation E.S., Avignon, 1991, 173-184 [Alvarez2] Alvarez I. "SCORPIO, an expert system for diagnosis of agricultural tractors", Agricultural Engineering, Berlin, 1990. [Chandrasekaran] Chandrasekaran B., Tanner M.C., Josephson J.R. "Explaining control strategies in problem solving" IEEE Expert, Spring 1989, 9-25. [Clancey1] Clancey W.J. "From Guidon to Neomycin and Heracles in 20 short lessons: ORN final report 1979-1985", AI magazine, August 1986,40-187. [Clancey2] Clancey W.J. "The Guidon program" MIT Press, 1987. [David&Krivine] David J-M., Krivine J-P, Tiarri J-P, Ricard B. "DIVA: système-expert pour la surveillance vibratoire des groupes turbo-alternateurs" Convention IA 89, Paris, 1989, 539-557. [Farreny&Prade] Farreny H., Prade H. "Explications de raisonnements dans l'incertain", Journées du PRC-IA sur l'Explication, March 1989. [Goguen] Goguen J.A., Wiener J.L., Linde C. "Reasonning and natural explanation", IMMS 1983, Volume 19,521-559. [Haziza] Haziza M: "DIAMS, an expert system shell for satellite fault isolation. The user's feedback" Proceedings of Human-Machine Interaction and AI in Aeronautics & Space, Toulouse, September 1990,313-331. [Kassel] Kassel G: "CQFE". Thèse de l'Université de Paris XI, December 1986. [Mittal&Paris] Mittal V.O., Paris C.L., "Analogical explanation in the EES framework" Proceedings of the 5th Explanation Workshop , University of Manchester, 1990. [Paris&Wick] Paris C.L., Wick M.R., Thompson W.B. "The line of Reasoning versus the line of Explanation" AAAI'88 Workshop on Explanation,4 - 7. [Safar] Safar B: "Le problème des explications négatives dans les systèmes experts: le système Pourquoi-Pas?". Thèse de l'Université de Paris XI, December 1987. [Schank] Schank R.C: "Explanation Patterns: understanding mechanically and creatively", LEA, 1986

slide-13
SLIDE 13

[Swartout] Swartout W.R. "XPLAIN: a system for creating and explaining expert consulting programs" Artificial Intelligence, 1983, Volume 21,285-325. [Varela] Varela F., "Principles of biological autonomy", North Holland, 1979. [Wiener] Wiener J L, "Blah, a system which explains its reasoning" Artificial Intelligence,1980, Volume 15, 19-48. [Wognum] Wognum P.M., N.J.I. Mars, "Why is explanation still limited in practice?" Proceedings of the IJCAI Workshop on explanation generation for KBS, 1991, 83-90.