Verification and Validation of Knowledge-Based Systems Prepared by - - PowerPoint PPT Presentation

verification and validation of knowledge based systems
SMART_READER_LITE
LIVE PREVIEW

Verification and Validation of Knowledge-Based Systems Prepared by - - PowerPoint PPT Presentation

Verification and Validation of Knowledge-Based Systems Prepared by Dr Ahmed Rafea Survey on KBS Development V&V Definition WHY IS V&V OF KS DIFFICULT? Knowledge Representation Formalisms Evolving and Large KBs


slide-1
SLIDE 1

Verification and Validation

  • f Knowledge-Based Systems

Prepared by Dr Ahmed Rafea

slide-2
SLIDE 2

Survey on KBS Development

slide-3
SLIDE 3

V&V Definition

slide-4
SLIDE 4

WHY IS V&V OF KS DIFFICULT?

  • Knowledge Representation Formalisms
  • Evolving and Large KBs
  • Characteristics of KBs
  • Domain Experts in V&V
  • KS Anomalies

– For instance, a rule based KB may have potential inconsistencies, incompleteness, circularity or redundancies among the rules in the KB.

  • Testing of KS

– Testing Criteria – Difficulties in Generation of Test Case Inputs – Difficulties in Generation of Test Case Outputs – Input and Output Spaces for Selection of TestCases Can Be Huge – High Costs of Testing

slide-5
SLIDE 5

V&V TECHNIQUES

  • Broadly speaking, those techniques can

be categorized into two groups:

– static methods (analysis) – dynamic methods (testing)

slide-6
SLIDE 6

Static Methods

  • Static V&V methods can have different objectives such

as detecting completeness and consistency faults and proving the correctness of the programs. Techniques in this category are:

– informal (reading/reviews, inspections, and walkthroughs) – semiformal checks such as type-checking performed by compilers – formal techniques (axiomatic mathematical proofs).

  • Some recent attempts include the informal analysis such

as the use of checklist approach, the formal analysis such as the assertional approach and the object-oriented specification approach.

slide-7
SLIDE 7

Dynamic Methods

  • Dynamic V&V methods require the execution of

a system through the use of test suites.

  • Test cases can be derived either from a

functional or structural viewpoint

  • In the functional testing, also known as the

black-box testing, a program is treated as a black box.

  • The structural testing, also known as the white-

box testing, constructs the test cases based on the implementation details.

slide-8
SLIDE 8

A Structured Testing Methodology

  • Verification

– Domain Knowledge Verification – Inference Layer Verification

  • Validation

– Test Case Generation Methods – Judging system acceptability – Regression Testing

slide-9
SLIDE 9

Domain Knowledge Verification(1)

  • The domain knowledge verification

process detects most of the coded KB

  • errors. The verification process can be

divided into three phases:

– Consistency checking phase. – Completeness checking phase. – Path checking phase.

slide-10
SLIDE 10

Domain Knowledge Verification(2)

slide-11
SLIDE 11

Consistency checking

  • The consistency checker works on the relations

between expressions of the domain layer , one relation at a time.

  • Consistency of the KB appears as:

– undefined object , – undefined attribute, – undefined attribute values, – duplicate rule pairs, – conflict rule pairs, and – subsumed rule pairs.

slide-12
SLIDE 12

Creating the relation between expressions table

  • The relation between expressions table contains the

needed information about all the relations between expressions in the KB.

  • The basic idea behind constructing this table is to

accelerate searching for any defined at tribute in the KB which is heavily used in subsequent phases

  • This table consists of the following fields:
  • Relation name: The name of the relation between

expressions as defined in the KB.

  • Input attribute: The names of object-attribute pairs

given in the rules antecedences.

  • Output attribute: The names of object-attribute pairs

given in the rules consequences.

slide-13
SLIDE 13

Completeness checking

  • As the number of rules grows large, i t

becomes impossible to check every possible path through the system. There are four indicative situations of gaps in the knowledge base:

– unused attribute values, – missing rules, – Unfireable rules, and – unused consequence.

slide-14
SLIDE 14

Path checking

  • The Path Checking consists of:

– Detecting circular

  • Circular paths are detected when an attribute appears as an

input attribute of one relation and as an output attribute of another relation and a path between the other edges of these relations can be reached.

– Redundant paths.

  • A redundant path is found when it is possible to reach the

same conclusion from the same inputs through different paths

  • These paths will be detected from a graph data
  • structure. This graph links the input attributes to

the output attributes using the relation between expressions

slide-15
SLIDE 15

Inference Layer Verification

slide-16
SLIDE 16

Step checker phase

  • The main functions of the step checker is:

– detecting inference step consistency error

  • Inconsistency arises when the input- or
  • utput-role refers to data element that is

not defined in any relation between expressions of this inference step.

slide-17
SLIDE 17

Inference checker phase

  • The inference checker works on the

input/output roles of the inference layer .

  • Since each inference has a defined input-

role and output-role, each output role should either be an input-role to the following inference step or the last output.

slide-18
SLIDE 18

Task Layer Verification

  • The task applies the inference steps

defined in the inference layer

  • The task dynamic input role must be

consistent with the input role of the first inference step or sub-task applied.

  • The task dynamic output role must be

consistent with the last inference step or sub-task applied.

slide-19
SLIDE 19

Validation

  • There are three main concerns in the

proposed validation process.

– First, determine the appropriate method to automatically generate test cases for each KB component. – Second, human experts should judge the KB’s solution. – Third, when a KB component is modified it should be re-tested (regression testing).

slide-20
SLIDE 20

Test Case Generation Methods

  • We have to define two types of tests for each KB component, automatic

test, developer test.

  • Automatic test: Test cases are automatically generated in the form of input

concept-attributes pair and their suggested values.

– The tester provides both the knowledge engineer and the domain expert by a list

  • f test cases. This list of test cases serves two functions.
  • First the knowledge engineer compares them with the requirements specification to

check consistency between the test cases and the specification (They serve the verification activity).

  • Second, the domain expert finds out whether they are valid or not
  • Developer test: By this test the developer will be able to randomly test his
  • system. He can run different KB components independently.

– A screen, holding the input concept-attributes pairs used in the KB components, is automatically generated and displayed. – This screen contains the possible legal values (in the case of nominal attribute)

  • r its boundary (in the case of numerical attribute) for each attribute.

– Thus, the developer can supply any combination of values, run the KB component, and observer the result. – The existence of domain expert will enrich this test since he could apply different combination according to his expertise.

slide-21
SLIDE 21

Judging system acceptability(1)

  • It is always difficult to define a standard against which to

judge the acceptability of the system. there is no such standard; instead, a so-called agreement method must be employed, where the performance of the system is compared with that of other performers

  • Agreement should be measured using some principled
  • approach. The most well-known agreement method is

based on Turing's famous test for intelligence.

  • Test cases are given to human experts who are asked to

determine the outcome. If expert agree on the outcome (85% for example) of test cases the tested component is

  • accepted. This will continued until a complete system is

provided.

slide-22
SLIDE 22

Judging system acceptability(2)

  • A second step is to let a group of human experts examine

anonymously the outputs obtained from the system and the outputs

  • btained by other humans for the same cases. If the third-party

experts cannot distinguish between the problem-solving abilities of the system and the other humans, then the system is deemed to be acceptable.

  • Third, the system should be monitored in the real environment; for 3

months and feedback on system effectiveness and user interface are to be assessed.

  • The system is then evaluated in a classroom setting consists of the

domain expert and end users. The expert assesses the correctness

  • f the KB, the quality of the explanation, and the quality of the
  • answers. The user assesses his/her ability to interface with the

system, the timelines of the response, the reasonableness of the

  • utput and explanation, and how the system fits in with the operating

environment.

slide-23
SLIDE 23

Regression testing

  • Regression testing in its most basic form is simply testing done to

determine whether a product has regressed to a less functional state than in the previous build

  • It is a common practice to evaluate KBS acceptance every six or

twelve months to keep track of whether its competence and accuracy remain high

  • In the use of restrict regression, that all previous test cases be

reapplied.

  • This is of course an expensive overhead to impose on a testing
  • scheme. We adopt this by having a database that contains old test

cases that have been applied to each KB components.

  • When a change applied to any component it is re-tested and

comparison between the old and the new cases is taken place.

  • The result is to be fed back to the developer to record the effect of

modification.

slide-24
SLIDE 24

Automatic Test Case Generation techniques (Rules)

  • The Generic Testing Method (GTM) creates three kinds of cases for

each rule condition:

– the case where the condition is exactly true and false; and – two cases where the condition is “minimally” true and false respectively; and – two cases where the condition is “extremely” true and false.

  • The exponential increase of the test case required affect the

performance of testing process.

  • Considering, the software engineering that suggests the optimum

tested values to be: valid data, invalid data, and default values, one approach suggested producing two cases for a condition one exactly true and the other minimally false.

  • The nonnumeric attributes are treated as special case of = operator

test.

  • Thus for a rule cluster that contains n rules, the number of

generated cases are 2*n.

slide-25
SLIDE 25

Automatic Test Case Generation techniques (Inference and Task )

  • Use test cases generated for domain

models as the test cases for the inference step that uses this model

  • Use test cases generated for inference

step as the test cases for the task