[PPT] - Learning Learning Retrieval Knowledge Retrieval Knowledge from PowerPoint Presentation

SLIDE 1

Learning Learning Retrieval Knowledge Retrieval Knowledge from Data from Data

Helge Langseth

Norwegian University of Science and Technology, Dept. of Mathematical Sciences

Agnar Aamodt

Norwegian University of Science and Technology, Dept.Computer and Information Science

Ole Martin Winnem

SINTEF Telecom and Informatics, Depth of Computer Science

NTNU

Work partly performed within NOEMIE, ESPRIT project no. 22312 Participants: NTNU, SINTEF, Saga, JRC, Schlumberger, Matra, Acknosoft, Dauphine

SLIDE 2

NTNU

Slide no.: 2

Outline Outline

Background / NOEMIE-project
CREEK
A data mining method
Integrating semantic networks with automatically

generated networks structures:

– Problems with the semantics – Benefits

Initial empirical results

SLIDE 3

NTNU

Slide no.: 3

Data and User views Data and User views

The Task Reality

SLIDE 4

NTNU

Slide no.: 4

Study of the task reality Study of the task reality

Experience gathering Past cases General domain knowledge

The Task Reality

CBR

Data warehouse Data capturing

DM

SLIDE 5

NTNU

Slide no.: 5

An example case An example case

case-16 instance-of value case has-activity value tripping-in circulating has-depth-of-occurrence value 5318 has-task value solve-lc-problem has-observable-parameter value high-pump-pressure high-mud-density-1.41-1.7kg/l high-viscosity-30-40cp normal-yield-point-10-30-lb/100ft2 large-final-pit-volume-loss->100m3 long-lc-repair-time->15h low-pump-rate low-running-in-speed-<2m/s complete-initial-loss decreasing-loss-when-pump-off very-depleted-reservoir->0.3kg/l tight-spot high-mud-solids-content->20% small-annular-hydraulic-diameter-2-4in small-leak-off/mw-margin-0.021-0.050kg/l very-long-stands-still-time->2h has-well-section-position value in-reservoir-section has-failure value induced-fracture-lc has-repair-activity value pooh-to-casing-shoe waited-<1h increased-pump-rate-stepwise lost-circulation-again pumped-numerous-lcm-pills no-return-obtained set-and-squeezed-balanced-cement-plug

SLIDE 6

NTNU

Slide no.: 6

Initial design Initial design

Data Mining

DW

Case-based reasoning

Controller

User experiences
Problem descriptions
Solutions

SLIDE 7

NTNU

Slide no.: 7

Tangled Tangled CreekL CreekL Network Network

thing domain-object case car

case#54

van electrical-fault battery-fault engine-test engine test-procedure engine-fault turning-of

ignition-key

test-step battery-low starter-motor engine-turns diagnostic-case diagnosis solved diagnostic-hypothesis wheel vehicle transportation hsc hp hsc hsc hsc hsc hsc hi hi hp hp hp hp case-of status-of hd has-status possible-status-of tested-by has-function tested-by battery instance-of has-fault hsc tested-by hsc test-for test-for has-fault goal find-fault find-treatment hsc hsc hsc hsc hsc has-state

bserved-finding

subclass-of car-fault fuel-system fuel-system-fault hsc hp has-fault has-output described-in part-of hsc electrical

system

broken-carburettor-membrane hsc hsc has-fault has-engine-status hi hd starter-motor-turns N-DD-234567 has-electrical-status finding subclass-of subclass-of subclass-of hsc hp

has subclass
has-instance
has-part
has-descriptor

SLIDE 8

NTNU

Slide no.: 8

Suitable DM methods must be: Suitable DM methods must be:

Able to generate structures from data, including a method for use (and

update) of the domain expert’s model

Able to learn new entities when exposed to new data
The expressiveness is important. Limited models (like decision trees)

are not suitable.

Our system performs explanation-driven CBR. Hence the models must

be open for inspection

As we work in open, weak theory domains, we cannot expect that a

deterministic structure will be able to capture the main effects

Should have semantic similarities with a semantic network structure
Bayesian networks is our initial method of choice although there are

significant differences which impose some limitations on the integration

Other methods (e.g. ILP) are candidates for future activities

SLIDE 9

NTNU

Slide no.: 9

Bayesian networks (BN) Bayesian networks (BN)

Left: Alarm (A) is caused by earthquake (E) and burglary (B). Alarm is independent of radio (R) given E and B. Right: The degree of belief in A (and not A) given the state of E and B. Eks.: Belief in A is 0.2 given E and not B (2nd row).

A computer efficient representation of probability distributions by

conditional independence among the attributes/states of a domain.

Has a qualitative part (below left), representing statistical

dependence/independence statements. Can often be interpreted as a causal model among states.

Has a quantitative part (below right), representing conditional

probability values for a specific state given one or more other

states. Can be interpreted as a degree of belief in on state given
ther states.

SLIDE 10

User experiences
Problem descriptions
Solutions

Controller

Data Mining Case-based reasoning

KI CBR (Creek) + Causal DM (BNs) General DM

Clustering
Time series
etc.

“Data driven” CBR

3 2 1

Information flow 1) Data preprocessing/cleaning 2) Structure learning and parameter tuning in the Bayesian Network 3) Generation of similarity matrices etc.

DW

SLIDE 11

NTNU

Slide no.: 11

CBR and BN integration: General picture CBR and BN integration: General picture

Case Base User DBs General purpose DBs Machine Generated General Domain Knowledge Human Generated Knowledge Intensive CBR General Data Mining Causal Data Mining

SLIDE 12

SLIDE 13

NTNU

Slide no.: 13

The experiment of Heckerman et. al. The experiment of Heckerman et. al.

cas e#

x 1 x 2 x 3 x 37

1 2 3 4 10,000 3 2 1 3 2 3 2 3 2 2 2 2 3 3 2 4 3 3 1 3

1 7 25 6 5 4 1 9 27 20 10 21 3 7 31 11 3 2 33 2 2 1 5 1 4 23 13 1 6 2 9 8 9 2 8 12 34 35 36 2 4 30 7 26 18 3 2 1 1 7

25

1 8 26 3 6 5 4 19 27 20 10 2 1 35 37 3 6 3 1 11 32 34 12 2 4 33 22 15 14 2 3 1 3 16 29 3 7 8 9 2 8 2 1 1 7 2 5 6 5 4 19 27 20 10 2 1 37 3 1 11 32 33 2 2 1 5 14 2 3 1 3 16 29 8 9 2 8 12 34 35 36 2 4 30 7 26 18 3 2 1

Deleted

SLIDE 14

NTNU

Slide no.: 14

Generating Networks: Generating Networks:

Initialize Network

repeat

Propose some Change to the structure
Fit Parameters to the new structure
Evaluate the new network according to

some measure (like BIC, AIC, MDL)

If the New network is Better than the

previous, then Keep the Change until Finished

SLIDE 15

NTNU

Slide no.: 15

BNs are powered by Conditional Independencies BNs are powered by Conditional Independencies

Age Exposure To Toxic Gender Smoking Cancer Serum Calcium Lung Tumour

Cancer is independent of Age and Gender given Exposure To Toxic and Smoking

SLIDE 16

Bayesian Networks: semantics Bayesian Networks: semantics

S L C E D X conditional independencies in BN structure + local probability models full joint distribution

ver domain

=

) , | ( ) | ( ) , | ( ) | ( ) ( ) ( ) , , , , , ( e l d P l x P c s e P s l P c P s P d x e l c s P =

Compact & natural representation:

– nodes have ≤ k parents ⇒ O(2kn) vs. O(2n) parameters – parameters natural and easy to elicit.

Slide taken from Nir Friedman: “Learning the Structure of Probabilistic Models”

SLIDE 17

NTNU

Slide no.: 17

Can we learn causation from data? Can we learn causation from data?

SLIDE 18

NTNU

Slide no.: 18

Can we learn causation … (continued) Can we learn causation … (continued)

The newspaper’s theory: “The Bimbo Theory”: Test result Clothes IQ Clothes IQ Sex Test result Sex

The “meaning” is different, but the two networks are equally plausible from the newspaper story

SLIDE 19

NTNU

Slide no.: 19

Inferred Causation Inferred Causation

SLIDE 20

NTNU

Slide no.: 20

Integration of BN and Integration of BN and EDoMo EDoMo

fuel-system-fault

bservable-state

too-rich-gas-mixture-in-cylinder carburettor carburettor-valve-stuck

causes

no-chamber-ignition engine-does-not-fire water-in-gas-mixture water-in-gas-tank fuel-system carburettor-fault enigne-turns carburettor-valve-fault

bserved-finding

causes causes causes causes hsc hsc hsc hp hi hi hi causes hsc has-fault hsc has-fault

condensation-in-gas-tank

causes

SLIDE 21

NTNU

Slide no.: 21

Integration Level Integration Level

Low Medium High Purpose Domain level integration Inference level integration Data-source Separate data files Common data format, different use Everything represented as frames Typical BN- Inference task

RetrieveCases ExplainSimilarity

(AttrA, AttrB)

No dedicated BN inference unit EDoMo Verification No verification Verify substructures by examining “hidden nodes” and KL divergence Verification

n arc level

IMPOSSIBLE?

SLIDE 22

NTNU

Slide no.: 22

Effect of Evidence During BN Effect of Evidence During BN-

retrieve

retrieve

Observed

Domain model attributes Cases

SLIDE 23

NTNU

Slide no.: 23

Case Indexing During BN Case Indexing During BN-

retain

retain

Index structure in BN Index structure in Creek

Remindings (solid) and causal (dot-line)

Feature #2 Feature #1 Feature#2 Feature#1 Case#1: Only F#1

bserved

Case#1: F#2 is relevant through its influence on F#1 Case#2: F#1 is relevant through its influence on F#2 Case#2: Both F#1 and F#2 are relevant

SLIDE 24

NTNU

Slide no.: 24

Validation of Validation of EDoMo EDoMo

Hidden Node No hidden nodes KL-div < α

? OK

fuel-system-fault

bservable-state

too-rich-gas-mixture-in-cylinder carburettor carburettor-valve-stuck

causes

no-chamber-ignition engine-does-not-fire water-in-gas-mixture water-in-gas-tank fuel-system carburettor-fault enigne-turns carburettor-valve-fault

bserved-finding

causes causes causes causes hsc hsc hsc hp hi hi hi causes hsc has-fault hsc has-fault

condensation-in-gas-tank

causes

SLIDE 25

NTNU

Slide no.: 25

Advantages Advantages of

f BN+CBR combination

BN+CBR combination

The BN model strengthens:

CBR Retrieval by

reducing the number of indexes needed to identify a case
due to the interdependency of indexes in the BN
matching of cases with syntaciallay different but semantically similar features

CBR Reuse by

suggesting solution adaptation based on a causal explanation from within the BN
explaining results to the user

CBR Retain by

checking for inter-consistency of case features (indexes) and identifying relevant

features when storing a new case

learning general domain knowledge by updating the BN

SLIDE 26

NTNU

Slide no.: 26

Setup of Empirical Study Setup of Empirical Study

Generate a BN from the semantic network of the drilling-

fluid domain

– Select entities manually – Use causal and taxonomic links as prior – Structural learning

Enter parts of a known case (Case-16) as a new situation

to both the CBR-system as to the BN.

Evaluate differences in retrieved cases, and compare the

quality of the retrieve regarding both ability to score similar cases high as well as punishing weaker correspondence

SLIDE 27

NTNU

Slide no.: 27

Preliminary empirical results Preliminary empirical results

Generated BN with 146 links between 128 cases

from the semantic net of 1254 entities and 2434

relationships. Structural learning difficult because
f to small overlap between data and user views
Both methods were able to

select Case-16 as best fit, discrepancies otherwise

The BN separated well

between good and not-so- good matches

10 20 30 40

No. cases

0-0.5 0.5-0.75 >0.75 Score

SLIDE 28

NTNU

Slide no.: 28

Further research/Still to come: Further research/Still to come:

Perform a more elaborate empirical study
Examine other machine learning methods in addition to

BNs (ILP is a strong candidate)

Look into different ways of collaboration between the two

models (e.g. BN used only to activate)

Continue our effort to make BNs as well suited as

possible for the integration with BNs

Extend the methods to handle time sequences (e.g. to

handle a planning task)

Examine the use of “event-type” DBs (discrepancy DBs)

for automatic case generation through data mining

SLIDE 29

NTNU

Slide no.: 29

Others doing the job for us: Others doing the job for us:

Daphne Koller’s group at Stanford

Extending the expressiveness of a BN

Elisabeth van de Stadt (TU Delft):

Spread activation algorithm for BNs

Judea Pearl’s group at UCLA:

Causation in probabilistic models

Friedman & Goldszmidt:

Learning BN structure from data

Many more …

SLIDE 30

NTNU

Slide no.: 30

Finishing Statement Finishing Statement His world is built up by rules. His world is built up by rules. Therefore he can never be as Therefore he can never be as quick or as smart as we can be. quick or as smart as we can be.

Learning Learning Retrieval Knowledge Retrieval Knowledge from Data from Data

Helge Langseth

Agnar Aamodt

Ole Martin Winnem

Outline Outline

generated networks structures:

– Problems with the semantics – Benefits

Data and User views Data and User views

Study of the task reality Study of the task reality

CBR

DM

An example case An example case

Initial design Initial design

Tangled Tangled CreekL CreekL Network Network

Suitable DM methods must be: Suitable DM methods must be:

Bayesian networks (BN) Bayesian networks (BN)

CBR and BN integration: General picture CBR and BN integration: General picture

The experiment of Heckerman et. al. The experiment of Heckerman et. al.

Generating Networks: Generating Networks:

BNs are powered by Conditional Independencies BNs are powered by Conditional Independencies

Cancer is independent of Age and Gender given Exposure To Toxic and Smoking

Bayesian Networks: semantics Bayesian Networks: semantics

) , | ( ) | ( ) , | ( ) | ( ) ( ) ( ) , , , , , ( e l d P l x P c s e P s l P c P s P d x e l c s P =

– nodes have ≤ k parents ⇒ O(2kn) vs. O(2n) parameters – parameters natural and easy to elicit.

Can we learn causation from data? Can we learn causation from data?

Can we learn causation … (continued) Can we learn causation … (continued)

The “meaning” is different, but the two networks are equally plausible from the newspaper story

Inferred Causation Inferred Causation

Integration of BN and Integration of BN and EDoMo EDoMo

Integration Level Integration Level

Effect of Evidence During BN Effect of Evidence During BN-

retrieve

Case Indexing During BN Case Indexing During BN-

retain

Validation of Validation of EDoMo EDoMo

? OK

Advantages Advantages of

BN+CBR combination

Setup of Empirical Study Setup of Empirical Study

fluid domain

to both the CBR-system as to the BN.

quality of the retrieve regarding both ability to score similar cases high as well as punishing weaker correspondence

Preliminary empirical results Preliminary empirical results

from the semantic net of 1254 entities and 2434

select Case-16 as best fit, discrepancies otherwise

between good and not-so- good matches

Further research/Still to come: Further research/Still to come:

BNs (ILP is a strong candidate)

models (e.g. BN used only to activate)

possible for the integration with BNs

handle a planning task)

for automatic case generation through data mining

Others doing the job for us: Others doing the job for us:

Extending the expressiveness of a BN

Spread activation algorithm for BNs

Causation in probabilistic models

Learning BN structure from data

Finishing Statement Finishing Statement His world is built up by rules. His world is built up by rules. Therefore he can never be as Therefore he can never be as quick or as smart as we can be. quick or as smart as we can be.

Morpheus describes an opposing agent in the movie “The Matrix”