Ontology Maintenance Ontology Maintenance Support Support Text, - - PowerPoint PPT Presentation

ontology maintenance ontology maintenance support support
SMART_READER_LITE
LIVE PREVIEW

Ontology Maintenance Ontology Maintenance Support Support Text, - - PowerPoint PPT Presentation

Ontology Maintenance Ontology Maintenance Support Support Text, Tools, and Theories Text, Tools, and Theories Chris Welty Chris Welty IBM Research IBM Research Outline Outline Opening joke Opening joke Motivation


slide-1
SLIDE 1

Ontology Maintenance Ontology Maintenance Support Support

Text, Tools, and Theories Text, Tools, and Theories Chris Welty Chris Welty IBM Research IBM Research

slide-2
SLIDE 2

Outline Outline

  • Opening joke

Opening joke

  • Motivation

Motivation

  • Maintenance

Maintenance

  • Support

Support

– – Tools Tools – – Theories Theories – – Text Analysis Text Analysis

slide-3
SLIDE 3

Motivation Motivation

  • Given:

Given: Ontologies Ontologies matter matter

– – Does quality matter? Does quality matter?

slide-4
SLIDE 4

Does quality matter? Does quality matter?

  • Good quality

Good quality ontologies

  • ntologies cost more

cost more

– – Coverage, correctness, richness, commitment Coverage, correctness, richness, commitment [ [ Kashyap Kashyap, 2003] , 2003] – – Organization, meta Organization, meta-

  • level consistency [

level consistency [ Guarino Guarino & & Welty, 2000] [Rector, 2002] Welty, 2000] [Rector, 2002] – – Required

Required for some applications

for some applications

  • Improvements in quality can improve

Improvements in quality can improve performance [Welty, et al, 2004] performance [Welty, et al, 2004]

– – 18% 18% f f-

  • improvement in search

improvement in search – – Cleanup cost ~ 1mw/3000 classes Cleanup cost ~ 1mw/3000 classes – – BUT BUT … … low quality ontology still improved base low quality ontology still improved base

slide-5
SLIDE 5

Motivation Motivation

  • Given:

Given: Ontologies Ontologies matter matter

– – Does quality matter? Does quality matter? Sometimes

Sometimes

  • Problem: How to create them

Problem: How to create them

  • Bigger problem: how to

Bigger problem: how to maintain maintain them them

– – From SE: 80% of the cost is maintenance From SE: 80% of the cost is maintenance [ [ Schrobe Schrobe, 1996] , 1996]

slide-6
SLIDE 6

Software Maintenance Software Maintenance

  • Fixing Bugs

Fixing Bugs

  • Testing

Testing

  • Enhancing

Enhancing

slide-7
SLIDE 7

Ontology Maintenance Ontology Maintenance

  • Fixing Bugs

Fixing Bugs

– – Inconsistent Inconsistent – – Inaccurate Inaccurate – – Inefficient Inefficient

  • Testing

Testing

– – Regression tests Regression tests – – Test Suites Test Suites – – Meta tag sets for test Meta tag sets for test content content – – Ablation tests Ablation tests

  • Enhancing

Enhancing

– – Tweaking Tweaking

  • Richness

Richness

  • Correctness

Correctness

  • Organization

Organization

  • Meta

Meta-

  • level consistency

level consistency

  • Efficiency

Efficiency

– – Extending Extending

  • Improving coverage

Improving coverage

  • Extending commitment

Extending commitment

  • Integration

Integration

– – Refactoring Refactoring

slide-8
SLIDE 8

A looming problem A looming problem

  • Prediction

Prediction

– – Ontology maintenance will become Ontology maintenance will become the

the

significant problem as significant problem as ontologies

  • ntologies become

become more mainstream more mainstream – – Will follow the SE model (80% of cost) Will follow the SE model (80% of cost)

  • Observation/Conjecture

Observation/Conjecture

– – High quality High quality ontologies

  • ntologies are easier to maintain

are easier to maintain

slide-9
SLIDE 9

Tool Support Tool Support

  • Hierarchical view of

Hierarchical view of classes classes

  • Hierarchical view of

Hierarchical view of properties properties

  • Consistency Reasoning

Consistency Reasoning

– – But But… ….no .no “ “segmentation segmentation faults faults” ”

  • Inferential Reasoning

Inferential Reasoning

  • View non

View non-

  • tree

tree taxonomies taxonomies

  • View relations between

View relations between classes classes

  • Global axioms

Global axioms

  • View meta

View meta-

  • level

level

  • Basic Upper

Basic Upper-

  • level

level Theories Theories

– – Space, Time, Parts, Space, Time, Parts, … …

  • Assistance for integration

Assistance for integration

slide-10
SLIDE 10

Theory Support Theory Support

  • Meta

Meta-

  • level analysis

level analysis

– – OntoClean OntoClean [ [ Guarino Guarino & Welty, 2000] & Welty, 2000]

  • Good organizing principles

Good organizing principles

– – R R-

  • Normalization [Rector, 2002]

Normalization [Rector, 2002]

  • Well

Well-

  • founded upper levels

founded upper levels

– – Dolce [ Dolce [ Gangemi Gangemi, et al., 2003] , et al., 2003] – – DAML DAML-

  • Time [Hobbs, 2003]

Time [Hobbs, 2003] – – RCC [ RCC [ Randell Randell, Cui & Cohn, 1992] , Cui & Cohn, 1992]

slide-11
SLIDE 11

OntoClean OntoClean

  • Draw

Draw fundamental notions

fundamental notions from Formal

from Formal Ontology Ontology

  • Establish a set of useful

Establish a set of useful meta

meta-

  • properties

properties,

, based on behavior based on behavior wrt wrt above notions above notions

  • Explore the way these meta

Explore the way these meta-

  • properties combine

properties combine to form relevant to form relevant property kinds

property kinds

  • Explore the

Explore the taxonomic constraints

taxonomic constraints imposed

imposed by these property kinds by these property kinds

– – Expose common modeling pitfalls Expose common modeling pitfalls

slide-12
SLIDE 12

Overloading Subsumption Overloading Subsumption

Common modeling pitfalls Common modeling pitfalls

  • Instantiation

Instantiation

  • Constitution

Constitution

  • Composition

Composition

  • Disjunction

Disjunction

  • Polysemy

Polysemy

slide-13
SLIDE 13

Instantiation Instantiation

T21 T21 My ThinkPad (s# xx123) My ThinkPad (s# xx123) ThinkPad Model ThinkPad Model Ooops Ooops… … Question: What ThinkPad models do you sell? Question: What ThinkPad models do you sell? Answer should NOT include My ThinkPad Answer should NOT include My ThinkPad --

  • - nor yours.

nor yours. Does this ontology mean that Does this ontology mean that My ThinkPad My ThinkPad is a is a ThinkPad Model ThinkPad Model? ?

slide-14
SLIDE 14

Composition Composition

Memory Memory Disk Drive Disk Drive Computer Computer Question: What Computers do you sell? Question: What Computers do you sell? Answer should NOT include Disk Drives or Memory. Answer should NOT include Disk Drives or Memory. Micro Drive Micro Drive

slide-15
SLIDE 15

Disjunction Disjunction

Memory Memory Disk Drive Disk Drive Computer Computer Micro Drive Micro Drive

has has-

  • part

part

Computer Part Computer Part Flashcard Flashcard-

  • 110

110 Camera Camera-

  • 15

15

has has-

  • part

part

Unintended model: flashcard Unintended model: flashcard-

  • 110 is a computer

110 is a computer-

  • part

part

slide-16
SLIDE 16

Polysemy Polysemy

Abstract Entity Abstract Entity Physical Object Physical Object Book Book Question: How many books do you have on Hemingway? Question: How many books do you have on Hemingway? Answer: 5,000 Answer: 5,000 … ….. ..

slide-17
SLIDE 17

Constitution Constitution

Amount of Matter Amount of Matter Physical Object Physical Object Entity Entity Computer Computer Clay Clay Metal Metal Question: What types of matter will conduct electricity? Question: What types of matter will conduct electricity? Answer should NOT include computers. Answer should NOT include computers.

slide-18
SLIDE 18
slide-19
SLIDE 19

Text Analysis Support Text Analysis Support

  • Document Classification

Document Classification

– – Subject hierarchies Subject hierarchies – – Identify relevant concepts Identify relevant concepts

  • Information Extraction

Information Extraction

– – Find individuals Find individuals – – Glossary extraction [Park, 2004] Glossary extraction [Park, 2004]

slide-20
SLIDE 20

Concept Concept-

  • specific Ontology

specific Ontology Building through Search Building through Search

  • Human expert knows what she is interested in: anchor

Human expert knows what she is interested in: anchor concept concept

  • Find relations and other related concepts for the anchor

Find relations and other related concepts for the anchor concept concept

  • Active acquisition of knowledge sources through search

Active acquisition of knowledge sources through search – – Concept Concept-

  • defining knowledge source: glossaries or

defining knowledge source: glossaries or dictionaries dictionaries – – Up Up-

  • to

to-

  • date knowledge source: web documents

date knowledge source: web documents

  • Very useful for recognizing missing terms

Very useful for recognizing missing terms

slide-21
SLIDE 21

Domain Term Recognition Domain Term Recognition

  • Nominal Expressions

Nominal Expressions – – acute radiation syndrome acute radiation syndrome – – intercontinental and submarine intercontinental and submarine-

  • launched

launched ballistic missile ballistic missile – – highly enriched uranium highly enriched uranium

  • New Domain Word Identification

New Domain Word Identification – – agroterrorism agroterrorism, astrobiology, , astrobiology, biocomputation biocomputation

  • Generic

Generic Premodifier Premodifier Filtering Filtering – – average average radial first harmonic radial first harmonic runout runout – – absolute absolute amazement/zero amazement/zero

slide-22
SLIDE 22

Domain Term Aggregation Domain Term Aggregation

  • Abbreviations

Abbreviations

  • 5

5-

  • HT

HT-

  • 3R

3R ---

  • -- 5

5-

  • h

hydroxy ydroxyt tryptamine type ryptamine type 3 3 r receptor eceptor

  • D2T2

D2T2 ---

  • -- D

Dye ye D Diffusion iffusion T Thermal hermal T Transfer ransfer

  • nAchRs

nAchRs

  • -- n

nicotinic icotinic a acetyl cetylch choline

  • line r

receptor eceptors s

  • Aliases : T1 ..

Aliases : T1 .. { known { known as|called as|called} } T2 T2

– – Zomig Zomig, formerly known as , formerly known as 311C90 311C90 – – Eleutherococcus Eleutherococcus senticosus senticosus ( (ES ES), also known ), also known as as Siberian ginseng Siberian ginseng or

  • r ciwuija

ciwuija

slide-23
SLIDE 23

Domain Term Aggregation Domain Term Aggregation

  • Spelling errors or alternative spellings

Spelling errors or alternative spellings

  • an

ane esthesia sthesia ---

  • -- an

anae aesthesia sthesia

  • Orthographic variants

Orthographic variants

  • audio/visual input

audio/visual input ---

  • -- audio

audio-

  • visual input

visual input

  • electro

electro-

  • magnetic clutch

magnetic clutch ---

  • -- electromagnetic clutch

electromagnetic clutch

  • Passenger airbag

Passenger airbag ---

  • -- passenger air bag

passenger air bag

  • Morphological variants

Morphological variants

  • multiprocessing ps/2

multiprocessing ps/2 ---

  • -- multiprocessing ps/2s

multiprocessing ps/2s

  • CD ROM

CD ROM ---

  • -- CD ROMs, CD

CD ROMs, CD-

  • ROMs

ROMs

slide-24
SLIDE 24

Related Concept Recognition Related Concept Recognition

  • A term G is related to term T if

A term G is related to term T if – – T and G share some words T and G share some words

  • Ballistic missile

Ballistic missile --

  • - medium

medium-

  • range ballistic

range ballistic missile missile

– – T and G often appear together in same T and G often appear together in same sentences sentences

  • Select a set of semantically related terms

Select a set of semantically related terms with higher domain specificity with higher domain specificity

slide-25
SLIDE 25

Relation Extraction (IS Relation Extraction (IS-

  • A)

A)

  • Structurally Suggested ISA Relation

Structurally Suggested ISA Relation

– – Ballistic missile Ballistic missile. A . A guided rocket guided rocket-

  • powered delivery vehicle

powered delivery vehicle for use against ground targets for use against ground targets – – Position defense Position defense. The type of . The type of defense defense in which in which … …. . – – Hyperspectral imagery Hyperspectral imagery. A term used to describe the . A term used to describe the imagery imagery derived from .. derived from ..

  • Lexically Suggested

Lexically Suggested ISA Relation

ISA Relation – – Ballistic missile Ballistic missile ---

  • --ISA

ISA---

  • -- guided rocket

guided rocket-

  • powered delivery

powered delivery vehicle vehicle – – guided rocket guided rocket-

  • powered delivery vehicle

powered delivery vehicle ---

  • --ISA

ISA---

  • delivery vehicle

delivery vehicle

slide-26
SLIDE 26

Lexical Patterns for IS Lexical Patterns for IS-

  • A

A

  • T

T is a kind/type of is a kind/type of H H

  • (T

(T1

1, T

, T2

2,

, … …, , T Tn

n )

) and/or other and/or other H H

– – rescue, meteorological information, rescue, meteorological information, navigational aid, communications facilities navigational aid, communications facilities and other and other services services

  • H

H1

1, H

, H2

2, H

, H3

3 { such as/including}

{ such as/including} (T (T1

1,T

,T2

2,

,… …) ) and/or and/or T T

– – conditions conditions such as such as fractures fractures, , wounds wounds, , sprains sprains, , strains strains, , dislocations dislocations, , concussions concussions, , and and compressions compressions

slide-27
SLIDE 27

agroterrorism agroterrorism. . Terrorist Terrorist attacks attacks aimed at aimed at reducing reducing the the food supply food supply by by destroying destroying crops crops using using natural natural pests pests such as the potato beetle such as the potato beetle, , animal diseases animal diseases such as hoof and mouth disease and anthrax such as hoof and mouth disease and anthrax, , molds molds and other and other plant diseases plant diseases.

.

Ontology Construction Ontology Construction

agroterrorism Food supply

reduce

crop

destroy

Natural pest Animal disease Plant disease

use use use

Terrorist Attack

ISA

Attack

ISA mold ISA Potato beetle ISA hoof anthrax Mouth disease ISA ISA ISA

slide-28
SLIDE 28
slide-29
SLIDE 29

Conclusions Conclusions

  • Ontology maintenance is a critical problem

Ontology maintenance is a critical problem

  • Need support

Need support

– – Tools help Tools help – – Theories help Theories help – – Text analysis helps Text analysis helps

  • All together helps more

All together helps more

– – Embedded in Prot Embedded in Proté ég gé é

… …Bring Research into Practice Bring Research into Practice… …