A Semantic Similarity Measure for Formal Ontologies Mark Hall - PowerPoint PPT Presentation

A Semantic Similarity Measure for Formal Ontologies Mark Hall Final presentation for the master thesis 17.03.2006

Overview I A Semantic Similarity Model and Algorithm Motivation - Heterogeneous Data Ontologies Similarity Measures Hybrid Model & Similarity Calculation Application Evaluation of the Model and Algorithm Summary

Heterogenous Data Hello, is this the Je m’appelle Jane et je train station? t’emmerde Frische T omaten! Frische T omaten! Heterogeneous data sources are the norm Integration poses two main problems Syntactic differences Semantic differences

Integration = Matching I have a Forest: Syntactic Semantic thousand CDs A wooded area Forest: J’ai 1000 Land that belongs to the disques forestry commission Integration depends on finding matches Syntactic problems Semantic problems Matching requires similarity and similarity measure

Ontologies Shack Villa Houses Industry Houses Iron Foundry T extiles Iron Foundry Shack Villa Industry T extiles Ontology is the study of the existence of entities Shared specification of domain knowledge Ontologies are one way to encode semantics T ool of choice for the Semantic Web

Similarity Measure Provide a means of comparing two entities to determine how similar they are Not based on a cognitive model Description Logics, Word based, Structure based Based on a cognitive model Feature, Network, Cognitive Spaces Hybrid Semantic Similarity Measure

Hybrid Cognitive Model I Red Filled Round Blue Filled Square Combines the approaches of the feature and the network model Basis is the feature model, but each feature has an inner structure in the form of the network model

Hybrid Cognitive Model II Has Surface Forest Has Vegetation Has Use Every class is represented by a set of properties Shared vocabulary is structured hierarchically Property values reference a shared vocabulary Property value ranges are sets of shared vocabulary

Similarity Calculation I Coniferous Forest Broad-leaved Forest Has Surface Has Surface Has Vegetation Has Vegetation Has Use Has Use Similarity of two classes is the aggregate of the similarities of their properties Property similarities can be weighted to emphasise certain aspects

Similarity Calculation II Has Surface Has Surface Has Vegetation Has Vegetation Properties are matched based on their quantifier and name Similarity for two matching properties is the similarity of their ranges

Austria - Realraumanalyse Austria - Realraumanalyse Slovenia - Corine Slovenia - Corine Italy - Moland Italy - Moland Application - HarmonISA

Overview II A Semantic Similarity Model and Algorithm Evaluation of the Model and Algorithm Expert evaluation Shortcomings of the model Modelling errors Performance analysis Summary

Expert Evaluation I Mappings evaluated by domain experts Realraumanalyse => Corine 136 total / 116 correct / 20 incorrect Corine => Realraumanalyse 64 total / 34 correct / 30 incorrect Incorrect mappings grouped by reasons Shortcomings of the model Modelling errors Correct but reclassified

Model Shortcomings I Knee timber Vegetation Surface refers to Knee timber Alpine turf Vegetation Rocks 90% : 10% 80% : 20% Non built-up areas belonging to the public administration No negation possible Knee timber partially with rocks and alpine turf Internal structure and relations between properties can’t be defined

Model Shortcomings II Alluvial Forrest River No relations between concepts in the land-use ontologies Workaround via special properties such as “Lies next to”

Model Shortcomings III Elevation Greenland Mountain Sub alpine Alpine and Woods Pasture higher than No relations between concepts in the skeleton ontology

Modelling Errors Additional incorrect knowledge specified Bare Rocks which included a value for the property Vegetation Knowledge left out or none specified Green Urban Areas which somehow managed to only have one property specified Incorrect metadata Incomplete settlement along a road which in the metadata was specified as belonging to the continuous urban fabric and was thus modelled as such

Reclassification of concepts Correctly mapped to the most similar concept, but would be handled different by the experts Sea and Ocean, Olive Groves, Annual crops associated with permanent crops Suggested strategy for dealing with these Leave them out. Create no mapping Reclassify based on additional knowledge Some knowledge could be added to the system Some knowledge basically a hunch

Expert Evaluation II Initial evaluation result not too good Realraumanalyse => Corine: 85% correct Corine => Realraumanalyse: 53% correct Analysis of errors revealed (out of a total 200 mappings): 3 erroneous mappings due to model shortcomings 17 erroneous mappings due to modelling errors 30 reclassifications of correct mappings

Expert Evaluation III Modelling errors can be corrected Reclassifications are not actual errors but differing methodologies Updated number of correct mappings Realraumanalyse => Corine: 134 out of 136 (98%) Corine => Realraumanalyse: 63 out of 64 (98%) Analysis of the evaluations of the other mappings reveals an average error rate between 0 and 5%

Performance Analysis I Corine Realraum Agricultural Artificial Agricultural Settlement Arable Pastures Arable Dense T ransport Source T arget Every source concept is matched to each target concept and then the best is selected.

Performance Analysis II Heuristics OWL Hierarchy Mapping Static DL Reasoning Similarity Calculation Hierarchy T otal complexity of the similarity calculation: Polynomial time (O(N 5 )) Loading and hierarchy calculation in Description Logics: Exponential time Optimisation required for larger ontologies Removing the Description Logics reasoning Heuristics / Parallelisation for the similarity calculation

Performance Analysis III Ontology # Concepts Avg # Prop. Load Time Corine 64 3 31sec Moland 96 5 3min 19sec Realraumanalyse 136 6 5min 16sec From / T o Corine Moland Realraumanalyse Corine 5sec 10sec 15sec Moland 11sec 20sec 31sec Realraumanalyse 18sec 34sec 52sec

Overview III A Semantic Similarity Model and Algorithm Evaluation of the Model and Algorithm Summary

Summary Cognitive model is capable of describing most real- world situations Similarity algorithm works sufficiently well to be used in real-world situations (average correctness of above 95%) Performance is the major bottleneck. Without improvement it is unusable for larger ontologies Cognitive model needs to be extended in some areas

Statistics 101 pages (a nice prime number) 77 pages with actual content 24 pages of structural padding 6 Chapters (average 12.8 pages per chapter) 29208 words Average of 379 words per page Most frequent word: similarity (239x) 62 Figures and 3 T ables 65 References T otal size: Source: 1.5MB, PDF: 1.4MB

Questions, Comments, Praise Thank you for listening

A Semantic Similarity Measure for Formal Ontologies Mark Hall - PowerPoint PPT Presentation

A Semantic Similarity Measure for Formal Ontologies Mark Hall Final presentation for the master thesis 17.03.2006 Overview I A Semantic Similarity Model and Algorithm Motivation - Heterogeneous Data Ontologies Similarity Measures Hybrid

Semantic Similarity MultiJEDI ERC 259234 Semantic Similarity Semantic Similarity Mostly

Align, Disambiguate, and Walk A Unified Approach for Measuring Semantic Similarity Semantic

Ontologies for NLP NLP for Ontologies FOIS 2014 - LogOnto Workshop on Logics and Ontologies for

A Similarity Measure for Formal Ontologies with an application to ontologies of a geographic kind

Time- -dependent Similarity Measure dependent Similarity Measure Time Time-dependent Similarity

Module 13 Introduction to Semantic Technology, Ontologies and the Semantic Web Module 13 Outline

Formal rmal Foundations oundations of of Ontologies Ontologies and and Reasoning Reasoning

SHER: Semantic Databases SHER: Semantic Databases using using ontologies ontologies Julian

Ontologies & Its Applications Ontologies & Its Applications San Su Lee, Jong Lim, Rami

Formal rmal Foundations oundations of of Ontologies Ontologies and and Reasoning Reasoning

Ontologies, semantic annotation and GATE Kalina Bontcheva Johann Petrak University of Sheffield

Ontologies, Description Logics and Semantic Web A Short Introduction Marek Obitko

Formal Definition of a Finite Automaton Formal Definition of a Finite Automaton p.1/23 Why a

Similarity Measures There are an enormous number of ways in which we can measure similarity

Terminologies & Terminologies & Ontologies? Ontologies? What are they for? What would

Ontologies: Weather and Ontologies: Weather and Flight Information Kajal Claypool Kelly Moran

A Study of Hybrid Similarity Measures for Semantic Relation Extraction Alexander Panchenko and

Classification of normal and pathological brain networks based on similarity of graph partitions

Knowledge in an AI System, for Future Autonomous Precision-surface Manufacturing Presented by:-

2Q19 Earnings Presentation August 28, 2019 For use with the general public. 1 Forward looking

Development R&D Review Automated Grouping Model Extraction from BIM Data Unified Fire-Egress

COMPARISON OF CATEGORICAL PROPERTIES OFFERED BY MULTIPLE MOOC PLATFORMS Using a Web Crawler in

Your Network Management Partner Castle Rock Com puting Silicon Valley based Network

RTG A Scalable SNMP Statistics Architecture USENIX LISA 2002 Robert Beverly November 7, 2002

Sambuz

Useful Links

Newsletter

Mail Us

A Semantic Similarity Measure for Formal Ontologies Mark Hall - PowerPoint PPT Presentation

A Semantic Similarity Measure for Formal Ontologies Mark Hall Final presentation for the master thesis 17.03.2006 Overview I A Semantic Similarity Model and Algorithm Motivation - Heterogeneous Data Ontologies Similarity Measures Hybrid

Semantic Similarity MultiJEDI ERC 259234 Semantic Similarity Semantic Similarity Mostly

Align, Disambiguate, and Walk A Unified Approach for Measuring Semantic Similarity Semantic

Ontologies for NLP NLP for Ontologies FOIS 2014 - LogOnto Workshop on Logics and Ontologies for

A Similarity Measure for Formal Ontologies with an application to ontologies of a geographic kind

Time- -dependent Similarity Measure dependent Similarity Measure Time Time-dependent Similarity

Module 13 Introduction to Semantic Technology, Ontologies and the Semantic Web Module 13 Outline

Formal rmal Foundations oundations of of Ontologies Ontologies and and Reasoning Reasoning

SHER: Semantic Databases SHER: Semantic Databases using using ontologies ontologies Julian

Ontologies &amp; Its Applications Ontologies &amp; Its Applications San Su Lee, Jong Lim, Rami

Formal rmal Foundations oundations of of Ontologies Ontologies and and Reasoning Reasoning

Ontologies, semantic annotation and GATE Kalina Bontcheva Johann Petrak University of Sheffield

Ontologies, Description Logics and Semantic Web A Short Introduction Marek Obitko

Formal Definition of a Finite Automaton Formal Definition of a Finite Automaton p.1/23 Why a

Similarity Measures There are an enormous number of ways in which we can measure similarity

Terminologies &amp; Terminologies &amp; Ontologies? Ontologies? What are they for? What would

Ontologies: Weather and Ontologies: Weather and Flight Information Kajal Claypool Kelly Moran

A Study of Hybrid Similarity Measures for Semantic Relation Extraction Alexander Panchenko and

Classification of normal and pathological brain networks based on similarity of graph partitions

Knowledge in an AI System, for Future Autonomous Precision-surface Manufacturing Presented by:-

2Q19 Earnings Presentation August 28, 2019 For use with the general public. 1 Forward looking

Development R&amp;D Review Automated Grouping Model Extraction from BIM Data Unified Fire-Egress

COMPARISON OF CATEGORICAL PROPERTIES OFFERED BY MULTIPLE MOOC PLATFORMS Using a Web Crawler in

Your Network Management Partner Castle Rock Com puting Silicon Valley based Network

RTG A Scalable SNMP Statistics Architecture USENIX LISA 2002 Robert Beverly November 7, 2002

Sambuz

Useful Links

Newsletter

Mail Us

Ontologies & Its Applications Ontologies & Its Applications San Su Lee, Jong Lim, Rami

Terminologies & Terminologies & Ontologies? Ontologies? What are they for? What would

Development R&D Review Automated Grouping Model Extraction from BIM Data Unified Fire-Egress