linking and building ontologies of linked data
play

Linking and Building Ontologies of Linked Data Rahul Parundekar, - PowerPoint PPT Presentation

Linking and Building Ontologies of Linked Data Rahul Parundekar, Craig A. Knoblock and Jose-Luis Ambite {parundek,knoblock,ambite}@isi.edu University of Southern California Web of Linked Data Vast collection of interlinked information


  1. Linking and Building Ontologies of Linked Data Rahul Parundekar, Craig A. Knoblock and Jose-Luis Ambite {parundek,knoblock,ambite}@isi.edu University of Southern California

  2. Web of Linked Data • Vast collection of interlinked information • Different sources with different schemas

  3. Web of Linked Data • Interlinked instances in the various domains • Equivalent instances linked with owl:sameAs Geospatial Domain

  4. Interlinked Instances Source 1 Source 2 Schema Level PopulatedPlac City e Instance Level owl:sameAs City of Los Los Angeles Angeles

  5. Disjoint Schemas Source 1 Source 2 Schema Level PopulatedPlac NO LINKS!! City e Instance Level owl:sameAs City of Los Los Angeles Angeles

  6. Objective 1: Find Schema Alignments Source 1 Source 2 Schema Level = PopulatedPlac City e Instance Level owl:sameAs City of Los Los Angeles Angeles

  7. Ontologies of Linked Data • Ontologies can be highly specialized • e.g. DBpedia has classes for Educational Institutions, Bridges, Airports, etc. • But some can be rudimentary • e.g. in Geonames all instances only belong to a single class – ‘Feature’ • Derived from RDBMS schemas from which Linked Data was generated

  8. Traditional Alignments • There might not exist exact equivalences between classes in two sources • Only subset relations possible Geonames DBpedia Schema Level ⊃ Educational Feature Institution Instance Level owl:sameAs University of University of Southern California Southern California

  9. Restriction Classes • A specialized class can be created by restricting the value of one or more properties • The following Venn diagram explains a restriction class in Geonames with a restriction on the value of the featureCode property as ‘S.SCH’ Set of all instances in Set of all instances in Restricted Class - Original Class - rdf:type =Feature & rdf:type =Feature featureCode =S.SCH

  10. Objective 2: Find Alignments Between Restriction Classes • Find and model specialized descriptions of classes Geonames DBpedia Schema Level = rdf:type =Feature & rdf:type =Educational featureCode =S.SCH Institution Instance Level owl:sameAs University of Southern University of Southern California California

  11. Domains • Geospatial • Dbpedia • LinkedGeoData • Geonames • Zoology • Geospecies • Dbpedia • Genetics (Bio2RDF) • GeneID • MGI

  12. Approach • Aligning Restriction Classes R 1 R 2

  13. Approach • Aligning Restriction Classes ? R 1 R 2 • Find relation between the two restriction classes • Equivalent • Subset

  14. Extensional Approach to Ontology Alignment

  15. Lattice of Restriction Classes • Instances belonging to a restriction class also belong to parent restriction class • e.g. restrictions from Geonames below • This also results in a hierarchy in the alignments, which our algorithm exploits

  16. Exploration of Hypotheses Search Space (LinkedGeoData with DBpedia) Seed hypotheses generation (lgd:gnis%3AST_alpha=NJ) (rdf:type=lgd:country) (dbpedia:Place#type= (rdf:type=owl:Thing) h>p://dbpedia.org/resource/City_(New_Jersey)) Seed hypothesis pruning (owl:Thing covers all instances) (rdf:type=lgd:node) (rdf:type=lgd:node) (rdf:type=lgd:node) (rdf:type=dbpedia:BodyOfWater) (rdf:type=dbpedia:PopulatedPlace) (dbpedia:Place#type=dbpedia:City) (rdf:type=lgd:node) (dbpedia:Place#type=dbpedia:City & rdf:type=owl:Thing) Prune as no change in the extension set (rdf:type=lgd:node) (rdf:type=dbpedia:BodyOfWater & dbpedia:Place#type=dbpedia:City) Pruning on empty set (rdf:type=lgd:node) r 2 = Ø (rdf:type=dbpedia:PopulatedPlace & dbpedia:Place#type=dbpedia:City)

  17. 1. Prune seed hypothesis if either restriction covers all instances in that source Seed hypotheses generation (lgd:gnis%3AST_alpha=NJ) (rdf:type=lgd:country) (dbpedia:Place#type= (rdf:type=owl:Thing) h>p://dbpedia.org/resource/City_(New_Jersey)) 1 Seed hypothesis pruning (owl:Thing covers all instances) (rdf:type=lgd:node) (rdf:type=lgd:node) (rdf:type=lgd:node) (rdf:type=dbpedia:BodyOfWater) (rdf:type=dbpedia:PopulatedPlace) (dbpedia:Place#type=dbpedia:City) (rdf:type=lgd:node) (dbpedia:Place#type=dbpedia:City & rdf:type=owl:Thing) Prune as no change in the extension set (rdf:type=lgd:node) (rdf:type=dbpedia:BodyOfWater & dbpedia:Place#type=dbpedia:City) Pruning on empty set (rdf:type=lgd:node) r 2 = Ø (rdf:type=dbpedia:PopulatedPlace & dbpedia:Place#type=dbpedia:City)

  18. 2. Number of instance pairs supporting hypothesis must be above a threshold Seed hypotheses generation (lgd:gnis%3AST_alpha=NJ) (rdf:type=lgd:country) (dbpedia:Place#type= (rdf:type=owl:Thing) h>p://dbpedia.org/resource/City_(New_Jersey)) Seed hypothesis pruning (owl:Thing covers all instances) (rdf:type=lgd:node) (rdf:type=lgd:node) (rdf:type=lgd:node) (rdf:type=dbpedia:BodyOfWater) (rdf:type=dbpedia:PopulatedPlace) (dbpedia:Place#type=dbpedia:City) (rdf:type=lgd:node) (dbpedia:Place#type=dbpedia:City & rdf:type=owl:Thing) Prune as no change in the extension set 2 (rdf:type=lgd:node) (rdf:type=dbpedia:BodyOfWater & dbpedia:Place#type=dbpedia:City) Pruning on empty set (rdf:type=lgd:node) r 2 = Ø (rdf:type=dbpedia:PopulatedPlace & dbpedia:Place#type=dbpedia:City)

  19. 3. Prune if the added constraint does not change the extension Seed hypotheses generation (lgd:gnis%3AST_alpha=NJ) (rdf:type=lgd:country) (dbpedia:Place#type= (rdf:type=owl:Thing) h>p://dbpedia.org/resource/City_(New_Jersey)) Seed hypothesis pruning (owl:Thing covers all instances) (rdf:type=lgd:node) (rdf:type=lgd:node) (rdf:type=lgd:node) (rdf:type=dbpedia:BodyOfWater) (rdf:type=dbpedia:PopulatedPlace) (dbpedia:Place#type=dbpedia:City) 3 (rdf:type=lgd:node) (dbpedia:Place#type=dbpedia:City & rdf:type=owl:Thing) Prune as no change in the extension set (rdf:type=lgd:node) (rdf:type=dbpedia:BodyOfWater & dbpedia:Place#type=dbpedia:City) Pruning on empty set (rdf:type=lgd:node) r 2 = Ø (rdf:type=dbpedia:PopulatedPlace & dbpedia:Place#type=dbpedia:City)

  20. 4. Lexicographic ordering Lexicographic ordering provides a systematic search by pruning hypotheses with reverse order r 1 (p 5 =v 5 ) Hypothesis (p 8 =v 8 ) r 2 (p 5 =v 5 & p 6 =v 6 ) (p 5 =v 5 & p 7 =v 7 ) (p 8 =v 8 ) (p 8 =v 8 ) Prune 4 (p 5 =v 5 & p 6 =v 6 & p 7 =v 7 ) (p 8 =v 8 )

  21. Relaxed Scoring • Compensates for missing, inconsistent in the data

  22. Post-processing: Removing Implied Alignments Keep the simpler definition & Remove the implied definition

  23. Removing Implied Alignments r 1 r 2 r’ 1 r’ 2 Cascading

  24. Results: Geospatial Domain

  25. Results: Zoology Domain

  26. Results: Genetics Domain

  27. Results: Alignments Found • Equivalences, Subset alignments before and after removing implied alignments

  28. Datasets: http://www.isi.edu/integration/data/LinkedData

  29. Related Work • Euzenat et al. – Ontology Matching Terminological • Structural • Semantic • • FCA-Merge, Duckham et al. Use extensional techniques • • GLUE Uses an extensional technique after performing machine learning • operations

  30. Conclusion • Our algorithm generates alignments, consisting of conjunctions of restriction classes • Extensional approach on Linked Data • Use of restriction classes • Alignments based on the actual data • We determine the relationships based on the data • Schemas of linked sources can be readily modeled and used • Algorithm also able to • Specialize ontologies where original were rudimentary • Find complimentary hierarchy across an ontology

  31. Future Work • How to actually understand these alignments • Scalability • Pre-procesing of the sources • Faster alignment processing

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend