semantic web techniques for multiple views on
play

Semantic Web Techniques for Multiple Views on Heterogeneous - PowerPoint PPT Presentation

Semantic Web Techniques for Multiple Views on Heterogeneous Collections A Case Study Marjolein van Gendt, Antoine Isaac , Lourens van der Meij, Stefan Schlobach ECDL 2006 ECDL 2006 Outline Motivations and project Experiment


  1. Semantic Web Techniques for Multiple Views on Heterogeneous Collections A Case Study Marjolein van Gendt, Antoine Isaac , Lourens van der Meij, Stefan Schlobach ECDL 2006

  2. ECDL 2006 Outline • Motivations and project • Experiment • Collection formalization • Collection integration • Integrated collection access • Conclusion

  3. ECDL 2006 Motivation • Current CH trend: portals that build on heterogeneous collections • Different databases • Documents described/ accessed according to different points of view (controlled vocabularies/ MD schemes)

  4. ECDL 2006 MDS 1 MDS 1 MDS 1 - Field 1 - Field 1 - Field 1 - Field 1.1 - Field 1.1 - Field 1.1 - Field 2 - Field 2 - Field 2 - Field 2.1 - Field 2.1 - Field 2.1 - Field 2.2 - Field 2.2 - Field 2.2 - … - … - … MDS 2 MDS 2 MDS 2 - Field 1 - Field 1 - Field 1 - Field 1.1 - Field 1.1 - Field 1.1 - Field 1.2 - Field 1.2 - Field 1.2 - Field 1.2.1 - Field 1.2.1 - Field 1.2.1 - Field 1.3 - Field 1.3 - Field 1.3 Document Document Document - Field 2 - Field 2 - Field 2 Description Description Description - … - … - … Collection X Collection X Collection X Base X Base X Base X Document Document Document Description Description Description Collection Y Collection Y Collection Y Base Y Base Y Base Y Thesaurus x Thesaurus x Thesaurus y Thesaurus y

  5. ECDL 2006 CH I nteroperability Problems • Current CH trend: portals that build on heterogeneous collections Different databases/ vocabularies/ MD schem es • Syntactic interoperability problem is being solved Access can be granted, cf. deployed portals • Semantic interoperability still to be addressed Links w ith original vocabularies/ MD structures are lost

  6. ECDL 2006 MDS 1 - Field 1 MDS 2 Unified MD Scheme - Field 1.1 - Field 1 - Field 2 - Field 1 - Field 1.1 - Field 2.1 - Field 1.2 - Field 1.1 - Field 2.2 - Field 1.2.1 - … - Field 1.2 - Field 1.3 - Field 2 - … - … DB X Unified (Virtual) Description Base DB Y No semantic information for description vocabulary

  7. ECDL 2006 STI TCH General Goals [Sem anTic Interoperability To access Cultural Heritage] Allow heterogeneous CH collections to be accessed • In a seamless way • Still benefiting from specific collection commitments Keeping original m etadata schem es and vocabularies

  8. ECDL 2006 MDS 1 MDS 1 MDS 1 MDS 2 MDS 2 MDS 2 - Field 1 - Field 1 - Field 1 - Field 1.1 - Field 1.1 - Field 1.1 - Field 1 - Field 1 - Field 1 - Field 2 - Field 2 - Field 2 - Field 1.1 - Field 1.1 - Field 1.1 - Field 2.1 - Field 2.1 - Field 2.1 - Field 1.2 - Field 1.2 - Field 1.2 - Field 1.2.1 - Field 1.2.1 - Field 1.2.1 - Field 2.2 - Field 2.2 - Field 2.2 - Field 1.3 - Field 1.3 - Field 1.3 - … - … - … - Field 2 - Field 2 - Field 2 - … - … - … DB X DB X Knowledge base Knowledge base DB Y DB Y

  9. ECDL 2006 STI TCH General Goals (2) Allow heterogeneous CH collections to be accessed • In a seamless way • Still benefiting from specific collection commitments Keeping original m etadata schem es and vocabularies Using Sem antic Web m eans for • Representation of the different points of view in one system • Creation and use of the alignment knowledge 2 m ethodological concerns • Generalize as much as possible • Automatize as much as possible

  10. ECDL 2006 Experiment On a reduced scale • 2 collections and associated vocabularies Output w ished: insights on • Use of SW off-the-shelf techniques with CH-specific resources • Impact of turning to standard proposals (SW-linked tools and methods) • In a context of natural semantics (thesauri) • Added value of this effort • Quantitative and qualitative evaluation • Simple prototype for accessing documents

  11. ECDL 2006 1 st Collection: KB I llustrated Manuscripts

  12. ECDL 2006 1 st Collection: KB I llustrated Manuscripts

  13. ECDL 2006 2 nd Collection: Rijksmuseum ARI A collection

  14. ECDL 2006 2 nd Collection: Rijksmuseum ARI A collection

  15. ECDL 2006 Outline • Motivations and project • Experiment • Collection formalization • Collection integration • Integrated collection access • Conclusion

  16. ECDL 2006 Experiment Steps

  17. ECDL 2006 Steps

  18. ECDL 2006 Steps • Gathering vocabulary and collection data • Analyzing it • Transforming it using SW standards All record/ vocabulary inform ation in one repository

  19. ECDL 2006 Collection Formalization Choices • Representation of vocabularies • Standard RDFS/ OWL encoding scheme: SKOS • Representation of records • Adhoc ontologies for collection MD schemes • Linking to SKOS concepts • RDF Schema repository: Sesam e

  20. ECDL 2006 Vocabulary Formalisation: ARI A in SKOS

  21. ECDL 2006 Steps

  22. ECDL 2006 Steps • Provide mappers with vocabulary data • Proceed to evaluation/ selection of their results • Put the alignment in the repository

  23. ECDL 2006 MDS 1 MDS 1 MDS 1 MDS 2 MDS 2 MDS 2 - Field 1 - Field 1 - Field 1 - Field 1.1 - Field 1.1 - Field 1.1 - Field 1 - Field 1 - Field 1 - Field 1.1 - Field 1.1 - Field 1.1 - Field 2 - Field 2 - Field 2 - Field 1.2 - Field 1.2 - Field 1.2 - Field 2.1 - Field 2.1 - Field 2.1 - Field 2.2 - Field 2.2 - Field 2.2 - Field 1.2.1 - Field 1.2.1 - Field 1.2.1 - Field 1.3 - Field 1.3 - Field 1.3 - … - … - … - Field 2 - Field 2 - Field 2 - … - … - … DB X DB X Knowledge base Knowledge base DB Y DB Y

  24. ECDL 2006 Collection I ntegration: Ontology Mapping Tools Tests with 2 mapping tools • S-Match, Trento • Tree-like structures mapper • Falcon-AO, Nanjing • Standard OWL ontology mapper • Using • Lexical comparisons • Structural comparisons • Third resource (Wordnet as ‘oracle’)

  25. ECDL 2006 Collection I ntegration: Mappings IC code IC label ARIA label "29B" "plants behaving as human beings or animals" "Flowers, plants" "25G1" "plants (in general)" "Flowers, plants" "25G7" "language of flowers" "Flowers, plants" "25GG3" "fabulous trees" "Flowers, plants" 42G family, relationship, descent Brothel scenes "25GG5" "fabulous lower plants" "Flowers, plants" "25H151" "deciduous forest" "Flowers, plants" "25H152" "forest of coniferous trees" "Flowers, plants" "25H153" "bush, shrubs ~ forest" "Flowers, plants" "29A" "animals acting as human beings" "Marine and other animals" "29B" "plants behaving as human beings or animals" "Marine and other animals"

  26. ECDL 2006 Partial evaluation • Conceptual level • evaluating links, not results of document searches • S-Match: 46% precision (subset of IC: 1500 concepts ) • Falcon-AO: 16% precision (subset of IC) Not m uch sense? • Difficulty to carry out com plete evaluation • Qualitative analysis reveals that im provem ent is possible

  27. ECDL 2006 Nice results (S-Match) • Lexical matching: 23L • Lemmatization: 25A271 • Background knowledge: 23U1

  28. ECDL 2006 Errors • Not enough NLP – 23H • Wrong Wordnet Disambiguation – 29D

  29. ECDL 2006 Steps

  30. ECDL 2006 Steps • Adapted faceted browsing paradigm ( Flam enco ) • Search by navigating through several dimensions • Adaptation of the paradigm: From facets corresponding to orthogonal dim ensions of object description (‘m aterial’, ‘location’) to facets corresponding to different conceptual schem es (ARIA, IconClass) • 3 views (sets of facet definitions) on integrated collections • Single view • Combined view • Merged view

  31. ECDL 2006 Collections Access: Single View • Facets based on 1 concept scheme • Access to objects indexed against concepts from other schemes If mapping between their index and the selected concepts A single point of view on integrated data set

  32. ECDL 2006 Collections Access: Combined View • Search based on 2 concepts schemes Facets attached to the different vocabularies are presented Sim ultaneous access from different points of view on the sam e data

  33. ECDL 2006 Collections Access: Merged View • Facets using a merged concept scheme with hierarchical links coming from schemes and alignment Making the links betw een vocabularies m ore visible during search A w ay to ‘enrich’w eakly structured vocabularies

  34. ECDL 2006 Outline • Motivations and project • Experiment • Collection formalization • Collection integration • Integrated collection access • Conclusion

  35. ECDL 2006 Steps

  36. ECDL 2006 Lessons learned: Collection Formalization Representing different vocabulary types using form al standards is feasible, but not trivial • Influence of the use of vocabularies on interpretation • Expressivity level is variable (weakly structured model vs. complex ones) • Implies some loss of data Part of the form alization is application and system - specific • E.g. depending on standard RDF Schema reasoning services for SKOS axioms

  37. ECDL 2006 Steps

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend