the semantic web web of integrated data
play

The Semantic Web: Web of (integrated) Data Frank van Harmelen - PDF document

The Semantic Web: Web of (integrated) Data Frank van Harmelen Vrije Universiteit Amsterdam Take home message Semantic Web = Web of Data (no longer only web of text, web of pictures) Set of open, stable W3C standards Rapidly


  1. The Semantic Web: Web of (integrated) Data Frank van Harmelen Vrije Universiteit Amsterdam Take home message � Semantic Web = Web of Data (no longer only web of text, web of pictures) � Set of open, stable W3C standards � Rapidly emerging tools & vendors � Use cases: � data integration � web services � knowledge management � search (intranets) 1

  2. Outline � The vision � What is required � Machine representation � XML, RDF, OWL � Where are we now? � Examples Things we would like to do on the Web 2

  3. “I ntelligent” things we can’t do today � Search engines • concepts, not keywords • semantic narrowing/widening of queries � Shopbots • semantic interchange, not screenscraping � E-commerce � Negotiation, catalogue mapping, personalisation � Web Services � Need semantic characterisations to find them, � to combine them � Navigation • by semantic proximity, not hardwired links � ..... Why can’t Google do this… harmelen harmelen 3

  4. Other use-case are � personalisation � semantic linking � data integration � web services � ... Sounds good, so.. how is this tackled? 4

  5. Outline � The vision � What is required � Machine representation � XML, RDF, OWL � Where are we now? � Examples machine accessible meaning (What it’s like to be a machine) name symptoms disease drug administration Meta-data ! 5

  6. What is meta-data? name symptoms disease drug administration � it's just data � it's data describing other data � its' meant for machine consumption meta-data + ontologies reduces <treatment> < name> < symptoms> IS-A < disease> < drug> < drug administration> 6

  7. What’s inside an ontology? � terms + specialisation hierarchy � classes + class-hierarchy � instances � slots/values � inheritance (multiple? defaults?) � restrictions on slots (type, cardinality) � properties of slots (symm., trans., …) � relations between classes (disjoint, covers) � reasoning tasks: classification, subsumption Increasing semantic “weight” I n short (for the duration of this tutorial) � Ontologies are not definitive descriptions of what exists in the world (= philosphy) � Ontologies are shared models of the world constructed to facilitate communication � Yes, ontologies exist (because we build them) 7

  8. Real life examples � handcrafted (often by communities) � music: CDnow (2410/5), MusicMoz (1073/7) � biomedical: SNOMED (200k), GO (15k), Emtree(45k+ 190k) � ranging from lightweight ( Yahoo, UNSPC ) to heavyweight ( Cyc ) � ranging from small ( METAR ) to large ( UNSPC ) allright, but how to represent all this in a computer? 8

  9. Outline � The vision � What is required � machine representation � XML, RDF, OWL � Where are we now? � Examples Semantic Web “architecture” 9

  10. What was XML again? <country name=”Netherlands”> <capital name=”Amsterdam”> <areacode>020</areacode> </capital> </country> country name capital “Netherlands” name areacode “Amsterdam” “020” So why not just use XML? � No agreement on: <country name=”Netherlands”> <capital name=”Amsterdam”> � structure <areacode>020</areacode> </capital> • is country a: </country> –object? <nation> –class? <name>Netherlands</name> –attribute? <capital>Amsterdam</capital> <capital_areacode> –relation? 020 –something else? </capital_areacode> </nation> • what does nesting mean? ● Are the above XML documents the same? � vocabulary ● Do they convey the same information? • is country the ● Is the answer machine-derivable? same as nation? 10

  11. So: XML ≠ machine accessible meaning < ναμε > name < > < εδυχατιον > education < > < Χς > CV < > < ωορκ > work < > < πριϖατε > private < > The semantic pyramid again 11

  12. W3C Stack � XML : � Surface syntax, no semantics � XML Schema : � Describes structure of XML documents � RDF : � Datamodel for “relations” between “things” � RDF Schema : � RDF Vocabulary Definition Language � OWL : � A more expressive Vocabulary Definition Language RDF & RDF Schema � RDF = � relations between things � all objects are URL’s (both things and relations) � RDF Schema = � hierarchical organisation of an RDF vocabulary � all things are URL’s (classes of things, subclass relations) � For more details: see slides later today 12

  13. The semantic pyramid again OWL: things RDF Schema can’t do � equality � enumeration � number restrictions � Single-valued/multi-valued � Optional/required values � inverse, symmetric, transitive � boolean algebra � Union, complement Again: For more details: see slides later today 13

  14. Sounds good in theory. How far are you with this in practice? Where are we now: tools � Languages are stable (W3C) � Tooling is rapidly emerging � HP, IBM, Oracle, Adobe, … � Parsers, � Editors, � visualisers, � large scale storage and querying Aduna � Portal generation I ntellidim ension 14

  15. Three example use-cases � Closed-world data integration: DOPE browser @ Elsevier � Open-world data integration: streaming media @ Philips � Semantic Web services � Conclusions Closed-world data integration: DOPE Browswer @ Elsevier This section joint with This section joint with Aduna and Aduna and Anita de Waard@Elsevier Anita de Waard@Elsevier 15

  16. Background � Vertical Information Provision � Buy a topic instead of a Journal ! � Web provides new opportunities � Business driver: drug development � Rich, information-hungry market � Good thesaurus (EMTREE) The Data � Document repositories: � ScienceDirect: approx. 500.000 fulltext articles � MEDLINE: approx. 10.000.000 abstracts � Extracted Metadata � The Collexis Metadata Server: concept- extraction ("semantic fingerprinting") � Thesauri and Ontologies � EMTREE: 60.000 preferred terms 200.000 synonyms 16

  17. Query Architecture: interface RDF Schema EMTREE RDF RDF …. Datasource 1 Datasource n 17

  18. 18

  19. 19

  20. 20

  21. Web-based data integration scenario: • heterogeneous • open This section material from This section material from Zharko Aleksovski @ VU & Philips Zharko Aleksovski @ VU & Philips 21

  22. Motivating scenario LaunchCast iTunes Rhapsody Sem antic W eb Buy.com Napster User devices consum er.philips.com MusicNow eMusic W al* Mart MusicNet Providers Musicm atch Example “Hits” from the “60s” “Evergreens” Mediator Music Ontology Evergreens and Golden hits are related: Golden hits is mostly subclass of Evergreens 22

  23. Domain characteristics � Many music providers � Wide variety of music offered � Constantly increasing in size and evolving � Cumbersome to browse and retrieve music � There is no agreement � Different terms are used � The same terms contain different sets of artists data-sources CDNow (Amazon.com) Size: 2410 classes ArtistGigs Depth: 5 levels Size: 382 classes Depth: 4 levels Artist Direct Network Size: 465 classes CD baby Depth: 2 levels Size: 222 classes Depth: 2 levels Yahoo All Music Guide Size: 96 classes Size: 403 classes Depth: 3 levels Depth: 2 levels MusicMoz Size: 1073 classes Depth: 7 levels 23

  24. Why approximate matching � Genre is not precisely defined � Pop and Rock have no common definition on the big portals AllMusic.com, Amazon.com and MP3.com � Exact reasoning will not be useful A X 99 % 1 % Results A - AllMusicGuide B - ArtistDirectNetwork 600000 500000 400000 B subClass of A 300000 A subClass of B equivalences 200000 100000 0 0 1 2 3 4 5 6 7 8 9 0 . . . . . . . . . . . 0 0 0 0 0 0 0 0 0 0 1 24

  25. Semantic Web Services This section material from This section material from Marta Sabou @ VU Marta Sabou @ VU What are web-services � a software system designed to support interoperable machine-to-machine interaction over a network. � has an interface described in a machine processable format (specifically WSDL). � Other systems interact with a web service in a manner specified by its descriptions using SOAP messages 25

  26. Web Service Tasks � Web Service Discovery & Selection � Find an airline that can fly me to Marina del Rey � Web Service I nvocation � Book flight tickets from NWA to arrive 12 th Oct. � Web Service Composition & I nteroperation � Arrange taxis, flights and hotel for travel from Southampton to Portland, OR, via Marina del Rey, CA. � Web Service Execution Monitoring � Has the taxi to Gatwick Airport been reserved yet? Limitations of WS Technology � Manual Discovery � Manual Invocation � Manual (ad hoc) Mediation � Manual (ad hoc) Composition 26

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend