Semantic Web: a short introduction Ivan Herman, Semantic Web - - PowerPoint PPT Presentation

semantic web a short introduction ivan herman semantic
SMART_READER_LITE
LIVE PREVIEW

Semantic Web: a short introduction Ivan Herman, Semantic Web - - PowerPoint PPT Presentation

Semantic Web: a short introduction Ivan Herman, Semantic Web Activity Lead, W3C Webelopers Day, Internet NG Conference, Isabel Plaza (Madrid), October 17, 2007 (2) > Towards a Semantic Web The current Web represents information using


slide-1
SLIDE 1

Semantic Web: a short introduction Ivan Herman, Semantic Web Activity Lead, W3C

“Webelopers Day”, Internet NG Conference, Isabel Plaza (Madrid), October 17, 2007

slide-2
SLIDE 2

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (2)

(2)

> Towards a Semantic Web

The current Web represents information using

− natural language (English, Hungarian, Spanish,…) − graphics, multimedia, page layout

Humans can process this easily

− can deduce facts from partial information − can create mental associations − are used to various sensory information

 (well, sort of… people with disabilities may have serious problems

  • n the Web with rich media!)
slide-3
SLIDE 3

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (3)

(3)

> Towards a Semantic Web

Tasks often require to combine data on the Web:

− hotel and travel information may come from different sites − searches in different digital libraries − etc.

Again, humans combine these information easily

− even if different terminologies are used!

slide-4
SLIDE 4

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (4)

(4)

> However…

However: machines are ignorant!

− partial information is unusable − difficult to make sense from, e.g., an image − drawing analogies automatically is difficult − difficult to combine information automatically

 is <foo:creator> same as <bar:author>?  how to combine different XML hierarchies?

− …

slide-5
SLIDE 5

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (5)

(5)

> Example: automatic airline reservation

Your automatic airline reservation

− knows about your preferences − builds up knowledge base using your past − can combine the local knowledge with remote services:

 airline preferences  dietary requirements  calendaring  etc

It communicates with remote information (i.e., on the Web!)

− (M. Dertouzos: The Unfinished Revolution)

slide-6
SLIDE 6

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (6)

(6)

> Example: data(base) integration

Databases are very different in structure, in content Lots of applications require managing several databases

− after company mergers − combination of administrative data for e-Government − biochemical, genetic, pharmaceutical research − etc.

Most of these data are accessible from the Web (though not necessarily public yet)

slide-7
SLIDE 7

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (7)

(7)

> And the problem is real…

slide-8
SLIDE 8

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (8)

(8)

> Example: change of address & the authorities

It means change of address at “official” places

 so you could still get the right official mails for official notices, tax

information, certificates, etc.

… but you never know if you notified the right local, regional, national, etc, authorities, so they all have your new mail address

 ie, you still get some mail from some agency at your old address

It should be possible to change the address in one official place only

− the administration should be smart enough to propagate the

change to authorities that need to know about it

− this means that various authorities should be able to merge their

data…

slide-9
SLIDE 9

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (9)

(9)

> Example: “smart” portal

Various types of “portals” are created (for a journal on line, for a specific area of knowledge, for specific communities, etc) The portals may:

− integrate lots of different data sources − may have access to specialized domain knowledge

Goal is to provide a better local access, search on the integrated data, reveal new relationships among the data

slide-10
SLIDE 10

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (10)

(10)

> What is needed?

(Some) data should be available for machines for further processing Data should be possibly combined, merged on a Web scale Sometimes, data may describe other data (like the library example, using metadata)… … but sometimes the data is to be exchanged by itself, like my calendar or my travel preferences Machines may also need to reason about that data

slide-11
SLIDE 11

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (11)

(11)

> In what follows…

We will use a simplistic example to introduce the main Semantic Web concepts We take, as an example area, data integration

slide-12
SLIDE 12

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (12)

(12)

> The rough structure of data integration

  • 1. Map the various data onto an abstract data representation

− make the data independent of its internal representation…

  • 2. Merge the resulting representations
  • 3. Start making queries on the whole!

− queries that could not have been done on the individual data sets

slide-13
SLIDE 13

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (13)

(13)

>

A simplified bookstore data (dataset “A”)

ID Author Title Publisher Year

ISBN 0-00-651409-X The Glass Palace 2000

ID Name Home page ID City

Harper Collins London id_xyz id_qpr id_xyz Ghosh, Amitav http://www.amitavghosh.com/

  • Publ. Name

id_qpr

slide-14
SLIDE 14

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (14)

(14)

> 1st: export your data as a set of relations

slide-15
SLIDE 15

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (15)

(15)

> Some notes on the exporting the data

Relations form a graph

− the nodes refer to the “real” data or contain some literal − how the graph is represented in machine is immaterial for now

Data export does not necessarily mean physical conversion of the data

− relations can be generated on-the-fly at query time

 via SQL “bridges”  scraping HTML pages  extracting data from Excel sheets  etc.

One can export part of the data

slide-16
SLIDE 16

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (16)

(16)

> Another bookshop data (dataset “F”)

ID Titre Auteur Original ISBN 2020386682 ISBN 0-00-651409-X ID Traducteur Le Palais des miroirs i_abc i_qrs Nom i_abc Ghosh, Amitav i_grs Besse, Christiane

slide-17
SLIDE 17

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (17)

(17)

> 2nd: export your second set of data

slide-18
SLIDE 18

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (18)

(18)

> 3rd: start merging your data

slide-19
SLIDE 19

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (19)

(19)

> 3rd: start merging your data (cont.)

slide-20
SLIDE 20

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (20)

(20)

> 3rd: merge identical resources

slide-21
SLIDE 21

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (21)

(21)

> Start making queries…

User of data “F” can now ask queries like:

− « donnes-moi le titre de l’original » − (ie: “give me the title of the original”)

This information is not in the dataset “F”… …but can be retrieved by merging with dataset “A”!

slide-22
SLIDE 22

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (22)

(22)

> However, more can be achieved…

We “feel” that a:author and f:auteur should be the same But an automatic merge doest not know that! Let us add some extra information to the merged data:

− a:author same as f:auteur − both identify a “Person” − a term that a community may have already defined:

 a “Person” is uniquely identified by his/her name and, say,

homepage

 it can be used as a “category” for certain type of resources

slide-23
SLIDE 23

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (23)

(23)

> 3rd revisited: use the extra knowledge

slide-24
SLIDE 24

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (24)

(24)

> Start making richer queries!

User of dataset “F” can now query:

− « donnes-moi la page d’accueil de l’auteur de l’original » − (ie, “give me the home page of the original’s author”)

The information is not in datasets “F” or “A”… …but was made available by:

− merging datasets “A” and datasets “F” − adding three simple extra statements as an extra “glue”

slide-25
SLIDE 25

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (25)

(25)

> Combine with different datasets

Using, e.g., the “Person”, the dataset can be combined with

  • ther sources

For example, data in Wikipedia can be extracted using dedicated tools

− there is an active development to add some simple semantic

“tag” to wikipedia entries (so called “Semantic Wiki”-s)

− the “dbpedia” project can extract the “infobox” information from

Wikipedia already…

slide-26
SLIDE 26

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (26)

(26)

> Merge with Wikipedia data

slide-27
SLIDE 27

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (27)

(27)

> Merge with Wikipedia data

slide-28
SLIDE 28

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (28)

(28)

> Merge with Wikipedia data

slide-29
SLIDE 29

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (29)

(29)

> Is that surprising?

Maybe but, in fact, no… What happened via automatic means is done all the time, every day by the users of the Web! The difference: a bit of extra rigor (e.g., naming the relationships) is necessary so that machines could do this, too

slide-30
SLIDE 30

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (30)

(30)

> What did we do?

We combined different datasets

− all may be of different origin somewhere on the web − all may have different formats (mysql, excel sheet, XHTML, etc) − all may have different names for relations (e.g., multilingual)

We could combine the data because some URI-s were identical (the ISBN-s in this case) We could add some simple additional information (the “glue”), also using common terminologies that a community has produced As a result, new relations could be found and retrieved

slide-31
SLIDE 31

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (31)

(31)

> It could become even more powerful

We could add extra knowledge to the merged datasets

− e.g., a full classification of various type of library data − geographical information − etc.

This is where ontologies, extra rules, etc, may come in Even more powerful queries can be asked as a result

slide-32
SLIDE 32

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (32)

(32)

> What did we do? (cont)

slide-33
SLIDE 33

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (33)

(33)

> The abstraction pays off because…

… the graph representation is independent on the exact structures in, say, a relational database … a change in local database schema's, XHTML structures, etc, do not affect the whole, only the “export” step

− “schema independence”

… new data, new connections can be added seamlessly, regardless of the structure of other data sources

slide-34
SLIDE 34

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (34)

(34)

> So where is the Semantic Web?

The Semantic Web provides technologies to make such integration possible! For example:

− an abstract model for the relational graphs: RDF − extract RDF information from XML (eg, XHTML) pages: GRDDL − add structured information to XHTML pages: RDFa − a query language adapted for the relational graphs: SPARQL − characterize the relationships, categorize resources: RDFS,

OWL, SKOS, Rules

 applications may choose among the different technologies  some of them may be relatively simple with simple tools (RDFS),

whereas some require sophisticated systems (OWL, Rules)

− reuse of existing “ontologies” that others have produced (FOAF in

  • ur case)
slide-35
SLIDE 35

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (35)

(35)

> So where is the Semantic Web? (cont)

slide-36
SLIDE 36

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (36)

(36)

> SW data begins to accumulate on the Web

IgentaConnect bibliographic metadata storage: over 200 million triplets Tracking the US Congress: data stored in RDF (around 25 million triplets) RDFS/OWL Representation of WordNet: also downloadable as 150MB of RDF/XML “Département/canton/commune” structure of France published by the French Statistical Institute Geonames Ontology and associated RDF data: 6 million (and growing) geographical features RDF Book Mashup, integrating book data from, eg, Amazon “dbpedia”: get infobox data of Wikipedia into RDF See, for example, the linked data index

slide-37
SLIDE 37

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (37)

(37)

> And what about applications?

slide-38
SLIDE 38

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (38)

(38)

> A number of projects in data integration

Developments are under way at various companies, institutions

− not always easy to find out the details…

Data integration comes to the fore as one of the SW application areas

slide-39
SLIDE 39

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (39)

(39)

> Integrate knowledge for Chinese Medicine

Integration of a large number of relational databases (on traditional Chinese medicine) using a Semantic Layer

− around 80 databases, around 200,000 records each

A visual tool to map databases to the semantic layer using a specialized ontology Form based query interface for end users

Courtesy of Huajun Chen, Zhejiang University, (SWEO Case Study)

slide-40
SLIDE 40

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (40)

(40)

> Find the right experts at NASA

Expertise locater for nearly 20,000 NASA civil servants using RDF integration techniques over 6 or 7 geographically distributed databases, data sources, and web services…

Courtesy of Kendall Clark, Clark & Parsia, LLC

slide-41
SLIDE 41

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (41)

(41)

> Ontology controlled annotation

Annotation of different data formats all along the full drug discovery process…

RDF Triple Store Web API

Acrobat

Chemical Series Compounds Assay Data Points Scientific Papers Any PDF Pathways Lab data Collaborations Targets BioMarkers Spreadsheets Powerpoints Word… Websites/Pages Views of exp data BrainStorming Meeting Notes

Semantic Agents

Automat

ic Email Alerts

Project

Portals

Wikis

Courtesy of Giles Day, Pfizer

slide-42
SLIDE 42

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (42)

(42)

> Public health surveillance

Integrated biosurveillance system (biohazards, bioterrorism, disease control, etc)

Courtesy of Parsa Mirhaji, School of Health Information Sciences, University of Texas (SWEO Case Study)

slide-43
SLIDE 43

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (43)

(43)

> Help in choosing the right drug regimen

Help in finding the best drug regimen for a specific case

− find the best trade-off for a patient

Use an ontology for medical conditions, signs, symptoms Integrate data from various sources (patients, physicians, Pharma, researchers, etc)

Courtesy of Erick Von Schweber, PharmaSURVEYOR Inc., (SWEO Use Case)

slide-44
SLIDE 44

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (44)

(44)

> Some other names…

Pfizer, NASA, Eli Lilly, MITRE Corp., Elsevier, … EU R&D Projects like Sculpteur and Artiste UN FAO’s MeteoBroker, … Semantic Digital Library projects (JeromeDL, Simile, Fedora,…)

slide-45
SLIDE 45

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (45)

(45)

> Web sites, portals, local site search

Portal’s internal organization makes use of semantic data,

  • ntologies

− integration with external and internal data

  • there is a clear overlap here with data integration applications!

− better queries, often based on controlled vocabularies or

  • ntologies…
slide-46
SLIDE 46

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (46)

(46)

> Semantic portal for art collections

Courtesy of Jacco van Ossenbruggen, CWI, and Guus Schreiber, VU Amsterdam

slide-47
SLIDE 47

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (47)

(47)

> Semantic portal for cultural heritage

Courtesy of Francisca Hernández, Fundación Marcelino Botín, and Richard Benjamins, iSOCO, (SWEO Case Study)

slide-48
SLIDE 48

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (48)

(48)

> Help for deep sea drilling operations

Integration of experience and data in the planning and

  • peration of deep sea drilling

processes Discover relevant experiences that could affect current or planned drilling operations

− uses an ontology backed

search engine

Courtesy of David Norheim and Roar Fjellheim, Computas AS (SWEO Use Case)

slide-49
SLIDE 49

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (49)

(49)

> Portal to Principality of Asturias’ documents

Search through governmental documents A “bridge” is created between the users and the juridical jargon using SW vocabularies and tools

Courtesy of Diego Berrueta and Luis Polo, CTIC, U. of Oviedo, and the Principality of Asturias, (SWEO Case Study)

slide-50
SLIDE 50

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (50)

(50)

> Digital music asset portal at NRK

Used by program production to find the right music in the archive for a specific show

Courtesy of Robert Engels, ESIS, and Jon Roar Tønnesen, NRK (SWEO Case Study)

slide-51
SLIDE 51

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (51)

(51)

> Elsevier’s DOPE browser

Single interface to multiple data sources (in life sciences) Integration, search, etc, via thesauri and metadata in RDF(S)

Courtesy of Anita de Waard, Elsevier, Christiaan Fluit, Aduna, and Frank van Harmelen, VU Amsterdam (SWEO Use Case)

slide-52
SLIDE 52

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (52)

(52)

> Intelligent search for public services

Semantic Web based search engine for public services at the municipality of Zaragoza (Spain) The search is based a local ontology, natural language processing and ontological reasoning

Courtesy of Jesús Fernando Ruíz, Municipality of Zaragoza (SWEO Use Case)

slide-53
SLIDE 53

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (53)

(53)

> Vodafone live!

Integrate various vendors’ product descriptions via RDF

− ring tones, games, wallpapers − manage complexity of handsets, binary

formats

A portal is created to offer appropriate content Significant increase in content download after the introduction

Courtesy of Kevin Smith, Vodafone Group R&D (SWEO Case Study)

slide-54
SLIDE 54

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (54)

(54)

> Other examples…

Sun’s White Paper and System Handbook collections Nokia’s S60 support portal Harper’s Online Magazine Oracle’s virtual pressroom Opera’s community site Dow Jones’ Synaptica

slide-55
SLIDE 55

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (55)

(55)

> All kind of other types of applications…

slide-56
SLIDE 56

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (56)

(56)

> Adobe’s XMP

Metadata is added by, e.g., Photoshop into files in RDF XMP is a way of embedding + vocabulary + a set of (public) tools (there are also 3rd party tools to extract the RDF content) Used by a number of platform solutions

slide-57
SLIDE 57

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (57)

(57)

> Natural interface to business applications

Courtesy of C. Anantaram, Tata Consultancy Services Limited (SWEO Case Study)

Users interact with a business application (eg, via email) in natural language; OWL helps in the retrieval of relevant concepts

slide-58
SLIDE 58

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (58)

(58)

> Suggestions’ database…

Employees of the bank can submit new ideas for innovation, improving the business process, reduce costs, etc The entry system analyses the entry, shows similar ideas already in the system based on the concepts (not words) User gets immediate feedback, system gets better search, analysis, etc

Courtesy of José Luís Bas Uribe, Bankinter, and Richard Benjamins, iSOCO, (SWEO Case Study)

slide-59
SLIDE 59

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (59)

(59)

> Other application areas come to the fore

Content management Business intelligence Collaborative user interfaces Sensor-based services Linking virtual communities Grid infrastructure Multimedia data management Etc

slide-60
SLIDE 60

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (60)

(60)

> Conclusions

The Semantic Web is there to integrate data on the Web The goal is the creation of a Web of Data

slide-61
SLIDE 61

Ivan Herman, Semantic Web: a Short Introduction. “Webelopers day”, Isabel Plaza, 17.10.’07 (61)

(61)

> Thank you for your attention!

These slides are publicly available on:

http://www.w3.org/2007/Talks/1017-Madrid-IH/