th the e we web b by by XM XML Airi Salminen University of - - PowerPoint PPT Presentation

th the e we web b by by xm xml
SMART_READER_LITE
LIVE PREVIEW

th the e we web b by by XM XML Airi Salminen University of - - PowerPoint PPT Presentation

To Towards wards sem emantic antic we web: b: ad addi ding ng mea eaning ning an and d tr trus ust t to to th the e we web b by by XM XML Airi Salminen University of Jyvskyl http://www.cs.jyu.fi/~airi/ TUCS


slide-1
SLIDE 1

Airi Salminen, Towards semantic web, TUCS 28.11.2002

To Towards wards sem emantic antic we web: b: ad addi ding ng mea eaning ning an and d tr trus ust t to to th the e we web b by by XM XML

Airi Salminen University of Jyväskylä http://www.cs.jyu.fi/~airi/ TUCS 28.11.2002

slide-2
SLIDE 2

Airi Salminen, Towards semantic web, TUCS 28.11.2002

2

Outline

  • 1. Mileston

tones es of the we web

  • 2. What is XML?
  • 3. Why XM

XML evolv lved

  • 4. What is semanti

ntic c we web?

  • 5. Metadata

data on t the we web

  • 6. XML as metadata

data

  • 7. The RD

RDF model

  • 8. Semanti

ntic c we web architect tecture ure

  • 9. XML-ba

based sed languages s for semanti ntic c we web

  • 10. Re

Rela lated research rch at the Un Univ iversity sity of Jyväskylä skylä

slide-3
SLIDE 3

Airi Salminen, Towards semantic web, TUCS 28.11.2002

3

  • 1. Milestones of the web

1986 ... SGML (Standard Generalized Markup Language) 1960-1980 ... Infrastructure for the Internet 1991 ... WWW, HTML, Internet Society

  • RFC = Request for Comments
  • TCP/IP
slide-4
SLIDE 4

Airi Salminen, Towards semantic web, TUCS 28.11.2002

4

  • 1. Milestones of the web

1992 ... computers connected to the Internet > 1000.000 1996 ... PICS = Platform for Content Selection 1994 ... W3C = World Wide Web Consortium 1998 ... XML, Dublin Core 1999 ... RDF = Resource Description Framework 2000 ... computers connected to the Internet > 100.1000.000

slide-5
SLIDE 5

Airi Salminen, Towards semantic web, TUCS 28.11.2002

5

  • 2. What is XML?

A set of rules for defining and representing information as structured documents for applications on the Internet; a restricted form

  • f SGML (Standard Generalized Markup

Language)

XML = Extensible Markup Language

  • T. Bray, J. Paoli, C. M. Sperberg-McQueen, and E. Maler (Eds.),

Extensible Markup Language (XML) 1.0 (Second Edition), W3C Recommendation 6 October 2000, http://www.w3.org/TR/2000/REC-xml-20001006

slide-6
SLIDE 6

Airi Salminen, Towards semantic web, TUCS 28.11.2002

6

  • 2. What is XML?

Rule 1: Information is represented in units

called XML documents.

Rule 2: An XML document contains one or

more elements.

Rule 3: An element has a name, it is denoted

in the document by explicit markup, it can contain other elements, and it can be associated with attributes.

and lots of other rules ...

slide-7
SLIDE 7

Airi Salminen, Towards semantic web, TUCS 28.11.2002

7

  • 2. What is XML?

<?xml version = "1.0"?> <poem author = ”Murasaki Shikibu” author_born = ”974”> <info_link xmlns:xlink=”http://www.w3.org/1999/xlink” xlink:type="simple” xlink:href= ”http://digital.library.upenn.edu/women/omori/court/murasaki.html”> About the author </info_link> <stanza> <line>This life of ours would not cause you sorrow</line> <line>if you thought of it as like </line> <line>the mountain cherry blossoms</line> <line>which bloom and fade in a day. </line> </stanza> </poem>

Example of an XML document

Note: The text of the line elements is taken from http://www.slip.net/~knabb/rexroth/translations/japanese.htm, containing Kenneth Rexroth’s translations of Japanese poetry

slide-8
SLIDE 8

Airi Salminen, Towards semantic web, TUCS 28.11.2002

8

  • 2. What is XML?

 Defines the rules how to mark up a document

— does not define the names used in markup.

 Includes capability to prescribe a document

type by a collection of declarations to constrain the markup permitted in a class of documents.

 Intended for all natural languages, regardless

  • f character set, orientation of script, etc.

XML is a metalanguage, not a specific language

slide-9
SLIDE 9

Airi Salminen, Towards semantic web, TUCS 28.11.2002

9

  • 2. What is XML?

Document type declaration for a poem

<!DOCTYPE poem [ <!ELEMENT poem (info_link? title?, stanza+)> <!ATTLIST poem author CDATA #REQUIRED author_born CDATA #IMPLIED> <!ELEMENT title (#PCDATA) > <!ELEMENT info_link (#PCDATA) > <!ATTLIST info_link xmlns:xlink CDATA #FIXED "http://www.w3.org/1999/xlink" xlink:type CDATA #FIXED "simple" xlink:href CDATA #REQUIRED > <!ELEMENT stanza (line+) > <!ELEMENT line (#PCDATA) >]

slide-10
SLIDE 10

Airi Salminen, Towards semantic web, TUCS 28.11.2002

10

  • 2. What is XML?

XML document XML processor application may or may not be “validating” “XML Information Set”

slide-11
SLIDE 11

Airi Salminen, Towards semantic web, TUCS 28.11.2002

11

  • 3. Why XML evolved

Needs:

  • Simple, common rules that are easy to

understand by people with different backgrounds (like HTML)

  • Capability to describe Internet resources

and their relationships (like HTML)

  • Capability to define information

structures for different kinds of business sectors (unlike HTML, like SGML) After the breakthrough of WWW and HTML there was an urgent need for a new, common data format for the Internet

slide-12
SLIDE 12

Airi Salminen, Towards semantic web, TUCS 28.11.2002

12

  • 3. Why XML evolved

Needs (cont’d):

  • Format formal enough for computers and

clear enough to be human-legible (like SGML)

  • Rules simple enough to allow easy

building of software (unlike SGML)

  • Strong support for diverse natural

languages (unlike SGML)

slide-13
SLIDE 13

Airi Salminen, Towards semantic web, TUCS 28.11.2002

13

  • 4. What is semantic web?

The abstract representation of data on the World Wide Web, based on the RDF standards and other standards to be defined. It is being developed by the W3C, in collaboration with a large number of researchers and industrial partners

W3C Semantic Web Activity, http://www.w3.org/TR/2001/sw/

slide-14
SLIDE 14

Airi Salminen, Towards semantic web, TUCS 28.11.2002

14

  • 4. What is semantic web?

An extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation

Tim Berners-Lee, James Hendler, Ora Lassila, The Semantic Web, Scientific American, May 2001. http://www.scientificamerican.com/2001/0501issue/0501berners-lee.html

slide-15
SLIDE 15

Airi Salminen, Towards semantic web, TUCS 28.11.2002

15

  • 4. What is semantic web?

 Web resources consist of primary resources

and metadata resources.

 Metadata resources related to the meaning,

use, and trustworthness of the (primary) resources.

 Metadata resources first class web resources.  Metadata in standardized formats readable

both by software and people.

slide-16
SLIDE 16

Airi Salminen, Towards semantic web, TUCS 28.11.2002

16

  • 4. What is semantic web?

 Formats based on XML and RDF.  Major portion of the primary resources written

in various natural languages used in various communities.

 Homogeneous metadata about

heterogeneous content.

 Enabling merging of resoursers.

slide-17
SLIDE 17

Airi Salminen, Towards semantic web, TUCS 28.11.2002

17

  • 4. What is semantic web?

 Automated reasoning about meaning and

trustworthness.

 Enabling extensive cooperation of software.  Enabling and requiring cooperation of people

in communities having shared understanding

  • f the meaning of the content and shared

values.

 Development coordinated by W3C.

slide-18
SLIDE 18

Airi Salminen, Towards semantic web, TUCS 28.11.2002

18

  • 5. Metadata on the web
  • documents
  • databases
  • applications
  • services

about

metadata = data about web resources

slide-19
SLIDE 19

Airi Salminen, Towards semantic web, TUCS 28.11.2002

19

  • 5. Metadata on the web
  • title
  • creator
  • subject
  • format
  • identifier
  • description
  • publisher
  • rights

Examples of metadata About a document Can be given, for example, by Dublin Core elements

slide-20
SLIDE 20

Airi Salminen, Towards semantic web, TUCS 28.11.2002

20

  • 5. Metadata on the web
  • structure (DTD, XML Schema)
  • words in the content (indexes)
  • concepts and their meanings (ontologies)

Examples of metadata (cont’d) About a document repository

slide-21
SLIDE 21

Airi Salminen, Towards semantic web, TUCS 28.11.2002

21

  • 5. Metadata on the web
  • vocabularies of the markup (namespace,

DTD, XML Schema)

  • vocabularies in the metadata descriptions

(RDF Schema)

  • data types in the schemas (XML Schema

type definitions)

Examples of metadata (cont’d) About metadata in a repository

slide-22
SLIDE 22

Airi Salminen, Towards semantic web, TUCS 28.11.2002

22

  • 5. Metadata on the web
  • users of an application
  • access rights related to the resources of a

community

  • annotations for a document (Annotea

ea)

  • business process where documents are

created

Examples of metadata (cont’d)

slide-23
SLIDE 23

Airi Salminen, Towards semantic web, TUCS 28.11.2002

23

  • 5. Metadata on the web

embedde ded exte terna rnal centr tral aliz ized ed distri tribu buted ted creat ated ed by people e create ated d by softw twar are

metadata classifications

slide-24
SLIDE 24

Airi Salminen, Towards semantic web, TUCS 28.11.2002

24

  • 6. XML as metadata
  • The markup used in a document serves

as metadata in relationship to the character data

  • The declarations associated with a class
  • f documents serve as metadata in

relationship to the documents.

slide-25
SLIDE 25

Airi Salminen, Towards semantic web, TUCS 28.11.2002

25

  • 6. XML as metadata

<?xml version = "1.0"?> <poem author = ”Murasaki Shikibu” author_born = ”974”> <info_link xmlns:xlink=”http://www.w3.org/1999/xlink” xlink:type="simple” xlink:href= ”http://digital.library.upenn.edu/women/omori/court/murasaki.html”> About the author </info_link> <stanza> <line>This life of ours would not cause you sorrow</line> <line>if you thought of it as like </line> <line>the mountain cherry blossoms</line> <line>which bloom and fade in a day. </line> </stanza> </poem>

slide-26
SLIDE 26

Airi Salminen, Towards semantic web, TUCS 28.11.2002

26

  • 6. XML as metadata

This life of ours would not cause you sorrow if you thought of it as like the mountain cherry blossoms which bloom and fade in a day.

Lisätietoa runoilijasta

slide-27
SLIDE 27

Airi Salminen, Towards semantic web, TUCS 28.11.2002

27

  • 6. XML as metadata
  • The document is called a poem and it consists of

elements called info_link and stanza, and the stanza consists of elements called line.

  • The author of the poem is Murasaki Shikibu, born in 974.
  • The element info_link with the text content ”About the

author” is a simple link referring to the Web resource at http://digital.library.upenn.edu/women/omori/court/murasaki.html

  • ...

Metadata expressed in the markup :

slide-28
SLIDE 28

Airi Salminen, Towards semantic web, TUCS 28.11.2002

28

  • 6. XML as metadata

Also DTD provides metadata

<!DOCTYPE poem [ <!ELEMENT poem (info_link? title?, stanza+)> <!ATTLIST poem author CDATA #REQUIRED author_born CDATA #OMITTED> <!ELEMENT title (#PCDATA) > <!ELEMENT info_link (#PCDATA) > <!ATTLIST info_link xmlns:xlink CDATA #FIXED "http://www.w3.org/1999/xlink" xlink:type CDATA #FIXED "simple" xlink:href CDATA #REQUIRED > <!ELEMENT stanza (line+) > <!ELEMENT line (#PCDATA) >]

slide-29
SLIDE 29

Airi Salminen, Towards semantic web, TUCS 28.11.2002

29

  • 6. XML as metadata

The metadata provided by the DTD

  • The documents are called poems.
  • A poem may contain an element called title and it always

contains one or more elements called stanza.

  • A poem may be linked to a resource by a simple link.
  • For each poem there is information about the author and

possibly about the year of birth of the author. Vocabulary: poem, stnza, line, author, ... Structure:

slide-30
SLIDE 30

Airi Salminen, Towards semantic web, TUCS 28.11.2002

30

  • 7. The RDF model

resource anything that can be identified on the Internet; identification by URI examples: file, service, site, part of a file, book, person, company RDF = Resource Description Framework

a model for describing web resources

RDF Specification: http://www.w3.org/TR/REC-rdf-syntax/

slide-31
SLIDE 31

Airi Salminen, Towards semantic web, TUCS 28.11.2002

31

  • 7. The RDF model

Examples of resources

resource URI

home page of a course Department tment of CS & IS at the Universi ersity ty of Jyväs äskylä Airi Salminen nen Home page of Airi Salminen nen http://www.cs.jyu.fi/~airi/opetus/Seman ttinenWeb.html http://cs.jyu.fi http://cs.jyu.fi/henkilot/asalminen http://www.cs.jyu.fi/~airi/

slide-32
SLIDE 32

Airi Salminen, Towards semantic web, TUCS 28.11.2002

32

  • 7. The RDF model

RDF description consists of statements A statement is a triple expressing the value of a property of a resource: (property, resource, value)

(language, http://www.cs.jyu.fi/~airi/opetus/SemanttinenWeb.html, "fi")

slide-33
SLIDE 33

Airi Salminen, Towards semantic web, TUCS 28.11.2002

33

  • 7. The RDF model

(dc:Creator, http://www.cs.jyu.fi/~airi/opetus/SemanttinenWeb.html, "Airi Salminen") (dc:Language, http://www.cs.jyu.fi/~airi/opetus/SemanttinenWeb.html, "fi")

slide-34
SLIDE 34

Airi Salminen, Towards semantic web, TUCS 28.11.2002

34

  • 7. The RDF model
  • RDF is intended to facilitate automated processing of

Web resources

  • RDF does not specify a mechanism for reasoning
  • Intended to be used in a variety of application areas:
  • resource discovery
  • cataloging
  • by intelligent software agents
  • in content rating
  • to build a "web of trust" with digital signatures
slide-35
SLIDE 35

Airi Salminen, Towards semantic web, TUCS 28.11.2002

35

  • 8. Semantic web architecture

primary resources metadata resources applications semantic web technology

slide-36
SLIDE 36

Airi Salminen, Towards semantic web, TUCS 28.11.2002

36

  • 8. Semantic web architecture

primary resources

DTDs XML Schemata RDF Schemata RDF Repositories Ontologies Annotations

applications

URI, Unicode, XML, XML Namespaces, XML Schema, RDF, RDF Schema, XTM, XML-Signature, OWL, Annotea, ...

slide-37
SLIDE 37

Airi Salminen, Towards semantic web, TUCS 28.11.2002

37

  • 9. XML-based languages for semantic web
  • XML
  • XM

XML Na Namespace aces

  • XML Schema

Language ges s for represe sent nting g and defining ng structured uctured documents ments

slide-38
SLIDE 38

Airi Salminen, Towards semantic web, TUCS 28.11.2002

38

  • 9. XML-based languages for semantic web

la langu guag age purpo pose se

RDF RDF de describ cribin ing g web eb res esou

  • urces

rces RDF Sche hema ma de defi fini ning ng RDF voc

  • cab

abul ularies aries OWL pu publ blis ishing hing an and s d sha harin ing g

  • n
  • nto

tolo logi gies es on

  • n th

the e web eb XTM XTM Top

  • pic

ic ma maps ps

slide-39
SLIDE 39

Airi Salminen, Towards semantic web, TUCS 28.11.2002

39

  • 9. XML-based languages for semantic web

la langu guag age purpo pose se XM XML- Si Signatur ture digita ital l signat atur ures es XK XKMS public ic keys P3 P3P APPEL APPEL privac acy practic tices es for web sites prefe fere renc nces es regardi ding ng P3P policie ies XM XML En Encryp yptio tion encry rypt pted ed data

slide-40
SLIDE 40

Airi Salminen, Towards semantic web, TUCS 28.11.2002

40

  • 10. Related work at the University of Jyväskylä

EULEGIS, European User Views to Legislative Information in Structured Form (Airi Salminen et al.) http://www.cs.jyu.fi/~airi/docman.html#eulegis The purpose was to offer a consistent user interface to retrieve legal information created in different legal systems and at different levels - the European Union, a member state, a region, or a municipality. Utilized contextual metadata and ontologies in the user interface.

slide-41
SLIDE 41

Airi Salminen, Towards semantic web, TUCS 28.11.2002

41

  • 10. Related work at the University of Jyväskylä

DrElma: Digital Rights of Electronic Learning Materials (Pasi Tyrväinen et al.) http://www.cs.jyu.fi/~airi/docman.html#DrElma Steve Legrand (steveleg@hotmail.com), Using

  • ntologies for text disambiguation

The main motivation behind this research is to improve the accuracy of linguistic parsers to benefit linguistic applications used in translation and language learning and

  • ther tasks, which use parsers for disambiguation.
slide-42
SLIDE 42

Airi Salminen, Towards semantic web, TUCS 28.11.2002

42

  • 10. Related work at the University of Jyväskylä

Airi Salminen, XML family of languages. Overview and classification of W3C specifications. Available at http://www.cs.jyu.fi/~airi/xmlfamily.html. Airi Salminen, Semanttinen web. Home page of a

  • course. Available at

http://www.cs.jyu.fi/~airi/opetus/SemanttinenWeb.html.