Semantic Markup Languages: A Gentle Introduction Yolanda Gil - - PowerPoint PPT Presentation

semantic markup languages a gentle introduction
SMART_READER_LITE
LIVE PREVIEW

Semantic Markup Languages: A Gentle Introduction Yolanda Gil - - PowerPoint PPT Presentation

Semantic Markup Languages: A Gentle Introduction Yolanda Gil USC/Information Sciences Institute gil@isi.edu Yolanda Gil 1 Outline I: The Big Picture The Semantic Web


slide-1
SLIDE 1

1

Yolanda Gil

Semantic Markup Languages: A Gentle Introduction

Yolanda Gil

USC/Information Sciences Institute gil@isi.edu

slide-2
SLIDE 2

2

Yolanda Gil

Outline

ν I: The Big Picture

™ The Semantic Web

http://www.scientificamerican.com/2001/0501issue/0501berners-lee.html http://www.w3.org/DesignIssues/Semantic.html ν II: A Gentle Introduction

™ XSD, RDFS, (DAML), OWL

http://trellis.semanticweb.org/expect/web/semanticweb/comparison.html ν III: The Big Picture Revisited

™ W3C’s Semantic Web principles

http://www.semanticweb.org

™ How this is changing our research in Knowledge Bases

http://www.isi.edu/expect/papers/gil-seweb-book-01.pdf

slide-3
SLIDE 3

3

Yolanda Gil

I: THE BIG PICTURE

slide-4
SLIDE 4

4

Yolanda Gil

The Semantic Web

W3C’s Tim Berners-Lee: “Weaving the Web”: “I have a dream for the Web… and it has two parts.”

ν The first Web enables communication between people

™ The Web shows how computers and networks enable the

information space while getting out of the way

ν The new Web will bring computers into the action

™ Step 1 -- Describe: putting data on the Web in machine-

understandable form -- a Semantic Web

– RDF (based on XML) – Master list of terms used in a document (RDF schema) – Each document mixes global standards and local agreed-upon terms (namespaces)

™ Step 2 -- Infer and reason: apply logic inference

– Operate on partial understanding – Answering why – Heuristics

slide-5
SLIDE 5

5

Yolanda Gil

Semantics and Meaning according to TBL

ν “In the extreme view, the world can be seen as only

connections, nothing else. … I like the idea that a piece of information is really defined only by what it’s related to, and how it’s related. There really is little else to meaning. The structure is everything.”

ν “What matters is in the connections. It isn’t the

letters, it’s the way they are strung together into

  • words. […] into phrases. […] into a document.”

ν For the people, by the people: the right to link

“Once [… something…] was made available, it should be accesible to anyone […]. And it should be possible to make a link to that thing.”

slide-6
SLIDE 6

6

Yolanda Gil

And There You Have I

slide-7
SLIDE 7

7

Yolanda Gil

II: THE GENTLE INTRODUCTION

slide-8
SLIDE 8

8

Yolanda Gil

The Layer Cake [TBL,XML2000]

slide-9
SLIDE 9

9

Yolanda Gil

The Layer Cake [TBL,XML2000]

slide-10
SLIDE 10

10

Yolanda Gil

URIs: Uniform Resource Identifiers (aka URLs)

http://trellis.semanticweb.org/semanticweb/slides/ ftp://www.allinone.org/all.gz

ν The Web is an information space. URIs are the

points in that space.

ν Short strings that identify resources in the web:

documents, images, downloadable files, services, electronic mailboxes, and other resources.

ν They make resources addressable in the same

simple way. They reduce the tedium of "log in to this server, then issue this magic command ..." down to a single click.

slide-11
SLIDE 11

11

Yolanda Gil

slide-12
SLIDE 12

12

Yolanda Gil

Unicode

ν A character encoding system, like ASCII, designed

to help developers who want to create software applications that work in any language in the world

ν Unicode provides a unique number for every

character, no matter what the platform, no matter what the program, no matter what the language

slide-13
SLIDE 13

13

Yolanda Gil

The Layer Cake [TBL,XML2000]

slide-14
SLIDE 14

14

Yolanda Gil

Why XML (eXtensible Markup Language)

Problems wit h HTML

HTML design

  • HTML is intended for presentation of informat ion as

Web pages.

  • HTML cont ains a fixed set of markup tags.

This design is not appropriat e for dat a:

  • Tags don’ t convey meaning of t he dat a inside t he t ags.
  • Tags are not ext ensible.
slide-15
SLIDE 15

15

Yolanda Gil

The Design of XML

ν Tags can be used to represent the meaning of

data/information

™ separates syntax (structural representation) from semantics

=> only syntax is considered in XML

ν There is no fixed set of markup tags - new tags can

be defined

ν Underlying data model is a tree structure ν “XML is the new ASCII” -- Tim Bray

http://www.w3.org/TR/2000/REC-xml-20001006

slide-16
SLIDE 16

16

Yolanda Gil

Simple XML Example

<Bookst ore> <Book ID=“ 101” > <Aut hor>John Doe</ Aut hor> <Tit le>Int roduct ion t o XML</ Tit le> <Dat e>12 June 2001</ Dat e> <ISBN>121232323</ ISBN> <Publisher>XYZ</ Publisher> </ Book> <Book ID=“ 102” > <Aut hor>Foo Bar</ Aut hor> <Tit le>Int roduct ion t o XSL</ Tit le> <Dat e>12 June 2001</ Dat e> <ISBN>12323573</ ISBN> <Publisher>ABC</ Publisher> </ Book> </ Bookst ore>

XML by it self is j ust hierarchically st ruct ured t ext

slide-17
SLIDE 17

17

slide-18
SLIDE 18

18

Yolanda Gil

XSD: XML Schema Definition

– Writ t en in t he same synt ax as XML document s (unlike XML DTDs!) – Element s and at t ribut es – Enhanced set of primit ve dat at ypes.

  • Wide range of primit ive dat a t ypes, support ing t hose found in

dat abases (st ring, boolean, decimal, int eger, dat e, et c.)

  • Can creat e your own dat at ypes (complexType)
  • Can derive new t ype definit ions on t he basis of old ones

(refinement ) – Can have const raint s on at t ribut es

  • Examples: maxlengt h, precision, enumerat ion, maxInclusive

(upper bound), minInclusive (lower bound), et c.

slide-19
SLIDE 19

19

Yolanda Gil

XSD (XML Schema) Example

slide-20
SLIDE 20

20

Yolanda Gil

XSL [XML Stylesheet Language]

< ?xml version='1.0'?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl"> <xsl:template match="/"> < html> < body> < table cellpadding="2" cellspacing="0" border="1" bgcolor="#FFFFD5"> < tr> < th> Title< /th> < U I

slide-21
SLIDE 21

21

Yolanda Gil

XML: Tools/Software

slide-22
SLIDE 22

22

Yolanda Gil

Summary of the XML+ NS +XSD Layer The Power of Simplicity

ν “When I designed HTML, I chose to avoid giving it

more power than it absolutely needed – a “principle

  • f least power”, which I have stuck to ever since. I

could have used a language like Knuth’s Tex but…” -

  • TBL

ν Keeps the principles of SGML in place but its spec is

thin enough to wave ϑ

ν To say you are “Using XML” is sort of like saying you

are using ASCII

ν Using XSD (XML Schema) makes a lot more sense

slide-23
SLIDE 23

23

Yolanda Gil

The Layer Cake [TBL,XML2000]

slide-24
SLIDE 24

24

Yolanda Gil

Where XML & XML Schemas Fail

  • No semant ics!
  • Will XML scale in t he met adat a world?

1. The order in which element s appear in an XML document is oft en

  • meaningful. This seems highly unnat ural in t he met adat a world.

Furt hermore, maint aining t he correct order of millions of dat a it ems is impract ical. 2. XML allows const ruct ions t hat mix up some t ext along wit h child element s, which are hard t o handle. Ex.

slide-25
SLIDE 25

25

Yolanda Gil

RDF (Resource Description Framework)

  • RDF provides a way of describing resources via met adat a (dat a about dat a)

It rest rict s t he descript ion of resources t o triplets (subject,predicate,object)

  • It provides int eroperabilit y bet ween applicat ions t hat exchange machine

underst andable informat ion on t he Web.

  • The original broad goal of RDF was t o define a mechanism for describing

resources t hat makes no assumpt ions about a part icular applicat ion domain, nor defines (a priori) t he semant ics of any applicat ion domain.

  • Uses XML as t he int erchange synt ax.
  • Provides a lightweight ont ology syst em.

The formal specificat ion of RDF is available at : ht t p:/ / www.w3.org/ TR/ REC-rdf-synt ax/

slide-26
SLIDE 26

26

Yolanda Gil

RDF Syntax

Subj ect , Predicat e and Obj ect Triplet s (Tuples)

  • Subj ect : The resource being described.
  • Predicat e: A propert y of t he resource
  • Obj ect : The value of t he propert y

A combinat ion of t hem is said t o be a St at ement (or a rule)

slide-27
SLIDE 27

27

Yolanda Gil

RDF Example

<? xml version="1.0"? > <rdf:RDF xmlns:rdf="http:/ / www.w3.org/ TR/ WD-rdf-syntax#" xmlns:s="http:/ / description.org/ schema/ "> <rdf:Description about="http:/ / foo.bar.org/ index.html"> <s:Author>John Doe</ s:Author> </ rdf:Description> </ rdf:RDF>

slide-28
SLIDE 28

28

Yolanda Gil

RDF Schema

  • A schema defines t he t erms t hat will be used in t he RDF

st at ement s and gives specific meanings t o t hem.

http://www.w3.org/TR/rdf-schema/

Example:

<rdf:RDF xml:lang="en" xmlns:rdf="ht t p:/ / www.w3.org/ 1999/ 02/ 22-rdf-synt ax-ns#" xmlns:rdfs="ht t p:/ / www.w3.org/ 2000/ 01/ rdf-schema#"> <rdf:Descript ion ID="Mot orVehicle"> <rdf:t ype resource="ht t p:/ / www.w3.org/ 2000/ 01/ rdf-schema#Class"/ > <rdfs:subClassOf rdf:resource="ht t p:/ / www.w3.org/ 2000/ 01/ rdf-schema#Resource"/ > </ rdf:Descript ion> <rdf:Descript ion ID="PassengerVehicle"> <rdf:t ype resource="ht t p:/ / www.w3.org/ 2000/ 01/ rdf-schema#Class"/ > <rdfs:subClassOf rdf:resource="#Mot orVehicle"/ > </ rdf:Descript ion> <rdf:Descript ion ID="Truck"> <rdf:t ype resource="ht t p:/ / www.w3.org/ 2000/ 01/ rdf-schema#Class"/ > <rdfs:subClassOf rdf:resource="#Mot orVehicle"/ > </ rdf:Descript ion>

slide-29
SLIDE 29

29

Yolanda Gil

Example (cont..)

<rdf:Descript ion ID="Van"> <rdf:t ype resource="ht t p:/ / www.w3.org/ 2000/ 01/ rdf-schema#Class"/ > <rdfs:subClassOf rdf:resource="#Mot orVehicle"/ > </ rdf:Descript ion> <rdf:Descript ion ID="MiniVan"> <rdf:t ype resource="ht t p:/ / www.w3.org/ 2000/ 01/ rdf-schema#Class"/ > <rdfs:subClassOf rdf:resource="#Van"/ > <rdfs:subClassOf rdf:resource="#PassengerVehicle"/ > </ rdf:Descript ion> <rdf:Descript ion ID="regist eredTo"> <rdf:t ype resource="ht t p:/ / www.w3.org/ 1999/ 02/ 22-rdf-synt ax-ns#Propert y"/ > <rdfs:domain rdf:resource="#Mot orVehicle"/ > <rdfs:range rdf:resource="#Person"/ > </ rdf:Descript ion> <rdf:Descript ion ID="rearSeat LegRoom"> <rdf:t ype resource="ht t p:/ / www.w3.org/ 1999/ 02/ 22-rdf-synt ax-ns#Propert y"/ > <rdfs:domain rdf:resource="#PassengerVehicle"/ > <rdfs:domain rdf:resource="#Minivan"/ > <rdfs:range rdf:resource="ht t p:/ / www.w3.org/ 2000/ 03/ example/ classes#Number"/ > </ rdf:Descript ion> </ rdf:RDF>

Domain

slide-30
SLIDE 30

30

Yolanda Gil

RDF: Tools/Resources

slide-31
SLIDE 31

31

Yolanda Gil

Summary: RDF & RDF Schema layer

ν Minimalist model - (thing), Class, Property ν Subproperty, Subclass ν Domain & Range ν RDF Schema: a W3C recommendation Feb’04 ν Efficient storage and retrieval

™ “Triple store” using database backends

slide-32
SLIDE 32

32

Yolanda Gil

The Layer Cake [TBL,XML2000]

slide-33
SLIDE 33

33

Yolanda Gil

  • Cannot define propert ies of propert ies (unique, t ransit ive)
  • No equivalence, disj oint ness, et c.
  • No mechanism of specifying necessary and sufficient conditions

for class membership. Example: If it is given t hat ‘ XYZ’ has a ‘ car’ which is ‘ 7ft high’ , has ‘ wide wheels’ and ‘ loading space is 4 cub.m’ , t hen we should be able t o reason t hat ‘ XYZ’ has an ‘ SUV’ , as given by t he necessary and sufficient condit ions for being an ‘ SUV’ : height > 4ft & wide wheels & loading space > 2 cub.m

Limitations of RDF

slide-34
SLIDE 34

34

Yolanda Gil

  • W3C’ s Semant ic Web Act ivit y:
  • RDF and met adat a markup effort s t o represent dat a in

a machine underst andable form.

  • DARPA st art ed t he DARPA Agent Markup Language (DAML)

program.

  • possibly wit h “ ARPANET -> Int ernet” in mind
  • EC (European Commission) funding programs
  • Ont ology Int erchange Language (OIL)
  • logic based language.
  • brings logic and inference t o t he Semant ic Web

www.daml.org

DAML+OIL: http://www.daml.org/2001/03/daml+oil-index.html

DAML+OIL’s History

slide-35
SLIDE 35

35

Yolanda Gil

DAML+OIL (www.daml.org)

ν It builds on earlier W3C standards such as RDF and

RDF Schema.

ν DAML extends RDF and RDFS with richer modelling

primitives.

™ disjointWith, intersectionOf, oneOf, cardinality

ν Able to provide properties of properties

™ uniqueness, transitivity, etc.

ν Current version DAML+OIL provides a semantic

interpretation (model-theoretic semantics) http://www.daml.org/2001/03/daml+oil-index.html

slide-36
SLIDE 36

36

Yolanda Gil

DAML+OIL substrate: Description Logics

ν Classes are defined in terms of other

classes/relations

ν Relations are first class citizens ν Powerful inference algorithms:

™ Subsumption: is classA a subclass of classB given their

definitions?

™ Recognition: is instanceA of classA? ™ Classification: automatic reorganization of class hierarchy

based on definitions of classes

slide-37
SLIDE 37

37

Yolanda Gil

An Example (from www.daml.org)

<rdf:RDF xmlns:rdf ="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:daml="http://www.daml.org/2000/12/daml+oil#" xmlns ="http://www.daml.org/2000/12/daml+oil-ex#" > <daml:Ontology about=“ ” > <daml:versionInfo> An example ontology< /daml:versionInfo> <rdfs:Class rdf:ID="Animal"> < rdfs:label> Animal< /rdfs:label> < rdfs:comment> This class of animals is illustrative of a number of ontological idioms. < /rdfs:comment> < /rdfs:Class> <rdfs:Class rdf:ID="M ale"> < rdfs:subClassOf rdf:resource="#Animal"/> < /rdfs:Class> <rdfs:Class rdf:ID="Female"> < rdfs:subClassOf rdf:resource="#Animal"/> < daml:disjointWith rdf:resource="#M ale"/> < /rdfs:Class> <rdfs:Class rdf:ID="M an"> < rdfs:subClassOf rdf:resource="#Person"/> < rdfs:subClassOf rdf:resource="#M ale"/> < /rdfs:Class>

Can explicitly specify the set of Females to be disjoint with the set of Males Start of an ontology (about = “” implies ‘this’ document) The label is not used for logical interpretation To be read conjunctively. A man is a sub-class of ‘Person’ and a ‘Male’ The Person class is defined later

slide-38
SLIDE 38

38

Yolanda Gil

Example (contd..)

<rdfs:Class rdf:ID="Woman"> < rdfs:subClassOf rdf:resource="#Person"/> < rdfs:subClassOf rdf:resource="#Female"/> < /rdfs:Class> <rdf:ObjectProperty rdf:ID="hasParent"> < rdfs:domain rdf:resource="#Animal"/> < rdfs:range rdf:resource="#Animal"/> < /rdf:ObjectProperty> <rdf:ObjectProperty rdf:ID="hasFather"> < rdfs:subPropertyOf rdf:resource="#hasParent"/> < rdfs:range rdf:resource="#M ale"/> < /rdf:ObjectProperty> <daml:DatatypeProperty rdf:ID="age"> < rdfs:range rdf:resource="http://www.w3.org/2000/10/XM LSchema#nonNegativeInteger"/> < /daml:DatatypeProperty <rdfs:Class rdf:ID="Person"> < rdfs:subClassOf rdf:resource="#Animal"/> < rdfs:subClassOf> < daml:Restriction> <daml:onProperty rdf:resource="#hasParent"/> <daml:toClass rdf:resource="#Person"/> < /daml:Restriction> < /rdfs:subClassOf> < rdfs:subClassOf> < daml:Restriction daml:cardinality="1"> <daml:onProperty rdf:resource="#hasFather"/> < /daml:Restriction> < /rdfs:subClassOf> < /rdfs:Class>

slide-39
SLIDE 39

39

Yolanda Gil

Example (contd..)

<rdfs:Class rdf:about="#Animal"> <rdfs:subClassOf> < daml:Restriction daml:cardinality="2"> < daml:onProperty rdf:resource="#hasParent"/> < /daml:Restriction> < /rdfs:subClassOf> < /rdfs:Class> <rdfs:Class rdf:about="#Person"> < rdfs:subClassOf> < daml:Restriction daml:maxcardinality="1"> < daml:onProperty rdf:resource="#hasSpouse"/> < /daml:Restriction> < /rdfs:subClassOf> < /rdfs:Class>

Furt her const ruct s t hat t he example doesn’ t use : Propert ies: Transit ivePropert y (hasAncest or), UniquePropert y (hasMot her), inverseOf(hasChild -> hasParent), et c. Classes: int ersect ionOf (a daml:collect ion), unionOf (a daml:collect ion), sameClassAs, complement Of, et c.

slide-40
SLIDE 40

40

Yolanda Gil

DAML+OIL

ν “Most used knowledge representation language in

history”:

™ 5M statements on 20K web pages ™ 13M triples RDF-based

ν www.daml.org: average 20k hits/day ν Submit a query, many things are there:

™ http://www.daml.org/ontologies/

slide-41
SLIDE 41

41

Yolanda Gil

DAML References/Tools

DAML Viewer: It provides a means t o view t he inst ances found in a DAML document . ht t p:/ / www.daml.org/ viewer/ applet .ht ml DAML Crawler Result s: A list of .daml files on t he int ernet ht t p:/ / www.daml.org/ crawler/ pages.ht ml A DAML Validat or ht t p:/ / www.daml.org/ validat or/ A DAML example explained: It has t he same example as in t he slides, discussed in det ail. ht t p:/ / www.daml.org/ 2001/ 03/ daml+oil-walkt hru.ht ml