XML Documents XML Documents The XML Namespace mechanism Anders - - PowerPoint PPT Presentation

xml documents xml documents
SMART_READER_LITE
LIVE PREVIEW

XML Documents XML Documents The XML Namespace mechanism Anders - - PowerPoint PPT Presentation

Objectives Objectives What is XML , in particular in relation to HTML An Introduction to XML and Web Technologies An Introduction to XML and Web Technologies The XML data model and its textual representation XML Documents XML Documents


slide-1
SLIDE 1

1

An Introduction to XML and Web Technologies An Introduction to XML and Web Technologies

XML Documents XML Documents

Anders Møller & Michael I. Schwartzbach  2006 Addison-Wesley

2

An Introduction to XML and Web Technologies

Objectives Objectives

What is XML, in particular in relation to HTML The XML data model and its textual representation The XML Namespace mechanism

3

An Introduction to XML and Web Technologies

What is XML? What is XML?

XML: Extensible Markup Language A framework for defining markup languages Each language is targeted at its own application domain with its own markup tags There is a common set of generic tools for processing XML documents XHTML: an XML variant of HTML Inherently internationalized and platform independent (Unicode) Developed by W3C, standardized in 1998

4

An Introduction to XML and Web Technologies

Recipes in XML Recipes in XML

Define our own “Recipe Markup Language” Choose markup tags that correspond to concepts in this application domain

  • recipe, ingredient, amount, ...

No canonical choices

  • granularity of markup?
  • structuring?
  • elements or attributes?
  • ...
slide-2
SLIDE 2

2

5

An Introduction to XML and Web Technologies

Example (1/2) Example (1/2)

<col l ect i on> <descr i pt i on>Reci pes suggest ed by Jane Dow</ descr i pt i on> <r eci pe i d=" r 117" > <t i t l e>Rhubar b Cobbl er </ t i t l e> <dat e>W ed, 14 Jun 95</ dat e> <i ngr edi ent nam e=" di ced r hubar b" am

  • unt =" 2. 5" uni t =" cup" / >

<i ngr edi ent nam e=" sugar " am

  • unt =" 2" uni t =" t abl espoon" / >

<i ngr edi ent nam e=" f ai r l y r i pe banana" am

  • unt =" 2" / >

<i ngr edi ent nam e=" ci nnam

  • n" am
  • unt =" 0. 25" uni t =" t easpoon" / >

<i ngr edi ent nam e=" nut m eg" am

  • unt =" 1" uni t =" dash" / >

<pr epar at i on> <st ep> Com bi ne al l and use as cobbl er , pi e, or cr i sp. </ st ep> </ pr epar at i on>

6

An Introduction to XML and Web Technologies

Example (2/2) Example (2/2)

<com m ent > Rhubar b Cobbl er m ade wi t h bananas as t he m ai n sweet ener . I t was del i ci ous. </ com m ent > <nut r i t i on cal or i es=" 170" f at =" 28% " car bohydr at es=" 58% " pr ot ei n=" 14% " / > <r el at ed r ef =" 42" >G ar den Q ui che i s al so yum m y</ r el at ed> </ r eci pe> </ col l ect i on>

7

An Introduction to XML and Web Technologies

Building on the XML Notation Building on the XML Notation

Defining the syntax of our recipe language

  • DTD, XML Schema, ...

Showing recipe documents in browsers

  • XPath, XSLT

Recipe collections as databases

  • XQuery

Building a Web-based recipe editor

  • HTTP, Servlets, JSP, ...

... – the topics of the following weeks...

8

An Introduction to XML and Web Technologies

XML Trees XML Trees

Conceptually, an XML document is a tree structure

  • node, edge
  • root, leaf
  • child, parent
  • sibling (ordered),

ancestor, descendant

slide-3
SLIDE 3

3

9

An Introduction to XML and Web Technologies

An Analogy: File Systems An Analogy: File Systems

10

An Introduction to XML and Web Technologies

Tree View of the XML Recipes Tree View of the XML Recipes

11

An Introduction to XML and Web Technologies

Nodes in XML Trees Nodes in XML Trees

Text nodes: carry the actual contents, leaf nodes Element nodes: define hierarchical logical groupings of contents, each have a name Attribute nodes: unordered, each associated with an element node, has a name and a value Comment nodes: ignorable meta-information Processing instructions: instructions to specific processors, each have a target and a value Root nodes: every XML tree has one root node that represents the entire tree

12

An Introduction to XML and Web Technologies

Textual Representation Textual Representation

Text nodes: written as the text they carry Element nodes: start-end tags

  • <bla ...> ... </ bla>
  • short-hand notation for empty elements: <bla/ >

Attribute nodes: name=“ value” in start tags Comment nodes: <! - - bla - - > Processing instructions: <?target value?> Root nodes: implicit

slide-4
SLIDE 4

4

13

An Introduction to XML and Web Technologies

Browsing XML (without XSLT) Browsing XML (without XSLT)

14

An Introduction to XML and Web Technologies

More Constructs More Constructs

XML declaration Character references CDATA sections Document type declarations and entity references explained later... Whitespace?

15

An Introduction to XML and Web Technologies

Example Example

<?xm l ver si on=" 1. 1" encodi ng=" I SO

  • 8859- 1" ?>

<! DO CTYPE f eat ur es SYSTEM " exam pl e. dt d" > <f eat ur es a=" b" > <?m yt ool her e i s som e i nf or m at i on speci f i c t o m yt ool ?> El señor est á bi en, gar çon! Copyr i ght &#169; 2005 <! [ CDATA[ <t hi s i s not a t ag> ] ] > <! - - al ways r em em ber t o speci f y t he r i ght char act er encodi ng - - > </ f eat ur es>

16

An Introduction to XML and Web Technologies

Well Well-

  • formedness

formedness

Every XML document must be well-formed

  • start and end tags must match and nest properly
  • <x><y></y></x>
  • </z><x><y></x></y>
  • exactly one root element
  • ...

in other words, it defines a proper tree structure XML parser: given the textual XML document, constructs its tree representation

slide-5
SLIDE 5

5

17

An Introduction to XML and Web Technologies

Simpler Alternatives? Simpler Alternatives?

S-expressions, 1958:

( col l ect i on ( r eci pe ( t i t l e " Rhubar b Cobbl er " ) ( dat e " W ed, 14 Jun 95" ) . . . ) )

XML is defined as a simplified subset of SGML XML could have been designed simpler... ... but it wasn’t [end of discussion]

18

An Introduction to XML and Web Technologies

Applications Applications

Rough classification: Data-oriented languages Document-oriented languages Protocols and programming languages Hybrids

19

An Introduction to XML and Web Technologies

Example: XHTML Example: XHTML

<?xm l ver si on=" 1. 0" encodi ng=" UTF- 8" ?> <ht m l xm l ns=" ht t p: / / www. w3. or g/ 1999/ xht m l " > <head><t i t l e>Hel l o wor l d! </ t i t l e></ head> <body> <h1>Thi s i s a headi ng</ h1> Thi s i s som e t ext . </ body> </ ht m l >

20

An Introduction to XML and Web Technologies

Example: CML Example: CML

<m

  • l ecul e i d=" M

ETHANO L" > <at om Ar r ay> <st r i ngAr r ay bui l t i n=" i d" >a1 a2 a3 a4 a5 a6</ st r i ngAr r ay> <st r i ngAr r ay bui l t i n=" el em ent Type" >C O H H H H</ st r i ngAr r ay> <f l oat Ar r ay bui l t i n=" x3" uni t s=" pm " >

  • 0. 748 0. 558 . . .

</ f l oat Ar r ay> <f l oat Ar r ay bui l t i n=" y3" uni t s=" pm " >

  • 0. 015 0. 420 . . .

</ f l oat Ar r ay> <f l oat Ar r ay bui l t i n=" z3" uni t s=" pm " >

  • 0. 024 - 0. 278 . . .

</ f l oat Ar r ay> </ at om Ar r ay> </ m

  • l ecul e>
slide-6
SLIDE 6

6

21

An Introduction to XML and Web Technologies

Example: Example: ebXML ebXML

<M ul t i Par t yCol l abor at i on nam e=" Dr opShi p" > <Busi nessPar t ner Rol e nam e=" Cust om er " > <Per f or m s i ni t i at i ngRol e=' / / bi nar yCol l abor at i on[ @ nam e=" Fi r m O r der " ] / I ni t i at i ngRol e[ @ nam e=" buyer " ] ' / > </ Busi nessPar t ner Rol e> <Busi nessPar t ner Rol e nam e=" Ret ai l er " > <Per f or m s r espondi ngRol e=' / / bi nar yCol l abor at i on[ @ nam e=" Fi r m O r der " ] / Respondi ngRol e[ @ nam e=" sel l er " ] ' / > <Per f or m s i ni t i at i ngRol e=' / / bi nar yCol l abor at i on[ . . . ] / I ni t i at i ngRol e[ @ nam e=" buyer " ] ' / > </ Busi nessPar t ner Rol e> <Busi nessPar t ner Rol e nam e=" Dr opShi p Vendor " > . . . </ Busi nessPar t ner Rol e> </ M ul t i Par t yCol l abor at i on>

22

An Introduction to XML and Web Technologies

Example: Example: ThML ThML

<h3 cl ass=" s05" i d=" O

  • ne. 2. p0. 2" >Havi ng a Hum

bl e O pi ni on of Sel f </ h3> <p cl ass=" Fi r st " i d=" O

  • ne. 2. p0. 3" >EVERY m

an nat ur al l y desi r es knowl edge <not e pl ace=" f oot " i d=" O

  • ne. 2. p0. 4" >

<p cl ass=" Foot not e" i d=" O

  • ne. 2. p0. 5" ><added i d=" O
  • ne. 2. p0. 6" >

<nam e i d=" O

  • ne. 2. p0. 7" >Ar i st ot l e</ nam

e>, M et aphysi cs, i . 1. </ added></ p> </ not e>; but what good i s knowl edge wi t hout f ear of G

  • d? I ndeed a hum

bl e r ust i c who ser ves G

  • d i s bet t er t han a pr oud i nt el l ect ual who

negl ect s hi s soul t o st udy t he cour se of t he st ar s. <added i d=" O

  • ne. 2. p0. 8" ><not e pl ace=" f oot " i d=" O
  • ne. 2. p0. 9" >

<p cl ass=" Foot not e" i d=" O

  • ne. 2. p0. 10" >

August i ne, Conf essi ons V. 4. </ p> </ not e></ added> </ p>

23

An Introduction to XML and Web Technologies

XML Namespaces XML Namespaces

  • When combining languages, element names may become

ambiguous!

  • Common problems call for common solutions

<wi dget t ype=" gadget " > <head si ze=" m edi um " / > <bi g><subwi dget r ef =" gi zm

  • " / ></ bi g>

<i nf o> <head> <t i t l e>Descr i pt i on of gadget </ t i t l e> </ head> <body> <h1>G adget </ h1> A gadget cont ai ns a bi g gi zm

  • </ body>

</ i nf o> </ wi dget >

24

An Introduction to XML and Web Technologies

The Idea The Idea

Assign a URI to every (sub-)language e.g. ht t p: / / www. w3. or g/ 1999/ xht m l for XHTML 1.0 Qualify element names with URIs: { h { ht t p: / t t p: / / www / www. w3. o . w3. or g/ 1 r g/ 1999/ x 999/ xht m l ht m l } head

slide-7
SLIDE 7

7

25

An Introduction to XML and Web Technologies

The Actual Solution The Actual Solution

Namespace declarations bind URIs to prefixes

<. . . xm l xm l ns ns: f : f oo

  • o=" ht t p: / / www. w3. or g/ TR/ xht m

l 1" > . . . <f o f oo:

  • : head>. . . </ f o

f oo:

  • : head>

. . . </ . . . >

Lexical scope Default namespace (no prefix) declared with

xm l ns m l ns=" . . . “

Attribute names can also be prefixed

26

An Introduction to XML and Web Technologies

Widgets with Namespaces Widgets with Namespaces

Namespace map: for each element, maps prefixes to URIs

<wi dget t ype=" gadget " xm l ns xm l ns=" h =" ht t p t t p: / : / / ww / www.

  • w. wi

wi dge dget . i t . i nc nc" > <head si ze=" m edi um " / > <bi g><subwi dget r ef =" gi zm

  • " / ></ bi g>

<i nf o xm l n xm l ns: x s: xht m ht m l = l =" ht " ht t p t p: / : / / ww / www. w

  • w. w3.
  • 3. or g
  • r g/ T

/ TR/ R/ xht xht m l 1 m l 1" > <xht m l : head> <xht m l : t i t l e>Descr i pt i on of gadget </ xht m l : t i t l e> </ xht m l : head> <xht m l : body> <xht m l : h1>G adget </ xht m l : h1> A gadget cont ai ns a bi g gi zm

  • </ xht m

l : body> </ i nf o> </ wi dget >

27

An Introduction to XML and Web Technologies

Summary Summary

XML: a notation for hierarchically structured text Conceptual tree model vs. concrete textual representation Well-formedness Namespaces

28

An Introduction to XML and Web Technologies

Essential Online Resources Essential Online Resources

ht t p: / / www. w3. or g/ TR/ xm l 11/ ht t p: / / www. w3. or g/ TR/ xm l - nam es11 ht t p: / / www. uni code. or g/