XML Extensible Markup Language Generic format for structured - - PowerPoint PPT Presentation

xml extensible markup language
SMART_READER_LITE
LIVE PREVIEW

XML Extensible Markup Language Generic format for structured - - PowerPoint PPT Presentation

XML XML Extensible Markup Language Generic format for structured representation of data. DD1335 (Lecture 9) Basic Internet Programming Spring 2010 1 / 34 XML XML Extensible Markup Language Generic format for structured


slide-1
SLIDE 1

XML

XML – Extensible Markup Language

Generic format for structured representation of data.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 1 / 34

slide-2
SLIDE 2

XML

XML – Extensible Markup Language

Generic format for structured representation of data. No predefined tags, but a syntax similar to HTML.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 1 / 34

slide-3
SLIDE 3

XML

XML – Extensible Markup Language

Generic format for structured representation of data. No predefined tags, but a syntax similar to HTML. Applications:

◮ Web services, business transactions

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 1 / 34

slide-4
SLIDE 4

XML

XML – Extensible Markup Language

Generic format for structured representation of data. No predefined tags, but a syntax similar to HTML. Applications:

◮ Web services, business transactions ◮ XHTML – HTML on XML syntax

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 1 / 34

slide-5
SLIDE 5

XML

XML – Extensible Markup Language

Generic format for structured representation of data. No predefined tags, but a syntax similar to HTML. Applications:

◮ Web services, business transactions ◮ XHTML – HTML on XML syntax ◮ The graphics format SVG

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 1 / 34

slide-6
SLIDE 6

XML

XML – Extensible Markup Language

Generic format for structured representation of data. No predefined tags, but a syntax similar to HTML. Applications:

◮ Web services, business transactions ◮ XHTML – HTML on XML syntax ◮ The graphics format SVG ◮ Configuration files

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 1 / 34

slide-7
SLIDE 7

XML

XML – Extensible Markup Language

Generic format for structured representation of data. No predefined tags, but a syntax similar to HTML. Applications:

◮ Web services, business transactions ◮ XHTML – HTML on XML syntax ◮ The graphics format SVG ◮ Configuration files ◮ Much more . . .

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 1 / 34

slide-8
SLIDE 8

XML

XML – Strengths

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 2 / 34

slide-9
SLIDE 9

XML

XML – Strengths

◮ Open standard from W3C

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 2 / 34

slide-10
SLIDE 10

XML

XML – Strengths

◮ Open standard from W3C ◮ Simple text format, easy to parse

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 2 / 34

slide-11
SLIDE 11

XML

XML – Strengths

◮ Open standard from W3C ◮ Simple text format, easy to parse ◮ Supported by numerous vendors and platforms

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 2 / 34

slide-12
SLIDE 12

XML

XML – Strengths

◮ Open standard from W3C ◮ Simple text format, easy to parse ◮ Supported by numerous vendors and platforms ◮ Excellent for transactions between different systems

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 2 / 34

slide-13
SLIDE 13

XML

XML – Strengths

◮ Open standard from W3C ◮ Simple text format, easy to parse ◮ Supported by numerous vendors and platforms ◮ Excellent for transactions between different systems ◮ Structure allows for search

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 2 / 34

slide-14
SLIDE 14

XML

XML – Strengths

◮ Open standard from W3C ◮ Simple text format, easy to parse ◮ Supported by numerous vendors and platforms ◮ Excellent for transactions between different systems ◮ Structure allows for search ◮ Facilitates separation between content and presentation

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 2 / 34

slide-15
SLIDE 15

XML

XML – Example

<?xml version="1.0"?> <pricelist> <item> <name>Pears</name> <price>12.90</price> </item> <item> <name>Apples</name> <price>19.90</price> </item> </pricelist>

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 3 / 34

slide-16
SLIDE 16

XML

XML – Form

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 4 / 34

slide-17
SLIDE 17

XML

XML – Form

◮ The XML declaration first, perhaps stating the file encoding.

For example one of

<?xml version="1.0" encoding="UTF-8"?> <?xml version="1.0" encoding="ISO-8859-1"?>

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 4 / 34

slide-18
SLIDE 18

XML

XML – Form

◮ The XML declaration first, perhaps stating the file encoding.

For example one of

<?xml version="1.0" encoding="UTF-8"?> <?xml version="1.0" encoding="ISO-8859-1"?>

◮ More declarations may follow.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 4 / 34

slide-19
SLIDE 19

XML

XML – Form

◮ The XML declaration first, perhaps stating the file encoding.

For example one of

<?xml version="1.0" encoding="UTF-8"?> <?xml version="1.0" encoding="ISO-8859-1"?>

◮ More declarations may follow. ◮ Thereafter exactly one XML element on the outermost level. (Pricelist in

the example.)

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 4 / 34

slide-20
SLIDE 20

XML

XML – Form

◮ The XML declaration first, perhaps stating the file encoding.

For example one of

<?xml version="1.0" encoding="UTF-8"?> <?xml version="1.0" encoding="ISO-8859-1"?>

◮ More declarations may follow. ◮ Thereafter exactly one XML element on the outermost level. (Pricelist in

the example.)

◮ End tags required. (Compare with <p> in HTML.)

Special case: An empty element may be abbreviated: <a></a> becomes <a/>. (<a /> also allowed.)

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 4 / 34

slide-21
SLIDE 21

XML

XML – Form

◮ The XML declaration first, perhaps stating the file encoding.

For example one of

<?xml version="1.0" encoding="UTF-8"?> <?xml version="1.0" encoding="ISO-8859-1"?>

◮ More declarations may follow. ◮ Thereafter exactly one XML element on the outermost level. (Pricelist in

the example.)

◮ End tags required. (Compare with <p> in HTML.)

Special case: An empty element may be abbreviated: <a></a> becomes <a/>. (<a /> also allowed.)

◮ Correct nesting required. <a><bbb></a></bbb> never allowed.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 4 / 34

slide-22
SLIDE 22

XML

XML – Form

◮ The XML declaration first, perhaps stating the file encoding.

For example one of

<?xml version="1.0" encoding="UTF-8"?> <?xml version="1.0" encoding="ISO-8859-1"?>

◮ More declarations may follow. ◮ Thereafter exactly one XML element on the outermost level. (Pricelist in

the example.)

◮ End tags required. (Compare with <p> in HTML.)

Special case: An empty element may be abbreviated: <a></a> becomes <a/>. (<a /> also allowed.)

◮ Correct nesting required. <a><bbb></a></bbb> never allowed. ◮ Attribute values must be between quote marks. Example (SVG):

<circle cx="10" cy="10" r="5" />.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 4 / 34

slide-23
SLIDE 23

XML

XML – Form

◮ The XML declaration first, perhaps stating the file encoding.

For example one of

<?xml version="1.0" encoding="UTF-8"?> <?xml version="1.0" encoding="ISO-8859-1"?>

◮ More declarations may follow. ◮ Thereafter exactly one XML element on the outermost level. (Pricelist in

the example.)

◮ End tags required. (Compare with <p> in HTML.)

Special case: An empty element may be abbreviated: <a></a> becomes <a/>. (<a /> also allowed.)

◮ Correct nesting required. <a><bbb></a></bbb> never allowed. ◮ Attribute values must be between quote marks. Example (SVG):

<circle cx="10" cy="10" r="5" />.

◮ As in HTML, ”entities” are used for some characters. Example: &lt; for <

(Starts a tag otherwise).

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 4 / 34

slide-24
SLIDE 24

XML

XML – Form

◮ The XML declaration first, perhaps stating the file encoding.

For example one of

<?xml version="1.0" encoding="UTF-8"?> <?xml version="1.0" encoding="ISO-8859-1"?>

◮ More declarations may follow. ◮ Thereafter exactly one XML element on the outermost level. (Pricelist in

the example.)

◮ End tags required. (Compare with <p> in HTML.)

Special case: An empty element may be abbreviated: <a></a> becomes <a/>. (<a /> also allowed.)

◮ Correct nesting required. <a><bbb></a></bbb> never allowed. ◮ Attribute values must be between quote marks. Example (SVG):

<circle cx="10" cy="10" r="5" />.

◮ As in HTML, ”entities” are used for some characters. Example: &lt; for <

(Starts a tag otherwise).

◮ A well-formed document – follows the syntactic rules.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 4 / 34

slide-25
SLIDE 25

XML

XML – Specifying valid content

Different applications expect different content in their XML files. Several techniques to specify valid content:

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 5 / 34

slide-26
SLIDE 26

XML

XML – Specifying valid content

Different applications expect different content in their XML files. Several techniques to specify valid content:

◮ DTD (document type definition). W3C’s first standard.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 5 / 34

slide-27
SLIDE 27

XML

XML – Specifying valid content

Different applications expect different content in their XML files. Several techniques to specify valid content:

◮ DTD (document type definition). W3C’s first standard. ◮ XML schemas. W3C’s follow-up standard with data types and name

  • spaces. Rich but complicated.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 5 / 34

slide-28
SLIDE 28

XML

XML – Specifying valid content

Different applications expect different content in their XML files. Several techniques to specify valid content:

◮ DTD (document type definition). W3C’s first standard. ◮ XML schemas. W3C’s follow-up standard with data types and name

  • spaces. Rich but complicated.

◮ Several private initiatives, including well-supported Relax NG.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 5 / 34

slide-29
SLIDE 29

XML

XML – Specifying valid content

Different applications expect different content in their XML files. Several techniques to specify valid content:

◮ DTD (document type definition). W3C’s first standard. ◮ XML schemas. W3C’s follow-up standard with data types and name

  • spaces. Rich but complicated.

◮ Several private initiatives, including well-supported Relax NG. ◮ An instance document is valid if it satisfies a specification.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 5 / 34

slide-30
SLIDE 30

XML

XML – Document Type Definition

DTD for the pricelist example

<!ELEMENT pricelist (item*)> <!ELEMENT item (name, price)> <!ELEMENT name (#PCDATA)> <!ELEMENT price (#PCDATA)>

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 6 / 34

slide-31
SLIDE 31

XML

XML – Document Type Definition

DTD for the pricelist example

<!ELEMENT pricelist (item*)> <!ELEMENT item (name, price)> <!ELEMENT name (#PCDATA)> <!ELEMENT price (#PCDATA)>

◮ A pricelist element contains any number of item elements.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 6 / 34

slide-32
SLIDE 32

XML

XML – Document Type Definition

DTD for the pricelist example

<!ELEMENT pricelist (item*)> <!ELEMENT item (name, price)> <!ELEMENT name (#PCDATA)> <!ELEMENT price (#PCDATA)>

◮ A pricelist element contains any number of item elements. ◮ An item element contains one name and one price element.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 6 / 34

slide-33
SLIDE 33

XML

XML – Document Type Definition

DTD for the pricelist example

<!ELEMENT pricelist (item*)> <!ELEMENT item (name, price)> <!ELEMENT name (#PCDATA)> <!ELEMENT price (#PCDATA)>

◮ A pricelist element contains any number of item elements. ◮ An item element contains one name and one price element. ◮ The name and price elements consist of parsed character data.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 6 / 34

slide-34
SLIDE 34

XML

XML – Document Type Definition

DTD for the pricelist example

<!ELEMENT pricelist (item*)> <!ELEMENT item (name, price)> <!ELEMENT name (#PCDATA)> <!ELEMENT price (#PCDATA)>

◮ A pricelist element contains any number of item elements. ◮ An item element contains one name and one price element. ◮ The name and price elements consist of parsed character data.

Reference to external DTD in instance document: <!DOCTYPE pricelist SYSTEM "pricelist.dtd">

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 6 / 34

slide-35
SLIDE 35

XML

XML – Schemas

XML schemas offer more flexibility than DTDs.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 7 / 34

slide-36
SLIDE 36

XML

XML – Schemas

XML schemas offer more flexibility than DTDs. Data types are supported, with several built-in types such as

◮ String types ◮ Numeric types ◮ Types for date and time

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 7 / 34

slide-37
SLIDE 37

XML

XML – Schemas

XML schemas offer more flexibility than DTDs. Data types are supported, with several built-in types such as

◮ String types ◮ Numeric types ◮ Types for date and time

Minimum and maximum values may be specified, sets may be enumerated,

  • etc. Unlike DTDs, schemas are themselves defined in XML.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 7 / 34

slide-38
SLIDE 38

XML

XML – Name spaces

You may need to combine parts from different schemas.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 8 / 34

slide-39
SLIDE 39

XML

XML – Name spaces

You may need to combine parts from different schemas. Together with schemas, name spaces were introduced to avoid name conflicts.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 8 / 34

slide-40
SLIDE 40

XML

XML – Name spaces

You may need to combine parts from different schemas. Together with schemas, name spaces were introduced to avoid name conflicts. A name space is identified with a URL, and used with an arbitrary prefix.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 8 / 34

slide-41
SLIDE 41

XML

XML – Name spaces

You may need to combine parts from different schemas. Together with schemas, name spaces were introduced to avoid name conflicts. A name space is identified with a URL, and used with an arbitrary prefix. Note! The URL only serves as a name. There is no requirement on content.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 8 / 34

slide-42
SLIDE 42

XML

XML – Example, name spaces

<ica:pricelist xmlns:ica="http://www.ica.se/"> ... </ica:pricelist>

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 9 / 34

slide-43
SLIDE 43

XML

XML – Example, name spaces

<ica:pricelist xmlns:ica="http://www.ica.se/"> ... </ica:pricelist> Here, xmlns stands for XML name space.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 9 / 34

slide-44
SLIDE 44

XML

XML – Example, name spaces

<ica:pricelist xmlns:ica="http://www.ica.se/"> ... </ica:pricelist> Here, xmlns stands for XML name space. Defining a default namespace (no prefix): <pricelist xmlns="http://www.ica.se/"> ... </pricelist>

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 9 / 34

slide-45
SLIDE 45

XML

XML – Schema for the pricelist element (1/3)

The first part of the schema: <?xml version="1.0"?> <schema xmlns="http://www.w3.org/2001/XMLSchema" xmlns:ica="http://www.ica.se/" targetNamespace="http://www.ica.se/" elementFormDefault="unqualified"> ...

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 10 / 34

slide-46
SLIDE 46

XML

XML – Schema for the pricelist element (2/3)

... <element name="pricelist"> <complexType> <sequence> <element name="item" type="ica:item" minOccurs="0" maxOccurs="unbounded"> </element> </sequence> </complexType> </element> ...

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 11 / 34

slide-47
SLIDE 47

XML

XML – Schema for the pricelist element (3/3)

... <complexType name="item"> <sequence> <element name="name" type="string" /> <element name="price" type="string" /> </sequence> </complexType> </schema>

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 12 / 34

slide-48
SLIDE 48

XML

XML – Comments to the schema

With xmlns="http://www.w3.org/2001/XMLSchema" we choose as default name space W3C’s schema for schema definition. From there we use the elements schema, element, complexType and sequence, and the type string.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 13 / 34

slide-49
SLIDE 49

XML

XML – Comments to the schema

With xmlns="http://www.w3.org/2001/XMLSchema" we choose as default name space W3C’s schema for schema definition. From there we use the elements schema, element, complexType and sequence, and the type string. With targetNamespace="http://www.ica.se/" we define the name space of the new pricelist element, as well as the type item. To access this type ourselves, we also had to define the ica prefix.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 13 / 34

slide-50
SLIDE 50

XML

XML – Comments to the schema

With xmlns="http://www.w3.org/2001/XMLSchema" we choose as default name space W3C’s schema for schema definition. From there we use the elements schema, element, complexType and sequence, and the type string. With targetNamespace="http://www.ica.se/" we define the name space of the new pricelist element, as well as the type item. To access this type ourselves, we also had to define the ica prefix. Regarding elementFormDefault="unqualified", see the next slide, and http://www.xfront.com/HideVersusExpose.pdf.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 13 / 34

slide-51
SLIDE 51

XML

XML – Using the schema

Refer like this in the instance document: <?xml version="1.0"?> <ica:pricelist xmlns:ica="http://www.ica.se/"> <item> <name>Pears</name> <price>12.90</price> </item> </ica:pricelist>

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 14 / 34

slide-52
SLIDE 52

XML

XML – Using the schema

Refer like this in the instance document: <?xml version="1.0"?> <ica:pricelist xmlns:ica="http://www.ica.se/"> <item> <name>Pears</name> <price>12.90</price> </item> </ica:pricelist>

  • Note. Only pricelist is name space qualified.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 14 / 34

slide-53
SLIDE 53

XML

XML – Using the schema

Refer like this in the instance document: <?xml version="1.0"?> <ica:pricelist xmlns:ica="http://www.ica.se/"> <item> <name>Pears</name> <price>12.90</price> </item> </ica:pricelist>

  • Note. Only pricelist is name space qualified.

With elementFormDefault="qualified" all elements would have needed qualification.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 14 / 34

slide-54
SLIDE 54

XML

XML – Best Practices

A schema for an organization should perhaps

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 15 / 34

slide-55
SLIDE 55

XML

XML – Best Practices

A schema for an organization should perhaps

◮ work smoothly with other schemas

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 15 / 34

slide-56
SLIDE 56

XML

XML – Best Practices

A schema for an organization should perhaps

◮ work smoothly with other schemas ◮ allow updating without making old instance document invalid

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 15 / 34

slide-57
SLIDE 57

XML

XML – Best Practices

A schema for an organization should perhaps

◮ work smoothly with other schemas ◮ allow updating without making old instance document invalid ◮ allow instance documents to contain extra information

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 15 / 34

slide-58
SLIDE 58

XML

XML – Best Practices

A schema for an organization should perhaps

◮ work smoothly with other schemas ◮ allow updating without making old instance document invalid ◮ allow instance documents to contain extra information

This is not easy to attain.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 15 / 34

slide-59
SLIDE 59

XML

XML – Best Practices

A schema for an organization should perhaps

◮ work smoothly with other schemas ◮ allow updating without making old instance document invalid ◮ allow instance documents to contain extra information

This is not easy to attain. See advice at http://www.xfront.com/BestPracticesHomepage.html

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 15 / 34

slide-60
SLIDE 60

XML

XML – Relax NG

◮ A simpler schema definition language than that from W3C.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 16 / 34

slide-61
SLIDE 61

XML

XML – Relax NG

◮ A simpler schema definition language than that from W3C. ◮ Has become an ISO standard (ISO/IEC 19757-2) in sept 2009.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 16 / 34

slide-62
SLIDE 62

XML

XML – Relax NG

◮ A simpler schema definition language than that from W3C. ◮ Has become an ISO standard (ISO/IEC 19757-2) in sept 2009. ◮ Two syntaxes: Compact Syntax and an XML syntax.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 16 / 34

slide-63
SLIDE 63

XML

XML – Relax NG

◮ A simpler schema definition language than that from W3C. ◮ Has become an ISO standard (ISO/IEC 19757-2) in sept 2009. ◮ Two syntaxes: Compact Syntax and an XML syntax. ◮ See links at the end of

http://www.xmlhack.com/read.php?item=2061

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 16 / 34

slide-64
SLIDE 64

XML

XML – Schema with Relax NG Compact Syntax

namespace ica = "http://www.ica.se/" element ica:pricelist { element item { element name {text}, element price {text} }* }

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 17 / 34

slide-65
SLIDE 65

XML

XML – Schema with Relax NG Compact Syntax

namespace ica = "http://www.ica.se/" element ica:pricelist { element item { element name {text}, element price {text} }* } The compact form may be translated to the XML form with the java program trang. See http://www.abbeyworkshop.com/howto/xml/xml_relax_overview/

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 17 / 34

slide-66
SLIDE 66

XML

XML – Schema with Relax NG XML Syntax

<?xml version="1.0"?> <element name="ica:pricelist" xmlns:ica="http://www.ica.se/" xmlns="http://relaxng.org/ns/structure/1.0"> <zeroOrMore> <element name="item"> <element name="name"> <text /> </element> <element name="price"> <text /> </element> </element> </zeroOrMore> </element>

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 18 / 34

slide-67
SLIDE 67

XML

XML – Validation

On Unix/Linux xmllint --noout file check for validity, only show errors

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 19 / 34

slide-68
SLIDE 68

XML

XML – Validation

On Unix/Linux xmllint --noout file check for validity, only show errors

  • -dtdvalid

validate against external DTD

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 19 / 34

slide-69
SLIDE 69

XML

XML – Validation

On Unix/Linux xmllint --noout file check for validity, only show errors

  • -dtdvalid

validate against external DTD

  • -schema

validate against W3C-schema

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 19 / 34

slide-70
SLIDE 70

XML

XML – Validation

On Unix/Linux xmllint --noout file check for validity, only show errors

  • -dtdvalid

validate against external DTD

  • -schema

validate against W3C-schema

  • -relaxng

validate against Relax NG schema

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 19 / 34

slide-71
SLIDE 71

XML

XML – Validation

On Unix/Linux xmllint --noout file check for validity, only show errors

  • -dtdvalid

validate against external DTD

  • -schema

validate against W3C-schema

  • -relaxng

validate against Relax NG schema Web pages such as http://tools.decisionsoft.com/schemaValidate.html

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 19 / 34

slide-72
SLIDE 72

XML

XML – The parse tree

<pricelist> <price>12.90</price> </item> <pricelist> <item> <name>Pear</name> </item> <item> <name>Apple</name> <price>19.90</price>

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 20 / 34

slide-73
SLIDE 73

XML

XML – The parse tree

<pricelist> <price>12.90</price> </item> <pricelist> <item> <name>Pear</name> </item> <item> <name>Apple</name> <price>19.90</price> pricelist name name price price item item

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 20 / 34

slide-74
SLIDE 74

XML

XSL – Extensible Stylesheet Language

For presentation of XML documents.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 21 / 34

slide-75
SLIDE 75

XML

XSL – Extensible Stylesheet Language

For presentation of XML documents. Compare with HTML, which conveys presentational structure by itself. Additional style information may be put in a stylesheet.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 21 / 34

slide-76
SLIDE 76

XML

XSL – Extensible Stylesheet Language

For presentation of XML documents. Compare with HTML, which conveys presentational structure by itself. Additional style information may be put in a stylesheet. XML says nothing about presentation. So XSL has three different components:

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 21 / 34

slide-77
SLIDE 77

XML

XSL – Extensible Stylesheet Language

For presentation of XML documents. Compare with HTML, which conveys presentational structure by itself. Additional style information may be put in a stylesheet. XML says nothing about presentation. So XSL has three different components:

◮ XSLT (XSL Transformation) – selects elements in the XML file. Can sort, perform

tests, etc.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 21 / 34

slide-78
SLIDE 78

XML

XSL – Extensible Stylesheet Language

For presentation of XML documents. Compare with HTML, which conveys presentational structure by itself. Additional style information may be put in a stylesheet. XML says nothing about presentation. So XSL has three different components:

◮ XSLT (XSL Transformation) – selects elements in the XML file. Can sort, perform

tests, etc.

◮ XPath – syntax for positioning in the XML tree. Similar to path notation in a file

system.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 21 / 34

slide-79
SLIDE 79

XML

XSL – Extensible Stylesheet Language

For presentation of XML documents. Compare with HTML, which conveys presentational structure by itself. Additional style information may be put in a stylesheet. XML says nothing about presentation. So XSL has three different components:

◮ XSLT (XSL Transformation) – selects elements in the XML file. Can sort, perform

tests, etc.

◮ XPath – syntax for positioning in the XML tree. Similar to path notation in a file

system.

◮ XSL-FO (XSL Formatting Objects) – Page formatting.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 21 / 34

slide-80
SLIDE 80

XML

XSL – Example on XSLT and XPath

<?xml version="1.0" ?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:ica="http://www.ica.se/"> <xsl:template match="/"> <html> <body> <table border="1" cellpadding="5" cellspacing="0"> <tr><th>Item</th><th>Price</th></tr> <xsl:for-each select="ica:pricelist/item"> <tr><td><xsl:value-of select="name"/></td> <td><xsl:value-of select="price"/></td> </tr> </xsl:for-each> </table> </body> </html> </xsl:template> </xsl:stylesheet>

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 22 / 34

slide-81
SLIDE 81

XML

XSL – Comments on the XSLT example

<xsl:template match="/"> says that the template should start matching from the root of the XML tree.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 23 / 34

slide-82
SLIDE 82

XML

XSL – Comments on the XSLT example

<xsl:template match="/"> says that the template should start matching from the root of the XML tree. For each item in pricelist we then create a row in an HTML table.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 23 / 34

slide-83
SLIDE 83

XML

XSL – Comments on the XSLT example

<xsl:template match="/"> says that the template should start matching from the root of the XML tree. For each item in pricelist we then create a row in an HTML table. The row will contain the name and the price of the item.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 23 / 34

slide-84
SLIDE 84

XML

Referring to the stylesheet(s) in the XML file

<?xml version="1.0" ?> <?xml-stylesheet type="text/xsl" href="pricelist.xsl" ?>

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 24 / 34

slide-85
SLIDE 85

XML

Referring to the stylesheet(s) in the XML file

<?xml version="1.0" ?> <?xml-stylesheet type="text/xsl" href="pricelist.xsl" ?> <ica:pricelist xmlns:ica="http://www.ica.se/"> <item> <name>Pears</name> <price>12.90</price> </item> <item> <name>Apples</name> <price>19.90</price> </item> </ica:pricelist> See the result on

http://www.csc.kth.se/utbildning/kth/kurser/DD1335/gruint10/test/pricelist.xml

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 24 / 34

slide-86
SLIDE 86

XML

XSL – Tip

Put the files in your public_html directory and view them in a web browser the normal way (http://...).

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 25 / 34

slide-87
SLIDE 87

XML

XSL – Tip

Put the files in your public_html directory and view them in a web browser the normal way (http://...). The browser depends on the MIME-type that the server sends in the HTTP heading.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 25 / 34

slide-88
SLIDE 88

XML

XSL – Tip

Put the files in your public_html directory and view them in a web browser the normal way (http://...). The browser depends on the MIME-type that the server sends in the HTTP heading. With “Open File” this information is not obtained.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 25 / 34

slide-89
SLIDE 89

XML

DOM – Document Object Model

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 26 / 34

slide-90
SLIDE 90

XML

DOM – Document Object Model

◮ W3C object oriented APIs for XML documents.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 26 / 34

slide-91
SLIDE 91

XML

DOM – Document Object Model

◮ W3C object oriented APIs for XML documents. ◮ Access and change a document via the DOM parse tree.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 26 / 34

slide-92
SLIDE 92

XML

DOM – Document Object Model

◮ W3C object oriented APIs for XML documents. ◮ Access and change a document via the DOM parse tree. ◮ Methods such as

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 26 / 34

slide-93
SLIDE 93

XML

DOM – Document Object Model

◮ W3C object oriented APIs for XML documents. ◮ Access and change a document via the DOM parse tree. ◮ Methods such as

◮ documentElement – returns the root node DD1335 (Lecture 9) Basic Internet Programming Spring 2010 26 / 34

slide-94
SLIDE 94

XML

DOM – Document Object Model

◮ W3C object oriented APIs for XML documents. ◮ Access and change a document via the DOM parse tree. ◮ Methods such as

◮ documentElement – returns the root node ◮ childNodes – returns all children of a node DD1335 (Lecture 9) Basic Internet Programming Spring 2010 26 / 34

slide-95
SLIDE 95

XML

DOM – Document Object Model

◮ W3C object oriented APIs for XML documents. ◮ Access and change a document via the DOM parse tree. ◮ Methods such as

◮ documentElement – returns the root node ◮ childNodes – returns all children of a node ◮ attributes – returns all attributes of a node DD1335 (Lecture 9) Basic Internet Programming Spring 2010 26 / 34

slide-96
SLIDE 96

XML

DOM – Document Object Model

◮ W3C object oriented APIs for XML documents. ◮ Access and change a document via the DOM parse tree. ◮ Methods such as

◮ documentElement – returns the root node ◮ childNodes – returns all children of a node ◮ attributes – returns all attributes of a node ◮ nodeType, nodeValue, etc. DD1335 (Lecture 9) Basic Internet Programming Spring 2010 26 / 34

slide-97
SLIDE 97

XML

DOM – Document Object Model

◮ W3C object oriented APIs for XML documents. ◮ Access and change a document via the DOM parse tree. ◮ Methods such as

◮ documentElement – returns the root node ◮ childNodes – returns all children of a node ◮ attributes – returns all attributes of a node ◮ nodeType, nodeValue, etc. ◮ removeChild, appendChild, etc. DD1335 (Lecture 9) Basic Internet Programming Spring 2010 26 / 34

slide-98
SLIDE 98

XML

XML in Java

JAXP – Java API for XML Processing: a common interface to DOM, SAX, and XSLT. SAX:

◮ ”Simple” API for XML ◮ Processes an XML file while reading through it ◮ Fast, memory efficient ◮ More complicated than DOM

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 27 / 34

slide-99
SLIDE 99

XML

XML in Java

JAXP – Java API for XML Processing: a common interface to DOM, SAX, and XSLT. SAX:

◮ ”Simple” API for XML ◮ Processes an XML file while reading through it ◮ Fast, memory efficient ◮ More complicated than DOM

Many other APIs:

◮ JDOM, DOM4J – other DOM implementations ◮ JAXB – converts XML into classes, and vice versa ◮ JAXM, JAX-RPC for asynchronous and synchronous messaging

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 27 / 34

slide-100
SLIDE 100

XML

XML DOM in JAXP

import javax.xml.parsers.*; import org.w3c.dom.Document; // Reads an XML file into a DOM structure. // Usage: java DomExample filename public class DomExample { public static void main(String argv[]) throws Exception { DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); DocumentBuilder builder = factory.newDocumentBuilder(); Document document = builder.parse(argv[0]); // Explore the document here. } }

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 28 / 34

slide-101
SLIDE 101

XML

Web services

“Applications available over a network via standard protocols.”

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 29 / 34

slide-102
SLIDE 102

XML

Web services

“Applications available over a network via standard protocols.” Client-server, as well as peer-to-peer.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 29 / 34

slide-103
SLIDE 103

XML

Web services

“Applications available over a network via standard protocols.” Client-server, as well as peer-to-peer. Typical traditional web service: HTML content, down-loadable with the HTTP protocol.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 29 / 34

slide-104
SLIDE 104

XML

Web services

“Applications available over a network via standard protocols.” Client-server, as well as peer-to-peer. Typical traditional web service: HTML content, down-loadable with the HTTP protocol. New web services: composed of distributed parts that are linked dynamically to run seamlessly.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 29 / 34

slide-105
SLIDE 105

XML

Web services

“Applications available over a network via standard protocols.” Client-server, as well as peer-to-peer. Typical traditional web service: HTML content, down-loadable with the HTTP protocol. New web services: composed of distributed parts that are linked dynamically to run seamlessly. XML plays a key role.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 29 / 34

slide-106
SLIDE 106

XML

Protocols for new web services

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 30 / 34

slide-107
SLIDE 107

XML

Protocols for new web services

◮ SOAP – encodes and transmits messages between programs on the

web.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 30 / 34

slide-108
SLIDE 108

XML

Protocols for new web services

◮ SOAP – encodes and transmits messages between programs on the

web.

◮ WSDL (Web Services Description Language) XML format to describe

web services (operations, messages, types etc).

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 30 / 34

slide-109
SLIDE 109

XML

Protocols for new web services

◮ SOAP – encodes and transmits messages between programs on the

web.

◮ WSDL (Web Services Description Language) XML format to describe

web services (operations, messages, types etc).

◮ UDDI (Universal Description Discovery and Integration) Protocol for

distributed registries over web services. Company information like name, category and services. Uses WSDL and SOAP . A program should automatically be able to find and use the services it needs.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 30 / 34

slide-110
SLIDE 110

XML

SOAP

Simple Object Access Protocol (a misnomer, not really object oriented).

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 31 / 34

slide-111
SLIDE 111

XML

SOAP

Simple Object Access Protocol (a misnomer, not really object oriented). Smaller and simpler to implement than earlier distributed protocols, which did not become so wide spread.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 31 / 34

slide-112
SLIDE 112

XML

SOAP

Simple Object Access Protocol (a misnomer, not really object oriented). Smaller and simpler to implement than earlier distributed protocols, which did not become so wide spread. Allows programs written in different languages and running on different platforms to communicate.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 31 / 34

slide-113
SLIDE 113

XML

The SOAP format

<?xml version="1.0"?> <soap:Envelope xmlns:soap="http://www.w3.org/2001/12/soap-envelope" soap:encodingStyle="http://www.w3.org/2001/12/soap-encoding"> <soap:Header> ... </soap:Header> <soap:Body> ... </soap:Body> </soap:Envelope>

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 32 / 34

slide-114
SLIDE 114

XML

Use of SOAP

The Body element contains the message in XML. Two common cases are ”document-style” for arbitrary documents, and ”RPC-Style” for function calls. The Body element may also contain a Fault element to describe that a fault has occurred.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 33 / 34

slide-115
SLIDE 115

XML

Use of SOAP

The Body element contains the message in XML. Two common cases are ”document-style” for arbitrary documents, and ”RPC-Style” for function calls. The Body element may also contain a Fault element to describe that a fault has occurred. The Header element is optional, and intended for instructions to intermediaries on the way to the message destination. Intermediaries may add information, verify payment, etc.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 33 / 34

slide-116
SLIDE 116

XML

Use of SOAP

The Body element contains the message in XML. Two common cases are ”document-style” for arbitrary documents, and ”RPC-Style” for function calls. The Body element may also contain a Fault element to describe that a fault has occurred. The Header element is optional, and intended for instructions to intermediaries on the way to the message destination. Intermediaries may add information, verify payment, etc. SOAP is normally sent over HTTP , but other transport protocols (even e-mail) can be used.

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 33 / 34

slide-117
SLIDE 117

XML

XML – Summary

We have looked at

◮ the XML syntax ◮ Three ways to specify allowed elements: DTDs, XML schemas and Relax NG ◮ Name Spaces ◮ XSL for presentation ◮ XML support in Java ◮ SOAP

, WSDL and UDDI for flexible web services

DD1335 (Lecture 9) Basic Internet Programming Spring 2010 34 / 34