XML and Web Services Lecture 8 1 Outline XML (Section 17) XML - - PDF document

xml and web services
SMART_READER_LITE
LIVE PREVIEW

XML and Web Services Lecture 8 1 Outline XML (Section 17) XML - - PDF document

XML and Web Services Lecture 8 1 Outline XML (Section 17) XML syntax, semistructured data Document Type Definitions (DTDs) XML Schema Introduction to XML based Web Services 2 Additional Readings on XML XML


slide-1
SLIDE 1

1

XML and Web Services

Lecture 8

2

Outline

  • XML (Section 17)

– XML syntax, semistructured data – Document Type Definitions (DTDs) – XML Schema

  • Introduction to XML based Web Services
slide-2
SLIDE 2

3

Additional Readings on XML

  • XML

– http://www.w3.org/XML/1999/XML-in-10-points – www.zvon.org/xxl/XMLTutorial/General/book_en.html – http://www.w3.org/TR/REC-xml-names (1/99)

  • Main source: www.w3.org (but hard to

read)

  • Several XML tutorials on the Web

4

XML

  • eXtensible Markup Language
  • XML 1.0 – a recommendation from W3C,

1998

  • Roots: SGML (used in publishing).

– Standardized General Markup Language

  • After the roots: a format for sharing data
slide-3
SLIDE 3

5

XML Data

  • Relational data does not have a syntax

– I cant “give” you my relational database – Need to import it from other syntax, like CSV (comma-separated-values)

  • XML = rich syntax for data

– But XML is not relational: semi-structured

  • Usage:

– Map any data to XML – Store it in files, exchange on the Web, Web services etc. – Even query it directly, using XPath, XQuery

6

XML Data Sharing and Exchange

application relational data

Transform Integrate Warehouse

XML Data WEB (HTTP)

application application legacy data

  • bject-relational

Specific data management tasks

slide-4
SLIDE 4

7

From HTML to XML

HTML describes the layout

8

HTML

<h1> Bibliography </h1> <p> <i> Foundations of Databases </i> Abiteboul, Hull, Vianu <br> Addison Wesley, 1995 <p> <i> Data on the Web </i> Abiteoul, Buneman, Suciu <br> Morgan Kaufmann, 1999

slide-5
SLIDE 5

9

XML

<bibliography> <book> <title> Foundations… </title> <author> Abiteboul </author> <author> Hull </author> <author> Vianu </author> <publisher> Addison Wesley </publisher> <year> 1995 </year> </book> … </bibliography>

XML describes the structure

10

XML Terminology

  • tags: book, title, author, …
  • start tag: <book>, end tag: </book>
  • elements: <book>…</book>,<author>…</author>
  • elements are nested
  • empty element: <red></red> abbrv. <red/>

well formed XML document

  • if it has matching tags
  • tags are properly nested
  • single root element
  • and more constraints, e.g. on names
slide-6
SLIDE 6

11

More XML: Attributes

<book price = “55” currency = “EUR”> <title> Foundations of Databases </title> <author> Abiteboul </author> … <year> 1995 </year> </book> attributes are alternative ways to represent data

12

XML example

<?xml version='1.0' encoding='utf-8'?> <!-- A full XML Example --> <book price = “55” currency = “EUR”> <title> Foundations of Databases </title> <author> Abiteboul </author> … <year> 1995 </year> </book>

slide-7
SLIDE 7

13

More XML: Comments

  • Syntax <!-- .... Comment text... -->

– Same syntax as in HTML

  • Yes, they are part of the data model !!!

– Good documentation should be provided using comments

  • In particular for Web services (see later)

14

XML Data: a Tree !

<data> <person id=“o555” > <name> Mary </name> <address> <street> Maple </street> <no> 345 </no> <city> Seattle </city> </address> </person> <person> <name> John </name> <address> Thailand </address> <phone> 23456 </phone> </person> </data>

data Mary person person name address name address street no city Maple 345 Seattle John Thai phone 23456 id

  • 555

Element node Text node Attribute node

Order matters !!!

slide-8
SLIDE 8

15

More XML: IDs and References

<person id=“o555”> <name> Jane </name> </person> <person id=“o456”> <name> Mary </name> <children idref=“o123 o555”/> </person> <person id=“o123” mother=“o456”><name>John</name> </person>

Scope of IDs and references is the document

16

From Relational Data to XML Data

<persons> <row> <name>John</name> <phone> 3634</phone></row> <row> <name>Sue</name> <phone> 6343</phone> <row> <name>Dick</name> <phone> 6363</phone></row> </persons>

row row row name name name phone phone phone “John” 3634 “Sue” “Dick” 6343 6363

persons XML:

persons

6363 Dick 6343 Sue 3634 John Phone Name

slide-9
SLIDE 9

17

XML Data

  • XML is self-describing
  • Schema elements become part of the data

– Relational schema: persons(name,phone) – In XML <persons>, <name>, <phone> are part of the data, and are repeated many times

  • Consequence: XML is much more flexible

– However, XML data is redundant!

  • XML = semi-structured data

18

Structured / Unstructured / Semi-Structured Data

  • Structured data

– Organised in semantic chunks (entities) – Are grouped together, have same format

  • Unstructured data

– Data can be of any type

  • Semi-structured data

– Entities may not have the same attributes – Not all attributes may be required – Order of attributes not necessarily important – Nested and heterogeneous

slide-10
SLIDE 10

19

Semi-structured Data Explained

  • Missing attributes:
  • Could represent in

a table with nulls

<person> <name> John</name> <phone>1234</phone> </person> <person> <name>Joe</name> </person> no phone !

  • Joe

1234 John phone name

20

Semi-structured Data Explained

  • Repeated attributes
  • Impossible in tables:

<person> <name> Mary</name> <phone>2345</phone> <phone>3456</phone> </person> two phones ! 3456 2345 Mary phone name

???

slide-11
SLIDE 11

21

Semi-structured Data Explained

  • Attributes with different types in different
  • bjects
  • Nested collections (no 1NF)
  • Heterogeneous collections:

– <db> contains both <book>s and <publisher>s <person> <name> <first> John </first> <last> Smith </last> </name> <phone>1234</phone> </person>

structured name !

22

How to describe data types/schema?

  • In XML we basically have two options:

– Document Type Description (DTD)

  • Since early days of XML
  • Allows for document validation
  • Limited support for data types

– XML Schema

  • Allows for simple and complex data types
  • Has mainly replaced DTD
slide-12
SLIDE 12

23

Document Type Definitions DTD

  • Part of the original XML specification
  • an XML document may have a DTD
  • XML document:

well-formed = if tags are correctly closed valid = if it has a DTD and conforms to it

  • Validation is useful in data exchange

24

Very Simple DTD

<!DOCTYPE company [ <!ELEMENT company ((person|product)*)> <!ELEMENT person (ssn, name, office, phone?)> <!ELEMENT ssn (#PCDATA)> <!ELEMENT name (#PCDATA)> <!ELEMENT office (#PCDATA)> <!ELEMENT phone (#PCDATA)> <!ELEMENT product (pid, name, description?)> <!ELEMENT pid (#PCDATA)> <!ELEMENT description (#PCDATA)> ]>

slide-13
SLIDE 13

25

Very Simple DTD

<company> <person> <ssn> 123456789 </ssn> <name> John </name> <office> B432 </office> <phone> 1234 </phone> </person> <person> <ssn> 987654321 </ssn> <name> Jim </name> <office> B123 </office> </person> <product> ... </product> ... </company>

Example of valid XML document:

26

DTD: The Content Model

  • Content model:

– Complex = a regular expression over other elements – Text-only = #PCDATA – Empty = EMPTY – Any = ANY – Mixed content = (#PCDATA | A | B | C)* – * … 0 or many – + … at least one – ? … 0 or 1

<!ELEMENT tag (CONTENT)>

content model

slide-14
SLIDE 14

27

DTD: Regular Expressions

<!ELEMENT name (firstName, lastName))

<name> <firstName> . . . . . </firstName> <lastName> . . . . . </lastName> </name>

<!ELEMENT name (firstName?, lastName))

DTD XML

<!ELEMENT person (name, phone*))

sequence

  • ptional

<!ELEMENT person (name, (phone|email)))

star (repeated occurrence) alternation

<person> <name> . . . . . </name> <phone> . . . . . </phone> <phone> . . . . . </phone> <phone> . . . . . </phone> . . . . . . </person> <name> <lastName> . . . . . </lastName> </name> <person> <name> . . . . . </name> <email> . . . . . </email> </person> 28

DTD: Attributes

  • Document Type Definition

<!ELEMENT person (ssn, name, office, phone?)> <!ATTLIST person age CDATA #REQUIRED "18" birthdate CDATA #IMPLIED nationality CDATA #FIXED "CH" gender (male|female) "female">

  • Document

<person age="24" nationality="CH" gender="male"> <ssn> … </ssn> …<phone> … </phone> </person>

mandatory

  • ptional

default enumeration

slide-15
SLIDE 15

29

Inclusion of DTD in Documents

<?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE test PUBLIC "-//Test AG//DTD test V1.0//EN" SYSTEM "http://www.test.org/test.dtd"> <test> "test" is a document element </test> <!DOCTYPE test SYSTEM "http://www.test.org/test.dtd" [ <!ENTITY hello "hello world"> ]> <test>&hello;</test> <!DOCTYPE test [ <!ELEMENT test EMPTY> ]> <test/>

External DTD Declaration Internal DTD Declaration Mixed usage

30

XML Schema

  • DTD has restrictions

– Only allows for very simple data types

  • XML Schema

– Uses same syntax as XML documents – Allows for basic (built-in) and complex data types

  • integer, float, boolean, string etc.
slide-16
SLIDE 16

31

XML Schema Example

<element name=“Person” type=“string”/> <element name=“age” type=“integer”/> <Person>Antonia</Person> <age>23</age>

Would like to have an entity called Person …

32

Complex Types

<complexType name=“Person”> <sequence> <element name=“Name” type=“string”> <element name=“age” type=“integer”> <element name=“Address” type=“string” minOccurs=“1” maxOccurs=“unbounded”> </sequence> </complexType> <Person> <Name>Antonia</Name> <age>23</age> <Address>1015 Lausanne</Address> <Address>1010 Vienna</Address> </Person>

slide-17
SLIDE 17

33

XML Namespaces

  • Different DTDs/XMLSchemas can use

the same names!

– how to avoid conflicts when combining names from different DTDs/XMLSchemas?

  • XML namespace is a collection of

names (markup vocabulary)

– identified by a prefix (URL reference)

34

XML Namespaces

  • name ::= [prefix:]localname

<book xmlns='urn:loc.gov:book' xmlns:isbn='www.isbn-org.org/def'> <title> … </title> <number> 15 </number> <isbn:number> …. </isbn:number> </book> default name space names belong to default name space

slide-18
SLIDE 18

35

<tag xmlns:mystyle = “http://…”> … <mystyle:title> … </mystyle:title> <mystyle:number> … </tag>

XML Namespaces

  • syntactic: <number> , <isbn:number>
  • semantic: URL used as unique identifier

– URL may not exist, has no function

Belong to this namespace

36 <?xml version="1.0" encoding="ISO-8859-1" ?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="shiporder"> <xs:complexType> <xs:sequence> <xs:element name="orderperson" type="xs:string"/> <xs:element name="shipto"> <xs:complexType> <xs:sequence> <xs:element name="name" type="xs:string"/> <xs:element name="address" type="xs:string"/> <xs:element name="city" type="xs:string"/> <xs:element name="country" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="item" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <xs:element name="title" type="xs:string"/> <xs:element name="note" type="xs:string" minOccurs="0"/> <xs:element name="quantity" type="xs:positiveInteger"/> <xs:element name="price" type="xs:decimal"/> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> <xs:attribute name="orderid" type="xs:string" use="required"/> </xs:complexType> </xs:element> </xs:schema>

slide-19
SLIDE 19

37

Outline

  • XML (Section 17)

– XML syntax, semistructured data – Document Type Definitions (DTDs) – XML Schema

  • Introduction to XML based Web Services

38

What is a Web Service?

  • A Web service is a network accessible interface to

application programs, built using standard Internet technologies.

  • Clients of Web services do NOT need to know how it

is implemented. Application client Application program Network Web Service

slide-20
SLIDE 20

39

Web Server vs. Web Service

  • Web Server

– Typically an HTTP server that serves Web pages such as the EPFL Web site

  • Web Service

– Provides a programmable interface to client applications

  • Typically uses SOAP for communication

– We can write Java, C++, …, applications to access the service

40

  • A Web Service is a URL-addressable software resource that

performs functions (or a function).

  • "Web services are a new breed of Web application. They are

self-contained, self-describing, modular applications that can be published, located, and invoked across the Web. Web services perform functions, which can be anything from simple requests to complicated business processes. … Once a Web service is deployed, other applications (and other Web services) can discover and invoke the deployed service.” IBM web service tutorial

Web Services: Some Definitions

slide-21
SLIDE 21

41

Lets work out the details

  • Client and server need to speak a

certain protocol

– HTTP, “home-grown”, SOAP, etc. – Messages need to be exchanged between client and server – Communication should be “fast” – Independent of programming language,

  • perating system etc.

42

Example: InfoPoint Service

  • Assume a service that returns temperature, time etc.
  • f a certain city:

InfoPoint Service client San Francisco, time 0:30 PST San Francisco, temp 11 C

slide-22
SLIDE 22

43

Messages, Remote Procedures

  • Client should have the impression to

call a (remote) procedure

  • Should be able to do that from her

programming language

String getTime(String city) String getTemp(String city) int getAltitude(String city)

44

How do we exchange data/messages?

  • Server needs to

– Listen on a certain port – Speak a certain protocol

  • Example (in class):

– Open a socket connection and use TCP/IP – Send a few bytes

  • Write data to the network

– Wait for a response

  • Read data from network

– What is the problem with “home-grown” protocols on top of TCP/IP?

slide-23
SLIDE 23

45

HTTP (Hyper Text Transfer Protocol)

  • Standardised way of exchanging requests and

responses

  • Mainly design for exchange of Web pages or data

GET /index.html HTTP/1.1 Host: server.example.org HTTP/1.1 200 OK Connection: close Content-Length: 80 Content-Type: text/html Date: Wed, 25 Feb 2007 14:07:42 GMT <HTML> <BODY> <H1>Introduction to Info Sys</H1> The course provides … </BODY> </HTML>

Request Response

46

Example: InfoPoint Service

  • How do we tell the service:

– Which method to call? – Which parameters (data types)?

POST /InfoPoint HTTP/1.1 Host: server.example.org getTime(“San Francisco”) … HTTP/1.1 200 OK Connection: close Content-Length: 80 Content-Type: text/html Date: Wed, 25 Feb 2007 14:07:42 GMT Time=0:30

Request Response

slide-24
SLIDE 24

47

XML for Message Exchange

  • Requests and responses can be

translated into XML

  • Use data types such as defined in XML

Schema

  • Provides a powerful way to map

methods into requests/responses

  • This is the main idea of SOAP based

Web Services

48

InfoPoint in XML

String getTime(String city) String getTemp(String city) int getAltitude(String city)

<InfoPoint> <getTimeRequest> <city>San Francisco</city> </getTimeRequest> </InfoPoint> <InfoPoint> <getTimeResponse> <time>0:30 PST</time> </getTimeResponse> </InfoPoint>

Request Response

slide-25
SLIDE 25

49

Putting it all together: XML over HTTP

  • Send XML file with HTTP Request
  • Receive XML file in HTTP Response

POST /InfoPoint HTTP/1.1 Host: server.example.org <InfoPoint> <getTimeRequest> <city>San Francisco</city> </getTimeRequest> </InfoPoint> HTTP/1.1 200 OK Connection: close Content-Type: text/xml Date: Wed, 25 Feb 2007 14:07:42 GMT <InfoPoint> <getTimeResponse> <time>0:30 PST</time> </getTimeResponse> </InfoPoint>

Request Response

50

Web Service Architecture

Service provider Service broker Service requestor

publish (WSDL) find (UDDI) bind (SOAP) "server" "client" "naming service"

slide-26
SLIDE 26

51

Advantages of Web Services

  • Platform independence

– Not bound to any programming language nor

  • perating system

– Client only needs to understand

  • HTTP (TCP/IP)
  • XML
  • A standardised protocol exists to create XML

requests/responses: SOAP

– Will be discussed next time

  • In principle, simple to understand, implement

52

Disadvantages of Web Services

  • XML data is exchanged:

– Often, lots of redundancy in messages – Can lead to performance issues

  • Not designed for high-performance

data exchange

slide-27
SLIDE 27

53

Outlook: Web Service Implementations

  • Often provided on top of existing HTTP

based Web servers or application servers

  • Stand-alone Web service containers

– Axis for Java

Web Server HTTP SOAP client

54

Summary

  • XML provides a powerful means for

semi-structured data exchange

– Relational tables can easily be exported into XML

  • Web Services

– Use XML to exchange messages