Data Presentation and Markup Languages MIE456 Tutorial - - PDF document

data presentation and markup languages
SMART_READER_LITE
LIVE PREVIEW

Data Presentation and Markup Languages MIE456 Tutorial - - PDF document

Data Presentation and Markup Languages MIE456 Tutorial Acknowledgements Some contents of this presentation are borrowed from a tutorial given at VLDB 2000, Cairo, Agypte (www.vldb.org) by D. Florescu &. J. Simeno. 1 Agenda Web


slide-1
SLIDE 1

1

Data Presentation and Markup Languages

MIE456 Tutorial

Acknowledgements

Some contents of this presentation are

borrowed from a tutorial given at VLDB 2000, Cairo, Agypte (www.vldb.org) by

  • D. Florescu &. J. Simeno.
slide-2
SLIDE 2

2

Agenda

Web Servers and HTML Dynamic Web Page Generation XML DTD

HTML

Most web contents are formatted in HTML Displayed by web browsers Use “tags” to indicate formatting information

(presentation logic)

For example:

<html> <body> <H1>Heading</H1> normal text <b>bode text</b> <i>italic</i> </body> </html>

slide-3
SLIDE 3

3

Static vs. Dynamic Content

Static content

Preformatted HTML pages

Dynamic content

The content is determined at runtime Presentation is dependent on user input

and data from the database

HTML page is generated upon request.

Web Server

HTTP – hypertext transfer protocol URL – Uniform resource locater The web server parse the incoming request

and sends a reply to the client

Static page

http://www.eecg.toronto.edu/~jacobsen/

mie456/index.html

Locate the “index.html” file in the server file

system and sends the file back to the client

slide-4
SLIDE 4

4

Dynamic Web Page Generation

URLs can indicate a request for invoking a

program.

http://www.ibm.com/webapp/wcs/stores/

servlet/CategoryDisplay?categoryId=2035724& storeId=124&catalogId=-124&langId=124

CGI – Common Gateway Interface CGI scripts, e.g. Perl CGI programs, e.g. C/C++ programs Java servlets, run in servlet engines JSP, ASP – combine code and scripts

Java Server Pages (JSP)

An easier way to write server programs for

dynamic content generation

Compile JSP pages into servlet code

Manually When the JSP page is invoked for the first time

Use directives to control how the web

container translates and executes the JSP page

slide-5
SLIDE 5

5

JSP example

<% User[] users; users = getUsers(); // get a list of users from database %> <table> <tr> <th>first name</th> <th>Last name</th> </tr> <% for( int i=0; i<users.length; i++) { User u = users[i]; String first = u.getFirstName(); String last = u.getLastName(); %> <TR> <TD><%= first%></TD> <TD><%= last%> </TD> </TR> <% } // end for %> </table>

Extended Markup Language (XML)

A subset of Standard Generalized Markup

Language (SGML)

A markup language

Use tags to describe semantics and structure of

data

Self-descriptive

Use user-defined tags with meaningful tag names

Allows a tree-like, nested data structure Semi-structured data

Meaningful with or without a schema

Extensible

slide-6
SLIDE 6

6

XML Example

<?xml version=“1.0”?> <!-- a list of students--> <Students> <Student age=“20”> <Lastname>Smith</Lastname> <Firstname>John</First> <Male/> </Student> <Student age=“21”> <Lastname>Brown</Lastname> <Firstname>Jane</First> <Female/> </Student> </Students>

attribute empty tags Elements prolog

Well-formed XML

Correct syntax

Start with an XML Declaration (prolog) Match start and end tags End empty tags with /> Has a root element completely contains all other

elements

Tags may nest but may not overlap Attribute values must be quoted (more…)

slide-7
SLIDE 7

7

Valid XML

A valid XML document is associated

with a Document Type Definition (DTD)

An XML document that conforms to its

DTD is valid.

A valid XML document must also be

well-formed

Can be declared inline with the XML

document or in a separate file

Document Type Definition (DTD)

Provides a grammar for the document Contains or points to markup

declarations for: elements, attributes, entities, notations

Optional

slide-8
SLIDE 8

8

DTD Example

<!ELEMENT Students (student*)> <!ELEMENT Student (Lastname, Firstname, Male?, Female?)> <!ATTLIST Student age CDATA #REQUIRED> <!ATTLIST Student weight CDATA #IMPLIED> <!ELEMENT Lastname (#PCDATA)> <!ELEMENT Firstname (#PCDATA)> <!ELEMENT Male EMPTY> <!ELEMENT Female EMPTY> <Students> <Student age=“20” weight=“150”> <Lastname>Smith</Lastname> <Firstname>John</First> <Male/> </Student> <Student age=“21”> <Lastname>Brown</Lastname> <Firstname>Jane</First> <Female/> </Student> </Students>

Elements

<?XML version=“1.0”?> <DOCTYPE book [ <!ELEMENT book (title, author*, publisher?, section+)> <!ATTLIST book year CDATA #IMPLIED> <!ELEMENT title (#PCDATA)> <!ENTITY % macro “publisher (#PCDATA)”> <!-- The declaration of the <publisher> element--> <!ELEMENT %macro;> <!ELEMENT author (#PCDATA)> <!ELEMENT section (#PCDATA | title | section)*> ]> <book year=“1967” > <title>The politics of experience</title> <author>R.D.Laing</author> <section> The great and true Amphibian, whose nature is disposed to….. <title>Persons and experience</title> Even facts become fictious without adequate ways to... </section> <section> <section> <![CDATA[Exploitation <must> not been….]]> </section> </section> </book>

  • Element:
  • the logical atomic unit of data
  • has a name, a content and a set of attributes
  • the content is an ordered list of children that can be

elements, character data, comments, processing instructions and references

  • Element declaration:
  • describes constraints on the content of an element
  • EMPTY: no content allowed
  • ANY: can contain any elements defined in the DTD,

in any order

  • MIXED: character data mixed with the additional

declared elements

  • CHILDREN: the children can be only elements and

they have to satisfy the given regular expression

slide-9
SLIDE 9

9

Attributes

<?XML version=“1.0”?> <DOCTYPE book [ <!ELEMENT book (title, author*, publisher?, section+)> <!ATTLIST book year CDATA #IMPLIED> <!ELEMENT title (#PCDATA)> <!ENTITY %macro “publisher (#PCDATA)”> <!-- The declaration of the <publisher> element--> <!ELEMENT %macro;> <!ELEMENT author (#PCDATA)> <!ELEMENT section (#PCDATA | title | section)*> ]> <book year=“1967” > <title>The politics of experience</title> <author>R.D.Laing</author> <section> The great and true Amphibian, whose nature is disposed to….. <title>Persons and experience</title> Even facts become fictions without adequate ways to... </section> <section> <section> <![CDATA[Explointation <must> not been….]]> </section> </section> </book>

  • Attribute:
  • (name, string value) pair
  • associated with an element
  • Attribute declaration:
  • a triple (name, type, defaultValue)
  • type:
  • string type (CDATA)
  • tokenized type (ID,IDREF, IDREFS,

entity, nmtoken,etc)

  • enumerated
  • default declaration:
  • REQUIRED
  • IMPLIED
  • FIXED
  • default value

Entities

<?XML version=“1.0”?> <DOCTYPE book [ <!ELEMENT book (title, author*, publisher?, section+)> <!ATTLIST book year CDATA #IMPLIED> <!ELEMENT title (#PCDATA)> <!ENTITY %macro “publisher (#PCDATA)”> <!-- The declaration of the <publisher> element--> <!ELEMENT %macro;> <!ELEMENT author (#PCDATA)> <!ELEMENT section (#PCDATA | title | section)*> <!ENTITY macro2 “<![CDATA[Exploitation <must> not been….]]>”> ]> <book year=“1967” > <title>The politics of experience</title> <author>R.D.Laing</author> <section> The great and true Amphibian, whose nature is disposed to….. <title>Persons and experience</title> Even facts become fictious without adequate ways to... </section> <section> <section> &macro2 </section></section> </book>

  • Entities:
  • the physical storage unit for the XML

data

  • have a name and a content
  • can be referenced by name
  • first classification:
  • parsed entities
  • unparsed entities
  • second classification:
  • internal entities
  • external entities
  • parsed entities:
  • general entities: can occur in the data

content of the document

  • parameter entities : can occur in the

DTD

  • entity references:
  • general entity: &name;
  • parameter entity: %name;
slide-10
SLIDE 10

10

XML and DTD Syntax

Get XML and DTD specifications from

http://www.xml.com/axml/testaxml.htm

Applications of XML

Communicating data between distributed

applications

Web services Data integration and transformation Structuring data in flat files, e.g.

configuration files

Large textual databases B2B e-business Enterprise Application Integration (EAI) Voice XML Many others…

slide-11
SLIDE 11

11

Parsing XML documents

Free parsers

xml4j (Java), xml4c (C++) from IBM Alphaworks Xerces from apache.org (Java and C++)

Document Object Model (DOM)

Object oriented Represent the XML document in a tree-like data

structure

Simple API for XML (SAX)

Event-driven

Extensible Stylesheet Language Transformations (XSLT)

HTML formats data for presenting in browsers XML describe the structures and semantics of

data

XML data can be displayed in browsers by

defining transformation logic in XSLT

XML XSLT

XSLT processor HTML

Browser

Internet

slide-12
SLIDE 12

12

Related XML Technologies

Query languages for XML data

XQuery, Xpath

XML schema XHTML Links to XML research

http://www.research.avayalabs.com/user/

wadler/xml/