1
Data Presentation and Markup Languages
MIE456 Tutorial
Acknowledgements
Some contents of this presentation are
borrowed from a tutorial given at VLDB 2000, Cairo, Agypte (www.vldb.org) by
- D. Florescu &. J. Simeno.
Data Presentation and Markup Languages MIE456 Tutorial - - PDF document
Data Presentation and Markup Languages MIE456 Tutorial Acknowledgements Some contents of this presentation are borrowed from a tutorial given at VLDB 2000, Cairo, Agypte (www.vldb.org) by D. Florescu &. J. Simeno. 1 Agenda Web
Some contents of this presentation are
Web Servers and HTML Dynamic Web Page Generation XML DTD
Most web contents are formatted in HTML Displayed by web browsers Use “tags” to indicate formatting information
For example:
<html> <body> <H1>Heading</H1> normal text <b>bode text</b> <i>italic</i> </body> </html>
Static content
Preformatted HTML pages
Dynamic content
The content is determined at runtime Presentation is dependent on user input
HTML page is generated upon request.
HTTP – hypertext transfer protocol URL – Uniform resource locater The web server parse the incoming request
Static page
http://www.eecg.toronto.edu/~jacobsen/
Locate the “index.html” file in the server file
URLs can indicate a request for invoking a
http://www.ibm.com/webapp/wcs/stores/
CGI – Common Gateway Interface CGI scripts, e.g. Perl CGI programs, e.g. C/C++ programs Java servlets, run in servlet engines JSP, ASP – combine code and scripts
An easier way to write server programs for
Compile JSP pages into servlet code
Manually When the JSP page is invoked for the first time
Use directives to control how the web
<% User[] users; users = getUsers(); // get a list of users from database %> <table> <tr> <th>first name</th> <th>Last name</th> </tr> <% for( int i=0; i<users.length; i++) { User u = users[i]; String first = u.getFirstName(); String last = u.getLastName(); %> <TR> <TD><%= first%></TD> <TD><%= last%> </TD> </TR> <% } // end for %> </table>
A subset of Standard Generalized Markup
A markup language
Use tags to describe semantics and structure of
Self-descriptive
Use user-defined tags with meaningful tag names
Allows a tree-like, nested data structure Semi-structured data
Meaningful with or without a schema
Extensible
<?xml version=“1.0”?> <!-- a list of students--> <Students> <Student age=“20”> <Lastname>Smith</Lastname> <Firstname>John</First> <Male/> </Student> <Student age=“21”> <Lastname>Brown</Lastname> <Firstname>Jane</First> <Female/> </Student> </Students>
Correct syntax
Start with an XML Declaration (prolog) Match start and end tags End empty tags with /> Has a root element completely contains all other
Tags may nest but may not overlap Attribute values must be quoted (more…)
A valid XML document is associated
An XML document that conforms to its
A valid XML document must also be
Can be declared inline with the XML
Provides a grammar for the document Contains or points to markup
Optional
<!ELEMENT Students (student*)> <!ELEMENT Student (Lastname, Firstname, Male?, Female?)> <!ATTLIST Student age CDATA #REQUIRED> <!ATTLIST Student weight CDATA #IMPLIED> <!ELEMENT Lastname (#PCDATA)> <!ELEMENT Firstname (#PCDATA)> <!ELEMENT Male EMPTY> <!ELEMENT Female EMPTY> <Students> <Student age=“20” weight=“150”> <Lastname>Smith</Lastname> <Firstname>John</First> <Male/> </Student> <Student age=“21”> <Lastname>Brown</Lastname> <Firstname>Jane</First> <Female/> </Student> </Students>
<?XML version=“1.0”?> <DOCTYPE book [ <!ELEMENT book (title, author*, publisher?, section+)> <!ATTLIST book year CDATA #IMPLIED> <!ELEMENT title (#PCDATA)> <!ENTITY % macro “publisher (#PCDATA)”> <!-- The declaration of the <publisher> element--> <!ELEMENT %macro;> <!ELEMENT author (#PCDATA)> <!ELEMENT section (#PCDATA | title | section)*> ]> <book year=“1967” > <title>The politics of experience</title> <author>R.D.Laing</author> <section> The great and true Amphibian, whose nature is disposed to….. <title>Persons and experience</title> Even facts become fictious without adequate ways to... </section> <section> <section> <![CDATA[Exploitation <must> not been….]]> </section> </section> </book>
elements, character data, comments, processing instructions and references
in any order
declared elements
they have to satisfy the given regular expression
<?XML version=“1.0”?> <DOCTYPE book [ <!ELEMENT book (title, author*, publisher?, section+)> <!ATTLIST book year CDATA #IMPLIED> <!ELEMENT title (#PCDATA)> <!ENTITY %macro “publisher (#PCDATA)”> <!-- The declaration of the <publisher> element--> <!ELEMENT %macro;> <!ELEMENT author (#PCDATA)> <!ELEMENT section (#PCDATA | title | section)*> ]> <book year=“1967” > <title>The politics of experience</title> <author>R.D.Laing</author> <section> The great and true Amphibian, whose nature is disposed to….. <title>Persons and experience</title> Even facts become fictions without adequate ways to... </section> <section> <section> <![CDATA[Explointation <must> not been….]]> </section> </section> </book>
entity, nmtoken,etc)
<?XML version=“1.0”?> <DOCTYPE book [ <!ELEMENT book (title, author*, publisher?, section+)> <!ATTLIST book year CDATA #IMPLIED> <!ELEMENT title (#PCDATA)> <!ENTITY %macro “publisher (#PCDATA)”> <!-- The declaration of the <publisher> element--> <!ELEMENT %macro;> <!ELEMENT author (#PCDATA)> <!ELEMENT section (#PCDATA | title | section)*> <!ENTITY macro2 “<![CDATA[Exploitation <must> not been….]]>”> ]> <book year=“1967” > <title>The politics of experience</title> <author>R.D.Laing</author> <section> The great and true Amphibian, whose nature is disposed to….. <title>Persons and experience</title> Even facts become fictious without adequate ways to... </section> <section> <section> ¯o2 </section></section> </book>
data
content of the document
DTD
Get XML and DTD specifications from
Communicating data between distributed
Web services Data integration and transformation Structuring data in flat files, e.g.
Large textual databases B2B e-business Enterprise Application Integration (EAI) Voice XML Many others…
Free parsers
xml4j (Java), xml4c (C++) from IBM Alphaworks Xerces from apache.org (Java and C++)
Document Object Model (DOM)
Object oriented Represent the XML document in a tree-like data
Simple API for XML (SAX)
Event-driven
HTML formats data for presenting in browsers XML describe the structures and semantics of
XML data can be displayed in browsers by
XSLT processor HTML
Internet
Query languages for XML data
XQuery, Xpath
XML schema XHTML Links to XML research
http://www.research.avayalabs.com/user/