The XML Metalanguage Mika Raento mika.raento@cs.helsinki.fi - - PowerPoint PPT Presentation

the xml metalanguage
SMART_READER_LITE
LIVE PREVIEW

The XML Metalanguage Mika Raento mika.raento@cs.helsinki.fi - - PowerPoint PPT Presentation

University of Helsinki Department of Computer Science The XML Metalanguage Mika Raento mika.raento@cs.helsinki.fi University of Helsinki Department of Computer Science Mika Raento The XML Metalanguage p.1/442 2003-09-15


slide-1
SLIDE 1

University of Helsinki – Department of Computer Science

The XML Metalanguage

Mika Raento

mika.raento@cs.helsinki.fi

University of Helsinki – Department of Computer Science

Mika Raento The XML Metalanguage – p.1/442 2003-09-15

slide-2
SLIDE 2

University of Helsinki – Department of Computer Science

Preliminaries

Mika Raento The XML Metalanguage – Preliminaries – p.2/442 2003-09-15

slide-3
SLIDE 3

University of Helsinki – Department of Computer Science

Preliminaries

Motivation Practicalities Course Overview

Mika Raento The XML Metalanguage – Preliminaries – p.3/442 2003-09-15

slide-4
SLIDE 4

University of Helsinki – Department of Computer Science

Motivation

XML is used to Publish documents (Linux documentation in DocBook, Reference works such as Dictionaries) in several formats from the same contents Publish news items on the web via RDF (for example: Slashdot, CNN, Mozillazine) that can be incorporated to other web sites or client software Store program settings and preferences (Gnome)

Mika Raento The XML Metalanguage – Preliminaries – p.4/442 2003-09-15

slide-5
SLIDE 5

University of Helsinki – Department of Computer Science

Motivation

XML is used to Exchange business documents, such as invoices and inventories (ebXML – new standard for EDI, for example Finnish customs documents) Make remote procedure calls over the Internet (SOAP=Web Services allows calling Google from your code) Build message-based large-scale software (SAP R/3 integration, Ascade Cockpit application)

Mika Raento The XML Metalanguage – Preliminaries – p.5/442 2003-09-15

slide-6
SLIDE 6

University of Helsinki – Department of Computer Science

Motivation

XML is Fairly easy to learn Human and machine-readable Lightweight for processing easy to find software for

Mika Raento The XML Metalanguage – Preliminaries – p.6/442 2003-09-15

slide-7
SLIDE 7

University of Helsinki – Department of Computer Science

Practicalities

2 study weeks – 8 × 2 hours lectures – 8 × 2 hours exercises Lecturer Mika Raento, D419, available Thu 16–17 Lectures in Finnish, material in English, one exercise group in English One course exam on Nov 10th Literature: The XML Companion, 3rd edition by Neil Bradley — or use the web Course web site on http://www.cs.helsinki.fi/u/mraento/teaching/xml_s03/ Newsgroup news:hy.opiskelu.tktl.xml

Mika Raento The XML Metalanguage – Preliminaries – p.7/442 2003-09-15

slide-8
SLIDE 8

University of Helsinki – Department of Computer Science

Practicalities

Exam + project work Two ways to complete project work: Exercises + smaller project that will be partially done at the exercises. You are allowed to miss at most two

  • exercises. OR

Larger standalone project work (details at the course web page) So: cancel your registration at an exercise group, if you don’t think you’ll be able to attend

Mika Raento The XML Metalanguage – Preliminaries – p.8/442 2003-09-15

slide-9
SLIDE 9

University of Helsinki – Department of Computer Science

Practicalities

Grading maximum 60 points Exam 30 points, 15 points minimum to pass Project work 30 points, 15 points minimum to pass (10 points from exercise attendance, 20 points from work OR 30 points from larger project work) 3 extra points (above 60) for exercises attended after the minimum six No exercise points taken into account if you take the separate exam

Mika Raento The XML Metalanguage – Preliminaries – p.9/442 2003-09-15

slide-10
SLIDE 10

University of Helsinki – Department of Computer Science

Course Overview

  • 1. Introduction. History. Motivation
  • 2. From HTML to XML. Well-formedness and validity.
  • 3. DTD basics. Document modelling with DTDs.
  • 4. DTD limitations. Alternatives
  • 5. Namespaces. XML processing.
  • 6. XSLT transformations, XPath. Mind set. Techniques,

strength.

  • 7. FO. Basics, mind set, more advanced topics.
  • 8. Combining. Wrapup. Related standards.

Mika Raento The XML Metalanguage – Preliminaries – p.10/442 2003-09-15

slide-11
SLIDE 11

University of Helsinki – Department of Computer Science

Introduction

Mika Raento The XML Metalanguage – Introduction – p.11/442 2003-09-15

slide-12
SLIDE 12

University of Helsinki – Department of Computer Science

Introduction

What is XML? What does it look like? What does ’metalanguage’ mean? XML-processors DTDs Transformations Style definitions

Mika Raento The XML Metalanguage – Introduction – p.12/442 2003-09-15

slide-13
SLIDE 13

University of Helsinki – Department of Computer Science

eXtensible Markup Language

W3C recommendation Version 1.0 (1.1 candidate recommendation) 1st edition 1998-02-10, 2nd (current) ed. 2000-10-06 An agreed-upon textual format for representing tree-structured data For storing, combining, exchanging and publishing information Human- and machine-readable

Mika Raento The XML Metalanguage – Introduction – p.13/442 2003-09-15

slide-14
SLIDE 14

University of Helsinki – Department of Computer Science

XML document instance

<!-- Example document instance --> <university> <department> <name> Department of Computer Science </name> <address> Teollisuuskatu 23 </address> </department> </university>

Mika Raento The XML Metalanguage – Introduction – p.14/442 2003-09-15

slide-15
SLIDE 15

University of Helsinki – Department of Computer Science

XML document instance

<!-- Example document instance --> <university> <department> ←tag <name> Department of Computer Science </name> <address> Teollisuuskatu 23 </address> </department> ←tag </university>

Mika Raento The XML Metalanguage – Introduction – p.14/442 2003-09-15

slide-16
SLIDE 16

University of Helsinki – Department of Computer Science

XML document instance

<!-- Example document instance --> <university> <department> <name> Department of Computer Science ←element </name> <address> Teollisuuskatu 23 </address> </department> </university>

Mika Raento The XML Metalanguage – Introduction – p.14/442 2003-09-15

slide-17
SLIDE 17

University of Helsinki – Department of Computer Science

XML document instance

<!-- Example document instance --> ←comment <university> <department> <name> Department of Computer Science </name> <address> Teollisuuskatu 23 </address> </department> </university>

Mika Raento The XML Metalanguage – Introduction – p.14/442 2003-09-15

slide-18
SLIDE 18

University of Helsinki – Department of Computer Science

XML document instance

Actual document contents, that have been marked up in an agreed way Self-describing (for humans) tags Elements and nested elements, meaningful units of information Text within elements Comments

Mika Raento The XML Metalanguage – Introduction – p.15/442 2003-09-15

slide-19
SLIDE 19

University of Helsinki – Department of Computer Science

Logical vs. physical structure

Logical structure Logical relationships and constraints Describes the structure of the information content Physical structure Entities Characters and character set Files

Mika Raento The XML Metalanguage – Introduction – p.16/442 2003-09-15

slide-20
SLIDE 20

University of Helsinki – Department of Computer Science

XML Processors

XML parser Finds errors Provides information to applications Entity (document part) management Combines entities to documents Combines physical files

Mika Raento The XML Metalanguage – Introduction – p.17/442 2003-09-15

slide-21
SLIDE 21

University of Helsinki – Department of Computer Science

Metalanguage XML

XML provides a general syntax for tree structured data Users provide a application-specific grammar for this syntax Element names Element order and nesting Certain reserved words This grammar is called a Document Type Definition, DTD

Mika Raento The XML Metalanguage – Introduction – p.18/442 2003-09-15

slide-22
SLIDE 22

University of Helsinki – Department of Computer Science

Example DTD

<!-{}- Document Type Definition (DTD) example --> <!ELEMENT university (department+)> <!ELEMENT department (name, address)> <!ELEMENT name (#PCDATA)> <!ELEMENT address (#PCDATA)>

Mika Raento The XML Metalanguage – Introduction – p.19/442 2003-09-15

slide-23
SLIDE 23

University of Helsinki – Department of Computer Science

DTD

Defines the type of the document, or its structure One rule per element Name of the element Allowed content Grammar for document instances ”Regular expressions” for element content (may be recursive) not required in XML

Mika Raento The XML Metalanguage – Introduction – p.20/442 2003-09-15

slide-24
SLIDE 24

University of Helsinki – Department of Computer Science

Advantages of using DTDs

Allows a validating parser Checks that the document instance corresponds to the DTD Consistent use of the tags Standard DTDs for specific applications A common vocabulary

Mika Raento The XML Metalanguage – Introduction – p.21/442 2003-09-15

slide-25
SLIDE 25

University of Helsinki – Department of Computer Science

Descriptive (declarative) markup

Generalized markup (no formatting (information necessarily)) syntactic form without semantics however includes element names that describe content In addition we need a way to describe formatting and e.g. links so that we can present the information to humans

Mika Raento The XML Metalanguage – Introduction – p.22/442 2003-09-15

slide-26
SLIDE 26

University of Helsinki – Department of Computer Science

Descriptive vs. procedural markup

Descriptive Categorises the document into parts logical (logical parts and their relations) self-describing content and format separated E.g. XML

Mika Raento The XML Metalanguage – Introduction – p.23/442 2003-09-15

slide-27
SLIDE 27

University of Helsinki – Department of Computer Science

Descriptive vs. procedural markup

Procedural Defines what processing is to be carried out on the document Visible (e.g. L

A

T EX) or invisible (e.g. Word) Formatting information Content and format mixed This distinction is neither clear-cut (e.g. L

A

T EX with only \section, \subsection etc., or applications of XML such as XSL) nor all-encompassing, but provides a useful starting point

Mika Raento The XML Metalanguage – Introduction – p.24/442 2003-09-15

slide-28
SLIDE 28

University of Helsinki – Department of Computer Science

Stylesheets

Defines the presentation (output) format Possibly several per DTD and/or document instance Cascading Style Sheets (CSS) XML Stylesheet Language (XSL) (DSSSL, not covered in this course)

Mika Raento The XML Metalanguage – Introduction – p.25/442 2003-09-15

slide-29
SLIDE 29

University of Helsinki – Department of Computer Science

Other applications of XML

Data transfer Subsets (views) of relational databases EDI (Electronic data interchange) Message-based applications Coarse-grained RPC Publishing Documents Metadata Etc.

Mika Raento The XML Metalanguage – Introduction – p.26/442 2003-09-15

slide-30
SLIDE 30

University of Helsinki – Department of Computer Science

Publishing process

XML document XSL/CSS stylesheet Formatted document Formatting

Mika Raento The XML Metalanguage – Introduction – p.27/442 2003-09-15

slide-31
SLIDE 31

University of Helsinki – Department of Computer Science

History of XML — SGML

Standard Generalised Markup Language Based (in part) on IBM’s GML (1969) Introduced in 1974, ISO standard 1986 Large and complicated Tools correspondingly large and complicated ← few and expensive XML has basically the same expressive power (almost a proper subset) Still widely used in publishing of very large documents/ document collections

Mika Raento The XML Metalanguage – Introduction – p.28/442 2003-09-15

slide-32
SLIDE 32

University of Helsinki – Department of Computer Science

History of XML — HTML

HyperText Markup Language (first proposal 1989–1990, HTML 2.0 IETF (RFC) standard 1995) Huge success, basis for the web Non-standard extensions problematic (although nowadays mainly in javascript/DOM) Lots of tools available

Mika Raento The XML Metalanguage – Introduction – p.29/442 2003-09-15

slide-33
SLIDE 33

University of Helsinki – Department of Computer Science

History of XML — HTML

An SGML DTD + predefined semantics Practical when the only purpose is to present information Easy and pleasing presentation in a browser Focuses on tags for book-like document structure, presentation and linking

Mika Raento The XML Metalanguage – Introduction – p.30/442 2003-09-15

slide-34
SLIDE 34

University of Helsinki – Department of Computer Science

XML — SGML — HTML

XML combines good features from both SGML (expressiveness, extensibility) and HTML (simple, easy to understand) Lots of tools available XHTML is HTML cast into an XML DTD (instead of SGML) SGML may still be the best format for very large documents XML does not solve all problems All three languages are needed

Mika Raento The XML Metalanguage – Introduction – p.31/442 2003-09-15

slide-35
SLIDE 35

University of Helsinki – Department of Computer Science

XML Design principles

  • 1. XML shall be straightforwardly usable over the Internet
  • 2. XML shall support a wide variety of applications
  • 3. XML shall be compatible with SGML
  • 4. It shall be easy to write programs which process XML

documents

  • 5. The number of optional features in XML is to be kept to

the absolute minimum, ideally zero

Mika Raento The XML Metalanguage – Introduction – p.32/442 2003-09-15

slide-36
SLIDE 36

University of Helsinki – Department of Computer Science

XML Design principles

  • 6. XML documents should be human-legible and

reasonably clear

  • 7. The XML design should be prepared quickly
  • 8. The design of XML shall be formal and concise
  • 9. XML documents shall be easy to create
  • 10. Terseness is of minimal importance

Mika Raento The XML Metalanguage – Introduction – p.33/442 2003-09-15

slide-37
SLIDE 37

University of Helsinki – Department of Computer Science

Related standards

XLink — hyperlinking for XML XPath — locating and selecting XML document parts XPointer — The reference language for XLink XSL — XML stylesheet language XSLT — XSL transformations SAX — stream-oriented XML API DOM — tree-oriented XML API

Mika Raento The XML Metalanguage – Introduction – p.34/442 2003-09-15

slide-38
SLIDE 38

University of Helsinki – Department of Computer Science

On this course

Introduction, motivation, background XML documents (XML) XML DTDs (XML) XML transformations (XSLT) Stylesheets (CSS, XSL) Some tools Related information (XML Schema, XPath, XML Namespaces)

Mika Raento The XML Metalanguage – Introduction – p.35/442 2003-09-15

slide-39
SLIDE 39

University of Helsinki – Department of Computer Science

Literature

Bradley : The XML Companion http://www.w3.org/XML/ http://www.xml.org http://www.xml.com http://www.xmlsoftware.com http://www.xslinfo.com http://xml.coverpages.org http://xml.apache.org http://www.cs.helsinki.fi/u/ruini/structure/xml/

Mika Raento The XML Metalanguage – Introduction – p.36/442 2003-09-15

slide-40
SLIDE 40

University of Helsinki – Department of Computer Science

This lecture in literature

Bradley: 2, 3, 31

  • r

XML in 10 points, http://www.w3.org/XML/1999/XML-in-10-points Norman Walsh, a Technical Introduction to XML, http://nwalsh.com/docs/articles/xml/, upto ’Entity References’ Greg Meyer: An Overview of the XML..., http://www.stsc.hill.af.mil/crosstalk/1998/06/xml.asp Connolly et.al, The Evolution of Web Documents, http://www.xml.com/pub/a/w3j/s3.connolly.html

Mika Raento The XML Metalanguage – Introduction – p.37/442 2003-09-15