3 defining the document structure dtd
play

3. Defining the document structure (DTD) Declaration of - PowerPoint PPT Presentation

3. Defining the document structure (DTD) Declaration of application-specific names and structural constraints A document is valid if it specifies a DTD, and if its contents conform to the DTD. A validating parser does the checking;


  1. 3. Defining the document structure (DTD) • Declaration of application-specific names and structural constraints • A document is valid if it specifies a DTD, and if its contents conform to the DTD. • A validating parser does the checking; but: validation is not mandatory • Items not specified in the DTD are forbidden • A DTD does not specify: the root, precise number of element instances, data formats (everything is a string ; some restrictions on names ), semantics (meaning) • Alternative to DTD: XML Schema (see later) XML-3 J. Teuhola 2013 37

  2. Example DTD: Course document <!ELEMENT course (cname, teacher, semester, audience)> <!ELEMENT cname (#PCDATA)> <!ELEMENT teacher (#PCDATA)> <!ELEMENT semester (#PCDATA)> <!ELEMENT audience (student*)> <!ELEMENT student (#PCDATA)> #PCDATA (parsed char data) may contain entity references like &amp; but not tags. Note! the DTD syntax does not conform to the general XML syntax. XML-3 J. Teuhola 2013 38

  3. Example: test documents Valid: Invalid: <course> <course> <cname>XML</cname> <cname>XML</cname> <teacher>JT</teacher> <teacher>JT</teacher> <student>NN</student> <semester> <extent>5 sp</extent> Spring 2013 </course> </semester> <audience> <student>NN</student> Errors: </audience> ’semester’ and ’audience’ <course> are missing; ’extent’ not defined in DTD XML-3 J. Teuhola 2013 39

  4. Declaring the DTD • Position: in the document prolog (after XML declaration, before the root) • Alternatives: – External dtd file URI : <!DOCTYPE coursetype SYSTEM ”http://...”> – External public DTD ; unique and known application: <!DOCTYPE coursetype PUBLIC ”ref” ”backup”> where backup is used if ref is not found. – Internal ; useful in development phase: <!DOCTYPE coursetype [<!ELEMENT ... >]> – Both (compatible internal and external subsets): <!DOCTYPE coursetype SYSTEM ”http://...” [ ... ]> XML-3 J. Teuhola 2013 40

  5. Declaring elements • <!ELEMENT name (content)> where the content can be: – #PCDATA (parsed character data) – child – sequence (comma-separated ordered list) – alternatives (’|’-separated list) • Repetition indicators (suffix symbol), applicable to elements and parentesis expressions: – ? = zero or one – * = zero or many – + = one or many XML-3 J. Teuhola 2013 41

  6. Declaring elements (cont.) • Examples: <!ELEMENT audience (student*) <!ELEMENT day (sunday|monday|...)> <!ELEMENT semester (year,(spring|fall))> <!ELEMENT audience (#PCDATA|student)*> • Special cases: – Empty element: <!ELEMENT name EMPTY> allows elements <name /> <name></name> – Arbitrary contents: <!ELEMENT name ANY> XML-3 J. Teuhola 2013 42

  7. Declaring attributes • All possible attributes must be declared for each element type. • Syntax: <!ATTLIST element attname 1 type 1 default 1 attname 2 type 2 default 2 ... > • Example: <!ATTLIST course name CDATA #REQUIRED dept CDATA ”CS-IT”> • Attributes of one element may also be declared one by one in separate ATTLIST statements. XML-3 J. Teuhola 2013 43

  8. Attribute types CDATA Character string where < and & must be escaped by &lt; and &amp; (possibly also &quot; and &apos;). Numeric data is also CDATA. Name token; like XML name but may NMTOKEN start with a number / punctuation NMTOKENS Whitespace-separated list of name tokens in parentheses Enumeration ’|’-separated list of alternative names following the XML name restrictions XML-3 J. Teuhola 2013 44

  9. Attribute types (cont.) • ID XML name which is unique among ID- attributes in the document. Only one ID attribute per element is allowed. ID value must be a valid XML name (plain number is not!). • IDREF XML name referring to an ID attribute. This enables relationships between elements (cf. foreign keys of relations; but: referential integrity not checked). Needed for M:M relationships. • IDREFS Whitespace-separated list of ID references. XML-3 J. Teuhola 2013 45

  10. Attribute types (cont.) • ENTITY Name of an (unparsed) entity, defined elsewhere in the DTD. • ENTITIES Whitespace-separated list of entity names • NOTATION ’|’-separated list (in parentheses) of alternative NOTATION declarations in DTD A NOTATION is more flexible than enumeration because notations are not restricted to XML naming rules. Declaring a notation, e.g. <!NOTATION gif SYSTEM “image/gif”> <!NOTATION tiff SYSTEM “image/tiff”> … <!ATTLIST image type NOTATION (gif | tiff) #REQUIRED> XML-3 J. Teuhola 2013 46

  11. Attribute defaults Alternatives: #REQUIRED Compulsory, no default value #IMPLIED Attribute value may be omitted; no default #FIXED Always the same value; may be omitted Literal Quoted default value XML-3 J. Teuhola 2013 47

  12. Declaring entities • Entity is a name with a related replacement text • Predefined: &lt; &amp; &gt; &quot; &apos; • Example: <!ENTITY domain ”it.utu.fi”> • Reference: &domain; • Replacement may contain well-formed markup: <!ENTITY address ”<addr> <street>Joukahaisenkatu 3-5</street> <zip>20014</zip> <city>Turku</city> </addr>”> • Replacement may contain entity references (but not loops). XML-3 J. Teuhola 2013 48

  13. External entities • Parsed external entity: – Replacement in a file, e.g. <!ENTITY addr SYSTEM ”/folder/addr.xml”> – Not allowed in attribute values – After replacement the result must be well-formed – An external entity must not have a prolog (e.g. DTD) • Unparsed external entity: – Any data, e.g. digital image: <!ENTITY people SYSTEM ”pic.jpg” NDATA jpeg> – NDATA refers to (application-specific) notation: <!NOTATION jpeg SYSTEM ”image/jpeg”> – Usage as attribute value: <!ATTLIST course photo ENTITY #REQUIRED> – Instance: <course photo=”people”> XML-3 J. Teuhola 2013 49

  14. Parameter entities • Used to name a repeating segment in the DTD • Syntax: <!ENTITY % name ”replacement”> • Reference (to be replaced): %name; • Example: <!ENTITY % employee ”name, dept, bdate”> <!ELEMENT professor (%employee;)> <!ELEMENT lecturer (%employee;)> <!ELEMENT assistant (%employee;)> • Usually appears in external DTDs, but can be redefined in an internal DTD (if both exist); replacement can itself be external: <!ENTITY % name SYSTEM ”http://...”> XML-3 J. Teuhola 2013 50

  15. Example DTD (in file ’letters.dtd’) <!ELEMENT letters (letter+)> <!ELEMENT letter (topic*, text)> <!ATTLIST letter num ID #REQUIRED from CDATA #FIXED ”John Smith, IBM” to CDATA #REQUIRED date CDATA #REQUIRED secret (yes | no) ”no”> <!ELEMENT topic EMPTY> <!ATTLIST topic title CDATA #IMPLIED> <!ELEMENT text ANY> <!ENTITY signature ”Cheers, John”> XML-3 J. Teuhola 2013 51

  16. Example: valid document <?xml version=”1.0” standalone=”no”?> <!DOCTYPE letters SYSTEM ”letters.dtd”> <letters> <letter num=”A123” to=”Bill” date=”20.09” secret=”yes”> <topic title=”Howdy” /> <topic title=”What’s cooking?” /> <text>Thanks for the party. &signature;</text> </letter> <letter num=”A124” to=”Jim” date=”21.09”> <topic title=”Hi” /> <text>See you again. &signature;</text> </letter> </letters> XML-3 J. Teuhola 2013 52

  17. Problems with DTD • Does not itself use XML syntax; needs a different parser/editor/processor • No constraints on character data (e.g. no format, no regular expressions) • No strict data types (e.g. integer, float, boolean) • Restricting the number of repetitions is difficult • Namespaces are not interpreted; prefixes are just part of the names. • Definitions cannot depend on the context (DTD allows ”too much”) XML-3 J. Teuhola 2013 53

  18. Problems with DTD (cont.) • Uniqueness scope of IDs cannot be restricted. • Referential integrity of IDREFS is not specified. • Limited modularity (using ENTITY-definitions); another way to build from pieces: XInclude . • No defaults for elements (only for attributes) • No wildcards for elements/attributes (only ANY content possible for elements). Some of these problems were solved in the XML schema language (see later). XML-3 J. Teuhola 2013 54

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend