12. Application program interfaces (APIs) XML documents are text - PowerPoint PPT Presentation

12. Application program interfaces (APIs) • XML documents are text files – in principle no special APIs are required. • However, for example parsing and validation are tasks needed in almost any application. • Predefined class libraries and standardized interfaces reduce programmer’s work & errors. • Main alternatives: – Document Object Model (DOM) – Simple API for XML (SAX) – Streaming API for XML (StAX) • Example implementation by Sun: JAXP (containing DOM, SAX, and XSLT) XML-12 J. Teuhola 2013 209

12.1. Document Object Model (DOM) • W3C recommendation: A tree-based interface: reads and parses the whole document and places the tree in memory for processing. • Not tied to any programming language; Java suits well (platform-independent, such as XML). • DOM Levels 1, 2, 3 : Successively wider support for various features of XML. • Interfaces are divided into modules , enabling varying degrees of support for the API. • Here: Level 2 Core (2000; Level 3: 2004) XML-12 J. Teuhola 2013 210

About DOM specifications • Extensions have been defined for applications, such as MathML, SVG, SMIL. • Alternatives for processing: – Using only generic interfaces, like manipulating the Nodes. – Using application-specific interfaces, e.g. HTML: paragraphs, images, etc. • Specification language: Interface Description Language (IDL by OMG) – independent of programming language and operating system. • Here: Java mapping (rather straightforward). • JDOM: Simplified DOM for Java XML-12 J. Teuhola 2013 211

Tentative DOM example (Xerces & Java) import java.io.*; import org.w3c.dom.*; import org.apache.xerces.parsers.DOMParser; import org.xml.sax.*; … // Print the root tag name of document ”element.xml” DOMParser parser = new DOMParser(); try { parser.parse(”example.xml”); } catch (SAXException saxe) { … } catch (IOException ioe) { … } Document d = parser.getDocument(); Element root = d.getDocumentElement(); System.out.println(”Root: ” + root.getTagName()); XML-12 J. Teuhola 2013 212

Important interfaces in DOM • Node is the root of all component interfaces. – The whole document can be processed by the methods and properties defined for Node . – The in-memory document structure consists of nodes connected by parent, child and sibling links. • NodeList and NamedNodeMap for processing of node sets • DocumentTraversal , NodeIterator , TreeWalker for tree traversal and iteration • DOMImplementation for various purposes • … and many others XML-12 J. Teuhola 2013 213

Node interface hierarchy DocumentFragment Document CharacterData Text CDATASection Attr Comment Node Element DocumentType Notation Entity EntityReference ProcessingInstruction XML-12 J. Teuhola 2013 214

Node methods 1. Node characteristics: getNodeType (), getNodeName (), getNodeValue (), setNodeValue(value ), hasChildNodes (), getAttributes (), getOwnerDocument () 2. Accessing relatives: getFirstChild (), getLastChild (), getChildNodes (), getNextSibling (), getPreviousSibling (), getParentNode () 3. Node manipulation: removeChild (), insertBefore(newChild , refChild), appendChild(newChild ), replaceChild(oldChild , NewChild), cloneNode(deep ), normalize () XML-12 J. Teuhola 2013 215

Access directions in the document tree node last child first child parent parent parent next next next sibling sibling sibling node node node node previous previous previous sibling sibling sibling XML-12 J. Teuhola 2013 216

Document interface • Represents the whole document – Technically implemented as the root node of the document – Extends the Node interface. – Note: the root of DOM = parent of the actual document root. • Accessing the document information: – getDocType () – getImplementation () – getDocumentElement () – getElementsByTagName(tagName ) • DOM Level 2: – getElementsbyTagNameNS (URI, localName) – getElementByID (elementID) – importNode (importedNode, deep) … and many others … XML-12 J. Teuhola 2013 217

Document interface (cont.) • Factory methods for creating objects to a doc: – createElement (tagName) – createTextNode (data) – createComment (data) – createCDATASection (data) – createProcessingInstruction (target, data) – createAttribute (name) – createEntityReference (name) • Dom Level 2: – createElementNS (URI, qualifiedName) – createAttributeNS (URI, qualifiedName) XML-12 J. Teuhola 2013 218

DocumentType interface • General data about the document (DTD): – getName () � DOCTYPE name = root name – getEntities () � Internal and external entities as a list – getNotations () � Notations as a list • DOM Level 2: – getInternalSubset () – getPublicId () – getSystemId () XML-12 J. Teuhola 2013 219

Element interface • Extends the Node interface with element- specific features: – getTagName () – getElementsByTagName (name) – normalize () � merge adjacent text elements • Attribute-related methods: – getAttribute (name) – setAttribute (name, value) – removeAttribute (name) – getAttributeNode (name) – setAttributeNode (newAttr) – removeAttributeNode (oldAttr) XML-12 J. Teuhola 2013 220

Element interface (cont.) • DOM Level 2 element-specific extension: – getElementsByTagNameNS (URI, localName) • Attribute-specific extensions – hasAttribute (name) – hasAttributeNS (URI, localName) – getAttributeNS (URI, localName) – setAttributeNS (URI, qualName, value) – getAttributeNodeNS (URI, localName) – setAttributeNodeNS (newAttr) – removeAttributeNS (URI, localName) XML-12 J. Teuhola 2013 221

Attr interface • Information about attributes: – getName () – getValue () – setValue (value) – getSpecified () � false if the value originates from DTD – getOwnerElement () � DOM Level 2 • Note that most attribute-accessing operations are part of the Element interface. XML-12 J. Teuhola 2013 222

CharacterData interface • Adds text processing methods to the Node interface: – getData () – setData (data) – getLength () – appendData (arg) – substringData (offset, count) – insertData (offset, arg) – deleteData (offset, count) – replaceData (offset, count, arg) XML-12 J. Teuhola 2013 223

Extensions (subtypes) of Character Data • Text interface – One additional method: splitText (offset) – Creation by a factory method in Document : createTextNode (data) • CDATASection interface – No additional methods; just identifies CDATA nodes (reminder: <![CDATA[ ... ]]>) – Creation by a factory method in Document • Comment interface – No additional methods; identifies comments. – Creation by a factory method in Document XML-12 J. Teuhola 2013 224

ProcessingInstruction interface • Name of node = name of target application • Methods: – getTarget () – getData () – setData (data) • Creation (by a factory method in Document): – createProcessingInstruction (target, data) XML-12 J. Teuhola 2013 225

Entities and notations • Replacing entities by their values is parser- dependent. External binary data cannot be replaced, but entity references must be created. • Entity interface: – getPublicId () – getSystemId () – getNotationName () • Notation interface: – getPublicId () – getSystemId () XML-12 J. Teuhola 2013 226

Node lists and named node maps • Some DOM operations return a list of nodes; NodeList interface: – item (index), getLength () • Attribute and entity declarations have no specific order; accessing is based on their names; NamedNodeMap interface: – item (index), getLength (), getNamedItem (nodeName), setNamedItem (node), removeNamedItem (nodeName) DOM Level 2: – getNamedItemNS (URI, localName), setNamedItemNS (node), removeNamedItemNs (URI, localName) XML-12 J. Teuhola 2013 227

Testing the DOM implementation • DOMImplementation interface: hasFeature (feature, version) where – feature = module name: core, XML, HTML (DOM Level 1) Views, Events, Style, Traversal, Range (Level 2) More modules appear in Level 3. – version = ”1.0”, ”2.0”, ... • Other methods: – createDocument (URI, qualifiedName, docType) – createDocumentType (qualifiedName, publicId, systemId) XML-12 J. Teuhola 2013 228

Tree traversal interfaces • DOM Level 2: Optional package for sophisticated traversal of document trees. • DocumentTraversal interface: An iterator can be created to choose node types and filter the nodes further. • NodeIterator interface: Iteration steps: to the next/previous node • TreeWalker interface: Like NodeIterator, but more versatile: first/last child, next/previous sibling, parent • NodeFilter interface: accept/reject/skip nodes. XML-12 J. Teuhola 2013 229

Processing of ranges • Ranges is an optional module in DOM Level 2. • A range is a segment between start and end points; points are offsets from the start of the containing element. • Range interface: Methods e.g. for – setting the start and end point, – comparing two ranges, – copying the contents of the range, – inserting new items to the range, – collapsing the range, – etc. XML-12 J. Teuhola 2013 230

12. Application program interfaces (APIs) XML documents are text - PowerPoint PPT Presentation

12. Application program interfaces (APIs) XML documents are text files in principle no special APIs are required. However, for example parsing and validation are tasks needed in almost any application. Predefined class libraries

History and Biology Thursday, April 3, 14 Apis Cerana Apis Cerana Thursday, April 3, 14 Apis

T Topic 7 i 7 Interfaces and Abstract Interfaces and Abstract Classes Interfaces Interfaces

Tasks of a Parser Tasks of a Parser Document Parser Interfaces Document Parser Interfaces

Analysis of Security APIs (ASA-2) June 26, 2008 Minimizing Threats from Flawed Security APIs:

The current state of banking APIs Open APIs are high priority for financial institutions What are

CSSE 220 Interfaces and Polymorphism Check out Interfaces from SVN Interfaces What, When,

The History of Interaction Batch Interfaces Command-Line Interfaces Graphical User

Virtual xfrm interfaces Steffen Klassert secunet Security Networks AG Dresden Linux IPsec

Introduction to APIs and JSONs Importing Data in Python II APIs Application Programming

Data Archiving in iRODS Data Management Platform User Interfaces ? CLI CLI Plugins APIs

STOR 390: APIs Marshall Markham Overview Intro to APIs Concept Steps to URI API Usage

Touch Interfaces Multi-touch displays Input & interaction Mobile design 1 CS349 -- Touch

Anaphe Developer Developer Anaphe Interfaces Interfaces Lorenzo Moneta Lorenzo Moneta CERN

Interfaces Interfaces interface : A list of methods that a class promises to implement.

The History of Interaction Batch Interfaces Command-Line Interfaces Graphical

Interfaces Interfaces n interface : A list of methods that a class promises to implement. q

Schematron Tony Graham XML Division Antenna House, Inc. tgraham@antenna.co.jp

XML Processing (XPath, XQuery, XUpdate) Part 3: XQuery 23.11./30.11.2011 Roadmap for XQuery

validation problem April 30, 2014 Embedded Linux Conference San Jose, CA Tomasz Figa Linux

Berkeley DB XML Installation from source on a Linux/Unix system, with PHP support BDB XML:

1 Paradigm Shift on the Web From documents (HTML) to data (XML) From information retrieval

Syntax-based test coverage (part 2) - grammar-based testing - Basic grammar concepts

Compiler Construction Compiler Construction 1 / 177 Mayer Goldberg \ Ben-Gurion University

What Every Xtext User Wished to Know Industry Experience of Implementing 80+ DSLs EclipseCon

12. Application program interfaces (APIs) XML documents are text - PowerPoint PPT Presentation

12. Application program interfaces (APIs) XML documents are text files in principle no special APIs are required. However, for example parsing and validation are tasks needed in almost any application. Predefined class libraries

History and Biology Thursday, April 3, 14 Apis Cerana Apis Cerana Thursday, April 3, 14 Apis

T Topic 7 i 7 Interfaces and Abstract Interfaces and Abstract Classes Interfaces Interfaces

Tasks of a Parser Tasks of a Parser Document Parser Interfaces Document Parser Interfaces

Analysis of Security APIs (ASA-2) June 26, 2008 Minimizing Threats from Flawed Security APIs:

The current state of banking APIs Open APIs are high priority for financial institutions What are

CSSE 220 Interfaces and Polymorphism Check out Interfaces from SVN Interfaces What, When,

The History of Interaction Batch Interfaces Command-Line Interfaces Graphical User

Virtual xfrm interfaces Steffen Klassert secunet Security Networks AG Dresden Linux IPsec

Introduction to APIs and JSONs Importing Data in Python II APIs Application Programming

Data Archiving in iRODS Data Management Platform User Interfaces ? CLI CLI Plugins APIs

STOR 390: APIs Marshall Markham Overview Intro to APIs Concept Steps to URI API Usage

Touch Interfaces Multi-touch displays Input &amp; interaction Mobile design 1 CS349 -- Touch

Anaphe Developer Developer Anaphe Interfaces Interfaces Lorenzo Moneta Lorenzo Moneta CERN

Interfaces Interfaces interface : A list of methods that a class promises to implement.

The History of Interaction Batch Interfaces Command-Line Interfaces Graphical

Interfaces Interfaces n interface : A list of methods that a class promises to implement. q

Schematron Tony Graham XML Division Antenna House, Inc. tgraham@antenna.co.jp

XML Processing (XPath, XQuery, XUpdate) Part 3: XQuery 23.11./30.11.2011 Roadmap for XQuery

validation problem April 30, 2014 Embedded Linux Conference San Jose, CA Tomasz Figa Linux

Berkeley DB XML Installation from source on a Linux/Unix system, with PHP support BDB XML:

1 Paradigm Shift on the Web From documents (HTML) to data (XML) From information retrieval

Syntax-based test coverage (part 2) - grammar-based testing - Basic grammar concepts

Compiler Construction Compiler Construction 1 / 177 Mayer Goldberg \ Ben-Gurion University

What Every Xtext User Wished to Know Industry Experience of Implementing 80+ DSLs EclipseCon

Touch Interfaces Multi-touch displays Input & interaction Mobile design 1 CS349 -- Touch