JAXP 1.1 JAXP 1.1 Included in Java since JDK 1.4 Included in Java - - PDF document

jaxp 1 1 jaxp 1 1
SMART_READER_LITE
LIVE PREVIEW

JAXP 1.1 JAXP 1.1 Included in Java since JDK 1.4 Included in Java - - PDF document

2.3 JAXP: Java API for XML Processing 2.3 JAXP: Java API for XML Processing JAXP 1.1 JAXP 1.1 Included in Java since JDK 1.4 Included in Java since JDK 1.4 How can applications use XML processors? How can applications use XML


slide-1
SLIDE 1

XPT 2006 XML APIs: JAXP 1

2.3 JAXP: Java API for XML Processing 2.3 JAXP: Java API for XML Processing

  • How can applications use XML processors?

How can applications use XML processors?

– – A Java A Java-

  • based answer: through

based answer: through JAXP JAXP – – An overview of the JAXP interface An overview of the JAXP interface

» » What does it specify? What does it specify? » » What can be done with it? What can be done with it? » » How do the JAXP components fit together? How do the JAXP components fit together?

[Partly based on tutorial [Partly based on tutorial “ “An Overview of the APIs An Overview of the APIs” ” at at http://java.sun.com/xml/jaxp/dist/1.1/docs/tutorial/overview http://java.sun.com/xml/jaxp/dist/1.1/docs/tutorial/overview /3_apis.html, from which also some graphics are borrowed] /3_apis.html, from which also some graphics are borrowed]

XPT 2006 XML APIs: JAXP 2

JAXP 1.1 JAXP 1.1

  • Included in Java since JDK 1.4

Included in Java since JDK 1.4

  • An interface for

An interface for “ “plugging plugging-

  • in

in” ” and using XML and using XML processors in Java applications processors in Java applications

– – includes packages includes packages

» » org.xml.sax

  • rg.xml.sax:

: SAX 2.0 SAX 2.0 » » org.w3c.dom:

  • rg.w3c.dom: DOM Level 2

DOM Level 2 » » javax.xml.parsers javax.xml.parsers: : initialization and use of parsers initialization and use of parsers » » javax.xml.transform javax.xml.transform: : initialization and use of transformers initialization and use of transformers (XSLT processors) (XSLT processors)

XPT 2006 XML APIs: JAXP 3

Later Versions Later Versions

  • JAXP 1.2 (2002) adds property

JAXP 1.2 (2002) adds property-

  • strings for

strings for setting the language and source of a schema setting the language and source of a schema used for (non used for (non-

  • DTD

DTD-

  • based) validation

based) validation

  • JAXP 1.3 included in JDK 5.0 (2005)

JAXP 1.3 included in JDK 5.0 (2005)

– – more flexible validation (decoupled from parsing) more flexible validation (decoupled from parsing) – – support for DOM3 and support for DOM3 and XPath XPath

  • We'll restrict to basic ideas from JAXP 1.1

We'll restrict to basic ideas from JAXP 1.1

XPT 2006 XML APIs: JAXP 4

JAXP: XML processor JAXP: XML processor plugin plugin (1) (1)

  • Vendor

Vendor-

  • independent method for selecting

independent method for selecting processor implementation at run time processor implementation at run time

– – principally through system properties principally through system properties

javax.xml.parsers.SAXParserFactory javax.xml.parsers.SAXParserFactory javax.xml.parsers.DocumentBuilderFactory javax.xml.parsers.DocumentBuilderFactory javax.xml.transform.TransformerFactory javax.xml.transform.TransformerFactory

– – Set on command line (for example, to use Apache Set on command line (for example, to use Apache Xerces Xerces as the DOM implementation): as the DOM implementation):

java java

  • D

Djavax.xml.parsers.DocumentBuilderFactory

javax.xml.parsers.DocumentBuilderFactory= =

  • rg.apache.xerces.jaxp.DocumentBuilderFactoryImpl
  • rg.apache.xerces.jaxp.DocumentBuilderFactoryImpl

XPT 2006 XML APIs: JAXP 5

JAXP: XML processor JAXP: XML processor plugin plugin (2) (2)

– – Set during execution ( Set during execution (− −> > Saxon as the XSLT Saxon as the XSLT impl impl): ):

System.setProperty System.setProperty( ( " "javax.xml.transform.TransformerFactory javax.xml.transform.TransformerFactory", ", " "com.icl.saxon.TransformerFactoryImpl com.icl.saxon.TransformerFactoryImpl"); ");

  • By default, reference implementations used

By default, reference implementations used

– – Apache Crimson/ Apache Crimson/Xerces Xerces as the XML parser as the XML parser – – Apache Apache Xalan Xalan as the XSLT processor as the XSLT processor

  • Supported by a few compliant processors:

Supported by a few compliant processors:

– – Parsers: Apache Crimson and Parsers: Apache Crimson and Xerces Xerces, , Aelfred Aelfred, , Oracle XML Parser for Java Oracle XML Parser for Java – – XSLT transformers: Apache XSLT transformers: Apache Xalan Xalan, Saxon , Saxon

XPT 2006 XML APIs: JAXP 6

JAXP: Functionality JAXP: Functionality

  • Parsing using SAX 2.0 or DOM Level 2

Parsing using SAX 2.0 or DOM Level 2

  • Transformation using XSLT

Transformation using XSLT

– – (We (We’ ’ll study XSLT in detail later) ll study XSLT in detail later)

  • Adds functionality missing from SAX 2.0 and

Adds functionality missing from SAX 2.0 and DOM Level 2: DOM Level 2:

– – controlling validation and handling of parse errors controlling validation and handling of parse errors

» » error handling error handling can can be controlled in SAX, be controlled in SAX, by implementing by implementing ErrorHandler ErrorHandler methods methods

– – loading and saving of DOM Document objects loading and saving of DOM Document objects

XPT 2006 XML APIs: JAXP 7

JAXP JAXP Parsing Parsing API API

  • Included

Included in JAXP in JAXP package package

javax.xml.parsers javax.xml.parsers

  • Used for invoking and using SAX

Used for invoking and using SAX … …

SAXParserFactory SAXParserFactory spf spf = = SAXParserFactory SAXParserFactory. .newInstance newInstance(); ();

and DOM parser implementations: and DOM parser implementations:

DocumentBuilderFactory DocumentBuilderFactory dbf = dbf = DocumentBuilderFactory DocumentBuilderFactory. .newInstance newInstance(); ();

XPT 2006 XML APIs: JAXP 8

XML XML

. .getXMLReader getXMLReader() ()

JAXP: JAXP: Using Using a SAX a SAX parser parser (1) (1)

f.xml f.xml . .parse parse( ( ” ”f.xml f.xml” ”) ) . .newSAXParser newSAXParser() ()

slide-2
SLIDE 2

XPT 2006 XML APIs: JAXP 9

JAXP: JAXP: Using Using a SAX a SAX parser parser (2) (2)

  • We

We have have already already seen seen this this: :

SAXParserFactory SAXParserFactory spf spf = = SAXParserFactory SAXParserFactory. .newInstance newInstance(); (); try try { { SAXParser SAXParser saxParser saxParser = = spf. spf.newSAXParser newSAXParser(); (); XMLReader XMLReader xmlReader xmlReader = = saxParser. saxParser.getXMLReader getXMLReader(); (); ... ... xmlReader xmlReader. .setContentHandler setContentHandler(handler (handler); ); xmlReader xmlReader. .parse parse(fileNameOrURI (fileNameOrURI); ... ); ... } } catch catch (Exception e) { (Exception e) { System.err.println(e.getMessage System.err.println(e.getMessage()); ()); System.exit(1); }; System.exit(1); };

XPT 2006 XML APIs: JAXP 10

f.xml f.xml

JAXP: JAXP: Using Using a DOM a DOM parser parser (1) (1)

. .parse( parse(” ”f.xml f.xml” ”) ) . .newDocument newDocument() ()

. .newDocumentBuilder newDocumentBuilder() ()

XPT 2006 XML APIs: JAXP 11

JAXP: JAXP: Using Using a DOM a DOM parser parser (2) (2)

  • Parsing

Parsing a a file file into a DOM into a DOM Document Document: :

DocumentBuilderFactory DocumentBuilderFactory dbf = dbf = DocumentBuilderFactory DocumentBuilderFactory. .newInstance newInstance(); (); try try { { // to get a new // to get a new DocumentBuilder DocumentBuilder: : DocumentBuilder DocumentBuilder builder = builder = dbf. dbf.newDocumentBuilder newDocumentBuilder(); (); Document Document domDoc domDoc = = builder. builder.parse parse(fileNameOrURI (fileNameOrURI); ); } } catch catch ( (ParserConfigurationException ParserConfigurationException e) { e) { e.printStackTrace e.printStackTrace()); ()); System.exit(1); System.exit(1); }; };

XPT 2006 XML APIs: JAXP 12

DOM DOM building building in JAXP in JAXP

XML XML Reader Reader (SAX (SAX Parser Parser) ) XML XML Error Error Handler Handler DTD DTD Handler Handler Entity Entity Resolver Resolver Document Document Builder Builder ( (Content Content Handler Handler) ) DOM DOM Document Document

DOM on top of SAX DOM on top of SAX -

  • So

So what what? ?

XPT 2006 XML APIs: JAXP 13

JAXP: JAXP: Controlling Controlling parsing parsing (1) (1)

  • Errors

Errors of DOM

  • f DOM parsing

parsing can can be be handled handled

– – by by creating creating a SAX a SAX ErrorHandler ErrorHandler

» » to to implement implement error error, , fatalError fatalError and and warning warning methods methods

and and passing passing it it to the to the DocumentBuilder DocumentBuilder: : builder. builder.setErrorHandler setErrorHandler(new (new myErrHandler myErrHandler()); ());

domDoc domDoc = = builder. builder.parse parse(fileName (fileName); );

  • Parser

Parser properties properties can can be be configured configured: :

– – for for both both SAXParserFactories SAXParserFactories and and DocumentBuilderFactories DocumentBuilderFactories (

(before before parser parser/ /builder builder creation creation) ):

:

factory. factory.setValidating setValidating(true (true/false) /false) factory. factory.setNamespaceAware setNamespaceAware(true (true/false) /false)

XPT 2006 XML APIs: JAXP 14

JAXP: JAXP: Controlling Controlling parsing parsing (2) (2)

setIgnoringComments setIgnoringComments(true (true/false) /false) setIgnoringElementContentWhitespace setIgnoringElementContentWhitespace(true (true/false) /false) setCoalescing setCoalescing(true (true/false) /false)

  • combine CDATA sections with surrounding text?

combine CDATA sections with surrounding text? setExpandEntityReferences setExpandEntityReferences(true (true/false) /false)

  • Further

Further DocumentBuilderFactory DocumentBuilderFactory configuration configuration methods methods to to control control the the form form of

  • f

the the resulting resulting DOM DOM Document Document: :

XPT 2006 XML APIs: JAXP 15

JAXP JAXP Transformation Transformation API API

  • earlier

earlier known known as as TrAX TrAX

  • Allows

Allows application application to to apply apply a a Transformer Transformer to a to a Source Source document document to to get get a a Result Result document document

  • Transformer

Transformer can can be be created created

– – from from XSLT XSLT transformation transformation instructions instructions (to (to be be discussed discussed later later) ) – – without without instructions instructions

» » gives gives an an identity identity transformation transformation, , which which simply simply copies copies the the Source Source to the to the Result Result

XPT 2006 XML APIs: JAXP 16

XSLT XSLT

JAXP: JAXP: Using Using Transformers Transformers (1) (1)

. .newTransformer newTransformer( (… …) ) . .transform transform(.,.) (.,.)

Source Source

slide-3
SLIDE 3

XPT 2006 XML APIs: JAXP 17

JAXP JAXP Transformation Transformation APIs APIs

  • javax.xml.transform

javax.xml.transform: :

– – Classes Classes Transformer Transformer and and TransformerFactory TransformerFactory; initialization similar ; initialization similar to parsers and parser factories to parsers and parser factories

  • Transformation

Transformation Source Source object can be

  • bject can be

– – a DOM tree, a SAX a DOM tree, a SAX XMLReader XMLReader or an input stream

  • r an input stream
  • Transformation

Transformation Result Result object can be

  • bject can be

– – a DOM tree, a SAX a DOM tree, a SAX ContentHandler ContentHandler

  • r an output stream
  • r an output stream

XPT 2006 XML APIs: JAXP 18

Source Source-

  • Result

Result combinations combinations

XML XML Reader Reader

(SAX (SAX Parser Parser) )

Transformer Transformer DOM DOM Content Content Handler Handler Input Input Stream Stream Output Output Stream Stream DOM DOM

Source Source Result Result

XPT 2006 XML APIs: JAXP 19

JAXP JAXP Transformation Transformation Packages Packages (2) (2)

  • Classes to create

Classes to create Source Source and and Result Result objects

  • bjects

from DOM, SAX and I/O streams defined in from DOM, SAX and I/O streams defined in packages packages

– – javax.xml.transform.dom javax.xml.transform.dom, , javax.xml.transform.sax javax.xml.transform.sax, , and

and

javax.xml.transform.stream javax.xml.transform.stream

  • Identity transformation to an output stream is a

Identity transformation to an output stream is a vendor vendor-

  • neutral way to serialize DOM documents

neutral way to serialize DOM documents (and the only option in JAXP) (and the only option in JAXP)

– – “ “I would recommend using the JAXP interfaces until the I would recommend using the JAXP interfaces until the DOM DOM’ ’s s own load/save module becomes available

  • wn load/save module becomes available”

» » Joe Joe Kesselman Kesselman, IBM & W3C DOM WG , IBM & W3C DOM WG

XPT 2006 XML APIs: JAXP 20

Serializing Serializing a DOM a DOM Document Document as XML as XML text text

  • By an identity transformation to an output stream:

By an identity transformation to an output stream:

TransformerFactory TransformerFactory tFactory tFactory = = TransformerFactory TransformerFactory. .newInstance newInstance(); (); // // Create Create an an identity identity transformer transformer: : Transformer Transformer transformer transformer = = tFactory. tFactory.newTransformer newTransformer(); (); DOMSource DOMSource source source = = new new DOMSource DOMSource(myDOMdoc (myDOMdoc); ); StreamResult StreamResult result result = = new new StreamResult StreamResult(System.out (System.out); ); transformer. transformer.transform transform(source (source, , result result); );

XPT 2006 XML APIs: JAXP 21

Controlling Controlling the the form form of the

  • f the result

result? ?

  • We could specify the requested form of the result by

We could specify the requested form of the result by an XSLT script, say, in file an XSLT script, say, in file saveSpec.xslt saveSpec.xslt: :

< <xsl:transform xsl:transform version="1.0" version="1.0" xmlns:xsl="http xmlns:xsl="http://www.w3.org/1999/XSL/ ://www.w3.org/1999/XSL/Transform Transform"> "> < <xsl:output xsl:output encoding="ISO encoding="ISO-

  • 8859

8859-

  • 1"

1" indent="yes indent="yes" " doctype doctype-

  • system="reglist.dtd

system="reglist.dtd" /> " /> < <xsl:template xsl:template match match="/"> ="/"> <! <!--

  • - copy

copy whole whole document document: : --

  • ->

> < <xsl:copy xsl:copy-

  • of
  • f select

select="." /> ="." /> </ </xsl:template xsl:template> > </ </xsl:transform xsl:transform> >

XPT 2006 XML APIs: JAXP 22

Creating Creating an XSLT an XSLT Transformer Transformer

  • Then create a tailored

Then create a tailored transfomer transfomer: :

StreamSource StreamSource saveSpecSrc saveSpecSrc = = new new StreamSource StreamSource( ( new new File( File(” ”saveSpec.xslt saveSpec.xslt” ”) ); ) ); Transformer Transformer transformer transformer = = tFactory. tFactory.newTransformer newTransformer(saveSpecSrc (saveSpecSrc); ); // and // and use use it it to to transform transform a a Source Source to a to a Result Result, , // as // as before before

  • The

The Source Source of transformation instructions could be

  • f transformation instructions could be

given also as a given also as a DOMSource DOMSource, URL, , URL, or

  • r character

character reader reader

XPT 2006 XML APIs: JAXP 23

DOM vs. Other Java/XML APIs DOM vs. Other Java/XML APIs

  • JDOM (

JDOM (www.jdom.org www.jdom.org), ), DOM4J ( DOM4J (www.dom4j.org www.dom4j.org) ), , JAXB ( JAXB (java.sun.com java.sun.com/xml/ /xml/jaxb jaxb) )

  • The others may be more convenient to use,

The others may be more convenient to use, but but … …

“ “The The DOM offers DOM offers not only the not only the ability to move ability to move between languages between languages with minimal relearning, but with minimal relearning, but to to move between multiple implementations move between multiple implementations in in a single language a single language – – which a specific set of classes which a specific set of classes such as JDOM can such as JDOM can’ ’t support t support” ”

» » J.

  • J. Kesselman

Kesselman, IBM & W3C DOM WG , IBM & W3C DOM WG

XPT 2006 XML APIs: JAXP 24

JAXP: Summary JAXP: Summary

  • An interface for using XML Processors

An interface for using XML Processors

– – SAX/DOM parsers, XSLT transformers SAX/DOM parsers, XSLT transformers

  • Supports

Supports pluggability pluggability of XML processors

  • f XML processors
  • Defines means to control parsing, and

Defines means to control parsing, and handling of parse errors (through SAX handling of parse errors (through SAX ErrorHandlers ErrorHandlers) )

  • Defines means to write out DOM Documents

Defines means to write out DOM Documents

  • Included in Java 2

Included in Java 2