Streaming API for XML Asst. Prof. Dr. Kanda Runapongsa Saikaew - PowerPoint PPT Presentation

Streaming API for XML Asst. Prof. Dr. Kanda Runapongsa Saikaew (krunapon@kku.ac.th) Dept. of Computer Engineering Khon Kaen University 1

Agenda  What is StAX?  Why StAX?  StAX API  Using StAX  Sun’s Streaming Parser Implementation 2

What is StAX? (1/2)  StAX stands for Streaming API for XML (StAX)  A streaming Java-based, event- driven, pull-parsing API for reading and writing XML documents  StAX enables you to create bidirectional XML parsers that are fast, relatively easy to program, and have a light memory footprint 3

What is StAX? (2/2)  StAX provides a standard, bidrectional pull parser interface for streaming XML processing  Offer a simpler programming model than SAX  Process with more efficient memory management than DOM  Enable developers to parse and modify XML streams as events 4

Push APIs  The common streaming APIs like SAX are all push APIs  Feed the content of the document to the application as soon as they see it  Does not pay attention to whether the application is ready to receive that data or not  Cause patterns that are unfamiliar and uncomfortable to many developers 5

Pull APIs vs. Push APIs  In a pull API, the client program asks the parser for the next piece of information  Not the parser tell the client program when the next datum is available  In a pull API the client program drives the parser  In a push API the parser drives the client 6

Pull Parsing vs. Push Parsing (1/2)  Streaming pull parsing  The client only gets (pulls) XML data when it explicitly asks for it  The client controls the application thread  Streaming push parsing  The parser sends the data whether or not the client is ready to use it at that time  The parser controls the application thread 7

Pull Parsing vs. Push Parsing (2/2)  Pull parsing libraries can be much smaller  Pull clients can read multiple documents at one time with a single thread  Pull parser can filter XML documents such that elements unnecessary to the client can be ignored 8

Why StAX?  The primary goal of the StAX API is to give “parsing control to the programming by exposing a simple iterator based API  This allows the programmer to ask for the next event (pull the event) and allow state to be stored in procedural fashion  StAX was created to address limitations in the two prevalent parsing APIs, SAX and DOM 9

StAX Use Cases (1/2)  Data binding  Unmarshalling an XML document  Marshalling an XML document  Parallel document processing  Wireless communication  SOAP message processing  Parsing simple predictable structures  Parsing graph representations with forward references  Parsing WSDL 10

StAX Use Cases (2/2)  Virtual data sources  Viewing as XML data stored in databases  Viewing data in Java objects created by XML data binding  Navigating a DOM tree as a stream of events  Parsing specific XML vocabularies  Pipelined XML processing 11

StAX vs. SAX  StAX-enabled clients are generally easier to code than SAX clients  StAX is a bidirectional API  It can both read and write XML documents  SAX is read only  SAX is a push API whereas StAX is pull 12

XML Parser API Feature Summary (1/2) Feature StAX SAX DOM TrAX API Type Pull, Push, In memory XSLT rule streaming streaming tree Ease of High Medium High Medium use XPath No No Yes Yes Capability CPU and Good Good Varies Varies Memory Efficiency 13

XML Parser API Feature Summary (2/2) Feature StAX SAX DOM TrAX Forward Yes Yes No No Only Read XML Yes Yes Yes Yes Write XML Yes No Yes Yes Create, No No Yes No Read, Update, Delete 14

StAX API  The StAX API exposes methods for iterative, event-based processing of XML documents  The StAX API is really two distinct API sets  A cursor API  An iterator API 15

Using StAX In general, StAX programmers create XML stream readers, writers, and events by using classes  XMLInputFactory  XMLOutputFactory  XMLEventFactory 16

Cursor API  The StAX cursor API represents a cursor with which you can walk an XML document from beginning to end  This cursor can point to one thing at a time  It always moves forward, never backward, usually one infoset element at a time 17

Cursor Interfaces  The two main cursor interfaces are XMLStreamReader and XMLStreamWriter  XMLStreamReader includes accessor methods for all possible information retrievable from the XML information model  XMLStreamWriter provides methods that corresponds to StartElement and EndElement event types 18

XMLStreamReader public interface XMLStreamReader { public int next () throws XMLStreamException; public boolean hasNext () throws XMLStreamException; public String getText () ; public String getLocalName () ; public String getNamespaceURI () ; // ... other methods not shown } 19

XHTMLOutliner (1/7) packa ckage stax_p _parse ser; r; import rt javax. x.xml.st xml.strea ream.*; .*; import t java.n .net.UR .URL; import rt java.i .io.*; *; import t java.u .uti til. l.Prop Properti ties; s; publi lic c class ss XHTMLOutl tlin iner { publi lic c stati tic c void main(St (Strin ing[] [] args) ) { if (args.le s.length th == 0) { System.err.println("Usage: java XHTMLOutliner url"); retu turn; rn; } String input = args[0]; 20

XHTMLOutliner (2/7) try { setProxy(); URL u = new URL(in input); InputStream in = u.openStream(); XMLInputFactory factory = XMLInputFactory. newInstance(); XMLStreamReader parser = factory.createXMLStreamReader(in); int t inHeader r = 0; for (int event t = parser.n ser.next xt(); event != XMLStreamConstants. END_DOCUMENT; event = parser.next()) { 21

XHTMLOutliner (3/7) switch tch (event) { case se XMLStrea treamCon Consta stants. ts.START_ TART_ELEMENT: NT: if (isHeader(parser.g ser.getL tLocal calNam Name())) { inHeader++; } break; k; case se XMLStrea treamCon Consta stants. ts.END ND_EL _ELEMENT: NT: if (isHeader(parser.g ser.getL tLocal calNam Name())) { inHeader--; if (inHeader == 0) Syste tem.o .out.p .prin intl tln() (); } break; k; 22

XHTMLOutliner (4/7) case e XMLStreamCo reamConstan nstants.CHAR ARACTERS: ERS: if (inHead eader er > 0) System.out.print(parser.getText()); break; ak; case e XMLStream reamCons onstant ants.CDAT ATA: A: if (inHead eader er > 0) System.out.print(parser.getText()); break; ak; } // end switch } // end for 23

XHTMLOutliner (5/7) parser.close(); System.out.println("Done processing"); } catch h (XMLStrea treamE mExcepti ption on ex) { System.out.println(ex); } catch h (IOExcepti ption on ex) { System.out.println("IOException while parsing " + input); } // end try-catch } // end main 24

XHTMLOutliner (6/7) private te static ic boolean isHeader(S r(String tring name) { if (name.eq equa uals(" ls("h1") ")) return true; if (name.eq equa uals(" ls("h2") ")) return true; if (name.eq equa uals(" ls("h3") ")) return true; if (name.eq equa uals(" ls("h4") ")) return true; if (name.eq equa uals(" ls("h5") ")) return true; if (n (name.e .equ quals( als("h "h6")) )) return true; return rn false; } 25

XHTMLOutliner (7/7) private static void setProxy(){ Properties systemSettings = System.getProperties(); systemSettings.put("proxySet", "true"); systemSettings.put("http.proxyHost","2 02.12.97.116") ; systemSettings.put("http.proxyPort", "8088") ; } 26

XHTMLOutliner: Sample Input <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>I Love HTML</title> <meta http-equiv="Content-Language" content="en- us“ <meta http-equiv="Content-Type" content="text/html; charset=iso-8859- 1" /> </head> <body> <h1>Top 10 Strategic Technologies for 2008</h1> <h2>By Gartner</h2> <h3>Green IT</h3> <h4>Scheduling decisions for workloads on servers will begin to consider power efficiency as a key placement attribute.</h4> </body> </html> 27

XHTMLOutliner: Sample Output Top 10 Strategic Technologies for 2008 By Gartner Green IT Scheduling decisions for workloads on servers will begin to consider power efficiency as a key placement attribute. 28

XMLStreamWriter public interface XMLStreamWriter { public void writeStartElement ( String localName ) \ throws XMLStreamException; public void writeEndElement () \ throws XMLStreamException; public void writeCharacters ( String text ) \ throws XMLStreamException; // ... other methods not shown } 29

Writer1 (1/4) package staxtutorial; import java.io.*; import javax.xml.stream.XMLOutputFactory; import javax.xml.stream.XMLStreamWriter; public class Writer1 { public static void main(String[] args) { try { // output file name String fileName = "nation.xml"; 30

Streaming API for XML Asst. Prof. Dr. Kanda Runapongsa Saikaew - PowerPoint PPT Presentation

Streaming API for XML Asst. Prof. Dr. Kanda Runapongsa Saikaew (krunapon@kku.ac.th) Dept. of Computer Engineering Khon Kaen University 1 Agenda What is StAX? Why StAX? StAX API Using StAX Suns Streaming Parser

Module 2 Module 2 XML Basics XML Basics (XML, Namespaces, (XML, Namespaces, Usage scenarios,

Java 2 Micro Edition XML F. Ricci 2010/2011 J2Me XML overview XML, REST Parsing XML :

XML and Web Services Lecture 8 1 Outline XML (Section 17) XML syntax, semistructured

Binary XML and its Characterization Robin Berjon, XML Prague, 25/06/2005 What is Binary XML?

RESTFUL API BEST PRACTICES By Malwina Nowakowska STX NEXT talented developers | flexible teams

XML Documents XML Documents The XML Namespace mechanism Anders Mller & Michael I.

Querying XML Documents Querying XML Documents How XML may be supported in databases with

XML in Programming Patryk Czarnik XML and Applications 2015/2016 Lecture 5 4.04.2016 XML in

Study of an API Migration for two XML APIs Thiago Bartholomei Krzysztof Czarnecki Ralf Lmmel

Extensible Markup Language (XML) - Principles Michel Goossens IT/API XML Detector Description

Streaming API for XML Asst. Prof. Dr. Kanda Runapongsa Saikaew (krunapon@kku.ac.th) Dept. of

XML Retrieval XML Retrieval XML Retrieval XML Retrieval DB/IR in DB/IR in Theory Theory Web

Transforming XML Documents Transforming XML Documents How the XSLT language transforms XML

Session 23 XML XML Reading and Reference Reading https://en.wikipedia.org/wiki/XML

XML and Content Management Lecture 3: Modelling XML Documents: XML Schema Maciej Ogrodniczuk,

Modelling XML Applications Patryk Czarnik XML and Applications 2015/2016 Lecture 2

Web Real-Time Communication Solutions History Browser-based Real-time Communications Video,

The Campus Client Connecting campus researchers to the OSG David Champion University of

gRPC at Lyft gRPC at Lyft gRPC Meetup - SF gRPC Meetup - SF Chris Roche Chris Roche Lyft,

Control Points Switch Office Information Server Fixed Network DB Base Station Vechicle

Data Analytics CS301 Tools for Working with Data Week 1: 3 rd Sept Fall 2020 Oliver

Client/Server Computing & link The types of messages exchanged physical The syntax

Template-based Reconstruction of Complex Refactorings Kyle Prete, Napol Rachatasumrit, Nikita

Client Server Programming Client Server Programming Srinidhi Varadarajan Network Applications

Streaming API for XML Asst. Prof. Dr. Kanda Runapongsa Saikaew - PowerPoint PPT Presentation

Streaming API for XML Asst. Prof. Dr. Kanda Runapongsa Saikaew (krunapon@kku.ac.th) Dept. of Computer Engineering Khon Kaen University 1 Agenda What is StAX? Why StAX? StAX API Using StAX Suns Streaming Parser

Module 2 Module 2 XML Basics XML Basics (XML, Namespaces, (XML, Namespaces, Usage scenarios,

Java 2 Micro Edition XML F. Ricci 2010/2011 J2Me XML overview XML, REST Parsing XML :

XML and Web Services Lecture 8 1 Outline XML (Section 17) XML syntax, semistructured

Binary XML and its Characterization Robin Berjon, XML Prague, 25/06/2005 What is Binary XML?

RESTFUL API BEST PRACTICES By Malwina Nowakowska STX NEXT talented developers | flexible teams

XML Documents XML Documents The XML Namespace mechanism Anders Mller &amp; Michael I.

Querying XML Documents Querying XML Documents How XML may be supported in databases with

XML in Programming Patryk Czarnik XML and Applications 2015/2016 Lecture 5 4.04.2016 XML in

Study of an API Migration for two XML APIs Thiago Bartholomei Krzysztof Czarnecki Ralf Lmmel

Extensible Markup Language (XML) - Principles Michel Goossens IT/API XML Detector Description

Streaming API for XML Asst. Prof. Dr. Kanda Runapongsa Saikaew (krunapon@kku.ac.th) Dept. of

XML Retrieval XML Retrieval XML Retrieval XML Retrieval DB/IR in DB/IR in Theory Theory Web

Transforming XML Documents Transforming XML Documents How the XSLT language transforms XML

Session 23 XML XML Reading and Reference Reading https://en.wikipedia.org/wiki/XML

XML and Content Management Lecture 3: Modelling XML Documents: XML Schema Maciej Ogrodniczuk,

Modelling XML Applications Patryk Czarnik XML and Applications 2015/2016 Lecture 2

Web Real-Time Communication Solutions History Browser-based Real-time Communications Video,

The Campus Client Connecting campus researchers to the OSG David Champion University of

gRPC at Lyft gRPC at Lyft gRPC Meetup - SF gRPC Meetup - SF Chris Roche Chris Roche Lyft,

Control Points Switch Office Information Server Fixed Network DB Base Station Vechicle

Data Analytics CS301 Tools for Working with Data Week 1: 3 rd Sept Fall 2020 Oliver

Client/Server Computing &amp; link The types of messages exchanged physical The syntax

Template-based Reconstruction of Complex Refactorings Kyle Prete, Napol Rachatasumrit, Nikita

Client Server Programming Client Server Programming Srinidhi Varadarajan Network Applications

XML Documents XML Documents The XML Namespace mechanism Anders Mller & Michael I.

Client/Server Computing & link The types of messages exchanged physical The syntax