tasks of a parser tasks of a parser document parser
play

Tasks of a Parser Tasks of a Parser Document Parser Interfaces - PDF document

2. XML Processor APIs Document Parser Interfaces 2. XML Processor APIs Document Parser Interfaces (See, e.g., Leventhal (See, e.g., Leventhal, Lewis & Fuchs: Designing XML , Lewis & Fuchs: Designing XML How can (Java) applications


  1. 2. XML Processor APIs Document Parser Interfaces 2. XML Processor APIs Document Parser Interfaces (See, e.g., Leventhal (See, e.g., Leventhal, Lewis & Fuchs: Designing XML , Lewis & Fuchs: Designing XML � How can (Java) applications manipulate How can (Java) applications manipulate � Internet Applications, Chapter 10, and Internet Applications, Chapter 10, and D. D. Megginson Megginson: Events vs. Trees [online]) : Events vs. Trees [online]) structured (XML) documents? structured (XML) documents? � Every XML application contains some kind of Every XML application contains some kind of � – An overview of XML processor interfaces An overview of XML processor interfaces – a parser a parser – – editors, browsers editors, browsers – transformation/style engines, DB loaders, ... transformation/style engines, DB loaders, ... 2.1 SAX: an event- 2.1 SAX: an event -based interface based interface – � XML parsers have become standard tools of XML parsers have become standard tools of � 2.2 DOM: an object- -based interface based interface 2.2 DOM: an object application development frameworks application development frameworks 2.3 JAXP: Java API for XML Processing 2.3 JAXP: Java API for XML Processing – JDK 1.4 contains JAXP, with its default parser – JDK 1.4 contains JAXP, with its default parser (Apache Crimson) (Apache Crimson) XPT 2006 XML APIs: SAX 1 XPT 2006 XML APIs: SAX 2 Tasks of a Parser Tasks of a Parser Document Parser Interfaces Document Parser Interfaces � Document instance decomposition Document instance decomposition � I: Event- I: Event -based interfaces based interfaces – elements, attributes, text, processing instructions, elements, attributes, text, processing instructions, – – Command line and ESIS interfaces – Command line and ESIS interfaces entities, ... entities, ... » Element Structure Information Set, traditional » Element Structure Information Set, traditional � Verification Verification � interface to stand- interface to stand -alone SGML parsers alone SGML parsers – well – well- -formedness formedness checking checking – Event call Event call- -back interfaces: SAX back interfaces: SAX » » syntactical correctness of XML markup syntactical correctness of XML markup – – validation (against a DTD or Schema; optional) – validation (against a DTD or Schema; optional) II: Tree- II: Tree -based (object model) interfaces based (object model) interfaces � Access to contents of the DTD (if supported) Access to contents of the DTD (if supported) � – W3C DOM Recommendation – W3C DOM Recommendation – SAX 2.0 Extensions provide info of declarations: – SAX 2.0 Extensions provide info of declarations: – Java – Java- -specific object models: JAXB, JDOM, dom4J specific object models: JAXB, JDOM, dom4J element type names and their content model element type names and their content model expressions expressions XPT 2006 XML APIs: SAX 3 XPT 2006 XML APIs: SAX 4 Command- -line ESIS interface line ESIS interface Event Call- Event Call -Back Interfaces Back Interfaces Command � Application implements a set of Application implements a set of call call- -back back � Application Application methods for handling parse events for handling parse events methods – – parser notifies the application by method calls parser notifies the application by method calls Ai CDATA 1 Ai CDATA 1 Command Command (E (E – qualified further by parameters: qualified further by parameters: – ESIS ESIS line call line call -Hi! Hi! - » element type name » element type name Stream Stream )E )E » names and values of attributes » names and values of attributes » » values of content strings, values of content strings, … … SGML/XML Parser SGML/XML Parser � Idea behind Idea behind ‘‘ ‘‘SAX SAX’’ ’’ (Simple API for XML) (Simple API for XML) � – an industry standard API for XML parsers – an industry standard API for XML parsers <E i="1"> > Hi! </E> <E i="1" Hi! </E> – – could think as could think as “ “S Serial erial A Access ccess X XML ML” ” XPT 2006 XML APIs: SAX 5 XPT 2006 XML APIs: SAX 6 An event call- An event call -back application back application Object Model Interfaces Object Model Interfaces � The parser builds ... The parser builds ... � Application Main Application Main – a document object consisting of sub a document object consisting of sub- -objects such objects such – Routine Routine Parse() Parse() as document document , , elements, attributes, text elements, attributes, text , , … … as � Abstraction level higher than in event based Abstraction level higher than in event based � startDocument() startDocument () Callback Routines Callback interfaces; more powerful access interfaces; more powerful access "A",[i="1"] "A",[i="1"] – – to descendants, following siblings, to descendants, following siblings, … … startElement startElement() () "Hi!" "Hi!" � Drawback: Higher memory consumption Drawback: Higher memory consumption � Routines characters() characters() – – > used mainly in client applications > used mainly in client applications <?xml version='1.0'?> <?xml version='1.0'?> (to implement document manipulation by user) (to implement document manipulation by user) "A" "A" endElement() () endElement <A i="1"> <A i="1"> Hi! Hi! </A> </A> XPT 2006 XML APIs: SAX 7 XPT 2006 XML APIs: SAX 8

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend