9. Path expressions: XPath XPath is a language for selecting parts - - PowerPoint PPT Presentation

9 path expressions xpath
SMART_READER_LITE
LIVE PREVIEW

9. Path expressions: XPath XPath is a language for selecting parts - - PowerPoint PPT Presentation

9. Path expressions: XPath XPath is a language for selecting parts of XML documents it is a kind of simple query language . XPath does not use normal XML syntax; path expressions are strings for the parser. XPath is a tree


slide-1
SLIDE 1

XML-9 J. Teuhola 2013 151

  • 9. Path expressions: XPath
  • XPath is a language for selecting parts of

XML documents – it is a kind of simple query language.

  • XPath does not use normal XML syntax;

path expressions are strings for the parser.

  • XPath is a tree traversal (’navigation’)

language, with techniques for moving from one node to another in the document tree, and for restricting the set of selected nodes.

  • XPath is used e.g. by XML Schema, XSLT,

XLink, XPointer, XForms, ...

slide-2
SLIDE 2

XML-9 J. Teuhola 2013 152

Node types in the document tree

  • Root node (with special interpretation)
  • Element nodes
  • Text nodes
  • Attribute nodes
  • Processing instruction nodes
  • Namespace nodes (≠ normal attributes)
  • Comment nodes
  • Note. DTD, CDATA, and entity references are

assumed to have been merged to the document before applying XPath to the document tree.

slide-3
SLIDE 3

XML-9 J. Teuhola 2013 153

Path expressions

  • Called location paths
  • One path identifies zero, one or more nodes

(attributes, elements, etc.) in the document tree.

  • A location path consists of location steps.
  • Each step continues from a context node,

produced by the previous step (e.g. match).

  • Path notation resembles Unix directory paths:

– Root = ”/”; has as children the actual root element and the stuff before the root element (processing instructions and comments) – Paths can be absolute (starting from the root ”/...”),

  • r relative to the current subtree.
slide-4
SLIDE 4

XML-9 J. Teuhola 2013 154

Location step types

  • Element name moves from the context node to

the child elements with the given name.

  • Attribute name, prefixed by ’@’, selects the

named attribute of the context node.

  • text() matches text nodes within the context

node, i.e. maximum possible text segments.

  • comment() matches the comment nodes under

the context node.

  • processing-instruction() matches the

processing instructions under the context node.

slide-5
SLIDE 5

XML-9 J. Teuhola 2013 155

Example document: course list

<?xml version=”1.0”?> <courses> <course cname=”Advanced databases”> <teacher>Jukka</teacher> <audience> <student name=”Pekka”/> <student name=”Pirkko”/> </audience> </course> <course cname=”Medical informatics”> <teacher>Timo</teacher> <audience> <student name=”Pekka”/> <student name=”Paula”/> </audience> </course> </courses>

slide-6
SLIDE 6

XML-9 J. Teuhola 2013 156

Element and attribute location steps for listing course names with XSLT

<?xml ... ?> <xsl:stylesheet ... > <!-- Start with an absolute location step --> <xsl:template match=“/courses”> <html> <head><title>XPath test</title></head> <!-- Perform a location step relative to ‘/courses’ --> <body> <xsl:apply-templates select=“course”/> </body> </html> </xsl:template> <xsl:template match="course"> <!-- Location step relative to ‘course’ --> <xsl:value-of select="@cname"/> </xsl:template> </xsl:stylesheet> Result: Advanced databases Medical informatics

slide-7
SLIDE 7

XML-9 J. Teuhola 2013 157

Extensions of location steps

  • Wildcards:

– ’∗’ matches any element node being a child of the context node, irrespective of the name. – ’prefix:∗’ matches all child elements in the given namespace. – ’node()’ matches all nodes - attributes with ’@node()’. – ’@∗’ matches all attribute nodes – ’@prefix:∗’ matches all attribute nodes in the given namespace.

  • Alternatives: separated by ’|’.

– ’a|b’ matches all child nodes with name a or b.

slide-8
SLIDE 8

XML-9 J. Teuhola 2013 158

Example: Print child elements & attributes

Document: <?xml version="1.0" encoding="UTF-8"?> <example type="small" name=“node printing example"> <greeting>Hello world!</greeting> </example> Stylesheet: <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns:xsl=“…"> <xsl:template match=“/example"> <html> <head /> <body> <xsl:for-each select=“*|@*"> <xsl:value-of select="."/> <br /> </xsl:for-each> </body> </html> </xsl:template> </xsl:stylesheet> Result (browser view): small node printing example Hello world!

slide-9
SLIDE 9

XML-9 J. Teuhola 2013 159

Combining location steps

  • Combining operator: ’/’ (cf. disk directory paths)
  • In a compound path ’a/b/c/...’ step a is with

respect to the current context node; for b, c, ... the context is the result of the previous step.

  • At all steps, the result is a set of nodes.
  • Special cases:

– Starting from the root: ’/a/b/c/...’ – Selecting all descendants: ’//’ – Selecting the context node: ’.’ – Selecting the parent of the context node: ’..’

slide-10
SLIDE 10

XML-9 J. Teuhola 2013 160

Example: compound path for listing students of all courses

<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/"> <html><body> <xsl:for-each select="courses/course/audience/student"> <xsl:value-of select=“@name"/> <br /> </xsl:for-each> </body></html> </xsl:template> </xsl:stylesheet>

Result (browser view): Pekka Pirkko Pekka Paula

slide-11
SLIDE 11

XML-9 J. Teuhola 2013 161

Selection conditions

  • At any location step, one can restrict the

selected set by giving a predicate.

  • The predicate is a boolean expression in [ ... ].

Only nodes satisfying it are selected.

  • The predicate may contain normal comparison

and Boolean operators.

  • The operands may be arbitrary components

(using XPath) with respect to the context node.

  • Other data types can be interpreted as Boolean

values by type casting (function boolean()).

slide-12
SLIDE 12

XML-9 J. Teuhola 2013 162

Example path with conditions: students of ’Medical Informatics’

<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/"> <html><body> <xsl:for-each select="courses/course[@cname= 'Medical informatics']/audience/student"> <xsl:value-of select="@name"/> </xsl:for-each> </body></html> </xsl:template> </xsl:stylesheet>

Result (browser view): Pekka Paula

slide-13
SLIDE 13

XML-9 J. Teuhola 2013 163

Unabbreviated notation

  • The ’direction’ of location can be expressed by a so

called axis.

  • Syntax: ’axis::node’.
  • More powerful than the abbreviated syntax, but not

much used.

  • Axis types:

– parent, child – descendant, descendant-or-self – ancestor, ancestor-or-self – following, preceding (in document order) – following-sibling, preceding-sibling (in document order) – attribute – self

slide-14
SLIDE 14

XML-9 J. Teuhola 2013 164

Example of explicit axis notations: navigate to students in the ‘courses’-tree

<!-- See slide 155 for content of the ‘courses’ document --> <xsl:template match="/child::courses"> <html> <body> <!-- Navigate to the ‘audience’ node --> <xsl:for-each select="child::course/child::teacher/following-sibling::*"> <!-- scan the students of the audience --> <xsl:for-each select="child::student"> <!-- Pick up the names of students --> <xsl:value-of select="attribute::name"/><br/> </xsl:for-each> </xsl:for-each> </body> </html> </xsl:template>

Result (browser view): Pekka Pirkko Pekka Paula

slide-15
SLIDE 15

XML-9 J. Teuhola 2013 165

Other types of XPath expressions

  • Above expressions are of type nodeset; others:
  • Numbers:

– double-precision (8-byte) floating-point numbers; used also for integers, – normal arithmetic operations (+, -, *, div, mod)

  • Strings:

– sequences of Unicode characters; some syntactic restrictions depending on the context, – (in)equality comparison available (=, !=), less/greater meaningful only for numeric strings,

  • Booleans:

– Results from comparisons and Boolean operations.

slide-16
SLIDE 16

XML-9 J. Teuhola 2013 166

XPath functions for nodesets

  • position(): the relative position of the current

node within the context node list; used mainly in XSLT template rules.

  • last(): number of nodes in the context
  • count(x): number of nodes in the argument
  • id(x): nodeset having the argument IDs.
  • local-name(x): local part of the namespace of

the first node in the argument.

  • namespace-uri(x): as above, but URI returned
  • name(x): returns the prefixed name of the first

node in the argument.

slide-17
SLIDE 17

XML-9 J. Teuhola 2013 167

Example of XPath functions: Course numbers & names (see p. 155)

<xsl:stylesheet version="1.0" xmlns:xsl=“…"> <xsl:template match="/courses"> <html> <head> <title>XPath function test</title> </head> <body> <xsl:apply-templates select="course"/> </body> </html> </xsl:template> <xsl:template match="course"> Course <xsl:value-of select="position() "/>

  • f <xsl:value-of select="count(/courses/course)"/>

is named <xsl:value-of select="@cname"/> <br/> </xsl:template> </xsl:stylesheet> Result (browser view):

Course 1 of 2 is named Advanced databases Course 2 of 2 is named Medical informatics

slide-18
SLIDE 18

XML-9 J. Teuhola 2013 168

XPath functions for strings

  • string(x) converts the argument x (any type) to a string.
  • starts-with(a, b) is true if a starts with b.
  • contains(a, b) is true is b is a substring of a.
  • substring(s, i, n) returns n characters from s starting

from position i.

  • substring-before(a, b) returns the substring of a
  • ccurring before the first occurrence of b in a.
  • substring-after(a, b): as above but the suffix of

argument a is returned.

  • string-length(s) returns the length of s.
  • normalize-space(s) trims and removes extra spaces.
slide-19
SLIDE 19

XML-9 J. Teuhola 2013 169

Example: Courses with the teacher name starting with ‘T’ (see p. 155)

<xsl:stylesheet version="1.0" xmlns:xsl=“…"> <xsl:template match="/courses"> <html> <head> <title>String function test</title> </head> <body> Courses with teacher name starting with &quot;T&quot;:<br /> <xsl:for-each select="course[starts-with(teacher, 'T')]"> <xsl:value-of select="teacher" />: <xsl:value-of select="@cname" /> </xsl:for-each> </body> </html> </xsl:template> </xsl:stylesheet>

Result (browser view): Courses with teacher name starting with "T": Timo: Medical informatics

slide-20
SLIDE 20

XML-9 J. Teuhola 2013 170

Other XPath functions

  • Numeric functions:

– round(x), floor(x), ceiling(x): obvious meanings – number(x) converts (somehow) the argument to a number – sum(x) applies number conversion to a nodeset and computes their sum

  • Boolean functions:

– constant functions true() and false() – not(x) makes the complement – boolean(x) converts (somehow) x to boolean; e.g. zero= false, empty nodeset = false, empty string = false, others true.