XML Path Language (XPath) 2.0

XML Foundations (INFO 242)

Erik Wilde, UC Berkeley School of Information
2007-10-09
Creative Commons License

This work is licensed under a CC
Attribution 3.0 Unported License

Abstract

The XML Path Language (XPath) is one of the most useful and frequently used languages in the are of XML technologies. In its version 1.0, it is used in technologies such as XSLT, XSDL, DOM, and XML Tools. With XPath 2.0, the language has been greatly extended, the new version of XPath is the foundation for XSLT 2.0 and XQuery. XPath 2.0 provides support for regular expression matching, typed expressions, and contains language constructs for conditional and repeated evaluation.

Outline (Why XPath?)

  1. Why XPath? [3]
  2. How XPath Works [3]
    1. The XPath Tree Model [2]
    2. XPath Evaluation [1]
  3. XPath 1.0 Revisited [3]
  4. Ease of Use [6]
    1. Conditional Expressions [2]
    2. Iterations [2]
    3. Quantified Expressions [1]
  5. Sequences [2]
  6. Applications [3]
  7. Conclusions [1]

Selecting Parts of XML Documents

Making Selection Reusable

How XPath Evolved

Outline (How XPath Works)

  1. Why XPath? [3]
  2. How XPath Works [3]
    1. The XPath Tree Model [2]
    2. XPath Evaluation [1]
  3. XPath 1.0 Revisited [3]
  4. Ease of Use [6]
    1. Conditional Expressions [2]
    2. Iterations [2]
    3. Quantified Expressions [1]
  5. Sequences [2]
  6. Applications [3]
  7. Conclusions [1]

Outline (The XPath Tree Model)

  1. Why XPath? [3]
  2. How XPath Works [3]
    1. The XPath Tree Model [2]
    2. XPath Evaluation [1]
  3. XPath 1.0 Revisited [3]
  4. Ease of Use [6]
    1. Conditional Expressions [2]
    2. Iterations [2]
    3. Quantified Expressions [1]
  5. Sequences [2]
  6. Applications [3]
  7. Conclusions [1]

Starting from the Infoset

What is Not in the XPath Tree

Outline (XPath Evaluation)

  1. Why XPath? [3]
  2. How XPath Works [3]
    1. The XPath Tree Model [2]
    2. XPath Evaluation [1]
  3. XPath 1.0 Revisited [3]
  4. Ease of Use [6]
    1. Conditional Expressions [2]
    2. Iterations [2]
    3. Quantified Expressions [1]
  5. Sequences [2]
  6. Applications [3]
  7. Conclusions [1]

Tree In / Selection Out

Outline (XPath 1.0 Revisited)

  1. Why XPath? [3]
  2. How XPath Works [3]
    1. The XPath Tree Model [2]
    2. XPath Evaluation [1]
  3. XPath 1.0 Revisited [3]
  4. Ease of Use [6]
    1. Conditional Expressions [2]
    2. Iterations [2]
    3. Quantified Expressions [1]
  5. Sequences [2]
  6. Applications [3]
  7. Conclusions [1]

Source Document

 <body>
  <div class="header">
   <h1><a href="http://dret.net/lectures/publishing-spring07/">Web-Based Publishing</a> – Class List</h1>
   <h2><a href="http://www.berkeley.edu/" title="UC Berkeley">UCB</a> <a href="http://ischool.berkeley.edu/" title="School of Information">iSchool</a> – Spring 2007</h2>
  </div>
  <ul>
   <li id="jeff">Jeff Decker</li>
   <li id="michael">Michael Lee</li>
   <li id="yiming">Yiming Liu</li>
   <li id="matty">Matthew Ochmanek</li>
   <li id="igor">Igor Pesenson</li>
   <li id="ryan">Ryan Shaw</li>
   <li id="libby">Libby Smith</li>
   <li id="john">John Ward</li>
   <li id="lois">Lois Wei</li>
   <li id="dret">Erik Wilde</li>
  </ul>
 </body>

XPath Expressions

Axes

xpath-axes.png

Outline (Ease of Use)

  1. Why XPath? [3]
  2. How XPath Works [3]
    1. The XPath Tree Model [2]
    2. XPath Evaluation [1]
  3. XPath 1.0 Revisited [3]
  4. Ease of Use [6]
    1. Conditional Expressions [2]
    2. Iterations [2]
    3. Quantified Expressions [1]
  5. Sequences [2]
  6. Applications [3]
  7. Conclusions [1]

Easier to Understand

<listing src="xlinked-class.xml" line="81-98"/>
string-join(tokenize( if ( exists(@encoding) ) then unparsed-text($fileuri, @encoding) else unparsed-text($fileuri), '\r?\n')[(position() ge number(tokenize(current()/@line, '\-')[1])) and (position() le number(tokenize(current()/@line, '\-')[2]))], '&#xa;')

Outline (Conditional Expressions)

  1. Why XPath? [3]
  2. How XPath Works [3]
    1. The XPath Tree Model [2]
    2. XPath Evaluation [1]
  3. XPath 1.0 Revisited [3]
  4. Ease of Use [6]
    1. Conditional Expressions [2]
    2. Iterations [2]
    3. Quantified Expressions [1]
  5. Sequences [2]
  6. Applications [3]
  7. Conclusions [1]

Control Flow in XPath

if ( … ) then … else …
if ( @sex eq 'm' ) then 'Sir' else 'Madam'
if ( @sex eq 'm' ) then 'Sir' else if ( @sex eq 'f' ) then 'Madam' else 'Whatever'

Less XSLT

<names>
 <name>
  <first>Erik</first>
  <last>Wilde</last>
 </name>
 <name>
  <last>Hasan</last>
 </name>
</names>
first | last[not(../first)]
<xsl:variable name="name">
	<xsl:choose>
		<xsl:when test="first">
			<xsl:value-of select="first"/>
		</xsl:when>
		<xsl:otherwise>
			<xsl:value-of select="last"/>
		</xsl:otherwise>
	</xsl:choose>
</xsl:variable>
if ( exists(first) ) then first else last

Outline (Iterations)

  1. Why XPath? [3]
  2. How XPath Works [3]
    1. The XPath Tree Model [2]
    2. XPath Evaluation [1]
  3. XPath 1.0 Revisited [3]
  4. Ease of Use [6]
    1. Conditional Expressions [2]
    2. Iterations [2]
    3. Quantified Expressions [1]
  5. Sequences [2]
  6. Applications [3]
  7. Conclusions [1]

Repeating Expression Evaluation

for $… in … return …
for $i in //name return $i/last
for $i in //name return if ( exists($i/first) ) then $i/first else $i/last

Iterations vs. Location Paths

Outline (Quantified Expressions)

  1. Why XPath? [3]
  2. How XPath Works [3]
    1. The XPath Tree Model [2]
    2. XPath Evaluation [1]
  3. XPath 1.0 Revisited [3]
  4. Ease of Use [6]
    1. Conditional Expressions [2]
    2. Iterations [2]
    3. Quantified Expressions [1]
  5. Sequences [2]
  6. Applications [3]
  7. Conclusions [1]

Testing Sequences

( some | every ) $… in … satisfies …
some $i in //*[@xlink:type='locator']/@xlink:href satisfies $i eq $query-uri
every $i in //li/@id satisfies //*[@xlink:type='locator'][@xlink:href=concat('#', $i)]

Outline (Sequences)

  1. Why XPath? [3]
  2. How XPath Works [3]
    1. The XPath Tree Model [2]
    2. XPath Evaluation [1]
  3. XPath 1.0 Revisited [3]
  4. Ease of Use [6]
    1. Conditional Expressions [2]
    2. Iterations [2]
    3. Quantified Expressions [1]
  5. Sequences [2]
  6. Applications [3]
  7. Conclusions [1]

Major Changes

every $i in ( 11, 22, 33, 'string' ) satisfies string(number($i)) ne 'NaN'

Divide and Conquer

Outline (Applications)

  1. Why XPath? [3]
  2. How XPath Works [3]
    1. The XPath Tree Model [2]
    2. XPath Evaluation [1]
  3. XPath 1.0 Revisited [3]
  4. Ease of Use [6]
    1. Conditional Expressions [2]
    2. Iterations [2]
    3. Quantified Expressions [1]
  5. Sequences [2]
  6. Applications [3]
  7. Conclusions [1]

Standalone

for $i in ( 11, 22, 33, 'string' ) return ($i, number($i))

XQuery

declare variable $firstName external;
<videos featuring="{$firstName}"> {
  let $doc := .
  for $v in $doc//video, $a in $doc//actors/actor
  where ends-with($a, $firstName) and $v/actorRef = $a/@id
  order by $v/year
  return
    <video year="{$v/year}"> { $v/title } </video> }
</videos>

XSLT 2.0

Outline (Conclusions)

  1. Why XPath? [3]
  2. How XPath Works [3]
    1. The XPath Tree Model [2]
    2. XPath Evaluation [1]
  3. XPath 1.0 Revisited [3]
  4. Ease of Use [6]
    1. Conditional Expressions [2]
    2. Iterations [2]
    3. Quantified Expressions [1]
  5. Sequences [2]
  6. Applications [3]
  7. Conclusions [1]

Easy Transition