XML Transformations (XSLT) – Part IV

XML Foundations [./]
Fall 2011 — INFO 242 (CCN 42596)

Ray Larson, UC Berkeley School of Information
2011-11-03

Creative Commons License [http://creativecommons.org/licenses/by/3.0/]

This work is licensed under a CC
Attribution 3.0 Unported License
[http://creativecommons.org/licenses/by/3.0/]

Contents R. Larson: XML Transformations (XSLT) – Part IV

Contents

R. Larson: XML Transformations (XSLT) – Part IV

(2) Abstract

Advanced XSLT processing includes better control of the input and output documents, which can be finely controlled in terms of how whitespace is treated. Another interesting feature of XSLT are keys, which allow shorthand notations for frequently used access paths to nodes, and provide XSLT processors with more information for performance optimizations. Instructions for creating all possible kinds of nodes in the output tree make it possible to write code which generates element or attribute names based on runtime evaluations.



Controlling Documents

Outline (Controlling Documents)

  1. Controlling Documents [6]
    1. Input Documents [3]
    2. Output Documents [2]
  2. Keys [6]
  3. Generating Result Nodes [2]
  4. Modularizing Stylesheets [2]
  5. Conclusions [1]
Controlling Documents R. Larson: XML Transformations (XSLT) – Part IV

(4) XSLT Processing Model



Input Documents

Outline (Input Documents)

  1. Controlling Documents [6]
    1. Input Documents [3]
    2. Output Documents [2]
  2. Keys [6]
  3. Generating Result Nodes [2]
  4. Modularizing Stylesheets [2]
  5. Conclusions [1]
Input Documents R. Larson: XML Transformations (XSLT) – Part IV

(6) Opening Documents

  • Initially, XSLT starts with the XPath node tree of the main document
    • this step is outside of the control of the XSLT programmer
  • Additional documents can be accessed using document()
    • the function accepts URIs, which are interpreted relative to the stylesheet
    • only XML documents can be used, they will be parsed into an XPath tree
  • XSLT Processors are smart enough to cache documents
    • re-opening the same document will not re-parse it
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output method="text"/>
 <xsl:template match="/">
  <xsl:text>me: </xsl:text>
  <xsl:value-of select="count(document('document.xsl')/descendant::*)"/>
  <xsl:text>&#xa;http://dret.net/netdret/publications: </xsl:text>
  <xsl:value-of select="count(document('http://dret.net/netdret/publications')/descendant::*)"/>
 </xsl:template>
</xsl:stylesheet>


Input Documents R. Larson: XML Transformations (XSLT) – Part IV

(7) Whitespace in Documents

  • Documents often contain many irrelevant whitespace text nodes
    • many XML documents are pretty-printed for readability
    • pretty-printing produces many line-feeds and tabs/spaces
  • XSLT can be instructed to ignore whitespace nodes
    • strip-space lists all elements for which whitespace children should be ignored
    • this may be a bit too much, because Mixed Content [XML Basics; Mixed Content (1)] may contain significant whitespace
    <p>do <u>not</u> <em>throw</em> <b>away</b> these whitespace nodes!</p>
  • XSLT can be instructed to preserve some whitespace nodes
    • preserve-space lists all elements for which whitespace children should be preserved
    • usually, preserve-space lists the exceptions for strip-space
    • usually, preserve-space contains a list of all mixed content elements


Input Documents R. Larson: XML Transformations (XSLT) – Part IV

(8) Controlling Whitespace

 <xsl:strip-space elements="*"/>
 <xsl:preserve-space elements="xsl:text"/>
 <xsl:template match="/">
  <xsl:text>xsl:text is used for outputting text.</xsl:text>
  <xsl:text> </xsl:text>
  <xsl:text>it also is the only element where whitespace nodes in the stylesheet are significant</xsl:text>
  <xsl:text>&#xa;</xsl:text>
  <xsl:value-of select="count(//text())"/>
 </xsl:template>


Output Documents

Outline (Output Documents)

  1. Controlling Documents [6]
    1. Input Documents [3]
    2. Output Documents [2]
  2. Keys [6]
  3. Generating Result Nodes [2]
  4. Modularizing Stylesheets [2]
  5. Conclusions [1]
Output Documents R. Larson: XML Transformations (XSLT) – Part IV

(10) Serialization

  • XSLT always produces a result tree
    • stylesheet processing starts with an empty tree (root node only)
    • XSLT code producing output then adds nodes to this tree
    • text, value-of, copy-of, copy, element, attribute, comment, processing-instruction, Literal Result Elements [Literal Result Elements (1)]
  • Serialization is the process of externalizing the final tree
    • output controls how the tree is serialized
    • xml writes the tree as an XML document
    • html writes the tree as an HTML document (img … instead of img …/)
    • text writes the tree's string value (the concatenation of all text nodes)


Output Documents R. Larson: XML Transformations (XSLT) – Part IV

(11) Multiple Output Documents

  • XSLT 1.0 does not support more than one output document
    • message is another output channel, but not a document
    • this was one of the most requested features for language improvements
  • How can stylesheets produce more than one document?
    • XSLT 1.0 may produce one document which is then post-processed
    • XSLT 2.0 offers language facilities for more than one output document


Keys

Outline (Keys)

  1. Controlling Documents [6]
    1. Input Documents [3]
    2. Output Documents [2]
  2. Keys [6]
  3. Generating Result Nodes [2]
  4. Modularizing Stylesheets [2]
  5. Conclusions [1]
Keys R. Larson: XML Transformations (XSLT) – Part IV

(13) Document Access



Keys R. Larson: XML Transformations (XSLT) – Part IV

(14) Declaring and Using Keys



Keys R. Larson: XML Transformations (XSLT) – Part IV

(15) XML and XSLT for using a Key

<people>
 <entry id="dret" country="de">
  <name>
   <given>Erik</given>
   <given>Thomas</given>
   <sur>Wilde</sur>
  </name>
  <email>dret@berkeley.edu</email>
  <affiliation country="us">iSchool/UCB</affiliation>
  <phone location="office" type="voice">+1-510-6432253</phone>
  <phone location="office" type="fax">+1-510-6425814</phone>
 <xsl:key name="givenNameKey" match="name" use="given"/>
 <xsl:key name="affiliationKey" match="affiliation" use="."/>
 <xsl:key name="countryKey" match="entry | affiliation" use="@country"/>


Keys R. Larson: XML Transformations (XSLT) – Part IV

(16) XSLT Key Structure

givenNameKey
Node Value
[1] Erik Thomas Wilde Erik
[1] Erik Thomas Wilde Thomas
[2] Thomas Plagemann Thomas
[3] Bob Glushko Bob
countryKey
Node Value
[1a] Erik Thomas Wilde de
[1b] iSchool/UCB us
[2a] Thomas Plagemann de
[2b] IFI/UIO no
[3a] Bob Glushko us
[3b] iSchool/UCB us


Keys R. Larson: XML Transformations (XSLT) – Part IV

(17) Using Keys



Keys R. Larson: XML Transformations (XSLT) – Part IV

(18) Node Set Intersection

$a[count(. | $b) = count($b)]: Find all nodes in $a where the cardinality of $b does not change when adding this node to it. This means the node must be in $b, and it is in $a to start with.

xpath-intersection.png

Generating Result Nodes

Outline (Generating Result Nodes)

  1. Controlling Documents [6]
    1. Input Documents [3]
    2. Output Documents [2]
  2. Keys [6]
  3. Generating Result Nodes [2]
  4. Modularizing Stylesheets [2]
  5. Conclusions [1]
Generating Result Nodes R. Larson: XML Transformations (XSLT) – Part IV

(20) Literal Result Elements



Generating Result Nodes R. Larson: XML Transformations (XSLT) – Part IV

(21) Producing Nodes Explicitly

 <xsl:template match="*">
  <xsl:element name="{translate(local-name(), 'abcdefghijklmnopqrstuvwxyz', 'ABCDEFGHIJKLMNOPQRSTUVWXYZ')}">
   <xsl:apply-templates select="node() | @*"/>
  </xsl:element>
 </xsl:template>
 <xsl:template match="@*">
  <xsl:attribute name="{translate(local-name(), 'abcdefghijklmnopqrstuvwxyz', 'ABCDEFGHIJKLMNOPQRSTUVWXYZ')}">
   <xsl:value-of select="."/>
  </xsl:attribute>
 </xsl:template>


Modularizing Stylesheets

Outline (Modularizing Stylesheets)

  1. Controlling Documents [6]
    1. Input Documents [3]
    2. Output Documents [2]
  2. Keys [6]
  3. Generating Result Nodes [2]
  4. Modularizing Stylesheets [2]
  5. Conclusions [1]
Modularizing Stylesheets R. Larson: XML Transformations (XSLT) – Part IV

(23) Including and Importing



Modularizing Stylesheets R. Larson: XML Transformations (XSLT) – Part IV

(24) Import Precedence

xslt-import-precedence.png

Conclusions

Outline (Conclusions)

  1. Controlling Documents [6]
    1. Input Documents [3]
    2. Output Documents [2]
  2. Keys [6]
  3. Generating Result Nodes [2]
  4. Modularizing Stylesheets [2]
  5. Conclusions [1]
Conclusions R. Larson: XML Transformations (XSLT) – Part IV

(26) XSLT in Practice



2011-11-03 XML Foundations [./]
Fall 2011 — INFO 242 (CCN 42596)