Blogging in XML

XML Foundations (INFO 242)

Erik Wilde, UC Berkeley School of Information
2007-08-30
Creative Commons License

This work is licensed under a CC
Attribution 3.0 Unported License

Abstract

XML in used in a wide variety of application scenarios, resulting in a wide variety of requirements. This lecture introduces the application example used in this course, which is the representation of blog data in XML. Blogs are a good example for XML, because of their mix of structured data (blog post metadata) and textual data (the actual blog post), the requirement to derive different views (such as weekly and monthly summaries) from the same set of data, and the requirement to make the data available in various output formats (such as HTML and RSS).

Outline (BlogXML)

  1. BlogXML [2]
  2. Rules for BlogXML [4]
  3. Selecting BlogXML Content [1]
  4. Publishing BlogXML [4]
  5. Syndicating BlogXML [2]
  6. Managing BlogXML [1]
  7. Conclusions [1]

Blog Structures

dretblog.xml

<?xml version="1.0" encoding="UTF-8"?>
<blogxml>
 <post date="2007-05-15">
  <title>Half Dome</title>
  <text>The trip to half dome is a long one, but very beautiful and with a spectacular final climb.</text>
  <image src="half-dome">Me on top of half dome.</image>
 </post>
 <post date="2007-05-20">
  <title>Fifth Lake</title>
  <text>The seven lakes loop offers views of (surprise!) seven lakes settled in a remote valley.</text>
  <image src="fifth-lake">Still a lot of ice on fifth lake.</image>
 </post>
 <post date="2007-05-22">
  <title>Golden Canyon</title>
  <text>Golden Canyon leads to the famous Zabriskie Point overlook.</text>
  <image src="golden-canyon">A maze of erosion patterns carved in sediment.</image>
 </post>
</blogxml>

Outline (Rules for BlogXML)

  1. BlogXML [2]
  2. Rules for BlogXML [4]
  3. Selecting BlogXML Content [1]
  4. Publishing BlogXML [4]
  5. Syndicating BlogXML [2]
  6. Managing BlogXML [1]
  7. Conclusions [1]

Structural Constraints

<!ELEMENT blogxml (post+) >
<!ELEMENT post (title, text, image) >
<!ELEMENT title (#PCDATA) >
<!ELEMENT text (#PCDATA) >
<!ATTLIST post
 date CDATA #REQUIRED >
<!ELEMENT image (#PCDATA) >
<!ATTLIST image
 src CDATA #REQUIRED >

Adding Datatype Constraints

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
 <xs:element name="blogxml">
  <xs:complexType>
   <xs:sequence>
    <xs:element name="post" maxOccurs="unbounded">
     <xs:complexType>
      <xs:sequence>
       <xs:element name="title" type="xs:string"/>
       <xs:element name="text" type="xs:string"/>
       <xs:element name="image">
        <xs:complexType>
         <xs:simpleContent>
          <xs:extension base="xs:string">
           <xs:attribute name="src" type="xs:anyURI"/>
          </xs:extension>
         </xs:simpleContent>
        </xs:complexType>
       </xs:element>
      </xs:sequence>
      <xs:attribute name="date" type="xs:date"/>
     </xs:complexType>
    </xs:element>
   </xs:sequence>
  </xs:complexType>
 </xs:element>
</xs:schema>

A Clearer View

BlogXML XSDL

Less Constraints

BlogXML XSDL (Repeatable Images)

Outline (Selecting BlogXML Content)

  1. BlogXML [2]
  2. Rules for BlogXML [4]
  3. Selecting BlogXML Content [1]
  4. Publishing BlogXML [4]
  5. Syndicating BlogXML [2]
  6. Managing BlogXML [1]
  7. Conclusions [1]

Using XML Structures

Outline (Publishing BlogXML)

  1. BlogXML [2]
  2. Rules for BlogXML [4]
  3. Selecting BlogXML Content [1]
  4. Publishing BlogXML [4]
  5. Syndicating BlogXML [2]
  6. Managing BlogXML [1]
  7. Conclusions [1]

Generating HTML from BlogXML

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:template match="/">
  <html>
   <body>
    <table border="1" cellpadding="20">
     <xsl:for-each select="//post">
      <tr>
       <td><a href="../img/{image/@src}.jpg"><img src="../img/{image/@src}-small.jpg"/></a></td>
       <td>
        <h2><xsl:value-of select="format-date(@date, '[F] [MNn] [D], [Y]')"/>: <xsl:value-of select="title"/></h2>
        <p><xsl:value-of select="text"/></p>
       </td>
      </tr>
     </xsl:for-each>
    </table>
   </body>
  </html>
 </xsl:template>
</xsl:stylesheet>

One Page Blog

<html>
   <body>
      <table border="1" cellpadding="20">
         <tr>
            <td><a href="../img/half-dome.jpg"><img src="../img/half-dome-small.jpg"></a></td>
            <td>
               <h2>Tuesday May 15, 2007: Half Dome</h2>
               <p>The trip to half dome is a long one, but very beautiful and with a spectacular final climb.</p>
            </td>
         </tr>

Generating a Blog from BlogXML

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:template match="/">
  <html>
   <body>
    <h1>dretblog</h1>
    <xsl:for-each select="//post">
     <p><a href="{@date}"><xsl:value-of select="concat(format-date(@date, '[F] [MNn] [D], [Y]'), ': ', title)"/></a></p>
    </xsl:for-each>
   </body>
  </html>
  <xsl:for-each select="//post">
   <xsl:result-document href="{@date}.html">
    <html>
     <body>
      <h1><xsl:value-of select="title"/></h1>
      <h2><xsl:value-of select="format-date(@date, '[F] [MNn] [D], [Y]')"/></h2>
      <a href="../img/{image/@src}.jpg" title="{image}"><img src="../img/{image/@src}-small.jpg"/></a>
      <p><xsl:value-of select="text"/></p>
      <p>
       <xsl:if test="exists(preceding-sibling::post)"><a href="{preceding-sibling::post[1]/@date}">← </a></xsl:if>
       <a href="dretblog2">Home</a>
       <xsl:if test="exists(following-sibling::post)"><a href="{following-sibling::post[1]/@date}"> →</a></xsl:if>
      </p>
     </body>
    </html>
   </xsl:result-document>
  </xsl:for-each>
 </xsl:template>
</xsl:stylesheet>

Hyperlinked Blog

<html>
   <body>
      <p><a href="2007-05-15">Tuesday May 15, 2007: Half Dome</a></p>
      <p><a href="2007-05-20">Sunday May 20, 2007: Fifth Lake</a></p>
      <p><a href="2007-05-22">Tuesday May 22, 2007: Golden Canyon</a></p>
   </body>
</html>
<html>
   <body>
      <h1>Half Dome</h1>
      <h2>Tuesday May 15, 2007</h2>
      <a href="../img/half-dome.jpg" title="Me on top of half dome."><img src="../img/half-dome-small.jpg"></a>
      <p>The trip to half dome is a long one, but very beautiful and with a spectacular final climb.</p>
      <p><a href="dretblog2">Home</a><a href="2007-05-20"> →</a></p>
   </body>
</html>

Outline (Syndicating BlogXML)

  1. BlogXML [2]
  2. Rules for BlogXML [4]
  3. Selecting BlogXML Content [1]
  4. Publishing BlogXML [4]
  5. Syndicating BlogXML [2]
  6. Managing BlogXML [1]
  7. Conclusions [1]

Generating Atom from BlogXML

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:template match="/">
  <feed xmlns="http://www.w3.org/2005/Atom" xml:base="http://dret.net/lectures/xml-fall07/src/dretblog.atom">
   <title>dretblog</title>
   <id>http://dret.net/lectures/xml-fall07/src/dretblog.atom</id>
   <link rel="self" href=""/>
   <updated><xsl:value-of select="current-dateTime()"/></updated>
   <author><name>Erik Wilde</name></author>
   <xsl:for-each select="//post">
    <entry xml:base="http://dret.net/lectures/xml-fall07/src/{@date}">
     <title><xsl:value-of select="title"/></title>
     <link href="http://dret.net/lectures/xml-fall07/src/{@date}"/>
     <id><xsl:value-of select="concat('http://dret.net/lectures/xml-fall07/src/', @date)"/></id>
     <published><xsl:value-of select="@date"/>T00:00:00Z</published>
     <updated><xsl:value-of select="current-dateTime()"/></updated>
     <content type="xhtml">
      <div xmlns="http://www.w3.org/1999/xhtml">
       <h1><xsl:value-of select="title"/></h1>
       <h2><xsl:value-of select="format-date(@date, '[F] [MNn] [D], [Y]')"/></h2>
       <a href="http://dret.net/lectures/xml-fall07/img/{image/@src}.jpg" title="{image}"><img src="http://dret.net/lectures/xml-fall07/img/{image/@src}-small.jpg"/></a>
       <p><xsl:value-of select="text"/></p>
      </div>
     </content>
    </entry>
   </xsl:for-each>
  </feed>
 </xsl:template>
</xsl:stylesheet>

dretblog Atom Feed

<feed xmlns="http://www.w3.org/2005/Atom" xml:base="http://dret.net/lectures/xml-fall07/src/dretblog.atom">
 <title>dretblog</title>
 <id>http://dret.net/lectures/xml-fall07/src/dretblog.atom</id>
 <link rel="self" href=""/>
 <updated>2007-08-27T15:34:31.359-07:00</updated>
 <author>
  <name>Erik Wilde</name>
 </author>
 <entry xml:base="http://dret.net/lectures/xml-fall07/src/2007-05-15">
  <title>Half Dome</title>
  <link href="http://dret.net/lectures/xml-fall07/src/2007-05-15"/>
  <id>http://dret.net/lectures/xml-fall07/src/2007-05-15</id>
  <published>2007-05-15T00:00:00Z</published>
  <updated>2007-08-27T15:34:31.359-07:00</updated>
  <content type="xhtml">
   <div xmlns="http://www.w3.org/1999/xhtml">
    <h1>Half Dome</h1>
    <h2>Tuesday May 15, 2007</h2>
    <a href="http://dret.net/lectures/xml-fall07/img/half-dome.jpg" title="Me on top of half dome.">
     <img src="http://dret.net/lectures/xml-fall07/img/half-dome-small.jpg"/>
    </a>
    <p>The trip to half dome is a long one, but very beautiful and with a spectacular final climb.</p>
   </div>
  </content>
 </entry>

Outline (Managing BlogXML)

  1. BlogXML [2]
  2. Rules for BlogXML [4]
  3. Selecting BlogXML Content [1]
  4. Publishing BlogXML [4]
  5. Syndicating BlogXML [2]
  6. Managing BlogXML [1]
  7. Conclusions [1]

Files vs. Databases

Outline (Conclusions)

  1. BlogXML [2]
  2. Rules for BlogXML [4]
  3. Selecting BlogXML Content [1]
  4. Publishing BlogXML [4]
  5. Syndicating BlogXML [2]
  6. Managing BlogXML [1]
  7. Conclusions [1]

XML Blogs!