Content Syndication

Mobile Application Design and Development [./]
Spring 2010 — INFO 152 (CCN 42504)

Erik Wilde, UC Berkeley School of Information
2010-02-10

Creative Commons License [http://creativecommons.org/licenses/by/3.0/]

This work is licensed under a CC
Attribution 3.0 Unported License
[http://creativecommons.org/licenses/by/3.0/]

Contents Erik Wilde: Content Syndication

Contents

Erik Wilde: Content Syndication

(2) Abstract

For many information sources on the Web, it is useful to have some standardized way of subscribing to information updates. Syndication formats such as RSS and Atom can be used by these information sources to publish a feed of updated information items. Extensions allow feeds to carry additional data, for example in podcasts. While RSS and Atom are read-only formats, the Atom Publishing Protocol (AtomPub) build on top of Atom and provides a protocol for submitting new items to feeds.



Erik Wilde: Content Syndication

(3) Content Feeds



Syndication Formats

Outline (Syndication Formats)

  1. Syndication Formats [11]
    1. RSS [6]
    2. Atom [5]
  2. Syndication Aggregation [5]
    1. FeedBurner [3]
  3. Atom Publishing Protocol [7]
  4. Conclusions [1]

RSS

RSS Erik Wilde: Content Syndication

(6) RSS History

  • The Myth of RSS Compatibility [http://diveintomark.org/archives/2004/02/04/incompatible-rss] provides a good overview
  • RSS is a schoolbook example for why standards are a good thing
    • RSS 0.9 was created for the My Netscape portal in March 1999
    • RSS 0.91 (a simplification) was introduced in July 1999 (as an interim solution)
    • the AOL/Netscape merger removed the format from the company's portal
    • RSS was without an owner, and different parties claimed/denied ownership
    • RSS 1.0 was created by an informal developer group
    • RSS 0.92 (and 0.93 and 0.94) were published without acknowledging RSS 1.0
    • finally, RSS 2.0 was released as a follow-up to the RSS 0.9x versions
  • Using RSS has become an exercise in managing a menagerie of versions


RSS Erik Wilde: Content Syndication

(7) RSS 2.0 Example

<rss version="2.0" xmlns:media="http://search.yahoo.com/mrss/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:creativeCommons="http://cyber.law.harvard.edu/rss/creativeCommonsRssModule.html" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:georss="http://www.georss.org/georss" xmlns:woe="http://where.yahooapis.com/v1/schema.rng" xmlns:flickr="urn:flickr:">
 <channel>
  <title>The MobApp2010 Pool, with geodata</title>
  <link>http://www.flickr.com/photos/</link>
  <description/>
  <pubDate>Sat, 30 Jan 2010 22:10:25 -0800</pubDate>
  <lastBuildDate>Sat, 30 Jan 2010 22:10:25 -0800</lastBuildDate>
  <generator>http://www.flickr.com/</generator>
  <image>
   <url>http://l.yimg.com/g/images/buddyicon.jpg#1313291@N20</url>
   <title>The MobApp2010 Pool, with geodata</title>
   <link>http://www.flickr.com/photos/</link>
  </image>
  <item>
   <title>DSC00809</title>
   <link>http://www.flickr.com/photos/47030217@N06/4318229442/</link>
   <description>&lt;p&gt;&lt;a href=&quot;http://www.flickr.com/people/47030217@N06/&quot;&gt;stoodle246&lt;/a&gt; posted a photo:&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;http://www.flickr.com/photos/47030217@N06/4318229442/&quot; title=&quot;DSC00809&quot;&gt;&lt;img src=&quot;http://farm5.static.flickr.com/4028/4318229442_5ac597fdf5_m.jpg&quot; width=&quot;240&quot; height=&quot;180&quot; alt=&quot;DSC00809&quot; /&gt;&lt;/a&gt;&lt;/p&gt;</description>
   <pubDate>Sat, 30 Jan 2010 22:10:25 -0800</pubDate>
   <dc:date.Taken>2010-01-30T21:14:39-08:00</dc:date.Taken>
   <author flickr:profile="http://www.flickr.com/people/47030217@N06/">nobody@flickr.com (stoodle246)</author>
   <guid isPermaLink="false">tag:flickr.com,2004:/photo/4318229442</guid>
   <georss:point>37.873633 -122.256975</georss:point>
   <geo:lat>37.873633</geo:lat>
   <geo:long>-122.256975</geo:long>
   <woe:woeid>55858022</woe:woeid>
   <media:content url="http://farm5.static.flickr.com/4028/4318229442_e8ac9d23aa_o.jpg" type="image/jpeg" height="1536" width="2048"/>
   <media:title>DSC00809</media:title>
   <media:thumbnail url="http://farm5.static.flickr.com/4028/4318229442_5ac597fdf5_s.jpg" height="75" width="75"/>
   <media:credit role="photographer">stoodle246</media:credit>
  </item>


RSS Erik Wilde: Content Syndication

(8) Podcast Example

<rss xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">
 <channel>
  <lastBuildDate>Sat, 06 Feb 2010 08:41:35 CST</lastBuildDate>
  <title>TEDTalks (video)</title>
  <link>http://www.ted.com/talks/browse</link>
  <generator>TED - TED.com</generator>
  <description>Each year,  the TED (Technology, Entertainment, Design) conference hosts some of the world's most fascinating people: Trusted voices and convention-breaking mavericks, icons and geniuses. These podcasts (also available in audio format) capture the most extraordinary presentations delivered from the TED stage.</description>
  <itunes:subtitle>Each year, the TED (Technology, Entertainment, Design) conference hosts some of the world's most fascinating people: Trusted voices and convention-breaking mavericks, icons and geniuses. These podcasts (also available in audio format) capture the most ext</itunes:subtitle>
  <itunes:author>TED</itunes:author>
  <itunes:summary>Each year,  the TED (Technology, Entertainment, Design) conference hosts some of the world's most fascinating people: Trusted voices and convention-breaking mavericks, icons and geniuses. These podcasts (also available in audio format) capture the most extraordinary presentations delivered from the TED stage. </itunes:summary>
  <language>en</language>
  <copyright>Creative Commons: http://creativecommons.org/licenses/by-nc-nd/3.0/ </copyright>
  <itunes:owner>
   <itunes:name>Michael Glass</itunes:name>
   <itunes:email>contact@ted.com</itunes:email>
  </itunes:owner>
  <image>
   <url>http://ted.streamguys.net/TEDTalksvideo_tile_144.jpg</url>
   <title>TEDTalks (video)</title>
   <link>http://www.ted.com/talks/browse</link>
   <width>144</width>
   <height>144</height>
  </image>
  <itunes:image href="http://video.ted.com/assets/images/itunes/podcast_poster_600x600.jpg"/>
  <category>Science</category>
  <category>Technology</category>
  <category>Entertainment</category>
  <category>Design</category>
  <itunes:category text="Arts">
   <itunes:category text="Design"/>
  </itunes:category>
  <itunes:category text="Education">
   <itunes:category text="Higher Education"/>
  </itunes:category>
  <itunes:category text="Science &amp; Medicine">
   <itunes:category text="Natural Sciences"/>
  </itunes:category>
  <itunes:category text="Technology"/>
  <itunes:keywords>TED</itunes:keywords>
  <itunes:explicit>no</itunes:explicit>
  <media:rating scheme="urn:simple">nonadult</media:rating>
  <atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/TEDTalks_video"/>
  <feedburner:info uri="tedtalks_video"/>
  <atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com"/>
  <media:copyright>Creative Commons: http://creativecommons.org/licenses/by-nc-nd/3.0/</media:copyright>
  <media:thumbnail url="http://video.ted.com/assets/images/itunes/podcast_poster_600x600.jpg"/>
  <media:keywords>TED</media:keywords>
  <media:category scheme="http://www.itunes.com/dtds/podcast-1.0.dtd">Arts/Design</media:category>
  <media:category scheme="http://www.itunes.com/dtds/podcast-1.0.dtd">Education/Higher Education</media:category>
  <media:category scheme="http://www.itunes.com/dtds/podcast-1.0.dtd">Science &amp; Medicine/Natural Sciences</media:category>
  <media:category scheme="http://www.itunes.com/dtds/podcast-1.0.dtd">Technology</media:category>
  <feedburner:browserFriendly>Each year, TED hosts 80 of the world's most fascinating people: Trusted voices and convention-breaking mavericks, icons and geniuses. These podcasts (also available in audio format) capture the most extraordinary presentations delivered from the TED stage. Each week, we'll release a new talk to inspire, intrigue and stir the imagination. For best effect, plan to listen to at least three, start to finish. (They have a cumulative effect.) If you have a curious soul and an open mind, we think you'll be hooked...</feedburner:browserFriendly>
  <item>
   <title>TEDTalks : Tom Shannon: The painter and the pendulum - Tom Shannon (2009)</title>
   <itunes:author>Tom Shannon</itunes:author>
   <description>TED visits Tom Shannon in his Manhattan studio for an intimate look at his science-inspired art. An eye-opening, personal conversation with John Hockenberry reveals how nature's forces -- and the onset of Parkinson's tremors -- interact in his life and craft.&lt;img src="http://feeds.feedburner.com/~r/TEDTalks_video/~4/hC8_G4hdD6I" height="1" width="1"/&gt;</description>
   <itunes:subtitle>Tom Shannon: The painter and the pendulum</itunes:subtitle>
   <itunes:summary><![CDATA[TED visits Tom Shannon in his Manhattan studio for an intimate look at his science-inspired art. An eye-opening, personal conversation with John Hockenberry reveals how nature's forces -- and the onset of Parkinson's tremors -- interact in his life and craft.]]></itunes:summary>
   <link>http://feedproxy.google.com/~r/TEDTalks_video/~3/hC8_G4hdD6I/762</link>
   <guid isPermaLink="false">http://video.ted.com/talks/podcast/TomShannon_2009S.mp4</guid>
   <pubDate>Fri, 05 Feb 2010 09:08:00 -0600</pubDate>
   <category>Higher Education</category>
   <itunes:explicit>no</itunes:explicit>
   <itunes:duration>00:13:21</itunes:duration>
   <itunes:keywords>TED</itunes:keywords>
   <media:content url="http://feedproxy.google.com/~r/TEDTalks_video/~5/IMfhha_d6Xs/TomShannon_2009S.mp4" fileSize="45813001" type="video/mp4"/>
   <feedburner:origLink>http://www.ted.com/talks/view/id/762</feedburner:origLink>
   <enclosure url="http://feedproxy.google.com/~r/TEDTalks_video/~5/IMfhha_d6Xs/TomShannon_2009S.mp4" length="45813001" type="video/mp4"/>
   <feedburner:origEnclosureLink>http://video.ted.com/talks/podcast/TomShannon_2009S.mp4</feedburner:origEnclosureLink>
  </item>


RSS Erik Wilde: Content Syndication

(9) The Case for Content Management

  • RSS is very rarely produced by hand
    • by definition, RSS contains redundant information for a specific purpose
  • If a Content Management System (CMS) [../web-fall09/cms] is used, RSS can be generated
    • basic metadata can be generated by the CMS (title, author, date)
    • better tagging of content results in better tagging of feeds
    • well-tagged feeds are better foundations for large-scale reuse of feed items
  • Blogging is simply a specialized case of a CMS
    • Web-based interface for controlling everything
    • strictly time-ordered sequenced of published items
    • navigation features primarily based on the time-specific facets of the blog (maybe tags)
    • all blogging tools include feed support


RSS Erik Wilde: Content Syndication

(10) Consuming RSS

  • RSS feeds often have quality problems
    • surprisingly often feeds do not even deliver well-formed XML
    • the use of embedded markup in RSS is not well-defined
  • Writing an RSS reader from scratch is not a good idea
  • There are three major tasks which RSS readers must do
    1. accept non-XML RSS feeds and fix them to be XML
    2. look at the feed contents and bring them into a unified form
    3. produce a unified view of feeds regardless of the RSS version


RSS Erik Wilde: Content Syndication

(11) RSS Political Problems

  • Multiple and incompatible RSS History [RSS History (1)] are still in widespread use
    • RSS 1.0 and RSS 2.0 are incompatible by design (RDF vs. non-RDF)
    • none of the RSS versions is maintained by a universally accepted standards body
  • None of the specifications is being updated or fixed
    • some of the lessons learned by RSS deployment are not used in a new version
    • it is unlikely that a new version will be produced which merges the RSS landscape
  • Invent something new instead of trying to fix RSS
    • Atom [Atom (1)] started in 2003 (called Echo at first)
    • W3C or IETF would have been promising candidates for a new RSS
    • W3C is more formal, IETF is more developer-centered
    • IETF was chosen over W3C [http://www.bestkungfu.com/?p=492] because the of Atom community's preferences


Atom

Atom Erik Wilde: Content Syndication

(13) Atom History

atom-logo.png
  • RSS's shortcomings were very apparent and could not be fixed
  • In mid-2003, discussions started about an improved format
  • It also became apparent that the format should have a protocol
  • Atom 0.3 was released in December 2003 but had no formal home
  • IETF was chosen as the new home with a working group in June 2004
  • RFC 4287 [http://tools.ietf.org/html/rfc4287] was published in December 2005
  • AtomPub [Atom Publishing Protocol (1)] has been published as RFC 5023 [http://tools.ietf.org/html/rfc5023] in October 2007


Atom Erik Wilde: Content Syndication

(14) Atom vs. RSS

  • Standardized by the IETF (well-defined process)
  • Classification of entries (user-defined categories)
  • More XML-like markup design (more nesting)
  • Namespaces are used and supported as standard mechanism
  • Atom feeds must be well-formed XML (there even is a schema [http://atompub.org/2005/08/17/atom.rnc])
  • Interpretation of content is well-defined (various content types)
  • Support for xml:lang and xml:base


Atom Erik Wilde: Content Syndication

(15) Atom Example

<feed xmlns="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:georss="http://www.georss.org/georss" xmlns:woe="http://where.yahooapis.com/v1/schema.rng" xmlns:flickr="urn:flickr:" xmlns:media="http://search.yahoo.com/mrss/">
 <title>The MobApp2010 Pool, with geodata</title>
 <link rel="self" href="http://api.flickr.com/services/feeds/geo/?g=1313291@N20&amp;lang=en-us&amp;format=atom"/>
 <link rel="alternate" type="text/html" href="http://www.flickr.com/photos/"/>
 <id>tag:flickr.com,2005:/photos/public/group/1313291@N20/geo/</id>
 <icon>http://l.yimg.com/g/images/buddyicon.jpg#1313291@N20</icon>
 <subtitle/>
 <updated>2010-01-31T06:10:25Z</updated>
 <generator uri="http://www.flickr.com/">Flickr</generator>
 <entry>
  <title>DSC00809</title>
  <link rel="alternate" type="text/html" href="http://www.flickr.com/photos/47030217@N06/4318229442/"/>
  <id>tag:flickr.com,2005:/photo/4318229442</id>
  <published>2010-01-31T06:10:25Z</published>
  <updated>2010-01-31T06:10:25Z</updated>
  <dc:date.Taken>2010-01-30T21:14:39-08:00</dc:date.Taken>
  <content type="html">&lt;p&gt;&lt;a href=&quot;http://www.flickr.com/people/47030217@N06/&quot;&gt;stoodle246&lt;/a&gt; posted a photo:&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;http://www.flickr.com/photos/47030217@N06/4318229442/&quot; title=&quot;DSC00809&quot;&gt;&lt;img src=&quot;http://farm5.static.flickr.com/4028/4318229442_5ac597fdf5_m.jpg&quot; width=&quot;240&quot; height=&quot;180&quot; alt=&quot;DSC00809&quot; /&gt;&lt;/a&gt;&lt;/p&gt;</content>
  <author>
   <name>stoodle246</name>
   <uri>http://www.flickr.com/people/47030217@N06/</uri>
  </author>
  <link rel="enclosure" type="image/jpeg" href="http://farm5.static.flickr.com/4028/4318229442_e8ac9d23aa_o.jpg"/>
  <georss:point>37.873633 -122.256975</georss:point>
  <geo:lat>37.873633</geo:lat>
  <geo:long>-122.256975</geo:long>
  <woe:woeid>55858022</woe:woeid>
 </entry>


Atom Erik Wilde: Content Syndication

(16) Atom Content

  • RSS had no safe way of finding out what an entry's content is
    • this led to different implementations being smart about what the RSS author really wanted
    • one of Atom's main goals was to improve this in a well-defined way
    • Atom allows escaped markup (the only way to include non-XML HTML in an XML format)
  • Each content element should have a type (the default is text)
  • Atom's content interpretation algorithm (use first applicable rule):
    1. if type is text, no child elements are allowed (plain text content)
    2. if type is html then RSS's method of escaped markup is used
    3. if type is xhtml then there must be an div containing XHTML markup
    4. if type is an XML media type [../web-fall09/mediatypes] then the content should be treated as this type
    5. if type starts with text/ then no child elements are allowed
    6. for all other values, the content must be an base64-encoded entity of the specified MIME type


Syndication Aggregation

Outline (Syndication Aggregation)

  1. Syndication Formats [11]
    1. RSS [6]
    2. Atom [5]
  2. Syndication Aggregation [5]
    1. FeedBurner [3]
  3. Atom Publishing Protocol [7]
  4. Conclusions [1]
Syndication Aggregation Erik Wilde: Content Syndication

(19) End-User Aggregation

feed-icon.png
<link rel="alternate" type="application/rdf+xml" title="…" href="…" />
<link rel="alternate" type="application/rss+xml" title="…" href="…" />
<link rel="alternate" type="application/atom+xml" title="…" href="…" />


Syndication Aggregation Erik Wilde: Content Syndication

(20) Aggregation Intermediaries



FeedBurner

Outline (FeedBurner)

  1. Syndication Formats [11]
    1. RSS [6]
    2. Atom [5]
  2. Syndication Aggregation [5]
    1. FeedBurner [3]
  3. Atom Publishing Protocol [7]
  4. Conclusions [1]
FeedBurner Erik Wilde: Content Syndication

(22) Fixing Feeds

Cleaning Up Feeds

FeedBurner Erik Wilde: Content Syndication

(23) Load Balancing

Providing Feed Load Balancing

FeedBurner Erik Wilde: Content Syndication

(24) Statistics/Analytics

Providing Feed Statistics

Atom Publishing Protocol

Outline (Atom Publishing Protocol)

  1. Syndication Formats [11]
    1. RSS [6]
    2. Atom [5]
  2. Syndication Aggregation [5]
    1. FeedBurner [3]
  3. Atom Publishing Protocol [7]
  4. Conclusions [1]
Atom Publishing Protocol Erik Wilde: Content Syndication

(26) Syndication Format Protocols



Atom Publishing Protocol Erik Wilde: Content Syndication

(27) RESTified Syndication



Atom Publishing Protocol Erik Wilde: Content Syndication

(28) Protocol Summary

Resource HTTP Method Representation Description
Introspection GET Atom Service Document [Service Documents (1)] Enumerates a set of collections and lists their URIs and other information about the collections
Collection GET Atom Feed A list of member of the collection (this may be a subset of all entries in the collection)
Collection POST Atom Entry Create a new entry in the collection
Member GET Atom Entry Get the Atom Entry
Member PUT Atom Entry Update the Atom Entry
Member DELETE n/a Delete the Atom Entry from the collection


Atom Publishing Protocol Erik Wilde: Content Syndication

(29) Service Documents

Service Documents represent server-defined groups of Collections, and are used to initialize the process of creating and editing resources.


Atom Publishing Protocol Erik Wilde: Content Syndication

(30) Service Document Example

<service xmlns="http://purl.org/atom/app#" xmlns:atom="http://www.w3.org/2005/Atom">
 <workspace>
  <atom:title>Main Site</atom:title>
  <collection href="http://example.org/reilly/main">
   <atom:title>My Blog Entries</atom:title>
   <categories href="http://example.com/cats/forMain.cats"/>
  </collection>
  <collection href="http://example.org/reilly/pic">
   <atom:title>Pictures</atom:title>
   <accept>image/*</accept>
  </collection>
 </workspace>
 <workspace>
  <atom:title>Side Bar Blog</atom:title>
  <collection href="http://example.org/reilly/list">
   <atom:title>Remaindered Links</atom:title>
   <accept>entry</accept>
   <categories fixed="yes">
    <atom:category scheme="http://example.org/extra-cats/" term="joke"/>
    <atom:category scheme="http://example.org/extra-cats/" term="serious"/>
   </categories>
  </collection>
 </workspace>
</service>


Atom Publishing Protocol Erik Wilde: Content Syndication

(31) Category Documents



Atom Publishing Protocol Erik Wilde: Content Syndication

(32) Category Document Example

<app:categories xmlns:app="http://purl.org/atom/app#" xmlns="http://www.w3.org/2005/Atom" fixed="yes" scheme="http://example.com/cats/big3">
 <category term="animal"/>
 <category term="vegetable"/>
 <category term="mineral"/>
</app:categories>


Conclusions

Outline (Conclusions)

  1. Syndication Formats [11]
    1. RSS [6]
    2. Atom [5]
  2. Syndication Aggregation [5]
    1. FeedBurner [3]
  3. Atom Publishing Protocol [7]
  4. Conclusions [1]
Conclusions Erik Wilde: Content Syndication

(34) Semantic Web Light



2010-02-10 Mobile Application Design and Development [./]
Spring 2010 — INFO 152 (CCN 42504)