Web-Based Publishing

UCB iSchool INFO 290-19 (CCN 42704) – Spring 2007

Instructor: Erik Wilde

Lecture & Lab: Tu & Th 9.00–10.30, 110 South Hall

Description: This course is a broad survey of Web-based publishing, defined here as any well-designed service for providing information using Web formats and protocols. It touches on strategy and project planning considerations, but emphasizes design, implementation, and delivery issues. Design topics include publishing process modeling and document workflows, content reuse, document formats, compound documents, internationalization and localization, and the associated questions of usability and accessibility. Implementation issues include URI design, Web server setup, and storage management, starting from the foundation (XML databases) and moving on to specialized content management systems. Delivery issues include cross-media publishing and syndication alternatives such as RSS and Atom.

Date Subject Slides Resources
2007-01-23 Overview and Introduction: The field of Web-Based Publishing as presented in this course emphasizes design, implementation, and delivery issues of well-designed service for providing information using Web formats and protocols. In the introductory lecture, the range of topics covered in the course is presented. Additionally, the course project is presented, which spans the whole semester and is a joint project of all students taking the course. The project covers numerous areas, and students may choose the area in which they would like to specialize and acquire practical skills. Introduction (35 Slides) XML · Services
2007-01-25 XML Linking Language (XLink): One of the most important aspects of publishing is the relationship of content parts. The Web has popularized the concept of hypermedia, but HTML supports only a very simple concept of linking. The XML Linking Language (XLink) has been specified to provide a linking language for XML, and it supports a much richer concept of linking than HTML. One of the most important aspects of XLink is its ability to separate content and links, so that links can be regarded as being separate from a document's contents, making it possible to create flexible combinations of content and links. XLink (35 Slides) Spec · XLink Paper 1 · XLink Paper 2
2007-01-30 AJAXLink – Part I: AJAXLink is based on principles of Web linking and new possibilities to extend browsers with JavaScript code which talks to a server. While AJAXLink is designed to run in any browser, an alternative implementation as a browser-specific extension could use interface techniques which are not available to AJAX applications. In this lecture, we look at the foundations of AJAXLink and at the core components of the project, which are the client side, the communications protocol, and the server side. AJAXLink 1 (25 Slides) REST¹ · REST²
2007-02-01 Character Set Issues & Unicode: Every character-based document is based on some model of which characters are available, and how they are encoded. Unicode is the most popular character set today and provides a variety of encoding schemes, each of them being a Unicode Transformation Format (UTF). In addition to character sets and encodings, other issues relevant when dealing with characters are transcoding and normalization, which deal with the problems arising when using different character encodings or different encodings of particular characters. Unicode (32 Slides) Unicode · History
2007-02-06 AJAXLink – Part II: Based on the proposals submitted last week, individual AJAXLink project parts will be assigned today. The majority of today's class is reserved for the individual project parts to start working on the interfaces with all related parts. Today's result should be sufficient input for everybody to prepare a detailed project outline implementation strategy, due in two weeks. AJAXLink 2 (16 Slides) YUI · REST & Rails
2007-02-08 XML Path Language (XPath) 2.0: The XML Path Language (XPath) is one of the most useful and frequently used languages in the are of XML technologies. In its version 1.0, it is used in technologies such as XSLT, XML Schema, DOM, and XML Tools. With XPath 2.0, the language has been greatly extended, the new version of XPath is the foundation for XSLT 2.0 and XQuery. XPath 2.0 provides support for regular expression matching, typed expressions, and contains language constructs for conditional and repeated evaluation. XPath 2.0 (35 Slides) XPath 1.0 · XPath 2.0
2007-02-13 XQuery 1.0 and XPath 2.0 Data Model (XDM): While XPath 2.0 syntactically is an extension of XPath 1.0, the underlying data model has changed quite radically. Instead of XPath 1.0's simple concept of four datatypes (node set, number, string, boolean), the XQuery 1.0 and XPath 2.0 Data Model (XDM) is based on sequences and allows much more sophisticated ways of data representation and manipulation. Furthermore, XDM includes the datatypes defined by XML Schema, which results in an complex and powerful collection of built-in datatypes and operations on these datatypes. XDM (28 Slides) Spec
2007-02-15 XML Query (XQuery): The XML Query (XQuery) language has been designed to query collections of XML documents. It is thus different from XSLT, which primarily transforms one document at a time. However, the core of both languages is XPath 2.0, which means that learning XQuery (and XSLT 2.0) is not very hard when starting with a solid knowledge of XPath 2.0. XQuery's main concept is an expression language which supports iteration and binding of variables to intermediate results. The final result of an XQuery is a tree, which can be serialized in various serialization formats. XQuery (31 Slides) Spec
2007-02-20 AJAXLink – Part III: The presentation of the detailed proposals for every part of project provides a complete overview of the project work for the rest of the semester. For each project part, the presentation describes the overall goal, the implementation strategy, the dependencies with other project parts, and the individual goals for the milestone dates. AJAXLink 3 (11 Slides)
2007-02-22 AJAXLink – Part IV: AJAXLink and XLink offer a lot of opportunities and also a lot of places where design decisions need to be made. In this lecture, some of these design decisions are discussed, so that the design process and the implementation decisions can start with less uncertainties. While in some cases it simply is necessary to decide whether a certain feature should be supported or not, in other cases different strategies can be chosen to implement a feature. AJAXLink 4 (19 Slides) XLink Reference
2007-02-27 XML Databases: XML Databases are specialized databases for handling XML data. As their query language, they will often use XQuery, but they need additional technologies for updating and storing data. XQuery currently is a read-only language, so update facilities must be provided as an addition to XQuery querying capabilities. One of the big advantages of databases vs. file systems are optimized storage (and thus access) structures, and in the case of XML databases this means storing XML documents other than as text files. XDBMS (30 Slides) eXist
2007-03-01 XML and Databases: While XML databases are a good solution for managing XML content, frequently it is necessary to uses non-XML databases for managing XML content. In most cases, these databases will be relational databases. There a two major approaches of how to manage XML content in a relational database. The first approach is to define a mapping between XML and relational structures and work with this mapping. The second approach is to use the XML-specific functionality, which is increasingly provided by relational databases, turning them into XML-aware databases. XML & DBMS (32 Slides) FAQ
2007-03-06 AJAXLink – Part V: After two weeks of technical work, today the first results will be presented. The focus is to show first results, reports on design decisions which have been made or still need to be done, and to point out dependencies with other project parts. In addition, blog links (blinks) as an interesting use case for AJAXLink and XLink scenarios in general are discussed in today's class. AJAXLink 5 (8 Slides) Technorati API
2007-03-08 XLink Characterization: XLink's data model allows a number of combinations of XLink placement, XLink encoding, and XLink structure which make it hard to get an overview of all possible usages. XLink characterization aims at systematically looking at the various placements, encodings, and structures XLink allows, as well as looking at the individual participating resources of the link and their characterization in terms of their properties and their connections with the other resources participating in the link. XLink Characterization (24 Slides) Spec
2007-03-13 Asynchronous JavaScript and XML (AJAX): Asynchronous JavaScript and XML (AJAX) is a technology which allows client-side JavaScript to make requests to a server without causing a reload of the current page in the browser. Using AJAX and standard DOM methods, client-side JavaScript can request, receive, and visualize information in the context of a single Web page. The advantage of AJAX over more traditional Web pages is that they better resemble the behavior of desktop applications, providing enhanced features and usability for the user. AJAX (27 Slides) Spec
2007-03-15 Multipurpose Internet Mail Extensions (MIME) Types: One of the most important aspect of computer-based communications is the concept of media types, the question what type of information some digital artifact represents, and how it is encoded. The most common standard for this information is the scheme introduced by Multipurpose Internet Mail Extensions (MIME). Media types can be negotiated by peers communicating through HTTP. Some media types allow fragment identifiers, which allow references to a resource to identify a fragment of the complete resource. MIME Types (36 Slides) Spec · Registry
2007-03-20 AJAXLink – Part VI: Today's focus is on presenting and discussing the first prototype implementations, as well as issues of technical alignment and interoperability. The results of today's presentations should drive the work of the following two weeks, which lead to a stabilized version of prototypes. The most important task today is to get a better understanding how the individual components fit together, and how code reuse can be improved and code duplication can be avoided. AJAXLink 6 (3 Slides)
2007-03-22 Internationalization (I18N) & Localization (L10N): Many publishing environments need to support multiple languages. In many cases, the requirement to support multiple languages surfaces in later stages of a product development or publishing solution, which can cause major design changes, driving up costs. Internationalization (I18N) is the approach to design systems which can adapt to different locales. Localization (L10N) is the activity to identify, define, and encode locales, based on internationalized software. I18N & L10N (32 Slides) Spec
2007-04-03 AJAXLink – Part VII: Based on the presentations from two weeks ago, the focus of today is on presenting consolidated functionality that is implemented in a robust fashion. For the next milestone, it is important to look at the goals which can be achieved realistically, and how the functionality of the prototype can be defined in a way which makes sense both from a functionality and from an implementation point of view. AJAXLink 7 (3 Slides)
2007-04-05 Really Simple Syndication (RSS): For frequently updated content available on the Web, the concept of content syndication has become popular. It refers to the publication of periodically updated information in a special format, RSS is currently the most popular format for this. Syndication is interesting for the reuse of information in various contexts (such as a news ticker always displaying the latest headlines), or for personal information aggregation (compiling a personalized list of news from various sources). RSS (19 Slides) What is RSS?
2007-04-10 XLink Visualization: Even though XLink is an interesting and sufficient foundation for the project of this course, it does have some problems for the AJAXLink application. The most important observation is that XLink is underspecified and missing important complementary specifications in various areas. The two most important areas are linkbase access (how to selectively retrieve XLinks) and link visualization (how to render a link once it has been retrieved). This lecture proposes a simple XLink visualization algorithm. XLink Visualization (19 Slides)
2007-04-12 Atom: While RSS-based syndication has become widely used (in particular for user-generated content), the technical and political problems of the format triggered a new development. The Atom format is an improved syndication format, and accompanied by the Atom Publishing Protocol (APP), a REST-based way of interacting with an Atom source. APP includes support for creating, reading, updating, and deleting Atom entries, thus supporting a broad range of resource-centered interactions. Atom (27 Slides) Spec · Spec
2007-04-17 AJAXLink – Part VIII: The individual components are now tested up to a point where their combination can be tested. This week's presentations focus on the functionality available in each of the components, and the expected interoperability tests for the component. Based on the results of the interoperability tests, the components then can be improved to achieve better interoperability as the final stage of the course project. AJAXLink 8 (3 Slides)
2007-04-19 Web Server Setup: For Web-based publishing, one of the core components is how to make published information available, which is the task of a Web Server (technically speaking, an HTTP Server). While there are many different Web server implementations, this lecture uses the most popular Web server software, the Apache HTTP Server, as an example for the important aspects of Web server configuration. Server Setup (31 Slides) Apache
2007-04-24 Content Management Systems (CMS): Web-based publishing is based on the assumption that there is some data source providing the data which is then published in a Web-compliant way. It is the task of a Content Management System (CMS) to provide an platform for creating, managing, repurposing, and publishing content in a flexible way. CMS are often based on XML, because it provides a good foundation for flexible reuse, but this still leaves open the question of the schema and the available publication pipelines. CMS (29 Slides)
2007-04-26 Publishing Pipelines: While CMS publishing is bound to the features and limitations of a specific CMS, Web publishing often involves the integration of multiple information sources. Using Web technologies, Web publishing tasks can be solved efficiently and in a way which naturally leads to a publishing pipeline producing good Web content. Based on the context theme of the course, this lecture looks at various ways how context can be leveraged for Web-based publishing. Pipelines (26 Slides)
2007-05-01 AJAXLink – Part IX: The final presentations present the work of the semester-long course project. Each of the project parts present a list of tackled, solved, and unsolved issues. Work on AJAXLink will continue over the summer and present several opportunities for final projects in the iSchool. Areas of interest are using APP, working on the interaction language, and creating more robust components for publishing, processing, and presentation. AJAXLink 9 (3 Slides)
Show Abstracts
Hide Abstracts

Creative Commons License please send comments to dret@berkeley.edu
last modification on Thursday, 08-Feb-2007 01:23:18 EST
valid CSS! valid XHTML 1.0!