Web Architecture

INFO 290 (CCN 42593) – Fall 2009
School of Information, UC Berkeley

Instructor: Erik Wilde
TA: Michael Lissner

Lecture: Tue&Thu 9.00–10.30, 202 South Hall

Description: This course is a survey of Web technologies, ranging from the basic technologies underlying the Web (URI, HTTP, HTML) to more advanced technologies being used in the context of Web engineering, for example structured data formats and Web programming frameworks. The goal of this course is provide an overview of the technical issues surrounding the Web today, and to provide a solid and comprehensive perspective of the Web's constantly evolving landscape. Because of the large number of technologies covered in this course, only a fraction of them will be discussed and described in greater detail. The main goal of the course thus is an understanding of the interdependencies and connections of Web technologies, and of their capabilities and limitations. Implementing Web-based applications today can be done in a multitude of ways, and this course provides guidelines and best practices which technologies to choose, and how to use them.

Date Subject Slides Required Reading Additional Resources Assignments [a/]
2009-08-27 Overview and Introduction: This introductory lecture gives the motivation for the course, some information about the people involved and the organization of the course, a high-level overview of the course's topics, and an overview of the assignments which are an important part of the course program. Introduction (21 Slides)
2009-09-01 Web Browsers: This lecture looks at Web browsers and how they work. It introduces the basic functionalities of a browser; retrieval and rendering of Web pages. Any modern browser needs to support more than just HTTP and HTML; it must support CSS for stylesheets, JavaScript for scripted Web pages, various image formats, and popular applications such as Flash. In addition, browsers can support additional functionality such as off-line operation, or in general more application-oriented features such as AIR or Silverlight. Browsers (33 Slides) Wikipedia [http://en.wikipedia.org/wiki/Web_Browser] · History [http://en.wikipedia.org/wiki/History_of_the_web_browser] · YouTube [http://www.youtube.com/watch?v=o4MwTvtyrUQ] Firefox [http://www.mozilla.com/firefox/] · Safari [http://www.apple.com/safari/] · IE [http://www.microsoft.com/windows/products/winfamily/ie/default.mspx] · Chrome [http://www.google.com/chrome] · Opera [http://www.opera.com/]
2009-09-03 Hypertext Markup Language (HTML): The Hypertext Markup Language (HTML) is the most important content type on the Web. This lecture covers a basic overview of how to use HTML markup in general. In particular, we look at page titles, meta tags, inserting text and images, using lists, and creating simple tables. Attributes can be used for more layout control in the HTML tags, but most layout issues are deferred until the CSS lecture. HTML (23 Slides) Getting started with HTML [http://www.w3.org/MarkUp/Guide/] HTML Tutorial [http://www.w3schools.com/html] · HTML Reference [https://developer.mozilla.org/en/HTML/Element] · HTML Validator [http://validator.w3.org/] A1 [a/1/] assigned (due date: 9/20)
2009-09-08 Advanced HTML: This lecture covers linking in general and in header information, and a more general view of HTML layout based on the box model used by browsers. The concept of frames is introduced, which can be used in a combination of framesets and pages, or as inline frames. Finally, image maps are introduced as a way of how images can be turned not only into links, but into a set of various linked areas overlayed over the image. Advanced HTML (30 Slides) Advanced HTML [http://www.w3.org/MarkUp/Guide/Advanced.html] Online Image Map Editor [http://www.maschek.hu/imagemap/imgmap]
2009-09-15 Cascading Style Sheets (CSS): Cascading Stylesheets (CSS) have been designed as a language for better separating presentation-specific issues from the structuring of documents as provided by HTML. CSS uses a simple model of selectors and declarations. Selectors specify to which elements of a document a set of declarations (each being a value assigned to a property) apply; in addition there is a model of how property values are inherited and cascaded. The biggest limitation of CSS is that it cannot change the structure of the displayed document. CSS (32 Slides) Adding a Touch of Style [http://www.w3.org/MarkUp/Guide/Style] CSS Spec [http://www.w3.org/TR/CSS21/] · Properties [http://www.w3.org/TR/CSS21/propidx.html] · CSS Tutorial [http://www.w3schools.com/css] · CSS Validator [http://jigsaw.w3.org/css-validator/]
2009-09-16 HTML Forms: This lecture introduces HTML Forms, a way how an HTML page can provide input fields, so that users can provide data to a Web-based application. HTML forms are regular HTML pages (i.e., using regular HTML structures), but they also contain special HTML elements for data entry. Most importantly, each form contains instructions on how to submit the entered data, and the browser will use that information to compose a request containing all the data of the form submission. HTML Forms (22 Slides) HTML Forms FAQ [http://htmlhelp.com/faq/html/forms.html] Style Guide [http://www.webstyleguide.com/wsg3/10-forms-and-applications/] · Forms Spec [http://www.w3.org/TR/html401/interact/forms.html]
2009-09-22 Microformats: HTML pages are for human users and describe a resource in structural terms (headings, lists, tables, …). For machine-based interaction, it is often necessary to have more information about the application concepts. XML is a popular language for representing application structures, but is targeted at machine-based processing alone. Microformats and more formal approaches such as the Resource Description Format (RDF), RDF in Attributes (RDFa), and Web Ontology Language (OWL) often are used to describe Web content semantically. Microformats (30 Slides) Wikipedia [http://en.wikipedia.org/wiki/Microformat] Microformats [http://microformats.org/] · Tutorials [http://www.xfront.com/microformats/]
2009-09-24 Content Management System (CMS): The fundamental architecture of the Web only requires a Web server capable of answering HTTP requests on the server side. The question, however, is what that content server is serving when responding to requests. The content served by Web servers may come from files, from some form of managed more or less static content, or from dynamic processes. In this lecture, the idea of a Content Management System (CMS) or, more specifically, a Web Content Management System (WCMS), is introduced in a structured and disciplined way. CMS (28 Slides) Wikipedia (CMS) [http://en.wikipedia.org/wiki/Content_management_system] · Wikipedia (WCMS) [http://en.wikipedia.org/wiki/Web_content_management_system] Apache [http://httpd.apache.org/docs/2.2/] · Drupal [http://drupal.org/handbooks] · MarkLogic [http://www.marklogic.com/product/marklogic-server.html] A2 [a/2/] assigned (due date: 10/4)
2009-09-29 Extensible Markup Language (XML): The Extensible Markup Language (XML) defines a simple way for structuring data. The power and popularity of XML can be explained by its versatility, the platform-independence, the standards and technologies leveraging it, and the number of tools and products supporting it. Understanding XML itself is rather simple, it only depends on a very small set of other technologies. Unicode and URIs are the most important foundations of XML. XML itself specifies two different things: on the one hand the format for structured data, which are called XML documents, and on the other hand a constraint language for XML documents, which is called Document Type Definition (DTD). XML (30 Slides) XML 1.0 Press Release [http://www.w3.org/Press/1998/XML10-REC] · XML Fever [http://dret.net/netdret/docs/wilde-cacm2008-xml-fever.html] · On XML Language Design [http://www.tbray.org/ongoing/When/200x/2006/01/09/On-XML-Language-Design] Spec [http://www.w3.org/TR/REC-xml/] · Structuring Content with XML [http://dret.net/netdret/docs/wilde-elpub2006-xml.pdf] · People [http://www.tbray.org/ongoing/When/200x/2008/02/10/XML-People]
2009-10-01 Documents, Data, and Databases: XML databases often are a good solution for managing document-oriented content, but frequently it is necessary or dictated by existing solutions to use non-XML databases for managing document content. In most cases, these databases will be relational databases. There a two major approaches of how to manage document-oriented content in a relational database. The first approach is to define a mapping between document and relational structures and work with this mapping. The second approach is to use the XML-specific functionality, which is increasingly provided by relational databases, turning them into XML-aware databases. Documents, Data, and Databases (34 Slides) FAQ [http://www.rpbourret.com/xml/XMLAndDatabases.htm]
2009-10-06 Content Syndication: For many information sources on the Web, it is useful to have some standardized way of subscribing to information updates. Syndication formats such as RSS and Atom can be used by these information sources to publish a feed of updated information items. While RSS and Atom are read-only formats, the Atom Publishing Protocol (AtomPub) build on top of Atom and provides a protocol for submitting new items to feeds. Syndication (42 Slides) Identifying Atom [http://www.xml.com/pub/a/2004/08/18/pilgrim.html] Atom [http://atompub.org/rfc4287.html] · AtomPub [http://atompub.org/rfc5032.html] · Validator [http://validator.w3.org/feed/]
2009-10-08 Media Types: One of the most important aspect of computer-based communications is the concept of media types, the question what type of information some digital artifact represents, and how it is encoded. The most common standard for this information is the scheme introduced by Multipurpose Internet Mail Extensions (MIME). Media types can be negotiated by peers communicating through HTTP. Some media types allow fragment identifiers, which allow references to a resource to identify a fragment of the complete resource. Media Types (31 Slides) MIME Respect [http://www.w3.org/2001/tag/doc/mime-respect] MIME [http://dret.net/rfc-index/reference/RFC2046] · Registry [http://www.iana.org/assignments/media-types/] A3 [a/3/] assigned (due date: 10/18)
2009-10-15 Usability: According to the International Organization for Standardization (ISO), usability defines the extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use. We will discuss tradeoffs in the design of Web interfaces to support users goals, and present resources to aid design decisions. Usability Contextual Design (Chapter 3, pp. 41-64) [http://portal.acm.org/citation.cfm?id=286067] Heuristic Evaluation [http://www.useit.com/jakob/inspectbook.html] · useit.com [http://useit.com/]
2009-10-20 Scripting: Scripting is used on the majority of today's modern Web sites. Scripting can be used to improve the usability and accessibility of a Web site (for example for validating form data on the client side), it can vastly improve the user experience with new interface design (the smooth scrolling of Google Maps vs. older click to scroll map services), or it can be used to implement behavior that would be impossible without scripting (for example the online applications of Google Docs). Asynchronous JavaScript and XML (Ajax) takes Dynamic HTML (DHTML) to the next level by allowing server access from within scripting code. This is accomplished by using a standardized API for client/server communications, the XMLHttpRequest object. This objects allows using HTTP connections from within scripting code, and thereby allows scripting code to dynamically reload data from a server in response to user interactions. Scripting (33 Slides) DHTML [http://www.yourhtmlsource.com/javascript/dhtmlexplained.html] Best Practices [http://domscripting.com/book/sample/] · Tutorial [http://www.webteacher.com/javascript/] · Wikipedia [http://en.wikipedia.org/wiki/Dynamic_HTML] A4 [a/4/] assigned (due date: 11/1)
2008-10-22 Content vs. Context: The Web often is regarded as a content delivery platform (Web 1.0) or as an application development platform (Web 2.0). However, as another part of the Web 2.0 model, context also has become much more important, because (a) users can now more easily contextualize content by creating their own content, and (b) mechanisms such as social networking provide additional context that is bound to a user's identity and the social networks in which this user is engaged. In this lecture, we discuss the move from pure content to a more contextualized view of the content on the Web, and we discuss possible developments and their technical and non-technical implications. Content vs. Context (14 Slides) A5 [a/5/] assigned (due date: 11/8)
2009-10-27 Picture Formats: Pictures are the only multimedia content on the Web that is widely supported by standardized formats. The most important picture formats are the Graphics Interchange Format, the Joint Photographic Experts Group (JPEG) format, and the Portable Network Graphics (PNG) format. These picture formats target different application areas and depending on the picture material, choosing one format over the other can make a big difference. Audio and video in many cases are not handled by the browser itself, but are included in this overview of multimedia on the Web. Pictures (31 Slides) GIF [http://www.w3.org/Graphics/GIF/spec-gif89a.txt] · JPEG [http://www.w3.org/Graphics/JPEG/] · PNG [http://www.w3.org/TR/PNG/]
2009-10-29 Internet Architecture: The Internet is the technical infrastructure on top of which the Web is built. Some of the services provided by the Internet are essential for the Web, most importantly the naming service and the data transfer service. The Domain Name System (DNS) provides the human-readable names for computers, which can then be used in the addresses of Web servers and ultimately Web pages. The Transmission Control Protocol (TCP) provides the reliable data transfer service between Web Servers and Web Browsers, building on the very robust Internet Protocol (IP). Internet (24 Slides) TCP/IP [http://www.acm.org/crossroads/xrds1-1/tcpjmy.html] Internet Architecture [http://en.wikipedia.org/wiki/Category:Internet_architecture] · TCP/IP Overview [http://www.garykessler.net/library/tcpip.html] · Timeline [http://www.zakon.org/robert/internet/timeline/]
2009-11-03 Web Foundations (URI & HTTP): The Web's architecture has very simple principles revolving around the ideas of placing a heavy emphasis on a consistent and global identification mechanism for resources, a standardized way of how resource representations can be retrieved, and a standardized way of how resource representations should be usable by using standardized media types. Based on the Internet, the Web's transport protocol transmits representations of resources identified by a Uniform Resource Identifier (URI) between Web servers and clients. The most important protocols for data transfer on the Web is the Hypertext Transfer Protocol (HTTP). Foundations (25 Slides) HTTP [http://en.wikipedia.org/wiki/Http] · Cool URIs [http://www.w3.org/Provider/Style/URI] Live HTTP Headers [https://addons.mozilla.org/en-US/firefox/addon/3829] · HTTP and CGI [http://www.garshol.priv.no/download/text/http-tut.html] · URI Spec [http://tools.ietf.org/html/rfc3986] · HTTP Spec [http://tools.ietf.org/html/rfc2616]
2009-11-05 Security & Privacy: TCP and thus HTTP are clear-text protocols, which make no attempt to hide the data being transmitted. For secure data transfers, it thus is necessary to use additional technologies for providing secure data transfers. For the Web, the most interesting security feature are secure HTTP interactions, which are provided by HTTP over SSL (HTTPS), a protocol that layers an encryption layer (SSL or TLS) between TCP and HTTP. For any task involving personalization and/or trust, it is not only necessary to have a concept for providing privacy, but also to have concepts for identity and how to prove identity, which needs authentication. Security (29 Slides) Security [http://en.wikipedia.org/wiki/Internet_security] · Privacy [http://en.wikipedia.org/wiki/Internet_privacy] · Browser Security [http://cacm.acm.org/magazines/2009/8/34494-browser-security/fulltext] Browser Options [http://support.mozilla.com/en-US/kb/Options+window] · HTTPS [http://en.wikipedia.org/wiki/Https] · HTTPS Spec [http://tools.ietf.org/html/rfc2818]
2009-11-10 State Management: HTTP is a stateless protocol, where each request/response interaction is a separate interaction and there is no protocol support for longer sessions (such as a user logging in and working on a Web site as an identified user). State management refers to mechanisms which provide support for this kind of scenario, the most popular choice for state management are cookies. Another possibility is URI-based state management. This lecture is a first glimpse into the world of Representational State Transfer (REST), the Web's fundamental model of handling interaction with resources. State (22 Slides) Wikipedia [http://en.wikipedia.org/wiki/HTTP_cookie] State [http://www.w3.org/2001/tag/doc/state.html] · Cookies Spec [http://dret.net/rfc-index/reference/RFC2965]
2009-11-17 Representational State Transfer (REST): Representational State Transfer (REST) is an architectural style for building distributed systems. The Web is an example for such a system. REST-style applications can be built using a wide variety of technologies. REST's main principles are those of resource-oriented states and functionalities, the idea of a unique way of identifying resources, and the idea of how operations on these resources are defined in terms of a single protocol for interacting with resources. REST-oriented system design leads to systems which are open, scalable, extensible, and easy to understand. REST (25 Slides) REST vs. SOAP [http://www.mulberrytech.com/Extreme/Proceedings/html/2002/Prescod01/EML2002Prescod01.html] · What is REST? [http://www.eioba.com/a69755/how_i_explained_rest_to_my_wife] · REST Interfaces [http://bitworking.org/news/193/Do-we-need-WADL] RESTwiki [http://rest.blueoxen.net/cgi-bin/wiki.pl] A6 [a/6/] assigned (due date: 11/22)
2009-11-19 Semantic Web: The Semantic Web can either be understood as a prepackaged set of languages and technologies for representing semantics and working with them, or as a more general idea of Web Semantics, which instead of predefining certain languages and technologies just looks at the various options of how more semantics can be represented on the Web. Taking the latter approach, this lecture looks at the various ways in which semantics can be introduced on the Web, and what is required in these scenarios in terms of technology and information sharing. Semantic Web (19 Slides) Microformats [http://microformats.org/] · Which Semantic Web? [http://www.google.com/search?q=%22Which%20Semantic%20Web?%22+Catherine+C.+Marshall+Frank+M.+Shipman] RDFa [http://www.w3.org/TR/xhtml-rdfa-primer/] · FAQ [http://www.w3.org/2001/sw/SW-FAQ] · RDF [http://www.w3.org/TR/rdf-primer/] · OWL [http://www.w3.org/TR/owl-features/]
2009-11-24 Architecture of the World Wide Web: The Web's architecture has very simple principles revolving around the ideas of placing a heavy emphasis on a consistent and global identification mechanism for resources, a standardized way of how resource representations can be retrieved, and a standardized way of how resource representations should be usable by using standardized media types. This lecture presents an overview of these architectural principles and illustrates them with using blogs as an example of Web-based applications. Web Architecture (25 Slides) Architecture? [http://www.martinfowler.com/ieeeSoftware/whoNeedsArchitect.pdf] · Architecture Summary [http://www.w3.org/TR/webarch/summary.html] Architecture [http://www.w3.org/TR/webarch/] A7 [a/7/] assigned (due date: 12/6)
2009-12-01 Internationalization (I18N) & Localization (L10N): Many publishing environments need to support multiple languages. In many cases, the requirement to support multiple languages surfaces in later stages of a product development or publishing solution, which can cause major design changes, driving up costs. Internationalization (I18N) is the approach to design systems which can adapt to different locales. Localization (L10N) is the activity to identify, define, and encode locales, based on internationalized software. For languages using different alphabets, Unicode is the most popular character set today and provides a variety of encoding schemes, each of them being a Unicode Transformation Format (UTF). I18N & L10N (52 Slides) I18N & L10N Markup [http://www.w3.org/TR/itsreq/] Unicode [http://unicode.org/] · History [http://homepages.cwi.nl/~dik/english/codes/stand.html] · Content [http://www.w3.org/TR/i18n-html-tech-lang/] · Tag Set [http://www.w3.org/TR/its/]
2009-12-03 Web Trends: Web architecture in many cases simply lays the groundwork for developing application areas. In this final lecture we briefly look at some of the current trends on the Web, and how they connect to Web architecture. While the drivers of the trends often are not exclusively technical, they often have a substantial background in technology as an enabler of applications. Not all of the technological issues are within the realm of Web architecture, but increasingly the Web ties together a lot of formerly disconnected application areas, and serves as an integration and unification platform. Trends (23 Slides)
Show Abstracts
Hide Abstracts
Creative Commons License Please send comments to dret@berkeley.edu
Last modification on Wednesday, 29-Jul-2009 14:54:00 EDT
valid CSS! valid XHTML 1.0!