Web Architecture

INFO 290-03 (CCN 42584) – Fall 2008
School of Information, UC Berkeley

Instructor: Erik Wilde

Lecture: Tue&Thu 9.00–10.30, 202 South Hall

Description: This course is a survey of Web technologies, ranging from the basic technologies underlying the Web (URI, HTTP, HTML) to more advanced technologies being used in the context of Web engineering, for example structured data formats and Web programming frameworks. The goal of this course is provide an overview of the technical issues surrounding the Web today, and to provide a solid and comprehensive perspective of the Web's constantly evolving landscape. Because of the large number of technologies covered in this course, only a fraction of them will be discussed and described in greater detail. The main goal of the course thus is an understanding of the interdependencies and connections of Web technologies, and of their capabilities and limitations. Implementing Web-based applications today can be done in a multitude of ways, and this course provides guidelines and best practices which technologies to choose, and how to use them.

Date Subject Slides Required Reading Additional Resources Assignments [a/]
2008-08-28 Overview and Introduction: This introductory lecture gives the motivation for the course, some information about the people involved and the organization of the course, a high-level overview of the course's topics, and an overview of the assignments which are an important part of the course program. Introduction (27 Slides)
2008-09-02 Architecture of the World Wide Web: The Web's architecture has very simple principles revolving around the ideas of placing a heavy emphasis on a consistent and global identification mechanism for resources, a standardized way of how resource representations can be retrieved, and a standardized way of how resource representations should be usable by using standardized media types. This lecture presents an overview of these architectural principles and illustrates them with using blogs as an example of Web-based applications. Web Architecture (25 Slides) Architecture? [http://www.martinfowler.com/ieeeSoftware/whoNeedsArchitect.pdf] · Architecture Summary [http://www.w3.org/TR/webarch/summary.html] Architecture [http://www.w3.org/TR/webarch/]
2008-09-04 Internet Foundations: The Internet is the technical infrastructure on top of which the Web is built. Some of the services provided by the Internet are essential for the Web, most importantly the naming service and the data transfer service. The Domain Name System (DNS) provides the human-readable names for computers, which can then be used in the addresses of Web servers and ultimately Web pages. The Transmission Control Protocol (TCP) provides the reliable data transfer service between Web Servers and Web Browsers, building on the very robust Internet Protocol (IP). Internet (31 Slides) Timeline [http://www.zakon.org/robert/internet/timeline/] Internet Architecture [http://en.wikipedia.org/wiki/Category:Internet_architecture]
2008-09-09 Web Foundations (URI & HTTP): The Web assumes an underlying network infrastructure providing a reliable, connection-oriented, flow-controlled, end-to-end transport service. Based on such a network service (today provided by the Internet), the Web's transport protocol moves representations of resources identified by a Uniform Resource Identifier (URI) between Web servers and clients. The most important protocols for data transfer on the Web is the Hypertext Transfer Protocol (HTTP). Foundations (30 Slides) Cool URIs [http://www.w3.org/Provider/Style/URI] Language Negotiation [http://www.w3.org/International/questions/qa-apache-lang-neg]
2008-09-11 Security Issues: TCP and thus HTTP are clear-text protocols, which make no attempt to hide the data being transmitted. For secure data transfers, it thus is necessary to use additional technologies for providing secure data transfers. This lecture looks briefly into the foundations of cryptographic primitives (such as one-way functions and encryption) and cryptographic protocols. For the Web, the most interesting security feature are secure HTTP interactions, which are provided by HTTP over SSL (HTTPS), a protocol that layers an encryption layer (SSL or TLS) between TCP and HTTP. Security (30 Slides) TLS [http://dret.net/rfc-index/reference/RFC4346] · Code Book [http://www.simonsingh.net/The_Code_Book.html] Assignment 1 [a/1/]: HTTP Content Negotiation
2008-09-16 Identity and Authentication: For any task involving personalization and/or trust, it is not only necessary to have a concept for providing privacy, but also to have concepts for identity and how to prove identity, which needs authentication. HTTP has built-in mechanisms for authentication, and the standard HTTP Authentication mechanisms are Basic Authentication and Digest Access Authentication. Instead of these mechanisms, many applications implement their own ways of authentication, which often are based around authentication using HTML Forms. Authentication (21 Slides) HTTP Authentication Spec [http://dret.net/rfc-index/reference/RFC2617]
2008-09-18 State Management: HTTP is a stateless protocol, where each request/response interaction is a separate interaction and there is no protocol support for longer sessions (such as a user logging in and working on a Web site as an identified user). State management refers to mechanisms which provide support for this kind of scenario, the most popular choice for state management are cookies. Another possibility is URI-based state management. This lecture is a first glimpse into the world of Representational State Transfer (REST), the Web's fundamental model of handling interaction with resources. State (22 Slides) Wikipedia [http://en.wikipedia.org/wiki/HTTP_cookie] State [http://www.w3.org/2001/tag/doc/state.html] · Cookies Spec [http://dret.net/rfc-index/reference/RFC2965]
2008-09-23 Representational State Transfer (REST): Representational State Transfer (REST) is an architectural style for building distributed systems. The Web is an example for such a system. REST-style applications can be built using a wide variety of technologies. REST's main principles are those of resource-oriented states and functionalities, the idea of a unique way of identifying resources, and the idea of how operations on these resources are defined in terms of a single protocol for interacting with resources. REST-oriented system design leads to systems which are open, scalable, extensible, and easy to understand. REST (25 Slides) REST vs. SOAP [http://www.mulberrytech.com/Extreme/Proceedings/html/2002/Prescod01/EML2002Prescod01.html] · What is REST? [http://www.eioba.com/a69755/how_i_explained_rest_to_my_wife] · REST Interfaces [http://bitworking.org/news/193/Do-we-need-WADL] RESTwiki [http://rest.blueoxen.net/cgi-bin/wiki.pl]
2008-09-25 Character Set Issues & Unicode: Every character-based document is based on some model of which characters are available, and how they are encoded. Unicode is the most popular character set today and provides a variety of encoding schemes, each of them being a Unicode Transformation Format (UTF). In addition to character sets and encodings, other issues relevant when dealing with characters are transcoding and normalization, which deal with the problems arising when using different character encodings or different encodings of particular characters. Unicode (32 Slides) Unicode [http://unicode.org/] · History [http://homepages.cwi.nl/~dik/english/codes/stand.html]
2008-09-30 Media Types: One of the most important aspect of computer-based communications is the concept of media types, the question what type of information some digital artifact represents, and how it is encoded. The most common standard for this information is the scheme introduced by Multipurpose Internet Mail Extensions (MIME). Media types can be negotiated by peers communicating through HTTP. Some media types allow fragment identifiers, which allow references to a resource to identify a fragment of the complete resource. Media Types (31 Slides) MIME Respect [http://www.w3.org/2001/tag/doc/mime-respect] MIME [http://dret.net/rfc-index/reference/RFC2046] · Registry [http://www.iana.org/assignments/media-types/]
2008-10-02 Hypertext Markup Language (HTML): The Hypertext Markup Language (HTML) is the most important content type on the Web. Even though it is primarily intended for humans (by presenting formatted pages of textual content), it also has facets that are important for machine-based processing. HTML can be use in a variety of ways, and this lecture looks at some of the important rules that should be observed when creating HTML, for example how to use HTML markup in general and how to create accessible forms. HTML (29 Slides) HTML [http://www.w3.org/TR/html401/] · XHTML [http://www.w3.org/TR/xhtml1/] · Validator [http://validator.w3.org/]
2008-10-07 Accessibility: Web accessibility refers to the degree to which the Web can be used and accessed by people with disabilities. The World Wide Web Consortium (W3C) determines that accessibility specifically involves how people with disabilities can perceive, understand, navigate, interact with, as well contribute to the Web. Techniques for supporting these needs will be discussed and reviewed through examples and exercises. Accessibility User Interface Design [http://doi.acm.org/10.1145/637848.637858] Universal Usability [http://www.w3.org/WAI/] · WAI [http://www.w3.org/WAI/] · WCAG [http://www.w3.org/TR/WAI-WEBCONTENT/]
2008-10-09 Usability: According to the International Organization for Standardization (ISO), usability defines the extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use. We will discuss tradeoffs in the design of Web interfaces to support users goals, and present resources to aid design decisions. Usability Contextual Design (Chapter 3, pp. 41-64) [http://portal.acm.org/citation.cfm?id=286067] Heuristic Evaluation [http://www.useit.com/jakob/inspectbook.html] · useit.com [http://useit.com/]
2008-10-14 Cascading Style Sheets (CSS): Cascading Stylesheets (CSS) have been designed as a language for better separating presentation-specific issues from the structuring of documents as provided by HTML. CSS uses a simple model of selectors and declarations. Selectors specify to which elements of a document a set of declarations (each being a value assigned to a property) apply; in addition there is a model of how property values are inherited and cascaded. The biggest limitation of CSS is that it cannot change the structure of the displayed document. CSS (41 Slides) CSS Snapshot [http://www.w3.org/TR/css-beijing] · Touch of Style [http://www.w3.org/MarkUp/Guide/Style] Specs [http://www.w3.org/Style/CSS/] · Validator [http://jigsaw.w3.org/css-validator/]
2008-10-16 Asynchronous JavaScript and XML (Ajax): Asynchronous JavaScript and XML (Ajax) takes Dynamic HTML (DHTML) to the next level by allowing server access from within scripting code. This is accomplished by using a standardized API for client/server communications, the XMLHttpRequest object. This objects allows using HTTP connections from within scripting code, and thereby allows scripting code to dynamically reload data from a server in response to user interactions. Ajax (27 Slides) DOM [http://www.w3.org/DOM/] · XMLHttpRequest [http://www.w3.org/TR/XMLHttpRequest/]
2008-10-21 Internationalization (I18N) & Localization (L10N): Many publishing environments need to support multiple languages. In many cases, the requirement to support multiple languages surfaces in later stages of a product development or publishing solution, which can cause major design changes, driving up costs. Internationalization (I18N) is the approach to design systems which can adapt to different locales. Localization (L10N) is the activity to identify, define, and encode locales, based on internationalized software. I18N & L10N (32 Slides) I18N & L10N Markup [http://www.w3.org/TR/itsreq/] Content [http://www.w3.org/TR/i18n-html-tech-lang/] · Tag Set [http://www.w3.org/TR/its/]
2008-10-28 Picture Formats: Pictures are the only multimedia content on the Web that is widely supported by standardized formats. The most important picture formats are the Graphics Interchange Format, the Joint Photographic Experts Group (JPEG) format, and the Portable Network Graphics (PNG) format. These picture formats target different application areas and depending on the picture material, choosing one format over the other can make a big difference. Pictures (22 Slides) GIF [http://www.w3.org/Graphics/GIF/spec-gif89a.txt] · JPEG [http://www.w3.org/Graphics/JPEG/] · PNG [http://www.w3.org/TR/PNG/]
2008-10-30 Content Syndication: For many information sources on the Web, it is useful to have some standardized way of subscribing to information updates. Syndication formats such as RSS and Atom can be used by these information sources to publish a feed of updated information items. While RSS and Atom are read-only formats, the Atom Publishing Protocol (AtomPub) build on top of Atom and provides a protocol for submitting new items to feeds. Syndication (44 Slides) Identifying Atom [http://www.xml.com/pub/a/2004/08/18/pilgrim.html] Atom [http://atompub.org/rfc4287.html] · AtomPub [http://atompub.org/rfc5032.html] · Validator [http://validator.w3.org/feed/]
2008-11-13 Course Project Start: Today the class project is introduced and we will have an initial discussion about the scope of the project. Course Project Start
2008-11-18 Semantic Web: HTML pages are for human users and describe a resource in very general terms (headings, lists, tables, …). For machine-based interaction, it is often necessary to have more information about the application concepts. XML is a popular language for representing application structures, but is targeted at machine-based processing alone. Microformats and more formal approaches such as the Resource Description Format (RDF), RDF in Attributes (RDFa), and Web Ontology Language (OWL) often are used to describe Web content semantically. Semantic Web (30 Slides) Microformats [http://microformats.org/] · RDFa [http://www.w3.org/TR/xhtml-rdfa-primer/] · FAQ [http://www.w3.org/2001/sw/SW-FAQ] · RDF [http://www.w3.org/TR/rdf-primer/] · OWL [http://www.w3.org/TR/owl-features/]
2008-11-20 Course Project Proposals: Short presentations of all proposals and further discussion about the project scope and the possible interfaces between individual projects. Course Project Proposals
2008-12-02 Variants and Analysis: Today's landscape of Internet and Web technologies offers a sometimes confusingly wide array of implementation choices. Given some application idea, implementation can be done using basic Web technologies, newer Web 2.0 technologies, it can use browser-embedded functionality such as Flash, Java Applets, ActiveX, Silverlight, or Google Gears, or it can be built with Web-oriented application development platforms such as Adobe Integrated Runtime (AIR) or JavaFX. Starting from a desired objective (such as the successful implementation of a well-designed Web app), it can be very informative to assess factors influencing the pursuit of this objective. One way to do it is the analysis of the Strengths, Weaknesses, Opportunities, and Threats (SWOT) of implementation variants, which supports a more structured way of comparing variants, and can be a starting point for choosing the best one. Variants and Analysis (28 Slides)
2008-12-04 Course Project Presentations: Short presentations of all project results and discussion about the project future and possible improvements of the proposed architectures. Course Project Presentations
Show Abstracts
Hide Abstracts
Creative Commons License Please send comments to dret@berkeley.edu
Last modification on Thursday, 20-Nov-2008 21:10:20 EST
valid CSS! valid XHTML 1.0!