Web Architecture and Information Management

INFO 190-02 (CCN 42509) – Spring 2009
School of Information, UC Berkeley

Instructor: Erik Wilde
TAs: Ruchi Kumar and Anuradha Roy

Lecture: Mon & Wed 15.00–16.00, 210 Wheeler Hall
Lab: Fri 11.00–12.00, 110 South Hall; Fri 14.00–15.00 202 South Hall

Description: This courses focuses on understanding the Web as an information system, and how to use it for information management for personal and shared information. The Web is an open and constantly evolving system which can make it hard to understand how the different parts of the landscape fit together. This course provides students with an overview of the Web as a whole, and how the individual parts it together. We briefly look at topics such as Web design and Web programming, but this course is not exclusively designed to teach HTML or JavaScript. Instead, we look at the bigger picture and how and when to use these and other technologies. The Web already is and will remain a central part in many information-related activities for a long time to come, and this course provides students with the understanding and skills to better navigate and use the landscape of Web information (for example, Wikipedia), Web technologies (for example, HTML, CSS, and JavaScript), Web tools (for example, delicious and Yahoo pipes), and common Web patterns (for example, mashups).

Date Subject Slides Required Reading Additional Resources
2009-01-21 Overview and Introduction: This introductory lecture gives the motivation for the course, some information about the people involved and the organization of the course, a high-level overview of the course's topics, and an overview of the assignments which are an important part of the course program. The final part of the lecture describes how to connect to and setup a Web space. Introduction (PDF [2009-01-21-intro.pdf]) bSpace [https://bspace.berkeley.edu/portal/site/d1b86df6-7f28-4db3-ba2d-bdd6b3b27318] · Flier [flier]
2009-01-26 Setup and Environment: This lecture provides a hands-on overview of the various tools that are required to create and publish Web pages. This includes authoring a Web page, transferring the page onto your Web space on the Web server, and validating the Web page. For a more detailed look at the Web page in the context of the browser, the Firebug extension allows Firefox users to understand in detail how a Web page is structured and styled. Setup (PDF [2009-01-26-setup.pdf]) Firebug [https://addons.mozilla.org/en-US/firefox/addon/1843] · XAMPP [http://www.apachefriends.org/en/xampp.html]
2009-01-28 Hypertext Markup Language (HTML): The Hypertext Markup Language (HTML) is the most important content type on the Web. This lecture covers a basic overview of how to use HTML markup in general. In particular, we look at page titles, meta tags, inserting text and images, using lists, and creating simple tables. Attributes can be used for more layout control in the HTML tags, but most layout issues are deferred until the CSS lecture. HTML (PDF [2009-01-28-html.pdf]) Getting started with HTML [http://www.w3.org/MarkUp/Guide/] HTML Tutorial [http://www.w3schools.com/html] · HTML Reference [https://developer.mozilla.org/en/HTML/Element] · HTML Validator [http://validator.w3.org/]
2009-02-02 Advanced HTML: This lecture covers linking in general and in header information, and a more general view of HTML layout based on the box model used by browsers. The concept of frames is introduced, which can be used in a combination of framesets and pages, or as inline frames. Finally, image maps are introduced as a way of how images can be turned not only into links, but into a set of various linked areas overlayed over the image. Advanced HTML (PDF [2009-02-02-html-advanced.pdf]) Advanced HTML [http://www.w3.org/MarkUp/Guide/Advanced.html] Online Image Map Editor [http://www.maschek.hu/imagemap/imgmap]
2009-02-04 Cascading Style Sheets (CSS): Cascading Stylesheets (CSS) have been designed as a language for better separating presentation-specific issues from the structuring of documents as provided by HTML. CSS uses a simple model of selectors and declarations. Selectors specify to which elements of a document a set of declarations (each being a value assigned to a property) apply; in addition there is a model of how property values are inherited and cascaded. The biggest limitation of CSS is that it cannot change the structure of the displayed document. CSS (PDF [2009-02-04-css.pdf]) Adding a Touch of Style [http://www.w3.org/MarkUp/Guide/Style] CSS Spec [http://www.w3.org/TR/CSS21/] · Properties [http://www.w3.org/TR/CSS21/propidx.html] · CSS Tutorial [http://www.w3schools.com/css] · CSS Validator [http://jigsaw.w3.org/css-validator/]
2009-02-09 Web Browsers: This lecture looks at Web browsers and how they work. It introduces the basic functionalities of a browser; retrieval and rendering of Web pages. Any modern browser needs to support more than just HTTP and HTML; it must support CSS for stylesheets, JavaScript for scripted Web pages, various image formats, and popular applications such as Flash. In addition, browsers can support additional functionality such as off-line operation, or in general more application-oriented features such as AIR or Silverlight. Browsers (PDF [2009-02-09-browsers.pdf]) Wikipedia [http://en.wikipedia.org/wiki/Web_Browser] · History [http://en.wikipedia.org/wiki/History_of_the_web_browser] Firefox [http://www.mozilla.com/firefox/] · Safari [http://www.apple.com/safari/] · IE [http://www.microsoft.com/windows/products/winfamily/ie/default.mspx] · Chrome [http://www.google.com/chrome] · Opera [http://www.opera.com/]
2009-02-11 HTML Forms: This lecture introduces HTML Forms, a way how an HTML page can provide input fields, so that users can provide data to a Web-based application. HTML forms are regular HTML pages (i.e., using regular HTML structures), but they also contain special HTML elements for data entry. Most importantly, each form contains instructions on how to submit the entered data, and the browser will use that information to compose a request containing all the data of the form submission. HTML Forms (PDF [2009-02-11-forms.pdf]) HTML Forms FAQ [http://htmlhelp.com/faq/html/forms.html] Style Guide [http://www.webstyleguide.com/wsg3/10-forms-and-applications/] · Forms Spec [http://www.w3.org/TR/html401/interact/forms.html]
2009-02-18 Internet Architecture: The Internet is the technical infrastructure on top of which the Web is built. Some of the services provided by the Internet are essential for the Web, most importantly the naming service and the data transfer service. The Domain Name System (DNS) provides the human-readable names for computers, which can then be used in the addresses of Web servers and ultimately Web pages. The Transmission Control Protocol (TCP) provides the reliable data transfer service between Web Servers and Web Browsers, building on the very robust Internet Protocol (IP). Internet (PDF [2009-02-18-internet.pdf]) TCP/IP [http://www.acm.org/crossroads/xrds1-1/tcpjmy.html] Internet Architecture [http://en.wikipedia.org/wiki/Category:Internet_architecture] · TCP/IP Overview [http://www.garykessler.net/library/tcpip.html] · Timeline [http://www.zakon.org/robert/internet/timeline/]
2009-02-23 Security & Privacy: TCP and thus HTTP are clear-text protocols, which make no attempt to hide the data being transmitted. For secure data transfers, it thus is necessary to use additional technologies for providing secure data transfers. For the Web, the most interesting security feature are secure HTTP interactions, which are provided by HTTP over SSL (HTTPS), a protocol that layers an encryption layer (SSL or TLS) between TCP and HTTP. For any task involving personalization and/or trust, it is not only necessary to have a concept for providing privacy, but also to have concepts for identity and how to prove identity, which needs authentication. Security (PDF [2009-02-23-security.pdf]) Security [http://en.wikipedia.org/wiki/Internet_security] · Privacy [http://en.wikipedia.org/wiki/Internet_privacy] Browser Options [http://support.mozilla.com/en-US/kb/Options+window] · HTTPS [http://en.wikipedia.org/wiki/Https] · HTTPS Spec [http://tools.ietf.org/html/rfc2818]
2009-02-25 Web Foundations (URI & HTTP): The Web's architecture has very simple principles revolving around the ideas of placing a heavy emphasis on a consistent and global identification mechanism for resources, a standardized way of how resource representations can be retrieved, and a standardized way of how resource representations should be usable by using standardized media types. Based on the Internet, the Web's transport protocol transmits representations of resources identified by a Uniform Resource Identifier (URI) between Web servers and clients. The most important protocols for data transfer on the Web is the Hypertext Transfer Protocol (HTTP). Foundations (PDF [2009-02-25-foundations.pdf]) HTTP [http://en.wikipedia.org/wiki/Http] · Cool URIs [http://www.w3.org/Provider/Style/URI] Live HTTP Headers [https://addons.mozilla.org/en-US/firefox/addon/3829] · HTTP and CGI [http://www.garshol.priv.no/download/text/http-tut.html] · URI Spec [http://tools.ietf.org/html/rfc3986] · HTTP Spec [http://tools.ietf.org/html/rfc2616]
2009-03-02 Site Navigation: Most Web pages are part of bigger structures, usually Web sites. One common goal of Web sites is to make navigation of Web pages easy to understand and use. There are two main sides to site navigation: how to design it from the user point of view, and how to implement it from the Web site of view. User perspectives can be seen as a special case of Web Design Patterns: tasks for Web-based publishing that have to be addressed for a large percentage of all Web sites. Implementation perspectives look at how to efficiently manage information so that changes to the Web site are easily possible. Navigation (PDF [2009-03-02-navigation.pdf]) SSI for Navigation [http://www.yourhtmlsource.com/sitemanagement/includes.html] Style Guide [http://www.webstyleguide.com/wsg3/5-site-structure/] · SSI Tutorial [http://httpd.apache.org/docs/2.2/howto/ssi.html] · WCAG [http://www.w3.org/TR/WCAG/#navigation-mechanisms] · Web Patterns [http://groups.ischool.berkeley.edu/ui_designpatterns/webpatterns2/webpatterns/pattern.php?id=11]
2009-03-04 State Management (Cookies): HTTP is a stateless protocol, where each request/response interaction is a separate interaction and there is no protocol support for longer sessions (such as a user logging in and working on a Web site as an identified user). State management refers to mechanisms which provide support for this kind of scenario, the most popular choice for state management are cookies. Another possibility is URI-based state management. This lecture is a glimpse into the world of Representational State Transfer (REST), the Web's fundamental model of handling interaction with resources. Cookies (PDF [2009-03-04-cookies.pdf]) HowStuffWorks [http://computer.howstuffworks.com/cookie.htm/printable] Cookie Spec [http://tools.ietf.org/html/rfc2965] · Wikipedia [http://en.wikipedia.org/wiki/HTTP_cookie] · HTTP Viewer [http://www.httpviewer.net/]
2009-03-09 Multimedia Content: Pictures are the only multimedia content on the Web that is widely supported by standardized formats. The most important picture formats are the Graphics Interchange Format (GIF), the Joint Photographic Experts Group (JPEG) format, and the Portable Network Graphics (PNG) format. These picture formats target different application areas and depending on the picture material, choosing one format over the other can make a big difference. While audio and video are not supported by Web browsers, they also have become popular media types on the Web. Multimedia (PDF [2009-03-09-multimedia.pdf]) Style Guide [http://www.webstyleguide.com/wsg3/11-graphics/] Wikipedia [http://en.wikipedia.org/wiki/Graphics_file_format] · YSlow [https://addons.mozilla.org/en-US/firefox/addon/5369] · PNG Spec [http://www.w3.org/TR/PNG/]
2009-03-11 Media Types: One of the most important aspect of computer-based communications is the concept of media types, the question what type of information some digital artifact represents, and how it is encoded. The most common standard for this information is the scheme introduced by Multipurpose Internet Mail Extensions (MIME). Media types can be negotiated by peers communicating through HTTP. Some media types allow fragment identifiers, which allow references to a resource to identify a fragment of the complete resource. MIME (PDF [2009-03-11-mime.pdf]) Firefox Handling [https://developer.mozilla.org/En/How_Mozilla_determines_MIME_Types] Registry [http://www.iana.org/assignments/media-types/] · Wikipedia [http://en.wikipedia.org/wiki/MIME_type]
2009-03-16 Guest Lecture by Raymond Yee [http://raymondyee.net/] : Introduction to Mashups: The difference between static and dynamic content. How it is created and when is each useful. Talk about JSP and server side validation, compare it to client side scripting using JavaScript, which looks nice but may not always work on all browsers. Difference between web server and application Server. Mashup Intro HousingMaps [http://www.housingmaps.com/] · Flickr Sudoku [http://flickrsudoku.com/] · ProgrammableWeb [http://programmableweb.com/]
2009-03-18 Character Sets, Internationalization (I18N), and Localization (L10N): Every character-based document is based on some model of which characters are available, and how they are encoded. Unicode is the most popular character set today and provides a variety of encoding schemes, each of them being a Unicode Transformation Format (UTF). Many publishing environments need to support multiple languages. Internationalization (I18N) is the approach to design systems which can adapt to different locales. Localization (L10N) is the activity to identify, define, and encode locales, based on internationalized software. I18N (PDF [2009-03-18-i18n.pdf]) Writing Systems [http://en.wikipedia.org/wiki/Writing_system] History [http://homepages.cwi.nl/~dik/english/codes/stand.html] · Unicode Database [http://people.w3.org/rishida/scripts/uniview/] · Unicode Converter [http://people.w3.org/rishida/scripts/uniview/conversion.php]
2009-03-30 Scripting: Scripting is used on the majority of today's modern Web sites. Scripting can be used to improve the usability and accessibility of a Web site (for example for validating form data on the client side), it can vastly improve the user experience with new interface design (the smooth scrolling of Google Maps vs. older click to scroll map services), or it can be used to implement behavior that would be impossible without scripting (for example the online applications of Google Docs). This introductory lecture looks into scripting fundamentals such as JavaScript itself, the Document Object Model (DOM) for accessing the browser window's content, and XMLHttpRequest for script-server communications. Scripting (PDF [2009-03-30-scripting.pdf]) DHTML [http://www.yourhtmlsource.com/javascript/dhtmlexplained.html] Best Practices [http://domscripting.com/book/sample/] · Tutorial [http://www.webteacher.com/javascript/] · Wikipedia [http://en.wikipedia.org/wiki/Dynamic_HTML]
2009-04-06 Content Syndication: For many information sources on the Web, it is useful to have some standardized way of subscribing to information updates. Syndication formats such as RSS and Atom can be used by these information sources to publish a feed of updated information items. Feeds can be read directly in a browser, but in most cases they are read by specialized software; either a feed reader that allows users to subscribe to more than one feed and manage the information received through all these feeds, or some software module that reads feeds and embeds them for example in a Web page. This latter example is the classical usage of feeds; news feeds published by news agencies, and them embedded as news tickers into Web pages as a constantly updated source of information. Syndication (PDF [2009-04-06-syndication.pdf]) History [http://en.wikipedia.org/wiki/History_of_web_syndication_technology] Wikipedia (Syndication) [http://en.wikipedia.org/wiki/Web_syndication] · Wikipedia (Feeds) [http://en.wikipedia.org/wiki/Web_feed] · Podcast Spec [http://www.apple.com/itunes/whatson/podcasts/specs.html]
2009-04-08 Syndication Aggregation: Feeds are useful as information sources adhering to a standardized format, but they usually have a single source. In a larger picture of information flows on the Web, the question is how feeds can be handled more efficiently and flexibly. This includes questions such as load balancing and user statistics, but also more complex scenarios such as aggregating (and maybe even filtering) feeds. Since feeds use a standardized data format, such repurposed information again can be published as a feed, creating a flexible architecture of feed-based information dissemination, allowing an arbitrary number of feed aggregation steps. Aggregation (PDF [2009-04-08-aggregation.pdf]) Delicious [http://delicious.com/help/learn] · Google Reader [http://www.google.com/support/reader/?hl=en]
2009-04-13 Location and Geocoding: Location currently is not a concept that is supported by the Web itself, but there are many location-aware applications available on the Web. Furthermore, the increasing availability of mobile devices and mobile Internet connectivity will turn location into an increasingly important concept on the Web. This lecture looks at some of the question how location information can be obtained, and how it is represented in today's Web standards (such as feeds) and applications (such as Flickr and Google Maps). GeoRSS (PDF [2009-04-13-georss.pdf]) Geotagging [http://en.wikipedia.org/wiki/Geotagging] Spec [http://georss.org/] · Flickr [http://blog.flickr.net/2008/08/08/introducing-a-new-way-to-geotag/]
2009-04-15 Describing Geographical Objects: The Keyhole Markup Language (KML) is a way of how placemarks and other geographical features can be described. It is not as powerful or sophisticated as the Geographic Markup Language (GML), but it is easier to understand and use and is support as a data format by a variety of Web-oriented services and applications. Flickr, Google Maps, Google Earth all support KML and can use KML for exchanging geographic datasets. KML (PDF [2009-04-15-kml.pdf]) Video [http://www.youtube.com/watch?v=TftFnot5uXw] My Maps [http://maps.google.com/support/bin/answer.py?hl=en&answer=68480] · Spec [http://code.google.com/apis/kml/documentation/kmlreference.html] · Wikipedia [http://en.wikipedia.org/wiki/KML]
2009-04-20 Guest Lecture by Raymond Yee [http://raymondyee.net/] : Introduction to the Google Maps API: After studying some specific instances of map-based mashups, we will study the basics of the Google Maps API and the process of geocoding. Google Maps API [http://code.google.com/apis/maps/index.html] · KML & GeoRSS [http://googlemapsapi.blogspot.com/2007/03/kml-and-georss-support-added-to-google.html]
2009-04-22 Guest Lecture by Raymond Yee [http://raymondyee.net/] : Mashups with the Google Maps API: In this lecture, we will study how to make a simple mashup using the Google Maps API. Google Maps Mashups
2009-04-27 Semantic Web and Microformats: HTML pages are for human users and describe a resource in structural terms (headings, lists, tables, …). For machine-based interaction, it is often useful to have more information about the application concepts. The Web reflects the various ways in which the issue of semantics has been addressed in other disciplines, with the Semantic Web having the strongest commitment to highly formalized semantics. On the syntax side, the Extensible Markup Language (XML) is a popular language for representing application structures, but it is representing only syntax and no semantics Semantic Web (PDF [2009-04-27-semweb.pdf]) Wikipedia [http://en.wikipedia.org/wiki/Semantic_Web] FAQ [http://www.w3.org/2001/sw/SW-FAQ]
2009-04-29 Web Semantics in Practice: Web semantics are interesting to be able to know more about the meaning of Web content, not only its syntactic representation. Microformats and more formal approaches such as the Resource Description Framework (RDF), RDF in Attributes (RDFa), and the Web Ontology Language (OWL) can be used to describe Web content semantically. After looking at Semantic Web concepts such as Microformats and the Resource Description Framework (RDF), we look into some practical issues of how to express semantics on the Web. Web Semantics (PDF [2009-04-29-websemantics.pdf]) hCard [http://microformats.org/wiki/hcard] · hCalendar [http://microformats.org/wiki/hcalendar] · geo [http://microformats.org/wiki/geo] · adr [http://microformats.org/wiki/adr] microformats.org [http://microformats.org/] · RDFa [http://www.w3.org/TR/xhtml-rdfa-primer/]
2009-05-04 Representational State Transfer (REST): The Web is built on an architectural style called Representational State Transfer (REST). The main idea of this style is to use a uniform interface for all services, which means that each Web site provides the same service. This idea of a uniform interface is apparent in Web documents (browsers can GET documents by following hyperlinks), but also can be extended to cover machine-oriented Web Services. REST is a style that supports loose coupling and massive scalability, as opposed to more traditional ways of how enterprise computing attempts to integrate all functionality in an attempt to hide distribution. REST (PDF [2009-05-04-rest.pdf]) Simple Explanation [http://tomayko.com/writings/rest-to-my-wife]
2009-05-06 Course Summary and Review: The course summary looks at the topics covered in the course, and we will briefly discuss all the topics and what was important about them in the context of the course. The review sheet provides a overview for how to best study for the final exam (final exam date and location: Monday, May 18, 5.00-6.30pm in 110 Barrows Hall [http://berkeley.edu/map/maps/DE45.html]). Review Sheet
2009-05-11 Evaluation and Q&A: In this final class the course evaluation will be the first part of the class. After that, we will go through the remaining parts of the course summary, and then have a Q&A session about course and exam topics. Please make sure to read the review sheet as a starting point for the Q&A session. Course Evaluation
Show Abstracts
Hide Abstracts
Creative Commons License Please send comments to dret@berkeley.edu
Last modification on Wednesday, 21-Jan-2009 14:15:19 PST
valid CSS! valid XHTML 1.0!