Overview and Introduction

XML Foundations (INFOSYS 242)

Erik Wilde, UC Berkeley iSchool
Tuesday, August 29, 2006
Creative Commons License

This work is licensed under a Creative Commons
Attribution-NonCommercial-ShareAlike 2.5 License.

Abstract

The Extensible Markup Language (XML) has been introduced in 1998 to enable content providers to publish their content on the Web in an application-specific format. HTML was considered as conveying not enough semantics, since its only purpose was (and is) the preparation of content for Web-based publishing. XML was the first step towards machine-readable data formats for the Web, a trend that since its invention has been taken to higher levels with the idea of the Semantic Web. XML appeared when the Web was in the steepest part of its success curve, and since then has taken over as the globally accepted format for the exchange of machine-readable structured data.

Outline (Varia)

  1. Varia [5]
  2. Why XML? [9]
    1. Pre-XML Problems [2]
    2. XML on the Web [3]
    3. XML Today [2]
  3. What is XML? [12]
    1. What is XML Good for? [8]
    2. What is XML not Good for? [3]
  4. Beyond XML [2]
  5. Conclusions [1]

About Me

About You

About this Course

About these Slides

Additional Resources

Outline (Why XML?)

  1. Varia [5]
  2. Why XML? [9]
    1. Pre-XML Problems [2]
    2. XML on the Web [3]
    3. XML Today [2]
  3. What is XML? [12]
    1. What is XML Good for? [8]
    2. What is XML not Good for? [3]
  4. Beyond XML [2]
  5. Conclusions [1]

Web Technologies

From Humans to Machines

Outline (Pre-XML Problems)

  1. Varia [5]
  2. Why XML? [9]
    1. Pre-XML Problems [2]
    2. XML on the Web [3]
    3. XML Today [2]
  3. What is XML? [12]
    1. What is XML Good for? [8]
    2. What is XML not Good for? [3]
  4. Beyond XML [2]
  5. Conclusions [1]

HTML is for Humans

A Machine-Friendly Web

Outline (XML on the Web)

  1. Varia [5]
  2. Why XML? [9]
    1. Pre-XML Problems [2]
    2. XML on the Web [3]
    3. XML Today [2]
  3. What is XML? [12]
    1. What is XML Good for? [8]
    2. What is XML not Good for? [3]
  4. Beyond XML [2]
  5. Conclusions [1]

SGML, HTML, and XML

XML Documents on the Web

XML Documents Elsewhere

Outline (XML Today)

  1. Varia [5]
  2. Why XML? [9]
    1. Pre-XML Problems [2]
    2. XML on the Web [3]
    3. XML Today [2]
  3. What is XML? [12]
    1. What is XML Good for? [8]
    2. What is XML not Good for? [3]
  4. Beyond XML [2]
  5. Conclusions [1]

Used Everywhere

This Course and XML

Outline (What is XML?)

  1. Varia [5]
  2. Why XML? [9]
    1. Pre-XML Problems [2]
    2. XML on the Web [3]
    3. XML Today [2]
  3. What is XML? [12]
    1. What is XML Good for? [8]
    2. What is XML not Good for? [3]
  4. Beyond XML [2]
  5. Conclusions [1]

XML Ying & Yang

Outline (What is XML Good for?)

  1. Varia [5]
  2. Why XML? [9]
    1. Pre-XML Problems [2]
    2. XML on the Web [3]
    3. XML Today [2]
  3. What is XML? [12]
    1. What is XML Good for? [8]
    2. What is XML not Good for? [3]
  4. Beyond XML [2]
  5. Conclusions [1]

Why Use XML?

Case Study

Pre-XML Data

@misc{xml10fourth,
    author =    "Tim Bray and Jean Paoli and C. Michael Sperberg-McQueen and Eve Maler and Fran\c{c}ois Yergeau",
    title =     "Extensible Markup Language (XML) 1.0 (Fourth Edition)",
    howpublished =  "World Wide Web Consortium, Recommendation REC-xml-20060816",
    month =     aug,
    year =      2006,
    uri =       "http://www.w3.org/TR/2006/REC-xml-20060816",
    abstract =  "The Extensible Markup Language (XML) is a subset of SGML that is completely described in this document. Its goal is to enable generic SGML to be served, received, and processed on the Web in the way that is now possible with HTML. XML has been designed for ease of implementation and for interoperability with both SGML and HTML."
}

XMLized Data (Bad Idea)

<?xml version="1.0" encoding="UTF-8"?>
<bibtex>
@misc{xml10fourth,
    author =    "Tim Bray and Jean Paoli and C. Michael Sperberg-McQueen and Eve Maler and Fran\c{c}ois Yergeau",
    title =     "Extensible Markup Language (XML) 1.0 (Fourth Edition)",
    howpublished =  "World Wide Web Consortium, Recommendation REC-xml-20060816",
    month =     aug,
    year =      2006,
    uri =       "http://www.w3.org/TR/2006/REC-xml-20060816",
    abstract =  "The Extensible Markup Language (XML) is a subset of SGML that is completely described in this document. Its goal is to enable generic SGML to be served, received, and processed on the Web in the way that is now possible with HTML. XML has been designed for ease of implementation and for interoperability with both SGML and HTML."
}

XMLized Data

 <reference id="xml10fourth" type="misc">
  <abstract>The Extensible Markup Language (XML) is a subset of SGML that is completely described in this document. Its goal is to enable generic SGML to be served, received, and processed on the Web in the way that is now possible with HTML. XML has been designed for ease of implementation and for interoperability with both SGML and HTML.</abstract>
  <author>
   <person>
    <first>Tim</first>
    <last>Bray</last>
   </person>
   <person>
    <first>Jean</first>
    <last>Paoli</last>
   </person>
   <person>
    <first>C. Michael</first>
    <last>Sperberg-McQueen</last>
   </person>
   <person>
    <first>Eve</first>
    <last>Maler</last>
   </person>
   <person>
    <first>Fran\c{c}ois</first>
    <last>Yergeau</last>
   </person>
  </author>
  <howpublished>World Wide Web Consortium, Recommendation REC-xml-20060816</howpublished>
  <month>
   <macro ref="aug"/>
  </month>
  <title>Extensible Markup Language (XML) 1.0 (Fourth Edition)</title>
  <uri>http://www.w3.org/TR/2006/REC-xml-20060816</uri>
  <year>2006</year>
 </reference>

XML Data

 <reference name="xml10fourth" type="bibtex:misc">
  <names type="sharef:author">
   <person>
    <givenname>Tim</givenname>
    <surname>Bray</surname>
   </person>
   <person>
    <givenname>Jean</givenname>
    <surname>Paoli</surname>
   </person>
   <person>
    <givenname>C. Michael</givenname>
    <surname>Sperberg-McQueen</surname>
   </person>
   <person>
    <givenname>Eve</givenname>
    <surname>Maler</surname>
   </person>
   <person>
    <givenname>François</givenname>
    <surname>Yergeau</surname>
   </person>
  </names>
  <date value="2006-08"/>
  <abstract>
   <richtext>
    <p>The Extensible Markup Language (XML) is a subset of SGML that is completely described in this document. Its goal is to enable generic SGML to be served, received, and processed on the Web in the way that is now possible with HTML. XML has been designed for ease of implementation and for interoperability with both SGML and HTML.</p>
   </richtext>
  </abstract>
  <howpublished>World Wide Web Consortium, Recommendation REC-xml-20060816</howpublished>
  <title type="sharef:primaryTitle">Extensible Markup Language (XML) 1.0 (Fourth Edition)</title>
  <identifier type="sharef:uri">http://www.w3.org/TR/2006/REC-xml-20060816</identifier>
 </reference>

Other XML Data

  <record>
   <work-type>
    <style face="normal" font="default" size="100%">World Wide Web Consortium, Recommendation REC-xml-20060816</style>
   </work-type>
   <ref-type>13</ref-type>
   <contributors>
    <authors>
     <author>
      <style face="normal" font="default" size="100%">Bray, Tim</style>
     </author>
     <author>
      <style face="normal" font="default" size="100%">Paoli, Jean</style>
     </author>
     <author>
      <style face="normal" font="default" size="100%">Sperberg-McQueen, C. Michael</style>
     </author>
     <author>
      <style face="normal" font="default" size="100%">Maler, Eve</style>
     </author>
     <author>
      <style face="normal" font="default" size="100%">Yergeau, François</style>
     </author>
    </authors>
   </contributors>
   <titles/>
   <dates>
    <year>
     <style face="normal" font="default" size="100%">2006</style>
    </year>
    <pub-dates>
     <date>
      <style face="normal" font="default" size="100%">2006-08</style>
     </date>
    </pub-dates>
   </dates>
   <abstract>
    <style face="normal" font="default" size="100%">The Extensible Markup Language (XML) is a subset of SGML that is completely described in this document. Its goal is to enable generic SGML to be served, received, and processed on the Web in the way that is now possible with HTML. XML has been designed for ease of implementation and for interoperability with both SGML and HTML.</style>
   </abstract>
   <urls/>
  </record>

Is XML Self-Describing?

Outline (What is XML not Good for?)

  1. Varia [5]
  2. Why XML? [9]
    1. Pre-XML Problems [2]
    2. XML on the Web [3]
    3. XML Today [2]
  3. What is XML? [12]
    1. What is XML Good for? [8]
    2. What is XML not Good for? [3]
  4. Beyond XML [2]
  5. Conclusions [1]

XML is Character-Based

XML is a Syntax for Trees

XML Usages

Outline (Beyond XML)

  1. Varia [5]
  2. Why XML? [9]
    1. Pre-XML Problems [2]
    2. XML on the Web [3]
    3. XML Today [2]
  3. What is XML? [12]
    1. What is XML Good for? [8]
    2. What is XML not Good for? [3]
  4. Beyond XML [2]
  5. Conclusions [1]

Sharing Concepts

The Semantic Web

Outline (Conclusions)

  1. Varia [5]
  2. Why XML? [9]
    1. Pre-XML Problems [2]
    2. XML on the Web [3]
    3. XML Today [2]
  3. What is XML? [12]
    1. What is XML Good for? [8]
    2. What is XML not Good for? [3]
  4. Beyond XML [2]
  5. Conclusions [1]

What's the Plan?