XML Schema

Information Systems and the World Wide Web

International School of New Media
University of Lübeck

Erik Wilde, UC Berkeley School of Information
2007-01-09
Creative Commons License

This work is licensed under a Creative Commons
Attribution-NonCommercial-ShareAlike 2.5 License.

Abstract

XML Schema is the most popular schema language for XML today. It has been introduced to overcome some of the commonly observed limitations of DTDs, most notably the lack of typing. Simple Types describe content which is not structured by XML markup, which means it describes attribute values and element content. Simple types can be defined by deriving new types from existing types by using type restriction. Complex Types describe element content if this content is using attributes and/or element content other than only character data. Using XML Schema's type concepts, it is easier to represent model-level information in a schema, because type hierarchies can represent model-level specializations.

Bad Names

XML Schema is a language for describing an XML schema.
An XML schema can be defined using XML Schema.
I would like to use XML Schema for my XML schema.

What's Wrong With DTDs?

Different Levels of Semantics

Schema-Validation and Applications

Validation and Typing

  1. Validation checks for structural integrity (is the document schema-valid?)
    • checking elements and attributes for proper usage (as with DTDs)
    • checking element contents and attribute values for proper values
  2. Type annotations make the types available to applications
    • instead of having to look at the schema, applications get the Post-Schema Validation Infoset (PSVI)
    • type-based applications (such as XSLT 2.0) can work on the typed instance

Outline (XML Schema Types)

  1. XML Schema Types [3]
  2. Simple Types [11]
    1. Simple Type Restrictions [7]
  3. Complex Types [6]
    1. Content Models [3]
  4. Local and Global Definitions [6]
    1. Elements [2]
    2. Attributes [3]
  5. Names and Namespaces [3]
  6. Identity Constraints [6]
  7. Complex Type Derivation [7]
    1. Complex Type Restriction [3]
    2. Complex Type Extension [3]
  8. Conclusions [2]

What is a Type?

XML Schema vs. DTD

DTD XML Schema
Concepts some conceptual model (formal/informal)
Types ID/IDREF and (#P)CDATA Hierarchy of Simple and Complex Types
Markup Constructs Element Type Declarations
<!ELEMENT order ...
Element Definitions
<xs:element name="order"> ...
Instances (Documents) <order date=""> [ order content ] </order>

Document/Data Perspectives

Outline (Simple Types)

  1. XML Schema Types [3]
  2. Simple Types [11]
    1. Simple Type Restrictions [7]
  3. Complex Types [6]
    1. Content Models [3]
  4. Local and Global Definitions [6]
    1. Elements [2]
    2. Attributes [3]
  5. Names and Namespaces [3]
  6. Identity Constraints [6]
  7. Complex Type Derivation [7]
    1. Complex Type Restriction [3]
    2. Complex Type Extension [3]
  8. Conclusions [2]

What are Simple Types?

Named vs. Anonymous

 <xs:element name="home" type="phoneType"/>
 <xs:element name="office" type="phoneType"/>
 <xs:simpleType name="phoneType">
  <xs:restriction base="xs:string">
   <xs:maxLength value="30"/>
  </xs:restriction>
 </xs:simpleType>
 <xs:element name="business">
  <xs:simpleType>
   <xs:restriction base="xs:string">
    <xs:maxLength value="30"/>
   </xs:restriction>
  </xs:simpleType>
 </xs:element>

Type Definitions

Type Hierarchy

Outline (Simple Type Restrictions)

  1. XML Schema Types [3]
  2. Simple Types [11]
    1. Simple Type Restrictions [7]
  3. Complex Types [6]
    1. Content Models [3]
  4. Local and Global Definitions [6]
    1. Elements [2]
    2. Attributes [3]
  5. Names and Namespaces [3]
  6. Identity Constraints [6]
  7. Complex Type Derivation [7]
    1. Complex Type Restriction [3]
    2. Complex Type Extension [3]
  8. Conclusions [2]

Built-In Types

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
 <xs:simpleType name="integer">
  <xs:restriction base="xs:decimal">
   <xs:fractionDigits value="0" fixed="true"/>
  </xs:restriction>
 </xs:simpleType>
 <xs:simpleType name="nonNegativeInteger">
  <xs:restriction base="integer">
   <xs:minInclusive value="0"/>
  </xs:restriction>
 </xs:simpleType>
 <xs:simpleType name="positiveInteger">
  <xs:restriction base="nonNegativeInteger">
   <xs:minInclusive value="1"/>
  </xs:restriction>
 </xs:simpleType>
</xs:schema>

How to Restrict

Facets

Facet Applicability

string length, minLength, maxLength, pattern, enumeration, whiteSpace
boolean pattern, whiteSpace
float pattern, enumeration, whiteSpace, maxInclusive, maxExclusive, minInclusive, minExclusive
double pattern, enumeration, whiteSpace, maxInclusive, maxExclusive, minInclusive, minExclusive
decimal totalDigits, fractionDigits, pattern, whiteSpace, enumeration, maxInclusive, maxExclusive, minInclusive, minExclusive
duration pattern, enumeration, whiteSpace, maxInclusive, maxExclusive, minInclusive, minExclusive
dateTime pattern, enumeration, whiteSpace, maxInclusive, maxExclusive, minInclusive, minExclusive
time pattern, enumeration, whiteSpace, maxInclusive, maxExclusive, minInclusive, minExclusive
date pattern, enumeration, whiteSpace, maxInclusive, maxExclusive, minInclusive, minExclusive
gYearMonth pattern, enumeration, whiteSpace, maxInclusive, maxExclusive, minInclusive, minExclusive
gYear pattern, enumeration, whiteSpace, maxInclusive, maxExclusive, minInclusive, minExclusive
gMonthDay pattern, enumeration, whiteSpace, maxInclusive, maxExclusive, minInclusive, minExclusive
gDay pattern, enumeration, whiteSpace, maxInclusive, maxExclusive, minInclusive, minExclusive
gMonth pattern, enumeration, whiteSpace, maxInclusive, maxExclusive, minInclusive, minExclusive
hexBinary length, minLength, maxLength, pattern, enumeration, whiteSpace
base64Binary length, minLength, maxLength, pattern, enumeration, whiteSpace
anyURI length, minLength, maxLength, pattern, enumeration, whiteSpace
QName length, minLength, maxLength, pattern, enumeration, whiteSpace
NOTATION length, minLength, maxLength, pattern, enumeration, whiteSpace

Patterns

([a-zA-Z]{2}|[iI]-[a-zA-Z]+|[xX]-[a-zA-Z]{1,8})(-[a-zA-Z]{1,8})*

Simple Type Examples

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
 <xs:simpleType name="myIntegerType">
  <xs:restriction base="xs:integer">
   <xs:minInclusive value="10000"/>
   <xs:maxInclusive value="99999"/>
  </xs:restriction>
 </xs:simpleType>
 <xs:simpleType name="stockKeepingUnitType">
  <xs:restriction base="xs:string">
   <xs:pattern value="\d{3}-[A-Z]{2}"/>
  </xs:restriction>
 </xs:simpleType>
 <xs:simpleType name="USStateType">
  <xs:restriction base="xs:string">
   <xs:enumeration value="AK"/>
   <xs:enumeration value="AL"/>
   <xs:enumeration value="AR"/>
   <!-- and so on ... -->
  </xs:restriction>
 </xs:simpleType>
</xs:schema>

Facet Limitations

Outline (Complex Types)

  1. XML Schema Types [3]
  2. Simple Types [11]
    1. Simple Type Restrictions [7]
  3. Complex Types [6]
    1. Content Models [3]
  4. Local and Global Definitions [6]
    1. Elements [2]
    2. Attributes [3]
  5. Names and Namespaces [3]
  6. Identity Constraints [6]
  7. Complex Type Derivation [7]
    1. Complex Type Restriction [3]
    2. Complex Type Extension [3]
  8. Conclusions [2]

What is a Complex Type?

Complex Type Example

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
 <xs:element name="billingAddress" type="addressType"/>
 <xs:element name="shippingAddress" type="addressType"/>
 <xs:complexType name="addressType">
  <xs:sequence>
   <xs:element name="name" type="xs:string"/>
   <xs:element name="street" type="xs:string"/>
   <xs:element name="city" type="xs:string"/>
   <xs:element name="state" type="xs:string" minOccurs="0"/>
   <xs:element name="zip" type="xs:decimal"/>
  </xs:sequence>
  <xs:attribute name="country" type="xs:NMTOKEN"/>
 </xs:complexType>
</xs:schema>

Complex Types & Content Types

Simple Types Complex Types
Simple Content Complex Content
Element Only Mixed Empty

Outline (Content Models)

  1. XML Schema Types [3]
  2. Simple Types [11]
    1. Simple Type Restrictions [7]
  3. Complex Types [6]
    1. Content Models [3]
  4. Local and Global Definitions [6]
    1. Elements [2]
    2. Attributes [3]
  5. Names and Namespaces [3]
  6. Identity Constraints [6]
  7. Complex Type Derivation [7]
    1. Complex Type Restriction [3]
    2. Complex Type Extension [3]
  8. Conclusions [2]

DTD Content Models

Mixed Content

 <xs:element name="p" type="mixedType"/>
 <xs:complexType name="mixedType" mixed="true">
  <xs:choice maxOccurs="unbounded" minOccurs="0">
   <xs:element ref="b"/>
   <xs:element name="i" type="xs:string"/>
   <xs:element name="u" type="xs:string"/>
  </xs:choice>
  <xs:attribute ref="class"/>
 </xs:complexType>

Empty Content

Outline (Local and Global Definitions)

  1. XML Schema Types [3]
  2. Simple Types [11]
    1. Simple Type Restrictions [7]
  3. Complex Types [6]
    1. Content Models [3]
  4. Local and Global Definitions [6]
    1. Elements [2]
    2. Attributes [3]
  5. Names and Namespaces [3]
  6. Identity Constraints [6]
  7. Complex Type Derivation [7]
    1. Complex Type Restriction [3]
    2. Complex Type Extension [3]
  8. Conclusions [2]

Named and Anonymous Types

<!ELEMENT person (name, address) >
<!ATTLIST person id ID #REQUIRED >

Outline (Elements)

  1. XML Schema Types [3]
  2. Simple Types [11]
    1. Simple Type Restrictions [7]
  3. Complex Types [6]
    1. Content Models [3]
  4. Local and Global Definitions [6]
    1. Elements [2]
    2. Attributes [3]
  5. Names and Namespaces [3]
  6. Identity Constraints [6]
  7. Complex Type Derivation [7]
    1. Complex Type Restriction [3]
    2. Complex Type Extension [3]
  8. Conclusions [2]

Local vs. Global Elements

 <xs:complexType name="mixedType" mixed="true">
  <xs:choice maxOccurs="unbounded" minOccurs="0">
   <xs:element ref="b"/>
   <xs:element name="i" type="xs:string"/>
   <xs:element name="u" type="xs:string"/>
  </xs:choice>
  <xs:attribute ref="class"/>
 </xs:complexType>
 <xs:element name="b" type="xs:string"/>

Reusable Elements

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
 <xs:element name="billingAddress" type="addressType"/>
 <xs:element name="shippingAddress" type="addressType"/>
 <xs:complexType name="addressType">
  <xs:sequence>
   <xs:element name="name" type="xs:string"/>
   <xs:element name="street" type="xs:string"/>
   <xs:element name="city" type="xs:string"/>
   <xs:element name="state" type="xs:string" minOccurs="0"/>
   <xs:element name="zip" type="xs:decimal"/>
  </xs:sequence>
  <xs:attribute name="country" type="xs:NMTOKEN"/>
 </xs:complexType>
</xs:schema>

Outline (Attributes)

  1. XML Schema Types [3]
  2. Simple Types [11]
    1. Simple Type Restrictions [7]
  3. Complex Types [6]
    1. Content Models [3]
  4. Local and Global Definitions [6]
    1. Elements [2]
    2. Attributes [3]
  5. Names and Namespaces [3]
  6. Identity Constraints [6]
  7. Complex Type Derivation [7]
    1. Complex Type Restriction [3]
    2. Complex Type Extension [3]
  8. Conclusions [2]

Attribute Definitions

Reusing Attributes

Reusing Attributes (Example)

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
 <xs:element name="p" type="mixedType"/>
 <xs:complexType name="mixedType" mixed="true">
  <xs:choice maxOccurs="unbounded" minOccurs="0">
   <xs:element ref="b"/>
   <xs:element name="i" type="xs:string"/>
   <xs:element name="u" type="xs:string"/>
  </xs:choice>
  <xs:attribute ref="class"/>
 </xs:complexType>
 <xs:element name="b" type="xs:string"/>
 <xs:attribute name="class">
  <xs:simpleType>
   <xs:restriction base="xs:string">
    <xs:enumeration value="comment"/>
    <xs:enumeration value="warning"/>
   </xs:restriction>
  </xs:simpleType>
 </xs:attribute>
</xs:schema>

Outline (Names and Namespaces)

  1. XML Schema Types [3]
  2. Simple Types [11]
    1. Simple Type Restrictions [7]
  3. Complex Types [6]
    1. Content Models [3]
  4. Local and Global Definitions [6]
    1. Elements [2]
    2. Attributes [3]
  5. Names and Namespaces [3]
  6. Identity Constraints [6]
  7. Complex Type Derivation [7]
    1. Complex Type Restriction [3]
    2. Complex Type Extension [3]
  8. Conclusions [2]

Definitions

Instances

<html xmlns="http://www.w3.org/1999/xhtml">
 <head>
  <title>Multicolumn Layout in HTML</title>
  <style type="text/css">

Name Qualification

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.example.com/" elementFormDefault="qualified" attributeFormDefault="unqualified">

Outline (Identity Constraints)

  1. XML Schema Types [3]
  2. Simple Types [11]
    1. Simple Type Restrictions [7]
  3. Complex Types [6]
    1. Content Models [3]
  4. Local and Global Definitions [6]
    1. Elements [2]
    2. Attributes [3]
  5. Names and Namespaces [3]
  6. Identity Constraints [6]
  7. Complex Type Derivation [7]
    1. Complex Type Restriction [3]
    2. Complex Type Extension [3]
  8. Conclusions [2]

Element = Type + Constraints

Improvements over ID/IDREF

Types of Identity Constraints

Identity Constraint Definitions

Identity Constraint Evaluation

Advanced Identity Constraints

Outline (Complex Type Derivation)

  1. XML Schema Types [3]
  2. Simple Types [11]
    1. Simple Type Restrictions [7]
  3. Complex Types [6]
    1. Content Models [3]
  4. Local and Global Definitions [6]
    1. Elements [2]
    2. Attributes [3]
  5. Names and Namespaces [3]
  6. Identity Constraints [6]
  7. Complex Type Derivation [7]
    1. Complex Type Restriction [3]
    2. Complex Type Extension [3]
  8. Conclusions [2]

Type Derivation

Outline (Complex Type Restriction)

  1. XML Schema Types [3]
  2. Simple Types [11]
    1. Simple Type Restrictions [7]
  3. Complex Types [6]
    1. Content Models [3]
  4. Local and Global Definitions [6]
    1. Elements [2]
    2. Attributes [3]
  5. Names and Namespaces [3]
  6. Identity Constraints [6]
  7. Complex Type Derivation [7]
    1. Complex Type Restriction [3]
    2. Complex Type Extension [3]
  8. Conclusions [2]

Removing Choices

Complex Type Restriction (Example)

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
 <xs:complexType name="addressType">
  <xs:sequence>
   <xs:element name="name" type="xs:string"/>
   <xs:element name="street" type="xs:string"/>
   <xs:element name="city" type="xs:string" minOccurs="0"/>
   <xs:choice>
    <xs:element name="state" type="xs:string"/>
    <xs:element name="canton" type="xs:string"/>
   </xs:choice>
   <xs:element name="zip" type="xs:decimal"/>
  </xs:sequence>
  <xs:attribute name="country" type="xs:NMTOKEN"/>
  <xs:attribute name="territory" type="xs:string" use="optional"/>
 </xs:complexType>
 <xs:complexType name="USaddressType">
  <xs:complexContent>
   <xs:restriction base="addressType">
    <xs:sequence>
     <xs:element name="name" type="xs:string"/>
     <xs:element name="street" type="xs:string"/>
     <xs:element name="city" type="xs:string"/>
     <xs:choice>
      <xs:element name="state" type="xs:string"/>
     </xs:choice>
     <xs:element name="zip" type="zipType"/>
    </xs:sequence>
    <xs:attribute name="country" type="xs:NMTOKEN"/>
    <xs:attribute name="territory" type="xs:string" use="prohibited"/>
   </xs:restriction>
  </xs:complexContent>
 </xs:complexType>
 <xs:simpleType name="zipType">
  <xs:restriction base="xs:decimal">
   <xs:totalDigits value="5"/>
  </xs:restriction>
 </xs:simpleType>
</xs:schema>

Processing Restricted Complex Types

Outline (Complex Type Extension)

  1. XML Schema Types [3]
  2. Simple Types [11]
    1. Simple Type Restrictions [7]
  3. Complex Types [6]
    1. Content Models [3]
  4. Local and Global Definitions [6]
    1. Elements [2]
    2. Attributes [3]
  5. Names and Namespaces [3]
  6. Identity Constraints [6]
  7. Complex Type Derivation [7]
    1. Complex Type Restriction [3]
    2. Complex Type Extension [3]
  8. Conclusions [2]

Adding Content

Complex Type Extension (Example)

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
 <xs:complexType name="addressType">
  <xs:sequence>
   <xs:element name="name" type="xs:string"/>
   <xs:element name="street" type="xs:string"/>
   <xs:element name="city" type="xs:string" minOccurs="0"/>
   <xs:choice>
    <xs:element name="state" type="xs:string"/>
    <xs:element name="canton" type="xs:string"/>
   </xs:choice>
   <xs:element name="zip" type="xs:decimal"/>
  </xs:sequence>
  <xs:attribute name="country" type="xs:NMTOKEN"/>
  <xs:attribute name="territory" type="xs:string" use="optional"/>
 </xs:complexType>
 <xs:complexType name="businessAddressType">
  <xs:complexContent>
   <xs:extension base="addressType">
    <xs:sequence>
     <xs:element name="company" type="xs:string"/>
     <xs:element name="position" type="xs:string" minOccurs="0"/>
    </xs:sequence>
    <xs:attribute name="relationship" type="xs:NMTOKEN"/>
   </xs:extension>
  </xs:complexContent>
 </xs:complexType>
</xs:schema>

Processing Extended Complex Types

Outline (Conclusions)

  1. XML Schema Types [3]
  2. Simple Types [11]
    1. Simple Type Restrictions [7]
  3. Complex Types [6]
    1. Content Models [3]
  4. Local and Global Definitions [6]
    1. Elements [2]
    2. Attributes [3]
  5. Names and Namespaces [3]
  6. Identity Constraints [6]
  7. Complex Type Derivation [7]
    1. Complex Type Restriction [3]
    2. Complex Type Extension [3]
  8. Conclusions [2]

Schema Components

XML Schema Features