XML Schema Design Issues

Web-Based Services (INFOSYS 290-3)

Erik Wilde, UC Berkeley iSchool
Wednesday, November 22, 2006
Creative Commons License

This work is licensed under a Creative Commons
Attribution-NonCommercial-ShareAlike 2.5 License.

Abstract

In most scenarios today, structured data is exchanged using XML. For non-trivial data structures, it is important to have a schema for the data structures, so that users know what to expect and/or what to generate. XML Schema is the most popular language for XML schemas today. XML Schema has many powerful and complex features, which means that any problem can be solved in many different ways in XML Schema. This lecture describes some guidelines for writing well-designed, open, extensible, and well-documented XML schemas.

Common Models

Outline (XML Schema Types)

  1. XML Schema Types [3]
  2. Simple Types [11]
    1. Simple Type Restrictions [7]
  3. Complex Types [6]
    1. Content Models [3]
  4. Openness and Extensibility [5]
  5. Documentation [2]
  6. Conclusions [1]

What is a Type?

XML Schema vs. DTD

DTD XML Schema
Concepts some conceptual model (formal/informal)
Types ID/IDREF and (#P)CDATA Hierarchy of Simple and Complex Types
Markup Constructs Element Type Declarations
<!ELEMENT order …
Element Definitions
<xs:element name="order"> …
Instances (Documents) <order date=""> [ order content ] </order>

Document/Data Perspectives

Outline (Simple Types)

  1. XML Schema Types [3]
  2. Simple Types [11]
    1. Simple Type Restrictions [7]
  3. Complex Types [6]
    1. Content Models [3]
  4. Openness and Extensibility [5]
  5. Documentation [2]
  6. Conclusions [1]

What are Simple Types?

Named vs. Anonymous

 <xs:element name="home" type="phoneType"/>
 <xs:element name="office" type="phoneType"/>
 <xs:simpleType name="phoneType">
  <xs:restriction base="xs:string">
   <xs:maxLength value="30"/>
  </xs:restriction>
 </xs:simpleType>
 <xs:element name="business">
  <xs:simpleType>
   <xs:restriction base="xs:string">
    <xs:maxLength value="30"/>
   </xs:restriction>
  </xs:simpleType>
 </xs:element>

Type Definitions

Type Hierarchy

Outline (Simple Type Restrictions)

  1. XML Schema Types [3]
  2. Simple Types [11]
    1. Simple Type Restrictions [7]
  3. Complex Types [6]
    1. Content Models [3]
  4. Openness and Extensibility [5]
  5. Documentation [2]
  6. Conclusions [1]

Built-In Types

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
 <xs:simpleType name="integer">
  <xs:restriction base="xs:decimal">
   <xs:fractionDigits value="0" fixed="true"/>
  </xs:restriction>
 </xs:simpleType>
 <xs:simpleType name="nonNegativeInteger">
  <xs:restriction base="integer">
   <xs:minInclusive value="0"/>
  </xs:restriction>
 </xs:simpleType>
 <xs:simpleType name="positiveInteger">
  <xs:restriction base="nonNegativeInteger">
   <xs:minInclusive value="1"/>
  </xs:restriction>
 </xs:simpleType>
</xs:schema>

How to Restrict

Facets

Facet Applicability

string length, minLength, maxLength, pattern, enumeration, whiteSpace
boolean pattern, whiteSpace
float pattern, enumeration, whiteSpace, maxInclusive, maxExclusive, minInclusive, minExclusive
double pattern, enumeration, whiteSpace, maxInclusive, maxExclusive, minInclusive, minExclusive
decimal totalDigits, fractionDigits, pattern, whiteSpace, enumeration, maxInclusive, maxExclusive, minInclusive, minExclusive
duration pattern, enumeration, whiteSpace, maxInclusive, maxExclusive, minInclusive, minExclusive
dateTime pattern, enumeration, whiteSpace, maxInclusive, maxExclusive, minInclusive, minExclusive
time pattern, enumeration, whiteSpace, maxInclusive, maxExclusive, minInclusive, minExclusive
date pattern, enumeration, whiteSpace, maxInclusive, maxExclusive, minInclusive, minExclusive
gYearMonth pattern, enumeration, whiteSpace, maxInclusive, maxExclusive, minInclusive, minExclusive
gYear pattern, enumeration, whiteSpace, maxInclusive, maxExclusive, minInclusive, minExclusive
gMonthDay pattern, enumeration, whiteSpace, maxInclusive, maxExclusive, minInclusive, minExclusive
gDay pattern, enumeration, whiteSpace, maxInclusive, maxExclusive, minInclusive, minExclusive
gMonth pattern, enumeration, whiteSpace, maxInclusive, maxExclusive, minInclusive, minExclusive
hexBinary length, minLength, maxLength, pattern, enumeration, whiteSpace
base64Binary length, minLength, maxLength, pattern, enumeration, whiteSpace
anyURI length, minLength, maxLength, pattern, enumeration, whiteSpace
QName length, minLength, maxLength, pattern, enumeration, whiteSpace
NOTATION length, minLength, maxLength, pattern, enumeration, whiteSpace

Patterns

([a-zA-Z]{2}|[iI]-[a-zA-Z]+|[xX]-[a-zA-Z]{1,8})(-[a-zA-Z]{1,8})*

Simple Type Examples

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
 <xs:simpleType name="myIntegerType">
  <xs:restriction base="xs:integer">
   <xs:minInclusive value="10000"/>
   <xs:maxInclusive value="99999"/>
  </xs:restriction>
 </xs:simpleType>
 <xs:simpleType name="stockKeepingUnitType">
  <xs:restriction base="xs:string">
   <xs:pattern value="\d{3}-[A-Z]{2}"/>
  </xs:restriction>
 </xs:simpleType>
 <xs:simpleType name="USStateType">
  <xs:restriction base="xs:string">
   <xs:enumeration value="AK"/>
   <xs:enumeration value="AL"/>
   <xs:enumeration value="AR"/>
   <!-- and so on ... -->
  </xs:restriction>
 </xs:simpleType>
</xs:schema>

Facet Limitations

Outline (Complex Types)

  1. XML Schema Types [3]
  2. Simple Types [11]
    1. Simple Type Restrictions [7]
  3. Complex Types [6]
    1. Content Models [3]
  4. Openness and Extensibility [5]
  5. Documentation [2]
  6. Conclusions [1]

What is a Complex Type?

Complex Type Example

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
 <xs:element name="billingAddress" type="addressType"/>
 <xs:element name="shippingAddress" type="addressType"/>
 <xs:complexType name="addressType">
  <xs:sequence>
   <xs:element name="name" type="xs:string"/>
   <xs:element name="street" type="xs:string"/>
   <xs:element name="city" type="xs:string"/>
   <xs:element name="state" type="xs:string" minOccurs="0"/>
   <xs:element name="zip" type="xs:decimal"/>
  </xs:sequence>
  <xs:attribute name="country" type="xs:NMTOKEN"/>
 </xs:complexType>
</xs:schema>

Complex Types & Content Types

Simple Types Complex Types
Simple Content Complex Content
Element Only Mixed Empty

Outline (Content Models)

  1. XML Schema Types [3]
  2. Simple Types [11]
    1. Simple Type Restrictions [7]
  3. Complex Types [6]
    1. Content Models [3]
  4. Openness and Extensibility [5]
  5. Documentation [2]
  6. Conclusions [1]

DTD Content Models

Mixed Content

 <xs:element name="p" type="mixedType"/>
 <xs:complexType name="mixedType" mixed="true">
  <xs:choice maxOccurs="unbounded" minOccurs="0">
   <xs:element ref="b"/>
   <xs:element name="i" type="xs:string"/>
   <xs:element name="u" type="xs:string"/>
  </xs:choice>
  <xs:attribute ref="class"/>
 </xs:complexType>

Empty Content

Outline (Openness and Extensibility)

  1. XML Schema Types [3]
  2. Simple Types [11]
    1. Simple Type Restrictions [7]
  3. Complex Types [6]
    1. Content Models [3]
  4. Openness and Extensibility [5]
  5. Documentation [2]
  6. Conclusions [1]

Schema Use and Evolution

Wildcards

Wildcard Example

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
 <element name="purchaseReport">
  <complexType>
   <sequence>
    <element name="regions" type="r:RegionsType"/>
    <element name="parts" type="r:PartsType"/>
    <element name="htmlExample">
     <complexType>
      <sequence>
       <any namespace="http://www.w3.org/1999/xhtml" minOccurs="0" maxOccurs="unbounded" processContents="skip"/>
      </sequence>
     </complexType>
    </element>
   </sequence>
   <attribute name="period" type="duration"/>
   <attribute name="periodEnding" type="date"/>
  </complexType>
 </element>
</xs:schema>

Openness

Extensibility

Outline (Documentation)

  1. XML Schema Types [3]
  2. Simple Types [11]
    1. Simple Type Restrictions [7]
  3. Complex Types [6]
    1. Content Models [3]
  4. Openness and Extensibility [5]
  5. Documentation [2]
  6. Conclusions [1]

Schema/Namespace Policy

Namespace Descriptions

Erik Wilde, Structuring Namespace Descriptions, 15th International World Wide Web Conference (WWW2006), Edinburgh, UK, May 2006.

Outline (Conclusions)

  1. XML Schema Types [3]
  2. Simple Types [11]
    1. Simple Type Restrictions [7]
  3. Complex Types [6]
    1. Content Models [3]
  4. Openness and Extensibility [5]
  5. Documentation [2]
  6. Conclusions [1]

Plan Ahead