XSD – Part I

XML Foundations [./]
Fall 2011 — INFO 242 (CCN 42596)

Ray Larson, UC Berkeley School of Information
2011-09-06

Creative Commons License [http://creativecommons.org/licenses/by/3.0/]

This work is licensed under a CC
Attribution 3.0 Unported License
[http://creativecommons.org/licenses/by/3.0/]

Contents R. Larson: XSD – Part I

Contents

R. Larson: XSD – Part I

(2) Abstract

The XML Schema Definition Language (XSD) is the most popular schema language for XML today. It has been introduced to overcome some of the commonly observed limitations of DTDs, most notably the lack of typing. Simple Types describe content which is not structured by XML markup, which means it describes attribute values and element content. Simple types can be defined by deriving new types from existing types by using type restriction.



R. Larson: XSD – Part I

(3) Bad Names

XML Schema is a language for describing an XML schema.
An XML schema can be defined using XML Schema.
I would like to use XML Schema for my XML schema.


R. Larson: XSD – Part I

(4) What's Wrong With DTDs?



R. Larson: XSD – Part I

(5) Different Levels of Semantics



R. Larson: XSD – Part I

(6) Schema-Validation and Applications

schema-valid-documents.png

R. Larson: XSD – Part I

(7) Validation and Typing

  1. Validation checks for structural integrity (is the document schema-valid?)
    • checking elements and attributes for proper usage (as with DTDs)
    • checking element contents and attribute values for proper values
  2. Type annotations make the types available to applications
    • instead of having to look at the schema, applications get the Post-Schema Validation Infoset (PSVI)
    • type-based applications (such as XSLT 2.0) can work on the typed instance


R. Larson: XSD – Part I

(8) XSD Syntax

xml-technology-syntaxes.png

XSD Types

Outline (XSD Types)

  1. XSD Types [3]
  2. Simple Types [11]
    1. Simple Type Restriction [7]
  3. Conclusions [1]
XSD Types R. Larson: XSD – Part I

(10) What is a Type?



XSD Types R. Larson: XSD – Part I

(11) XSD vs. DTD

DTD XSD
Concepts some conceptual model (formal/informal)
Types ID/IDREF and (#P)CDATA Hierarchy of Simple and Complex Types
Markup Constructs Element Type Declarations
<!ELEMENT order …
Element Definitions
<xs:element name="order"> …
Instances (Documents) <order date=""> [ order content ] </order>


XSD Types R. Larson: XSD – Part I

(12) Document/Data Perspectives



Simple Types

Outline (Simple Types)

  1. XSD Types [3]
  2. Simple Types [11]
    1. Simple Type Restriction [7]
  3. Conclusions [1]
Simple Types R. Larson: XSD – Part I

(14) What are Simple Types?



Simple Types R. Larson: XSD – Part I

(15) Named vs. Anonymous

 <xs:element name="home" type="phoneType"/>
 <xs:element name="office" type="phoneType"/>
 <xs:simpleType name="phoneType">
  <xs:restriction base="xs:string">
   <xs:maxLength value="30"/>
  </xs:restriction>
 </xs:simpleType>
 <xs:element name="business">
  <xs:simpleType>
   <xs:restriction base="xs:string">
    <xs:maxLength value="30"/>
   </xs:restriction>
  </xs:simpleType>
 </xs:element>


Simple Types R. Larson: XSD – Part I

(16) Type Definitions



Simple Types R. Larson: XSD – Part I

(17) Type Hierarchy

xsd-type-hierarchy.gif

Simple Type Restriction

Outline (Simple Type Restriction)

  1. XSD Types [3]
  2. Simple Types [11]
    1. Simple Type Restriction [7]
  3. Conclusions [1]
Simple Type Restriction R. Larson: XSD – Part I

(19) Built-In Types

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
 <xs:simpleType name="integer">
  <xs:restriction base="xs:decimal">
   <xs:fractionDigits value="0" fixed="true"/>
  </xs:restriction>
 </xs:simpleType>
 <xs:simpleType name="nonNegativeInteger">
  <xs:restriction base="integer">
   <xs:minInclusive value="0"/>
  </xs:restriction>
 </xs:simpleType>
 <xs:simpleType name="positiveInteger">
  <xs:restriction base="nonNegativeInteger">
   <xs:minInclusive value="1"/>
  </xs:restriction>
 </xs:simpleType>
</xs:schema>


Simple Type Restriction R. Larson: XSD – Part I

(20) How to Restrict

  • Simple types can be derived by restriction
    • the base type must be a simple type
    • the derived type will be a simple type
    • all simple types form a tree, rooted at the anySimpleType
  • Restriction are based on facets
    • each restriction can use 0-n facets
    • facets can be refined in further simple type restrictions
    • XSD designers should try to restrict types as much as possible


Simple Type Restriction R. Larson: XSD – Part I

(21) Facets

  • Facets define a certain way of restricting a simple type
    • facets are independent, but they may interact (minLength and maxLength)
    • XSD defines 12 constraining facets which may be used for restrictions
    • length, minLength, maxLength, pattern, enumeration, whiteSpace, maxInclusive, maxExclusive, minExclusive, minInclusive, totalDigits, fractionDigits
  • Facets may be repeated in different levels of the type hierarchy
    • they may only further restrict the facet (e.g., reducing the maxLength)
    • facets apply to all directly or indirectly derived subtypes
    • facets may be fixed (no further restriction is allowed)
  • Not all facets are applicable to all types
    • the applicability depends on the primitive type being used


Simple Type Restriction R. Larson: XSD – Part I

(22) Facet Applicability

string length, minLength, maxLength, pattern, enumeration, whiteSpace
boolean pattern, whiteSpace
float pattern, enumeration, whiteSpace, maxInclusive, maxExclusive, minInclusive, minExclusive
double pattern, enumeration, whiteSpace, maxInclusive, maxExclusive, minInclusive, minExclusive
decimal totalDigits, fractionDigits, pattern, whiteSpace, enumeration, maxInclusive, maxExclusive, minInclusive, minExclusive
duration pattern, enumeration, whiteSpace, maxInclusive, maxExclusive, minInclusive, minExclusive
dateTime pattern, enumeration, whiteSpace, maxInclusive, maxExclusive, minInclusive, minExclusive
time pattern, enumeration, whiteSpace, maxInclusive, maxExclusive, minInclusive, minExclusive
date pattern, enumeration, whiteSpace, maxInclusive, maxExclusive, minInclusive, minExclusive
gYearMonth pattern, enumeration, whiteSpace, maxInclusive, maxExclusive, minInclusive, minExclusive
gYear pattern, enumeration, whiteSpace, maxInclusive, maxExclusive, minInclusive, minExclusive
gMonthDay pattern, enumeration, whiteSpace, maxInclusive, maxExclusive, minInclusive, minExclusive
gDay pattern, enumeration, whiteSpace, maxInclusive, maxExclusive, minInclusive, minExclusive
gMonth pattern, enumeration, whiteSpace, maxInclusive, maxExclusive, minInclusive, minExclusive
hexBinary length, minLength, maxLength, pattern, enumeration, whiteSpace
base64Binary length, minLength, maxLength, pattern, enumeration, whiteSpace
anyURI length, minLength, maxLength, pattern, enumeration, whiteSpace
QName length, minLength, maxLength, pattern, enumeration, whiteSpace
NOTATION length, minLength, maxLength, pattern, enumeration, whiteSpace


Simple Type Restriction R. Larson: XSD – Part I

(23) Patterns

  • Patterns restrict the lexical space of simple types
    • most other facets restrict the value space (e.g., intervals of numbers)
    • in many cases, patterns are useful additions to value-oriented facets
  • Patterns are regular expressions [http://www.w3.org/TR/xmlschema-2/#regexs]
    • they support many common regex constructs and Unicode
    • the language pattern allows de, de-CH, and other tags
    • the pattern checks for lexical correctness, not against a code list
([a-zA-Z]{2}|[iI]-[a-zA-Z]+|[xX]-[a-zA-Z]{1,8})(-[a-zA-Z]{1,8})*


Simple Type Restriction R. Larson: XSD – Part I

(24) Simple Type Examples

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
 <xs:simpleType name="myIntegerType">
  <xs:restriction base="xs:integer">
   <xs:minInclusive value="10000"/>
   <xs:maxInclusive value="99999"/>
  </xs:restriction>
 </xs:simpleType>
 <xs:simpleType name="stockKeepingUnitType">
  <xs:restriction base="xs:string">
   <xs:pattern value="\d{3}-[A-Z]{2}"/>
  </xs:restriction>
 </xs:simpleType>
 <xs:simpleType name="USStateType">
  <xs:restriction base="xs:string">
   <xs:enumeration value="AK"/>
   <xs:enumeration value="AL"/>
   <xs:enumeration value="AR"/>
   <!-- and so on ... -->
  </xs:restriction>
 </xs:simpleType>
</xs:schema>


Simple Type Restriction R. Larson: XSD – Part I

(25) Facet Limitations

  • Facets limit one dimension of a type's value space
    • using pattern, the lexical space can also be restricted
    • restrictions should be made as specific as possible
    • no limitations are possible beyond the predefined facets
  • There is no connection to the context within the document
    • facets cannot make references to other values (e.g., neighboring attributes)
  • Additional constraints should be documented
    • documentation enables applications to implement constraint checking
    • other schema languages (like Schematron [http://dret.net/lectures/xml-fall08/schemalanguages]) may be used to express these constraints


Conclusions

Outline (Conclusions)

  1. XSD Types [3]
  2. Simple Types [11]
    1. Simple Type Restriction [7]
  3. Conclusions [1]
Conclusions R. Larson: XSD – Part I

(27) Typed XML Content



2011-09-06 XML Foundations [./]
Fall 2011 — INFO 242 (CCN 42596)