Formal and Conceptual Models for XML

Erik Wilde

ETH Zürich

http://dret.net/netdret/docs/wilde-lmu05-xml-models

Outline

About Me (1)

About Me (2)

Abstract

Today, XML is primarily regarded as a syntax for exchanging structured data, and therefore the question of how to develop well-designed XML models (in the sense of abstract descriptions of XML-based data models) has not been studied extensively. However, since applications are increasingly penetrated by XML technologies (Web Services), and because query and programming languages increasingly provide native XML support (XQuery and E4X), it would be beneficial to use these features to work together with well-designed XML models. In order to better focus on XML-oriented technologies in systems engineering and programming languages, an XML modeling language should be used, which is more focused on modeling and structure than typical XML schema languages, which tend to focus on details of the XML syntax. Currently, there is no well-established XML modeling language, but there are several approaches which could be used as a foundation for XML modeling, the two most important developments being formal models of XML, and ER-inspired conceptual modeling languages for XML. In this talk, we give an overview of existing formal and conceptual models for XML, and present a list of requirements for a language which would be close to the ideal XML modeling language.

Model Views of Potential XML Users

  1. Document Processing
    • Modeling documents as XML documents
    • XML schema languages are made for this
  2. Information Management/Database Design
    • Mostly relational models (ER and derivations)
    • Mismatches between ER and XML's hierarchical model
  3. Software Engineering
    • Software engineering combines data and behavior
    • Classes are more than just documents

Usages of XML

  1. Document Processing
    • Many document-oriented systems are XML-based
    • XML import/export are a must
    • XML's restrictions are (in most cases) acceptable
  2. Information Management/Database Design
    • XML support in DBMSs (SQL/XML), XDBMS are available
    • "Modeling" using XSD or DTDs (or simply well-formed)
    • No good support for combining RDBMS/XDBMS
  3. Software Engineering
    • XML as a way for persisting objects
    • UML as a way to model software systems
    • CASE tools for generating classes and their attributes

Missing: XML Conceptual Modeling

Example: XML in Programming Languages

Conceptual Models for XML

XML Conceptual Modeling Languages

Example XER Schema (1)

Example XER Schema

Example XER Schema (2)

Weaknesses of Conceptual Languages

  1. Targeted on specific schema language
  2. No or weak support for mixed content
  3. Lack of formal foundation
  4. No support for reference/hierarchy mix of XML
  5. No support for multi-document scenarios
  6. Non-deterministic Content

Formal Models for XML

XML Formal Models

Weaknesses of Formal Languages

State of the Art

List of Requirements (1)

  1. Formal Foundation
  2. Graphical Notation
  3. Hierarchical and Referential Structures
  4. Schema Language Mappings
  5. Exceptions (Inclusions and Exclusions)
  6. Non-deterministic Content

List of Requirements (2)

  1. Treating XML Nodes Consistently
  2. Model Groups
  3. Reuse of Content
  4. Generalized Mixed Content
  5. Open Content
  6. Intra- and Inter-Document Relationships

Conclusions

Thank You! Questions?