XML and Database Systems

XML Foundations (INFOSYS 242)

Erik Wilde, UC Berkeley iSchool
Thursday, October 19, 2006
Creative Commons License

This work is licensed under a Creative Commons
Attribution-NonCommercial-ShareAlike 2.5 License.

Abstract

XML is the most popular data format for exchanging data, but the majority of data within applications and closed systems is still stored in Relational Database Managements Systems (RDBMS). This leads to two main issues, the first one being how moving data between XML formats and RDBMS can be done easily and efficiently, so that moving data between these two worlds can be done as easy as possible. The second issue is how to map the data models between these two worlds. Relational data can easily be represented in XML, because tables can be easily represented in trees. Things can be more complicated in the other direction, because arbitrary XML can be hard to store in a relational database. For XML-centric scenarios, XML Database Management Systems (XDBMS) are an interesting alternative, which provide XML-specific query capabilities with XML Query (XQuery).

XML is Trees

Storing XML

Outline (XML and Databases)

  1. XML and Databases [17]
    1. Relational Databases [2]
    2. Database Support for XML [4]
    3. XML Storage in Databases [9]
  2. XML in Relational Databases [5]
    1. SQL/XML [3]
  3. XML Databases [3]
  4. Conclusions [2]

Data needs Databases

Model Mismatches

Outline (Relational Databases)

  1. XML and Databases [17]
    1. Relational Databases [2]
    2. Database Support for XML [4]
    3. XML Storage in Databases [9]
  2. XML in Relational Databases [5]
    1. SQL/XML [3]
  3. XML Databases [3]
  4. Conclusions [2]

Generic XML Storage

Tree Table

ID Type Name Value Parent Left
1 Root
2 Element a 1
3 Element b 2
4 Element c 2 3
5 Text Text 3
6 Attribute att 42 4

Outline (Database Support for XML)

  1. XML and Databases [17]
    1. Relational Databases [2]
    2. Database Support for XML [4]
    3. XML Storage in Databases [9]
  2. XML in Relational Databases [5]
    1. SQL/XML [3]
  3. XML Databases [3]
  4. Conclusions [2]

Why XML and Databases?

XML Interchange

XML Support in DBMS

XML DBMS

Outline (XML Storage in Databases)

  1. XML and Databases [17]
    1. Relational Databases [2]
    2. Database Support for XML [4]
    3. XML Storage in Databases [9]
  2. XML in Relational Databases [5]
    1. SQL/XML [3]
  3. XML Databases [3]
  4. Conclusions [2]

Model Mapping

XML is Text

XML → ∗LOB

XML as a Datatype

XML Datatype

Mapping XML to Models

Shredding (XML → Columns)

XML as First-Class Citizen

XML DBMS

Outline (XML in Relational Databases)

  1. XML and Databases [17]
    1. Relational Databases [2]
    2. Database Support for XML [4]
    3. XML Storage in Databases [9]
  2. XML in Relational Databases [5]
    1. SQL/XML [3]
  3. XML Databases [3]
  4. Conclusions [2]

RDBish XML

Problematic XML

Outline (SQL/XML)

  1. XML and Databases [17]
    1. Relational Databases [2]
    2. Database Support for XML [4]
    3. XML Storage in Databases [9]
  2. XML in Relational Databases [5]
    1. SQL/XML [3]
  3. XML Databases [3]
  4. Conclusions [2]

SQL/XML:2003

SQL/XML Example

SELECT
	e.EmpId,
	e.FirstName,
	e.LastName,
	e.StartDate,
	e.EndDate
FROM Employees e WHERE e.EmpId = 12
SELECT
	XMLELEMENT (NAME "employee",
		XMLATTRIBUTES(e.EmpId as "id"),
		XMLELEMENT(NAME "names",
		XMLELEMENT(NAME "first", e.FirstName),
		XMLELEMENT(NAME "last", e.LastName)),
		XMLELEMENT(NAME "hire-dates",
			XMLATTRIBUTES(e.StartDate as "start", e.EndDate as "end")))
FROM Employees e WHERE e.EmpId = 12

SQL/XML:2007

Outline (XML Databases)

  1. XML and Databases [17]
    1. Relational Databases [2]
    2. Database Support for XML [4]
    3. XML Storage in Databases [9]
  2. XML in Relational Databases [5]
    1. SQL/XML [3]
  3. XML Databases [3]
  4. Conclusions [2]

Storing XML

XML Query Language (XQuery)

XQuery Example

  <video id="3325584384">
   <studio></studio>
   <actorRef>325442748</actorRef>
   <actorRef>2096814035</actorRef>
   <actorRef>4231919377</actorRef>
   <director>Louis Malle</director>
   <title>Au Revoir Les Enfants</title>
  <actor id="325442748">Feito, Raphael</actor>
  <actor id="4231919377">Morier, Philippe</actor>
  <actor id="2142583927">Racette, Francine</actor>
declare variable $firstName external;
<videos featuring="{$firstName}">
{
   let $doc := .
   for $v in $doc//video,
      $a in $doc//actors/actor
   where ends-with($a, $firstName) and $v/actorRef = $a/@id
   order by $v/year
   return
      <video year="{$v/year}">
         {$v/title}
      </video>
}
</videos> 

Outline (Conclusions)

  1. XML and Databases [17]
    1. Relational Databases [2]
    2. Database Support for XML [4]
    3. XML Storage in Databases [9]
  2. XML in Relational Databases [5]
    1. SQL/XML [3]
  3. XML Databases [3]
  4. Conclusions [2]

Tables and Trees don't Mix

Database Technologies do Mix