Details on Modularizatin can be found in Chapter 4 of Learning XML.
Some important notes:
DTD files that modularize (apart from a few modifications) the single DTD we developed for Exercise 1:
length
and year
.
These are referenced in the ATTLIST
declarations of the elements
track
and album
. Secondly, we define a new data
type URI
which is used to define the type of the attribute href
for the element link
. Lastly, we use the PE mechanism for declaring
an enumeration of values through the entity declarations for styles
and genders
.Details on character encodings can be found in Chapter 9 of Learning XML, but to a short explanation is:
The character set for a document is the scheme by which characters are derived from the numerical values in its underlying form—e.g., a byte value of 252 corresponds to "ü". Now, in 1991 the Unicode consortium defined that 16 bits shall be used to describe all the international characters; i.e. 2^16 different characters are represented. To address a certain character, you need to use &#X; where X is the decimal or a hexadecimal number of the character, e.g., Ř maps to "Ř" Note, that the table, or mapping, of the unicode character set is *fixed*, e.g., the 252th character *always* corresponds to "ü". For compatibility reasons, the first 128 characters of the unicode set correspond to the US-ASCII character set, then the next 128 correspond to ISO-8859-1 (the latin set) and then the next 256 to...
The purpose of an "XML encoding" property is to allow the typing of the characters belonging the specified encoding scheme *directly* instead of referring to them as &#X;—i.e. to type an "ü" directly. Yet another point is that these decimal or hexadecimal values are bit hard to remember. Thus, it's convient to define another mapping: I.e. to use ü which maps to ü. But this mapping has to be defined first! In order that an XML parser knows this mapping, it has to be specified as a parameter entitiy, e.g.:
[
<!ENTITY % HTMLlat1 SYSTEM "http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent"> %HTMLlat1;
]>
Note that the file "xhtml-lat1.ent" contains definitions such as for instance
<!ENTITY uuml "ü">
Thus, if you use the parameter entity definition as shown above, you can refer to the entity definitions as a "normal" entity with "&". Therefore, to have an "ü" displayed, you can either use ü or if you used the parameter entity from above as ü
PS: If you "browse" such an XML file with Internet Explorer 6, it shows the "ü" correctly in both cases (i.e. with ü and ü). However, doing so with Netscape, Mozilla, or Opera causes an error :-(
XML Files with different character encodings:
please send comments to xml-vl@dret.net last modification on Friday, 21-Apr-2006 14:28:17 CEST |