Details on Modularizatin can be found in Chapter 4 of Learning XML.
Some important notes:
DTD files that modularize (apart from a few modifications) the single DTD we developed for Exercise 1:
length and year.
These are referenced in the ATTLIST declarations of the elements
track and album. Secondly, we define a new data
type URI which is used to define the type of the attribute href
for the element link. Lastly, we use the PE mechanism for declaring
an enumeration of values through the entity declarations for styles
and genders.Details on character encodings can be found in Chapter 9 of Learning XML, but to a short explanation is:
The character set for a document is the scheme by which characters are derived from the numerical values in its underlying form—e.g., a byte value of 252 corresponds to "ü". Now, in 1991 the Unicode consortium defined that 16 bits shall be used to describe all the international characters; i.e. 2^16 different characters are represented. To address a certain character, you need to use &#X; where X is the decimal or a hexadecimal number of the character, e.g., Ř maps to "Ř" Note, that the table, or mapping, of the unicode character set is *fixed*, e.g., the 252th character *always* corresponds to "ü". For compatibility reasons, the first 128 characters of the unicode set correspond to the US-ASCII character set, then the next 128 correspond to ISO-8859-1 (the latin set) and then the next 256 to...
The purpose of an "XML encoding" property is to allow the typing of the characters belonging the specified encoding scheme *directly* instead of referring to them as &#X;—i.e. to type an "ü" directly. Yet another point is that these decimal or hexadecimal values are bit hard to remember. Thus, it's convient to define another mapping: I.e. to use ü which maps to ü. But this mapping has to be defined first! In order that an XML parser knows this mapping, it has to be specified as a parameter entitiy, e.g.:
[
<!ENTITY % HTMLlat1 SYSTEM "http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent"> %HTMLlat1;
]>
Note that the file "xhtml-lat1.ent" contains definitions such as for instance
<!ENTITY uuml "ü">
Thus, if you use the parameter entity definition as shown above, you can refer to the entity definitions as a "normal" entity with "&". Therefore, to have an "ü" displayed, you can either use ü or if you used the parameter entity from above as ü
PS: If you "browse" such an XML file with Internet Explorer 6, it shows the "ü" correctly in both cases (i.e. with ü and ü). However, doing so with Netscape, Mozilla, or Opera causes an error :-(
XML Files with different character encodings:
| please send comments to xml-vl@dret.net last modification on Friday, 21-Apr-2006 05:28:17 PDT |