XML - Grundlagen, Prinzipien und Anwendungen

Musterlösung Aufgabe 2 (XML DTD)

Verfeinern der DTD


Part 1 and 2 (Parameter Entity Delcarations and References and DTD Modularization)

Details on Modularizatin can be found in Chapter 4 of Learning XML.

Some important notes:

 

DTD files that modularize (apart from a few modifications) the single DTD we developed for Exercise 1:


Part 3 (Character References)

Details on character encodings can be found in Chapter 9 of Learning XML, but to a short explanation is:

The character set for a document is the scheme by which characters are derived from the numerical values in its underlying form—e.g., a byte value of 252 corresponds to "ü". Now, in 1991 the Unicode consortium defined that 16 bits shall be used to describe all the international characters; i.e. 2^16 different characters are represented. To address a certain character, you need to use &#X; where X is the decimal or a hexadecimal number of the character, e.g., Ř maps to "Ř" Note, that the table, or mapping, of the unicode character set is *fixed*, e.g., the 252th character *always* corresponds to "ü". For compatibility reasons, the first 128 characters of the unicode set correspond to the US-ASCII character set, then the next 128 correspond to ISO-8859-1 (the latin set) and then the next 256 to...

The purpose of an "XML encoding" property is to allow the typing of the characters belonging the specified encoding scheme *directly* instead of referring to them as &#X;—i.e. to type an "ü" directly. Yet another point is that these decimal or hexadecimal values are bit hard to remember. Thus, it's convient to define another mapping: I.e. to use ü which maps to ü. But this mapping has to be defined first! In order that an XML parser knows this mapping, it has to be specified as a parameter entitiy, e.g.:

[
<!ENTITY % HTMLlat1 SYSTEM "http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent"> %HTMLlat1;
]>

Note that the file "xhtml-lat1.ent" contains definitions such as for instance

<!ENTITY uuml "&#252;">

Thus, if you use the parameter entity definition as shown above, you can refer to the entity definitions as a "normal" entity with "&". Therefore, to have an "ü" displayed, you can either use &#252; or if you used the parameter entity from above as &uuml;

PS: If you "browse" such an XML file with Internet Explorer 6, it shows the "ü" correctly in both cases (i.e. with &#252; and &uuml;). However, doing so with Netscape, Mozilla, or Opera causes an error :-(

XML Files with different character encodings:


please send comments to xml-vl@dret.net
last modification on Friday, 21-Apr-2006 08:28:17 EDT
valid CSS!valid XHTML 1.0!