Unicode defines a 31-bit character set. Unicode is closely aligned with
UCS. The most commonly used characters, including all those found in older encoding standards, have been placed in one of the first 65534 positions (0x0000 to 0xFFFD). This 16-bit subset is called the
BMP or "Plane 0". The characters that were later added outside the 16-bit
BMP are mostly for specialist applications such as historic scripts and scientific notation. New characters are still being added on a continuous basis, but the existing characters will not be changed any more and are stable. Unicode assigns to each character not only a code number but also an official name. A hexadecimal number that represents a Unicode or
UCS value is commonly preceded by "U+" as in U+0041 for the character "Latin capital letter A". The Unicode characters U+0000 to U+007F are identical to those in
ASCII, and the range U+0000 to U+00FF is identical to
ISO 8859-1.