character encoding
<character> (Or "character encoding scheme") A mapping of binary values
to code positions and back; generally a 1:1 (bijective) mapping.
In the case of ASCII, this is generally a f(x)=x mapping: code point 65 maps to
the byte value 65, and vice versa. This is possible because ASCII uses only code
positions representable as single bytes, i.e., values between 0 and 255, at
most. (US-ASCII only uses values 0 to 127, in fact.)
Unicode and many CJK coded character sets use many more than 255 positions,
requiring more complex mappings: sometimes the characters are mapped onto pairs
of bytes (see DBCS). In many cases, this breaks programs that assume a
one-to-one mapping of bytes to characters, and so, for example, treat any
occurrance of the byte value 13 as a carriage return. To avoid this problem,
character encodings such as UTF-8 were devised.
(1998-10-18)
Nearby terms:
CHAP « char « character « character encoding
» character encoding scheme » character graphics »
characteristic function
|