ENC API

Introduction

ENC API provides character code conversion features.

Supported Character Encodings

The internal character encoding of Revolution is UTF-16BE.

ENC API supports bi-directional conversion between the following encodings and the internal character encoding.

ENC API supports a one-way conversion from the following encodings to the internal character encoding.

Character Encoding Name Matching

ENC API matches the character encoding names based on the following rules.

  1. Convert all alphabets to lowercase.
  2. If the name starts in "x-" or "cs," remove them.
  3. Remove all non-alphabet and non-numeric characters.
  4. Compare with the matching strings of individual character encoding names.

The individual character encoding names and the matching strings are as follows.

Character Encoding Name Matching Strings
US-ASCII usascii
ascii
us
ansix341968
ansix341986
cp367
ibm367
iso646irv1991
iso646us
isoir6
UTF-8 utf8
utf8n
unicode11utf8
unicode20utf8
UTF-16BE utf16be
ucs2
ucs2be
unicode11
unicode20
unicode20utf16
unicodeascii
unicodelatin1
iso10646
iso10646j1
iso10646ucs2
iso10646ucs2be
iso10646ucsbasic
iso10646unicodelatin1
UTF-32BE utf32be
utf32
ucs4
ucs4be
iso10646ucs4
iso10646ucs4be
ISO-8859-1 iso88591
latin1
l1
cp819
ibm819
isolatin1
iso885911987
isoir100
ISO-8859-2 iso88592
latin2
l2
isolatin2
iso885921987
isoir101
ISO-8859-3 iso88593
latin3
l3
isolatin3
iso885931988
isoir109
ISO-8859-7 iso88597
greek
greek8
isolatingreek
iso885971987
isoir126
ecma118
elot928
suneugreek
ISO-8859-10 iso885910
latin6
isolatin6
l6
ISO-8859-15 iso885915
latin9
iso8859101992
isoir157
ISO-2022-JP iso2022jp
iso2022jp1
iso2022jp2
Shift_JIS shiftjis
sjis
mscp932
mskanji
windows31j
UHC euckr
ksc56011987
isoir149
ksc56011989
ksc5601
korean
uhc
cp949
windows949
GB2312 gb2312
gb231280
isoir58
chinese
iso58gb231280
euccn
windows-1252 windows1252
cp1252
windows30latin1
windows31latin1
iso88591windows30latin1
iso88591windows31latin1
Character Encoding Name Matching Strings
UTF-7 utf7
unicode11utf7
unicode20utf7
cp65000
UTF-16 utf16
cp1200
ibm1200
UTF-16LE utf16le
ucs2le
iso10646ucs2le
windows-1250 windows1250
cp1250
windows31latin2
iso88592windowslatin2
windows-1253 windows1253
cp1253
macintosh macintosh
mac
macroman
x-mac-ce macce
x-mac-greek macgreek
IBM850 ibm850
cp850
850
pc850multilingual
IBM852 ibm852
cp852
852
pcp852

Conversion Rules of Individual Encodings

ISO-8859

ISO-8859 conversion involves conversion from ISO-8859 to the internal character encoding and its reverse conversion.
However, in ISO-8859-1 conversion, the encoding is treated as windows-1252 for conversion to the internal encoding, but treated as ISO-8859-1 for conversion from internal encoding.

Japanese Character Encoding

The ISO-2022-JP and Shift_JIS conversion rules are compatible with Windows conversion with certain exceptions.

When converting from the internal character encoding to ISO-2022-JP or Shift_JIS, the following conversions not in Windows are used.

Internal Character Encoding ISO-2022-JP Shift_JIS
0x203E 0x7E 0x7E
0x2014 0x213D 0x815C
0x2016 0x2142 0x8161
0x2212 0x215D 0x817C
0x301C 0x2141 0x8160

ISO-2022-JP supports the following character groups.

However, JIS romaji supports only one-way conversion from ISO-2022-JP and is treated the same as ASCII.
Half-width kana also only supports one-way conversion from ISO-2022-JP and is converted to full-width kana at conversion to ISO-2022-JP.

For internal character encoding, the private area (1880 characters) for ISO-2022-JP and Shift_JIS are defined in the ranges shown below, corresponding to the code order.
Conversion from ISO-2022-JP is possible in the private area, but conversion to ISO-2022-JP returns ENC_ERR_NO_MAP_RULE.

Internal Character Encoding ISO-2022-JP Shift_JIS
0xE000 - 0xE757E 0x7F21 - 0x927E 0xF040 - 0xF9FC

Korean Character Encoding

UHC (CP949) is supported for the encoding of Korean characters.
Be aware that the size of the conversion table is larger than that for Japanese or Chinese.
The conversion target to be supported can be restricted to either of the following using the conversion table strip described below.
In either case, the size of the converted table is about the same for Japanese and Chinese.

In the case of the former, the KS X 1001:1992 character code set is the conversion target. Hangul consists of 2350 characters.
In the case of the latter, all Hangul can be supported rather than excluding Chinese characters as conversion targets.

Conversion of the private area is not supported in both directions.

Chinese Character Encoding

With Chinese character encoding, conversion of characters not found in GB2312-80 is performed according to the internal fonts of the console.
For a listing, click here.

Conversion of the private area is not supported in both directions.

Stripping the Conversion Table

If it is not necessary to use some of the relatively large conversion tables, it is possible to strip the conversion tables by defining a macro within the program.
If you try to convert between the internal character encoding and one of the character encodings whose conversion table has been stripped, an error results. For information on the error code used in this case, see the manual entry for each function.

The currently supported macros are shown below.

Macros Character Encoding
ENC_STRIP_TABLE_JP ISO-2022-JP
Shift_JIS
ENC_STRIP_TABLE_KR_KANJI UHC Chinese character region
ENC_STRIP_TABLE_KR_UHC UHC extended Hangul region
ENC_STRIP_TABLE_KR UHC
ENC_STRIP_TABLE_CN GB2312

See Also

List of Additional Conversion Rules for Chinese Character Encoding

Revision History

2008/10/21 Revised the part where a mention of ENC_ERR_NOT_LOADED still remained.
2008/02/21 Added character codes for Korean and Chinese.
2007/02/05 Added a description about stripping conversion tables.
2006/11/14 Revised description of the private area.
2006/10/24 Initial version.


CONFIDENTIAL