ENC API Introduction

Introduction

ENC API provides character code conversion features.

Supported Character Encodings

The internal character encoding of Revolution is UTF-16BE.

ENC API supports bi-directional conversion between the following encodings and the internal character encoding:

ENC API supports a one-way conversion from the following encodings to the internal character encoding:

Character Encoding Name Matching

ENC API matches the character encoding names based on the following rules:

  1. Convert all alphabets to lower-case.
  2. If the name starts in "x-" or "cs," remove them.
  3. Remove all non-alphabet and non-numeric characters.
  4. Compare with the matching strings of individual character encoding names.

The individual character encoding names and the matching strings are as follows:

Character encoding name Matching strings
US-ASCII usascii
ascii
us
ansix341968
ansix341986
cp367
ibm367
iso646irv1991
iso646us
isoir6
UTF-8 utf8
utf8n
unicode11utf8
unicode20utf8
UTF-16BE utf16be
ucs2
ucs2be
unicode11
unicode20
unicode20utf16
unicodeascii
unicodelatin1
iso10646
iso10646j1
iso10646ucs2
iso10646ucs2be
iso10646ucsbasic
iso10646unicodelatin1
UTF-32BE utf32be
utf32
ucs4
ucs4be
iso10646ucs4
iso10646ucs4be
ISO-8859-1 iso88591
latin1
l1
cp819
ibm819
isolatin1
iso885911987
isoir100
ISO-8859-2 iso88592
latin2
l2
isolatin2
iso885921987
isoir101
ISO-8859-3 iso88593
latin3
l3
isolatin3
iso885931988
isoir109
ISO-8859-7 iso88597
greek
greek8
isolatingreek
iso885971987
isoir126
ecma118
elot928
suneugreek
ISO-8859-10 iso885910
latin6
isolatin6
l6
ISO-8859-15 iso885915
latin9
iso8859101992
isoir157
ISO-2022-JP iso2022jp
iso2022jp1
iso2022jp2
Shift_JIS shiftjis
sjis
mscp932
mskanji
windows31j
windows-1252 windows1252
cp1252
windows30latin1
windows31latin1
iso88591windows30latin1
iso88591windows31latin1
Character encoding name Matching strings
UTF-7 utf7
unicode11utf7
unicode20utf7
cp65000
UTF-16 utf16
cp1200
ibm1200
UTF-16LE utf16le
ucs2le
iso10646ucs2le
windows-1250 windows1250
cp1250
windows31latin2
iso88592windowslatin2
windows-1253 windows1253
cp1253
macintosh macintosh
mac
macroman
x-mac-ce macce
x-mac-greek macgreek
IBM850 ibm850
cp850
850
pc850multilingual
IBM852 ibm852
cp852
852
pcp852

Conversion Rules of Individual Encodings

ISO-8859

ISO-8859 conversion involves conversion from ISO-8859 to the internal character encoding and its reverse conversion.
However, in ISO-8859-1 conversion, the encoding will be treated as windows-1252 for conversion to the internal encoding,
but treated as ISO-8859-1 for conversion from internal encoding.

Japanese Character Encoding

The ISO-2022-JP and Shift_JIS conversion rules will be compatible with Windows conversion with certain exceptions.

When converting from the internal character encoding to ISO-2022-JP or Shift_JIS, the following conversions not in Windows will be used.

Internal Character Encoding ISO-2022-JP Shift_JIS
0x203E 0x7E 0x7E
0x2014 0x213D 0x815C
0x2016 0x2142 0x8161
0x2212 0x215D 0x817C
0x301C 0x2141 0x8160

ISO-2022-JP supports the following character groups.

However, JIS romaji supports only one-way conversion from ISO-2022-JP, and is treated the same as ASCII.
Half-width kana also only supports one-way conversion from ISO-2022-JP, and is converted to full-width kana at conversion to ISO-2022-JP.

For internal character encoding, the private area (1880 characters) for ISO-2022-JP and Shift_JIS are defined in the ranges shown below, corresponding to the code order.
Conversion from ISO-2022-JP is possible in the private area, but conversion to ISO-2022-JP returns ENC_ERR_NO_MAP_RULE.

Internal Character Encoding ISO-2022-JP Shift_JIS
0xE000 - 0xE757E 0x7F21 - 0x927E 0xF040 - 0xF9FC

Stripping the conversion table

If it is not necessary to use some of the relatively large conversion tables, it is possible to strip the conversion tables by defining a macro within the program.
If you try to convert between the internal character encoding and one of the character encodings whose conversion table has been stripped, ENC_ERR_NOT_LOADED will be returned.

The currently-supported macros are shown below.

macros Character encoding
ENC_STRIP_TABLE_JP ISO-2022-JP
Shift_JIS

Revision History

2007/02/05 Added a description about stripping conversion tables.
2006/11/14 Revised description of the private area.
2006/10/24 Initial version.




CONFIDENTIAL