/usr/lib/x86_64-linux-gnu/transcript1/aliases.txt is in libtranscript1 0.3.3-1.
This file is owned by root:root, with mode 0o644.
The actual contents of the file can be viewed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 | # Aliases for character sets, derived from the list distributed with ICU
# The format of this file is as follows:
# Each line contains a converter, followed by a list of aliases. The preferred
# names for display purposes are marked with a leading asterisk (*). If none of
# the aliases is marked as display name, the name of the converter is used.
# Note that names will be normalized, so for example the names IBM-1201 and
# ibm1201 will be considered equal (see the libtranscript documentation for
# details on normalization).
# Converter names must start in column 1. Lines can be continued by indenting
# the second and later lines with any amount of white-space.
# Some converters need special handling. For this purpose, there are several
# special tags that may be used in the list of aliases. These are:
# :disable Disables this converter. Note that this also prohibits any
# other converter with any of the names specified.
# :probe_load The converter must be loaded and the probe routine of the
# converter must be called to establish its presence. Normal
# operation merely checks the presence of the .ltc file.
# Unicode encodings:
# ==================
# All UTF-X variants have multiple different IBM names for different versions
# of the Unicode standard and with and without IBM Private Use Area (PUA).
# FIXME: The current grouping doesn't destinguish between different Unicode
# versions, eventhough the IBM names do. That doesn't seem right, because if
# you really specify a version, you would expect the output to be compatible
# with that version. So an error should be reported if you use codepoints that
# are not available in that version.
# The UTF-16 and UTF-32 encodings will write a BOM and produce output in the
# platform byte order. The UTF-16 and UTF-32 encodings will switch endianess
# when encountering a BOM on input. By default they use the big endian encoding.
UTF-8 IBM-1208 IBM-1209 IBM-5304 IBM-5305 IBM-13496 IBM-13497
IBM-17592 ibm-17593 Windows-65001 cp1208
# Use UCS-2 as an alias for UTF-16, although it is actually UTF-16BE. However,
# because people simply use UCS-2 and UTF-16 interchangebly, it is better to
# do it this way.
UTF-16 ISO-10646-UCS-2 IBM-1204 IBM-1205 unicode csUnicode ucs-2
UTF-16BE x-utf-16be UnicodeBigUnmarked IBM-1200 IBM-1201
IBM-13488 IBM-13489 ibm-17584 IBM-17585 IBM-21680 IBM-21681 IBM-25776
IBM-25777 IBM-29872 ibm-29873 IBM-61955 IBM-61956 Windows-1201 cp1200
cp1201 UTF16_BigEndian
UTF-16LE x-utf-16le UnicodeLittleUnmarked IBM-1202 IBM-1203
IBM-13490 ibm-13491 IBM-17586 IBM-17587 IBM-21682 IBM-21683 IBM-25778
IBM-25779 ibm-29874 IBM-29875 UTF16_LittleEndian Windows-1200
UTF-32 ISO-10646-UCS-4 IBM-1236 IBM-1237 csUCS4 ucs-4
UTF-32BE UTF32_BigEndian IBM-1232 IBM-1233 IBM-9424
UTF-32LE UTF32_LittleEndian IBM-1234 IBM-1235
# These special names force a particular endianess on output, but further are
# equivalent to the UTF-16 and UTF-32 converters.
*x-UTF-16BE-BOM *UnicodeBig
*x-UTF-16LE-BOM *UnicodeLittle
x-UTF-32BE-BOM
x-UTF-32LE-BOM
# To allow reading from/writing to UTF-8 files which happen to have a BOM
# encoded as their first character:
x-UTF-8-BOM
UTF-7 Windows-65000
CESU-8 IBM-9400
GB18030 :probe_load IBM-1392 Windows-54936
# Non-unicode encodings:
# ======================
*ISO-8859-1 IBM-819 cp819 *Latin1 8859_1 csISOLatin1 iso-ir-100
ISO_8859-1:1987 l1 819
*ASCII *US-ASCII ANSI_X3.4-1968 ANSI_X3.4-1986 ISO_646.irv:1991
iso_646.irv:1983 ISO646-US us csASCII iso-ir-6 cp367 ascii7 646
windows-20127 IBM-367
iso-8859_2-1999 ibm-912_P100-1995 IBM-912 *ISO-8859-2 ISO_8859-2:1987 *Latin2
csISOLatin2 iso-ir-101 l2 8859_2 cp912 912 Windows-28592
iso-8859_3-1999 ibm-913_P100-2000 IBM-913 *ISO-8859-3 ISO_8859-3:1988 *Latin3
csISOLatin3 iso-ir-109 l3 8859_3 cp913 913 Windows-28593
iso-8859_4-1998 ibm-914_P100-1995 IBM-914 *ISO-8859-4 *Latin4 csISOLatin4
iso-ir-110 ISO_8859-4:1988 l4 8859_4 cp914 914 Windows-28594
iso-8859_5-1999 ibm-915_P100-1995 IBM-915 *ISO-8859-5 *Cyrillic
csISOLatinCyrillic iso-ir-144 ISO_8859-5:1988 8859_5 cp915 915
Windows-28595
iso-8859_6-1999 ibm-1089_P100-1995 IBM-1089 *ISO-8859-6 arabic csISOLatinArabic
iso-ir-127 ISO_8859-6:1987 ECMA-114 ASMO-708 8859_6 cp1089
1089 Windows-28596 ISO-8859-6-I ISO-8859-6-E
iso-8859_7-2003 ibm-9005_X110-2007 IBM-9005 *ISO-8859-7 greek greek8 ELOT_928
ECMA-118 csISOLatinGreek iso-ir-126 Windows-28597 sun_eu_greek
ISO_8859-7:2003 8859_7
# Same as IBM-9005, but without the euro update.
iso-8859_7-1987 ibm-813_P100-1995 *IBM-813 ISO_8859-7:1987 cp813 813
iso-8859_8-1999 ibm-5012_P100-1999 IBM-5012 *ISO-8859-8 hebrew csISOLatinHebrew
iso-ir-138 ISO_8859-8:1988 ISO-8859-8-I ISO-8859-8-E 8859_8 Windows-28598
hebrew8
iso-8859_9-1999 ibm-920_P100-1995 IBM-920 *ISO-8859-9 *Latin5 csISOLatin5
iso-ir-148 ISO_8859-9:1989 l5 8859_9 cp920 920 Windows-28599 ECMA-128
turkish8 turkish
iso-8859_10-1998 *ISO-8859-10 iso-ir-157 l6 ISO_8859-10:1992 csISOLatin6
*Latin6
iso-8859_11-2001 *ISO-8859-11 thai8
iso-8859_13-1998 ibm-921_P100-1995 IBM-921 *ISO-8859-13 8859_13 Windows-28603
cp921 921
iso-8859_14-1998 *ISO-8859-14 iso-ir-199 *Latin8 iso-celtic l8 # ISO_8859-14:1998
iso-8859_15-1999 ibm-923_P100-1998 IBM-923 *ISO-8859-15 *Latin9 l9 8859_15
latin0 csisolatin0 csisolatin9 iso8859_15_fdis cp923 923 Windows-28605
# The Shift-JIS naming is a mess. There are at least 10 different encodings all
# claiming to be Shift-JIS.
Shift_JIS sjis pck x-sjis
Shift_JIS-2004
Shift_JISX0213
EUC-JP ibm-33722_P12A_P12A-2004_U2
Extended_UNIX_Code_Packed_Format_for_Japanese csEUCPkdFmtJapanese
X-EUC-JP Windows-51932 IBM-33722_VPUA IBM-eucJP
EUC-JIS-2004
EUC-JISX0213
# GBK
windows-936-2000 *GBK CP936 MS936 *Windows-936
# Technically, GB2312 doesn't include ASCII, which EUC-CN does. However, it is
# often used as an alias for EUC-CN, so we do that as well.
*EUC-CN *GB2312
# EUC-KR and similar. Java KSC names are ignored in favor of IANA ones
EUC-KR ibm-970_P110_P110-2006_U2 IBM-970 Windows-51949 csEUCKR IBM-eucKR
cp970 970 IBM-970_VPUA
# Windows EUC-KR variant. KSC names ignored in favor of IANA ones. CP949 is
# normally this one
cp949 windows-949-2000 *Windows-949 ms949
EUC-TW :probe_load ibm-964_P110-1999 IBM-964 IBM-eucTW cns11643 cp964 964 IBM-964_VPUA
EUC-TW-2004
ibm-437_P100-1995 *IBM-437 cp437 437 csPC8CodePage437 Windows-437
ibm-878_P100-1996 IBM-878 *KOI8-R koi8 csKOI8R Windows-20866 cp878
ibm-1168_P100-2002 IBM-1168 *KOI8-U Windows-21866
# Windows 125*
ibm-5346_P100-1998 IBM-5346 *Windows-1250 cp1250
ibm-5347_P100-1998 IBM-5347 *Windows-1251 cp1251 ANSI1251
ibm-5348_P100-1997 IBM-5348 *Windows-1252 cp1252
ibm-5349_P100-1998 IBM-5349 *Windows-1253 cp1253
ibm-5350_P100-1998 IBM-5350 *Windows-1254 cp1254
ibm-9447_P100-2002 IBM-9447 *Windows-1255 cp1255
ibm-9448_X100-2005 IBM-9448 *Windows-1256 cp1256
ibm-9449_P100-2002 IBM-9449 *Windows-1257 cp1257
ibm-5354_P100-1998 IBM-5354 *Windows-1258 cp1258
*ISO-2022-JP :probe_load csISO2022JP JIS_Encoding csJISEncoding JIS JIS7 JIS8
*ISO-2022-JP-1 :probe_load IBM-5054
*ISO-2022-JP-2 :probe_load csISO2022JP2
ISO-2022-JP-3 :probe_load
ISO-2022-JP-2004 :probe_load
*ISO-2022-KR :probe_load csISO2022KR
*ISO-2022-CN :probe_load csISO2022CN
ISO-2022-CN-EXT :probe_load
ibm-37_P100-1995 *IBM-37 ebcdic-cp-us ebcdic-cp-ca
ebcdic-cp-wt ebcdic-cp-nl csIBM037 cp037 037 cpibm37
ibm-1047_P100-1995 *IBM-1047 cp1047 1047
*KOI8-RU ibm-1167_P100-2002 *IBM-1167
jis-x-0201-1969 ibm-897_P100-1995 *IBM-897 *JIS-X-0201 X0201 csHalfWidthKatakana
iso-8859_16-2001 *ISO-8859-16 iso-ir-226 *Latin10 l10 # ISO_8859-16:2001
viscii *VISCII iso-ir-180
# Big5 variants
*Big5 x-big5 csbig5
*Windows-950
big5-hkscs-1999 *Big5-HKSCS ibm-1375_P100-2007 IBM-1375 big5hk HKSCS-BIG5
Big5-HKSCS-2001
Big5-HKSCS-2004
Big5-HKSCS-2008
|