ISO/IEC 8859-11
ISO/IEC 8859-11:2001, Information technology — 8-bit single-byte coded graphic character sets — Part 11: Latin/Thai alphabet, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 2001. It is informally referred to as Latin/Thai. It is nearly identical to the national Thai standard TIS-620 (1990). The sole difference is that ISO/IEC 8859-11 allocates non-breaking space to code 0xA0, while TIS-620 leaves it undefined. (In practice, this small distinction is usually ignored.)
ISO-8859-11 is not a main registered IANA charset name despite following the normal pattern for IANA charsets based on the ISO 8859 series. However, it is defined as an alias[1] of the close equivalent TIS-620 (which lacks the non-breaking space), and which can without problems be used for ISO/IEC 8859-11, since the no-break space has a code which was unallocated in TIS-620. Microsoft has assigned code page 28601 a.k.a. Windows-28601 to ISO-8859-11 in Windows.[2] A draft had the Thai letters in different spots.[3]
As with all varieties of ISO/IEC 8859, the lower 128 codes are equivalent to ASCII. The additional characters, apart from no-break space, are found in Unicode in the same order, only shifted from 0xA1 to U+0E01 and so forth.
The Microsoft Windows code page 874 as well as the code page used in the Thai version of the Apple Macintosh, MacThai, are variants of TIS-620 — incompatible with each other, however.
Character set
_0 | _1 | _2 | _3 | _4 | _5 | _6 | _7 | _8 | _9 | _A | _B | _C | _D | _E | _F | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0_ 0 |
||||||||||||||||
1_ 16 |
||||||||||||||||
2_ 32 |
SP 0020 |
! 0021 |
" 0022 |
# 0023 |
$ 0024 |
% 0025 |
& 0026 |
' 0027 |
( 0028 |
) 0029 |
* 002A |
+ 002B |
, 002C |
- 002D |
. 002E |
/ 002F |
3_ 48 |
0 0030 |
1 0031 |
2 0032 |
3 0033 |
4 0034 |
5 0035 |
6 0036 |
7 0037 |
8 0038 |
9 0039 |
: 003A |
; 003B |
< 003C |
= 003D |
> 003E |
? 003F |
4_ 64 |
@ 0040 |
A 0041 |
B 0042 |
C 0043 |
D 0044 |
E 0045 |
F 0046 |
G 0047 |
H 0048 |
I 0049 |
J 004A |
K 004B |
L 004C |
M 004D |
N 004E |
O 004F |
5_ 80 |
P 0050 |
Q 0051 |
R 0052 |
S 0053 |
T 0054 |
U 0055 |
V 0056 |
W 0057 |
X 0058 |
Y 0059 |
Z 005A |
[ 005B |
\ 005C |
] 005D |
^ 005E |
_ 005F |
6_ 96 |
` 0060 |
a 0061 |
b 0062 |
c 0063 |
d 0064 |
e 0065 |
f 0066 |
g 0067 |
h 0068 |
i 0069 |
j 006A |
k 006B |
l 006C |
m 006D |
n 006E |
o 006F |
7_ 112 |
p 0070 |
q 0071 |
r 0072 |
s 0073 |
t 0074 |
u 0075 |
v 0076 |
w 0077 |
x 0078 |
y 0079 |
z 007A |
{ 007B |
| 007C |
} 007D |
~ 007E |
|
8_ 128 |
||||||||||||||||
9_ 144 |
||||||||||||||||
A_ 160 |
NBSP 00A0 |
ก 0E01 |
ข 0E02 |
ฃ 0E03 |
ค 0E04 |
ฅ 0E05 |
ฆ 0E06 |
ง 0E07 |
จ 0E08 |
ฉ 0E09 |
ช 0E0A |
ซ 0E0B |
ฌ 0E0C |
ญ 0E0D |
ฎ 0E0E |
ฏ 0E0F |
B_ 176 |
ฐ 0E10 |
ฑ 0E11 |
ฒ 0E12 |
ณ 0E13 |
ด 0E14 |
ต 0E15 |
ถ 0E16 |
ท 0E17 |
ธ 0E18 |
น 0E19 |
บ 0E1A |
ป 0E1B |
ผ 0E1C |
ฝ 0E1D |
พ 0E1E |
ฟ 0E1F |
C_ 192 |
ภ 0E20 |
ม 0E21 |
ย 0E22 |
ร 0E23 |
ฤ 0E24 |
ล 0E25 |
ฦ 0E26 |
ว 0E27 |
ศ 0E28 |
ษ 0E29 |
ส 0E2A |
ห 0E2B |
ฬ 0E2C |
อ 0E2D |
ฮ 0E2E |
ฯ 0E2F |
D_ 208 |
ะ 0E30 |
◌ั 0E31 |
า 0E32 |
ำ 0E33 |
◌ิ 0E34 |
◌ี 0E35 |
◌ึ 0E36 |
◌ื 0E37 |
◌ุ 0E38 |
◌ู 0E39 |
◌ฺ 0E3A |
฿ 0E3F | ||||
E_ 224 |
เ 0E40 |
แ 0E41 |
โ 0E42 |
ใ 0E43 |
ไ 0E44 |
ๅ 0E45 |
ๆ 0E46 |
◌็ 0E47 |
◌่ 0E48 |
◌้ 0E49 |
◌๊ 0E4A |
◌๋ 0E4B |
◌์ 0E4C |
◌ํ 0E4D |
◌๎ 0E4E |
๏ 0E4F |
F_ 240 |
๐ 0E50 |
๑ 0E51 |
๒ 0E52 |
๓ 0E53 |
๔ 0E54 |
๕ 0E55 |
๖ 0E56 |
๗ 0E57 |
๘ 0E58 |
๙ 0E59 |
๚ 0E5A |
๛ 0E5B |
Letter Number Punctuation Symbol Other Undefined
Code values D1, D4-DA, E7-EE are for combining characters.
Vendor extensions
Code page 874 (IBM) / 9066
IBM code page 874 (CP874, IBM-874, x-IBM874), also known as Code page 9066 (IBM-9066),[5] differs from ISO/IEC 8859-11 in only nine symbols shown boxed in the following table:[6][7][8]
_0 | _1 | _2 | _3 | _4 | _5 | _6 | _7 | _8 | _9 | _A | _B | _C | _D | _E | _F | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
A_ 160 |
◌่ 0E48 |
ก 0E01 |
ข 0E02 |
ฃ 0E03 |
ค 0E04 |
ฅ 0E05 |
ฆ 0E06 |
ง 0E07 |
จ 0E08 |
ฉ 0E09 |
ช 0E0A |
ซ 0E0B |
ฌ 0E0C |
ญ 0E0D |
ฎ 0E0E |
ฏ 0E0F |
B_ 176 |
ฐ 0E10 |
ฑ 0E11 |
ฒ 0E12 |
ณ 0E13 |
ด 0E14 |
ต 0E15 |
ถ 0E16 |
ท 0E17 |
ธ 0E18 |
น 0E19 |
บ 0E1A |
ป 0E1B |
ผ 0E1C |
ฝ 0E1D |
พ 0E1E |
ฟ 0E1F |
C_ 192 |
ภ 0E20 |
ม 0E21 |
ย 0E22 |
ร 0E23 |
ฤ 0E24 |
ล 0E25 |
ฦ 0E26 |
ว 0E27 |
ศ 0E28 |
ษ 0E29 |
ส 0E2A |
ห 0E2B |
ฬ 0E2C |
อ 0E2D |
ฮ 0E2E |
ฯ 0E2F |
D_ 208 |
ะ 0E30 |
◌ั 0E31 |
า 0E32 |
ำ 0E33 |
◌ิ 0E34 |
◌ี 0E35 |
◌ึ 0E36 |
◌ื 0E37 |
◌ุ 0E38 |
◌ู 0E39 |
◌ฺ 0E3A |
◌้︀ 0E49 |
◌๊︀ 0E4A |
◌๋︀ 0E4B |
◌์︀ 0E4C |
฿ 0E3F |
E_ 224 |
เ 0E40 |
แ 0E41 |
โ 0E42 |
ใ 0E43 |
ไ 0E44 |
ๅ 0E45 |
ๆ 0E46 |
◌็ 0E47 |
◌่ 0E48 |
◌้ 0E49 |
◌๊ 0E4A |
◌๋ 0E4B |
◌์ 0E4C |
◌ํ 0E4D |
◌๎ 0E4E |
๏ 0E4F |
F_ 240 |
๐ 0E50 |
๑ 0E51 |
๒ 0E52 |
๓ 0E53 |
๔ 0E54 |
๕ 0E55 |
๖ 0E56 |
๗ 0E57 |
๘ 0E58 |
๙ 0E59 |
๚ 0E5A |
๛ 0E5B |
¢ 00A2 |
¬ 00AC |
¦ 00A6 |
NBSP 00A0 |
Code page 1161
Code page 1161 (CP1161, IBM-1161), is a variant of IBM code page 874. The only difference is the euro sign (€) in position DEhex (222).[11][12]
Code page 874 (Microsoft) / 1162
Windows code page 874 (windows-874, MS874, x-windows-874), known as Code page 1162 (CP1162, IBM-1162) by IBM,[13][14] is used by Microsoft Windows. It differs from ISO/IEC 8859-11 by only nine symbols as shown in the following table:
_0 | _1 | _2 | _3 | _4 | _5 | _6 | _7 | _8 | _9 | _A | _B | _C | _D | _E | _F | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
8_ 128 |
€ 20AC |
… 2026 |
||||||||||||||
9_ 144 |
‘ 2018 |
’ 2019 |
“ 201C |
” 201D |
• 2022 |
– 2013 |
— 2014 |
Mac OS Thai
This is the variant used on the Classic Mac OS.
_0 | _1 | _2 | _3 | _4 | _5 | _6 | _7 | _8 | _9 | _A | _B | _C | _D | _E | _F | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
8_ 128[lower-alpha 1] |
« 00AB |
» 00BB |
… 2026 |
◌่ 0E48 |
◌้ 0E49 |
◌๊ 0E4A |
◌๋ 0E4B |
◌์ 0E4C |
◌่ 0E48 |
◌้ 0E49 |
◌๊ 0E4A |
◌๋ 0E4B |
◌์ 0E4C |
“ 201C |
” 201D |
◌ํ 0E4D |
9_ 144[lower-alpha 2] |
• 2022 |
◌ั 0E31 |
◌็ 0E47 |
◌ิ 0E34 |
◌ี 0E35 |
◌ึ 0E36 |
◌ื 0E37 |
◌่ 0E48 |
◌้ 0E49 |
◌๊ 0E4A |
◌๋ 0E4B |
◌์ 0E4C |
‘ 2018 |
’ 2019 |
||
A_ 160 |
NBSP 00A0 |
ก 0E01 |
ข 0E02 |
ฃ 0E03 |
ค 0E04 |
ฅ 0E05 |
ฆ 0E06 |
ง 0E07 |
จ 0E08 |
ฉ 0E09 |
ช 0E0A |
ซ 0E0B |
ฌ 0E0C |
ญ 0E0D |
ฎ 0E0E |
ฏ 0E0F |
B_ 176 |
ฐ 0E10 |
ฑ 0E11 |
ฒ 0E12 |
ณ 0E13 |
ด 0E14 |
ต 0E15 |
ถ 0E16 |
ท 0E17 |
ธ 0E18 |
น 0E19 |
บ 0E1A |
ป 0E1B |
ผ 0E1C |
ฝ 0E1D |
พ 0E1E |
ฟ 0E1F |
C_ 192 |
ภ 0E20 |
ม 0E21 |
ย 0E22 |
ร 0E23 |
ฤ 0E24 |
ล 0E25 |
ฦ 0E26 |
ว 0E27 |
ศ 0E28 |
ษ 0E29 |
ส 0E2A |
ห 0E2B |
ฬ 0E2C |
อ 0E2D |
ฮ 0E2E |
ฯ 0E2F |
D_ 208 |
ะ 0E30 |
◌ั 0E31 |
า 0E32 |
ำ 0E33 |
◌ิ 0E34 |
◌ี 0E35 |
◌ึ 0E36 |
◌ื 0E37 |
◌ุ 0E38 |
◌ู 0E39 |
◌ฺ 0E3A |
WJ 2060 |
ZWSP 200B |
– 2013 |
— 2014 |
฿ 0E3F |
E_ 224 |
เ 0E40 |
แ 0E41 |
โ 0E42 |
ใ 0E43 |
ไ 0E44 |
ๅ 0E45 |
ๆ 0E46 |
◌็ 0E47 |
◌่ 0E48 |
◌้ 0E49 |
◌๊ 0E4A |
◌๋ 0E4B |
◌์ 0E4C |
◌ํ 0E4D |
™ 2122 |
๏ 0E4F |
F_ 240 |
๐ 0E50 |
๑ 0E51 |
๒ 0E52 |
๓ 0E53 |
๔ 0E54 |
๕ 0E55 |
๖ 0E56 |
๗ 0E57 |
๘ 0E58 |
๙ 0E59 |
® 00AE |
© 00A9 |
See also
Footnotes
- The otherwise-duplicate diacritical marks in this line are intended to display in a "low left position" (0x83–87), "low position" (0x88–8C) or "left position" (0x8F), and are followed in Apple's round-trip mapping by an appended Private Use Area character U+F875, U+F873 or U+F874 respectively.
- The otherwise-duplicate diacritical marks in this line are intended to display in a "left position", and are followed by an appended Private Use Area character U+F874 in Apple's round-trip mapping.
References
- "IANA Character Sets".
- "js-codepage, Getting codepages".
- Everson, Michael. "Proposed ISO 8859-11".
- Whistler, Ken (2002-10-07), ISO/IEC 8859-11:2001 to Unicode, Unicode Consortium
- IBM; Unicode Consortium. "convrtrs.txt". International Components for Unicode. v. 59180.0.1.
Yes ibm-874 == ibm-9066. ibm-1161 has the euro update.
- "Code page 874 information document". Archived from the original on 2017-01-16.
- "CCSID 874 information document". Archived from the original on 2016-03-27.
- "CCSID 9066 information document". Archived from the original on 2016-03-27.
- IBM. "Code Page CPGID 00874" (PDF). REGISTRY: Graphic Character Sets and Code Pages.
- Code Page CPGID 00874 (txt), IBM
- "Code Page 01161" (PDF).
- "CCSID 1161 information document". Archived from the original on 2016-03-27.
- "Code page 1162 information document". Archived from the original on 2016-03-17.
- "CCSID 1162 information document". Archived from the original on 2016-03-27.
- "Code Page 01162" (PDF).
- Steele, Shawn (1998-02-28). "cp874 to Unicode table". Unicode Consortium, Microsoft.
- Code Page CPGID 01162 (txt), IBM
- International Components for Unicode (ICU), ibm-1162_P100-1999.ucm, 2002-12-03
- Apple (2005-04-05). "Map (external version) from Mac OS Thai character set to Unicode 3.2 and later". Unicode Consortium.
External links
- ISO/IEC 8859-11:2001
- ISO/IEC 8859-11:1999 - 8-bit single-byte coded graphic character sets, Part 11: Latin/Thai character set (draft dated June 22, 1999; superseded by ISO/IEC 8859-11:2001, published December 15, 2001)
- Windows code page 874
- ISO-IR 166 Thai character set (July 13, 1992, from Thai Standard TIS 620-2533 (1990))
- Standardization and Implementations of Thai Language PDF 175k