Smartling supports the following character sets. When uploading source files via API, you can optionally specify a custom character set for text-format files.
- Big5 : Traditional Chinese encoding (Taiwan, Hong Kong)
- Big5-HKSCS : Big5 plus Hong Kong Supplementary Character Set
- EUC-JP : Extended Unix Code for Japanese (multi-byte)
- EUC-KR : Extended Unix Code for Korean (multi-byte)
- GB18030 : Mainland China Unicode-compatible charset (supports all CJK)
- GB2312 : Simplified Chinese national standard (older subset of GBK/GB18030)
- GBK : Extension of GB2312 for Simplified Chinese
- IBM-Thai : IBM code page for Thai language (legacy)
- IBM00858 : IBM multilingual PC code page (Latin variants)
- IBM01140 : IBM EBCDIC variant with Euro (Europe)
- IBM01141 : IBM EBCDIC variant with Euro (Asia variants)
- IBM01142 : IBM EBCDIC variant (extended Latin)
- IBM01143 : IBM EBCDIC variant (Cyrillic/others)
- IBM01144 : IBM EBCDIC variant (Greek/others)
- IBM01145 : IBM EBCDIC variant (Eastern Europe)
- IBM01146 : IBM EBCDIC variant (Arabic)
- IBM01147 : IBM EBCDIC variant (Greek)
- IBM01148 : IBM EBCDIC variant (Hebrew)
- IBM01149 : IBM EBCDIC variant (Korean)
- IBM037 : IBM EBCDIC code page 037 (US/Canada)
- IBM1026 : IBM EBCDIC code page 1026 (Turkish/PC translation)
- IBM1047 : IBM EBCDIC Latin-1 (Open Systems)
- IBM273 : IBM EBCDIC code page for Germany (historic)
- IBM277 : IBM EBCDIC code page for Denmark/Norway
- IBM278 : IBM EBCDIC code page for Finland/Sweden
- IBM280 : IBM EBCDIC code page for Italy
- IBM284 : IBM EBCDIC code page for Spain
- IBM285 : IBM EBCDIC code page for UK English
- IBM290 : IBM EBCDIC variant for Japanese Katakana
- IBM297 : IBM EBCDIC code page for France
- IBM420 : IBM EBCDIC Arabic
- IBM424 : IBM EBCDIC Hebrew
- IBM437 : IBM PC OEM code page 437 (original DOS Latin US)
- IBM500 : IBM EBCDIC International variant
- IBM775 : IBM PC code page for Baltic languages
- IBM850 : IBM PC code page 850 (Western European)
- IBM852 : IBM PC code page 852 (Central European)
- IBM855 : IBM PC code page 855 (Cyrillic)
- IBM857 : IBM PC code page 857 (Turkish)
- IBM860 : IBM PC code page 860 (Portuguese)
- IBM861 : IBM PC code page 861 (Icelandic)
- IBM862 : IBM PC code page 862 (Hebrew)
- IBM863 : IBM PC code page 863 (French Canadian)
- IBM864 : IBM PC code page 864 (Arabic)
- IBM865 : IBM PC code page 865 (Nordic)
- IBM866 : IBM PC code page 866 (Cyrillic, Russian)
- IBM868 : IBM PC code page 868 (Arabic, Pakistan)
- IBM869 : IBM PC code page 869 (Greek)
- IBM870 : IBM EBCDIC Multilingual Latin-2
- IBM871 : IBM EBCDIC Icelandic
- IBM918 : IBM EBCDIC Arabic (CCSID 918)
- ISO-2022-CN : ISO-2022 variant for Chinese (escape sequences)
- ISO-2022-JP : ISO-2022 variant for Japanese (escape sequences)
- ISO-2022-JP-2 : Extended ISO-2022-JP with more character sets
- ISO-2022-KR : ISO-2022 variant for Korean (escape sequences)
- ISO-8859-1 : ISO Latin-1 (Western European languages)
- ISO-8859-13 : ISO Latin-7 (Baltic languages)
- ISO-8859-15 : ISO Latin-9 (Western European with Euro sign)
- ISO-8859-2 : ISO Latin-2 (Central European languages)
- ISO-8859-3 : ISO Latin-3 (South European languages)
- ISO-8859-4 : ISO Latin-4 (North European languages)
- ISO-8859-5 : ISO Latin/Cyrillic (Cyrillic script)
- ISO-8859-6 : ISO Latin/Arabic (Arabic script)
- ISO-8859-7 : ISO Latin/Greek (Greek script)
- ISO-8859-8 : ISO Latin/Hebrew (Hebrew script)
- ISO-8859-9 : ISO Latin-5 (Turkish, replaces some Turkish chars)
- JIS_X0201 : JIS X 0201 (Roman + half-width Katakana)
- JIS_X0212-1990 : Supplementary Japanese character set (JIS X 0212)
- KOI8-R : KOI8-R Cyrillic (Russian)
- KOI8-U : KOI8-U Cyrillic (Ukrainian)
- Shift_JIS : Shift JIS (Japanese legacy multi-byte)
- TIS-620 : Thai Industrial Standard character set (Thai)
- US-ASCII : 7-bit ASCII (basic English characters)
- UTF-16 : Unicode Transformation Format - 16-bit (with endianness)
- UTF-16BE : UTF-16 Big Endian (no BOM)
- UTF-16LE : UTF-16 Little Endian (no BOM)
- UTF-32 : Unicode Transformation Format - 32-bit (fixed-length)
- UTF-32BE : UTF-32 Big Endian (no BOM)
- UTF-32LE : UTF-32 Little Endian (no BOM)
- UTF-8 : Unicode Transformation Format - 8-bit (variable-length)
- windows-1250 : Windows code page 1250 (Central European)
- windows-1251 : Windows code page 1251 (Cyrillic)
- windows-1252 : Windows code page 1252 (Western European)
- windows-1253 : Windows code page 1253 (Greek)
- windows-1254 : Windows code page 1254 (Turkish)
- windows-1255 : Windows code page 1255 (Hebrew)
- windows-1256 : Windows code page 1256 (Arabic)
- windows-1257 : Windows code page 1257 (Baltic)
- windows-1258 : Windows code page 1258 (Vietnamese)
- windows-31j : Windows variant of Shift_JIS (Japanese)
- x-Big5-HKSCS-2001 : Vendor-specific or extended charset variant
- x-Big5-Solaris : Vendor-specific or extended charset variant
- x-COMPOUND_TEXT : X Window System compound text (multi-charset)
- x-euc-jp-linux : EUC-JP variant commonly used on Linux
- x-EUC-TW : EUC-TW (Traditional Chinese, Taiwan)
- x-eucJP-Open : Open variant of EUC-JP (vendor-specific)
- x-IBM1006 : IBM code page 1006 (IBM legacy EBCDIC/OEM variant)
- x-IBM1025 : IBM code page 1025 (IBM legacy EBCDIC/OEM variant)
- x-IBM1046 : IBM code page 1046 (IBM legacy EBCDIC/OEM variant)
- x-IBM1097 : IBM code page 1097 (IBM legacy EBCDIC/OEM variant)
- x-IBM1098 : IBM code page 1098 (IBM legacy EBCDIC/OEM variant)
- x-IBM1112 : IBM code page 1112 (IBM legacy EBCDIC/OEM variant)
- x-IBM1122 : IBM code page 1122 (IBM legacy EBCDIC/OEM variant)
- x-IBM1123 : IBM code page 1123 (IBM legacy EBCDIC/OEM variant)
- x-IBM1124 : IBM code page 1124 (IBM legacy EBCDIC/OEM variant)
- x-IBM1364 : IBM code page 1364 (IBM legacy EBCDIC/OEM variant)
- x-IBM1381 : IBM code page 1381 (IBM legacy EBCDIC/OEM variant)
- x-IBM1383 : IBM code page 1383 (IBM legacy EBCDIC/OEM variant)
- x-IBM300 : IBM code page 300 (IBM legacy EBCDIC/OEM variant)
- x-IBM33722 : IBM code page 33722 (IBM legacy EBCDIC/OEM variant)
- x-IBM737 : IBM code page 737 (IBM legacy EBCDIC/OEM variant)
- x-IBM833 : IBM code page 833 (IBM legacy EBCDIC/OEM variant)
- x-IBM834 : IBM code page 834 (IBM legacy EBCDIC/OEM variant)
- x-IBM856 : IBM code page 856 (IBM legacy EBCDIC/OEM variant)
- x-IBM874 : IBM code page 874 (IBM legacy EBCDIC/OEM variant)
- x-IBM875 : IBM code page 875 (IBM legacy EBCDIC/OEM variant)
- x-IBM921 : IBM code page 921 (IBM legacy EBCDIC/OEM variant)
- x-IBM922 : IBM code page 922 (IBM legacy EBCDIC/OEM variant)
- x-IBM930 : IBM code page 930 (IBM legacy EBCDIC/OEM variant)
- x-IBM933 : IBM code page 933 (IBM legacy EBCDIC/OEM variant)
- x-IBM935 : IBM code page 935 (IBM legacy EBCDIC/OEM variant)
- x-IBM937 : IBM code page 937 (IBM legacy EBCDIC/OEM variant)
- x-IBM939 : IBM code page 939 (IBM legacy EBCDIC/OEM variant)
- x-IBM942 : IBM code page 942 (IBM legacy EBCDIC/OEM variant)
- x-IBM942C : IBM code page 942 (IBM legacy EBCDIC/OEM variant)
- x-IBM943 : IBM code page 943 (IBM legacy EBCDIC/OEM variant)
- x-IBM943C : IBM code page 943 (IBM legacy EBCDIC/OEM variant)
- x-IBM948 : IBM code page 948 (IBM legacy EBCDIC/OEM variant)
- x-IBM949 : IBM code page 949 (IBM legacy EBCDIC/OEM variant)
- x-IBM949C : IBM code page 949 (IBM legacy EBCDIC/OEM variant)
- x-IBM950 : IBM code page 950 (IBM legacy EBCDIC/OEM variant)
- x-IBM964 : IBM code page 964 (IBM legacy EBCDIC/OEM variant)
- x-IBM970 : IBM code page 970 (IBM legacy EBCDIC/OEM variant)
- x-ISCII91 : ISCII (Indian scripts) - 1991 standard
- x-ISO-2022-CN-CNS : Vendor-specific or extended charset variant
- x-ISO-2022-CN-GB : Chinese encodings family (GBK/GB2312/GB18030)
- x-iso-8859-11 : ISO-8859-11 (Thai)
- x-JIS0208 : JIS X 0208 mapping (Japanese character set)
- x-JISAutoDetect : Auto-detect between JIS encodings for Japanese
- x-Johab : Johab (older Korean encoding, Windows variant)
- x-MacArabic : Mac OS legacy encoding for Arabic
- x-MacCentralEurope : Mac OS legacy encoding for CentralEurope
- x-MacCroatian : Mac OS legacy encoding for Croatian
- x-MacCyrillic : Mac OS legacy encoding for Cyrillic
- x-MacDingbat : Chinese encodings family (GBK/GB2312/GB18030)
- x-MacGreek : Mac OS legacy encoding for Greek
- x-MacHebrew : Mac OS legacy encoding for Hebrew
- x-MacIceland : Mac OS legacy encoding for Iceland
- x-MacRoman : Mac OS legacy encoding for Roman
- x-MacRomania : Mac OS legacy encoding for Romania
- x-MacSymbol : Mac OS legacy encoding for Symbol
- x-MacThai : Mac OS legacy encoding for Thai
- x-MacTurkish : Mac OS legacy encoding for Turkish
- x-MacUkraine : Mac OS legacy encoding for Ukraine
- x-MS932_0213 : Windows Shift_JIS variant supporting JIS X 0213
- x-MS950-HKSCS : MS CP950 variant with HKSCS (Traditional Chinese - Hong Kong)
- x-MS950-HKSCS-XP : XP-era variant of MS950 with HKSCS mappings
- x-mswin-936 : Windows CP936 (GBK/GB2312 family for Simplified Chinese)
- x-PCK : PCK (Proprietary Shift_JIS variant used by Windows/IBM)
- x-SJIS_0213 : Shift_JIS variant supporting JIS X 0213
- x-UTF-16LE-BOM : UTF-16LE with BOM (byte order mark)
- X-UTF-32BE-BOM : UTF-32BE with BOM (byte order mark)
- X-UTF-32LE-BOM : UTF-32LE with BOM (byte order mark)
- x-windows-50220 : Windows variant mapping for ISO-2022-JP (50220)
- x-windows-50221 : Windows variant mapping for ISO-2022-JP (50221)
- x-windows-874 : Windows code page 874 (Thai variant)
- x-windows-949 : Microsoft CP949 (extension of EUC-KR for Korean)
- x-windows-950 : Microsoft CP950 (Big5 variant for Traditional Chinese)
- x-windows-iso2022jp : Windows variant of ISO-2022-JP (Japanese)