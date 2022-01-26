JsChardet

Port of python's chardet (https://github.com/chardet/chardet).

License

LGPL

How To Use It

Node

npm install jschardet

var jschardet = require("jschardet") // "àíàçã" in UTF-8 jschardet.detect(" \ xc 3 \ xa 0 \ xc 3 \ xad \ xc 3 \ xa 0 \ xc 3 \ xa 7 \ xc 3 \ xa 3") // { encoding: "UTF-8", confidence: 0.9690625 } // "次常用國字標準字體表" in Big5 jschardet.detect(" \ xa 6 \ xb 8 \ xb 1 \ x 60 \ xa 5 \ xce \ xb 0 \ xea \ xa 6 \ x 72 \ xbc \ xd 0 \ xb 7 \ xc 7 \ xa 6 \ x 72 \ xc 5 \ xe 9 \ xaa \ xed ") // { encoding: "Big5", confidence: 0.99 } // Martin Kühl // jschardet.detectAll(" \ x 3c \ x 73 \ x 74 \ x 72 \ x 69 \ x 6e \ x 67 \ x 3e \ x 4d \ x 61 \ x 72 \ x 74 \ x 69 \ x 6e \ x 20 \ x 4b \ xfc \ x 68 \ x 6c \ x 3c \ x 2f \ x 73 \ x 74 \ x 72 \ x 69 \ x 6e \ x 67 \ x 3e") // [ // {encoding: "windows-1252", confidence: 0.95}, // {encoding: "ISO-8859-2", confidence: 0.8796300205763055}, // {encoding: "SHIFT_JIS", confidence: 0.01} // ]

Browser

Copy and include jschardet.min.js in your web page.

This library is also available in cdnjs at https://cdnjs.cloudflare.com/ajax/libs/jschardet/1.4.1/jschardet.min.js

Options

jschardet.enableDebug(); jschardet.detect(str, { minimumThreshold : 0 }); jschardet.detect(str, { detectEncodings : [ "UTF-8" , "windows-1252" ] });

Supported Charsets

Big5, GB2312/GB18030, EUC-TW, HZ-GB-2312, and ISO-2022-CN (Traditional and Simplified Chinese)

EUC-JP, SHIFT_JIS, and ISO-2022-JP (Japanese)

EUC-KR and ISO-2022-KR (Korean)

KOI8-R, MacCyrillic, IBM855, IBM866, ISO-8859-5, and windows-1251 (Russian)

ISO-8859-2 and windows-1250 (Hungarian)

ISO-8859-5 and windows-1251 (Bulgarian)

windows-1252

ISO-8859-7 and windows-1253 (Greek)

ISO-8859-8 and windows-1255 (Visual and Logical Hebrew)

TIS-620 (Thai)

UTF-32 BE, LE, 3412-ordered, or 2143-ordered (with a BOM)

UTF-16 BE or LE (with a BOM)

UTF-8 (with or without a BOM)

ASCII

Technical Information

I haven't been able to create tests to correctly detect:

ISO-2022-CN

windows-1250 in Hungarian

windows-1251 in Bulgarian

windows-1253 in Greek

EUC-CN

Development

Use npm run dist to update the distribution files. They're available at https://github.com/aadsm/jschardet/tree/master/dist.

Authors

Ported from python to JavaScript by António Afonso (https://github.com/aadsm/jschardet)

Transformed into an npm package by Markus Ast (https://github.com/brainafk)