问题
I'm using a Latin1 encoded DB and can't change it to UTF-8 meaning that I run into issues with certain application data. I'm using Tesseract to OCR a document (tesseract encodes in UTF-8) and tried to use iconv-lite; however, it creates a buffer and to convert that buffer into a string. But again, buffer to string conversion does not allow "latin1" encoding.
I've read a bunch of questions/answers; however, all I get is setting client encoding and stuff like that.
Any ideas?
回答1:
You can create a buffer from the UFT8 string you have, and then decode that buffer to Latin 1 using iconv-lite, like this
var buff = new Buffer(tesseract_string, 'utf8');
var DB_str = iconv.decode(buff, 'ISO-8859-1');
回答2:
I've found a way to convert any encoded text file, to UTF8
var
fs = require('fs'),
charsetDetector = require('node-icu-charset-detector'),
iconvlite = require('iconv-lite');
/* Having different encodings
* on text files in a git repo
* but need to serve always on
* standard 'utf-8'
*/
function getFileContentsInUTF8(file_path) {
var content = fs.readFileSync(file_path);
var original_charset = charsetDetector.detectCharset(content);
var jsString = iconvlite.decode(content, original_charset.toString());
return jsString;
}
I'ts also in a gist here: https://gist.github.com/jacargentina/be454c13fa19003cf9f48175e82304d5
Maybe you can try this, where content
should be your database buffer data (in latin1 encoding)
来源:https://stackoverflow.com/questions/28594498/converting-a-string-from-utf8-to-latin1-in-nodejs