Converting a string from utf8 to latin1 in NodeJS

流过昼夜 提交于 2019-12-23 15:23:46

问题


I'm using a Latin1 encoded DB and can't change it to UTF-8 meaning that I run into issues with certain application data. I'm using Tesseract to OCR a document (tesseract encodes in UTF-8) and tried to use iconv-lite; however, it creates a buffer and to convert that buffer into a string. But again, buffer to string conversion does not allow "latin1" encoding.

I've read a bunch of questions/answers; however, all I get is setting client encoding and stuff like that.

Any ideas?


回答1:


You can create a buffer from the UFT8 string you have, and then decode that buffer to Latin 1 using iconv-lite, like this

var buff   = new Buffer(tesseract_string, 'utf8');
var DB_str = iconv.decode(buff, 'ISO-8859-1');



回答2:


I've found a way to convert any encoded text file, to UTF8

var 
  fs = require('fs'),
  charsetDetector = require('node-icu-charset-detector'),
  iconvlite = require('iconv-lite');

/* Having different encodings
 * on text files in a git repo
 * but need to serve always on 
 * standard 'utf-8'
 */
function getFileContentsInUTF8(file_path) {
  var content = fs.readFileSync(file_path);
  var original_charset = charsetDetector.detectCharset(content);
  var jsString = iconvlite.decode(content, original_charset.toString());
  return jsString;
}

I'ts also in a gist here: https://gist.github.com/jacargentina/be454c13fa19003cf9f48175e82304d5

Maybe you can try this, where content should be your database buffer data (in latin1 encoding)



来源:https://stackoverflow.com/questions/28594498/converting-a-string-from-utf8-to-latin1-in-nodejs

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!