iconv

How do I convert files between encodings where only some of them are wrong?

梦想的初衷 提交于 2019-12-02 08:40:09
问题 I have a large set of nested directories containing PHP, HTML, and Javascript files that should all be encoded as UTF-8. However, someone edited several of the files and saved them with ISO-8859-1 encoding. Unfortunately, they're all mixed in with the UTF-8 files. I'd like to use the iconv tool to convert the incorrectly-encoded files to UTF-8 (as described here). Primarily, the problems occur with characters that are valid ISO-8859-1 but invalid UTF-8. I think an appropriate starting point

ISO-8859-1 Character truncates text inserting into utf-8 mysql column

a 夏天 提交于 2019-12-02 08:38:08
So I have a weird truncate issue! Can't find a specific answer on this. So basically there's an issue with an apparent ISO character ½ that truncates the rest of the text upon insertion into a column with UTF-8 specified. Lets say that my string is: "You need to add ½ cup of water." MySQL will truncate that to "You need to add" if I: print iconv("ISO-8859-1", "UTF-8//IGNORE", $text); Then it outputs: ½ O_o OK that doesn't work because I need the 1/2 by itself. If I go to phpMyAdmin and copy and paste the sentence in and submit it, it works like a charm as the whole string is in there with

node.js的iconv模块----在linux上读取windows编码文件

大憨熊 提交于 2019-12-02 07:07:51
有时候我们在windows上会保存一些中文文字信息文件,然而由于编码集的差异,这文件在linux上显示为乱码,其中一种解决方法是node.js的iconv模块 var fs = require('fs'); var readstream = fs.createReadStream('./新建文本文档.txt'); var str=''; var iconv = require('iconv-lite'); var count=0; readstream.on('data',function (chunk) { str+=iconv.decode(chunk,'GBK'); count++; }); readstream.on('end',function () { console.log(str.toString()); console.log("文件分"+count+"次读完") }); 来源: https://www.cnblogs.com/saintdingspage/p/11735862.html

vim文本编辑器的介绍及使用

僤鯓⒐⒋嵵緔 提交于 2019-12-02 03:21:09
(一)什么是vim编辑器 在Linux系统中配置应用服务,实际上就是在修改它的配置文件(配置文件可能有多个,其中包含不同的参数),而且日常工作中也一定免不了编写文档的事情吧,这些都是要通过文本编辑器来完成的。 在热门Linux操作系统中都会默认安装一款超好用的文本编辑器——名字叫“ vim ”, vim 是 vi 编辑器的升级版 。 Vim能够得到这么多厂商与用户的认可,原因就是在Vim编辑器中有三种模式—— 命令模式 、 末行模式 和 编辑模式 ,分别又有多种不同的命令快捷键组合,很大的提高了工作效率,用习惯后会觉得非常的顺手。要想在文本操作时更加高效率,我们必需先搞清Vim编辑器的三种模式的操作不同与切换方法。 (二)三个模式之间的切换 (三)一般模式常用操作 【h(或向左方向键)】 光标左移一个字符 【j(或向下方向键)】 光标下移一个字符 【k(或向上方向键)】 光标上移一个字符 【l(或向右方向键)】 光标右移一个字符 【[Ctrl] + f】 屏幕向下移动一页(相当于Page Down键) 【[Ctrl] + b】 屏幕向上移动一页(相当于Page Up键) 【[0]或[Home]】 光标移动到当前行的最前面 【[$]或[End]】 光标移动到当前行的末尾 【G】 光标移动到文件的最后一行(第一个字符处) 【nG】 n为数字(下同),移动到当前文件中第n行 【gg】

_libiconv or _iconv undefined symbol on Mac OSX

孤街浪徒 提交于 2019-12-01 23:10:43
When compiling some packages from source on Mac OSX, I get the following iconv error: Undefined symbols for architecture x86_64: "_iconv", referenced from: "_iconv_close", referenced from: "_iconv_open", referenced from: or I get: Undefined symbols for architecture x86_64: "_libiconv", referenced from: "_libiconv_open", referenced from: "_libiconv_close", referenced from: Why does this happen and how can I get around this dependency or, more generally, figure out what is going on and how to fix it? John Q I have run into this problem over multiple years / upgrades of Mac OSX. I have thoroughly

thinkphp 分页编码出错 导致第二页分页sql查询乱码

别来无恙 提交于 2019-12-01 21:03:46
点击第二页的时候 url 传值 变为gbk 导致sql 乱码 程序和数据库编码都是utf8. 以下是解决方案。 $keyword = $this->_param('key'); //mb_check_encoding 检查字符串在指定的编码里是否有效 //成功时返回 TRUE , 或者在失败时返回 FALSE 。 //mb_check_encoding([ string $var = NULL [ , string $encoding = mb_ubternal_encoding() ] ] ); if (!mb_check_encoding($keyword, 'utf-8')){ // iconv — 字符串按要求的字符编码来转换 // string iconv (string $in_chatset , string $out_chatset , string $str ) // $in_chatset 输入的字符集 $out_chatset 输出的字符集 $str 要转换的字符串 $keyword = iconv('gbk', 'utf-8', $keyword); } 来源: oschina 链接: https://my.oschina.net/u/1412997/blog/221213

php转换字符串编码 iconv与mb_convert_encoding的区别

不问归期 提交于 2019-12-01 21:03:27
PHP判断字符串编码函数mb_detect_encoding总结 iconv — Convert string to requested character encoding(PHP 4 >= 4.0.5, PHP 5) mb_convert_encoding — Convert character encoding(PHP 4 >= 4.0.6, PHP 5) iconv — 字符串按要求的字符编码来转换 mb_convert_encoding — 转换字符的编码 这 两个 函数功能类似都是用来转换字符串编码的; 用法: string mb_convert_encoding ( string str, string to_encoding [, mixed from_encoding] ) 注:需要先启用 mbstring 扩展库,在 php.ini里将; extension=php_mbstring.dll 前面的 ; 去掉 参数:str——要编码的str、to_encoding——str要转换成编码类型、from_encoding——在转换前通过字符代码名称来指定。 它可以是一个 array 也可以是逗号分隔的枚举列表 。 如果没有提供 from_encoding,则会使用内部(internal)编码。 参见支持的编码。 支持的字符编码 当前 mbstring

PHP中的mb_convert_encoding与iconv函数介绍

妖精的绣舞 提交于 2019-12-01 21:03:10
mb_convert_encoding这个函数是用来转换编码的。原来一直对程序编码这一概念不理解,不过现在好像有点开窍了。 不过英文一般不会存在编码问题,只有中文数据才会有这个问题。比如你用Zend Studio或Editplus写程序时,用的是gbk编码,如果数据需要入数据库,而数据库的编码为utf8时,这时就要把数据进行编码转换,不然进到数据库就会变成乱码。 mb_convert_encoding的用法见官方: http://cn.php.net/manual/zh/function.mb-convert-encoding.php 做一个GBK To UTF-8 < ?php header("content-Type: text/html; charset=Utf-8"); echo mb_convert_encoding("妳係我的友仔", "UTF-8", "GBK"); ?> 再来个GB2312 To Big5 < ?php header("content-Type: text/html; charset=big5"); echo mb_convert_encoding("你是我的朋友", "big5", "GB2312"); ?> 不过要使用上面的函数需要安装但是需要先enable mbstring 扩展库。 PHP中的另外一个函数iconv也是用来转换字符串编码的

How to Convert UTF-16 to UTF-32 and Print the Resulting wchar_t in C?

南笙酒味 提交于 2019-12-01 19:27:57
i'm trying to print out a string of UTF-16 characters. i posted this question a while back and the advice given was to convert to UTF-32 using iconv and print it as a string of wchar_t. i've done some research, and managed to code the following: // *c is the pointer to the characters (UTF-16) i'm trying to print // sz is the size in bytes of the input i'm trying to print iconv_t icv; char in_buf[sz]; char* in; size_t in_sz; char out_buf[sz * 2]; char* out; size_t out_sz; icv = iconv_open("UTF-32", "UTF-16"); memcpy(in_buf, c, sz); in = in_buf; in_sz = sz; out = out_buf; out_sz = sz * 2; size_t