iconv

Ruby converting string encoding from ISO-8859-1 to UTF-8 not working

我与影子孤独终老i 提交于 2019-11-29 04:22:58
I am trying to convert a string from ISO-8859-1 encoding to UTF-8 but I can't seem to get it work. Here is an example of what I have done in irb. irb(main):050:0> string = 'Norrlandsvägen' => "Norrlandsvägen" irb(main):051:0> string.force_encoding('iso-8859-1') => "Norrlandsv\xC3\xA4gen" irb(main):052:0> string = string.encode('utf-8') => "Norrlandsvägen" I am not sure why Norrlandsvägen in iso-8859-1 will be converted into Norrlandsvägen in utf-8. I have tried encode, encode!, encode(destinationEncoding, originalEncoding), iconv, force_encoding, and all kinds of weird work-arounds I could

How can I force PHP to use the libiconv version of iconv instead of the CentOS-installed glibc version?

人走茶凉 提交于 2019-11-29 03:05:15
问题 The code I'm working on runs perfectly on Windows XP and on Mac OS X. When testing it on CentOS (and on Fedora and Ubuntu), it's not working properly. Searching the nets led me to the conclusion that it's the glibc version of the iconv that's causing the problem. So now I need the libiconv version of iconv for Zend Lucene to work properly. I already downloaded libiconv and configured it with --prefix=/usr/local , make , then make install without any errors. It seems that it was successfully

How to convert character encoding from CP932 to UTF-8 in nodejs javascript, using the nodejs-iconv module (or other solution)

和自甴很熟 提交于 2019-11-29 00:15:35
I'm attempting to convert a string from CP932 (aka Windows-31J) to utf8 in javascript. Basically I'm crawling a site that ignores the utf-8 request in the request header and returns cp932 encoded text (even though the html metatag indicates that the page is shift_jis). Anyway, I have the entire page stored in a string variable called "html". From there I'm attempting to convert it to utf8 using this code: var Iconv = require('iconv').Iconv; var conv = new Iconv('CP932', 'UTF-8//TRANSLIT//IGNORE'); var myBuffer = new Buffer(html.length * 3); myBuffer.write(html, 0, 'utf8') var utf8html = (conv

iconv any encoding to UTF-8

三世轮回 提交于 2019-11-28 23:12:46
I am trying to point iconv to a directory and all files will be converted UTF-8 regardless of the current encoding I am using this script but you have to specify what encoding you are going FROM. How can I make it autdetect the current encoding? dir_iconv.sh #!/bin/bash ICONVBIN='/usr/bin/iconv' # path to iconv binary if [ $# -lt 3 ] then echo "$0 dir from_charset to_charset" exit fi for f in $1/* do if test -f $f then echo -e "\nConverting $f" /bin/mv $f $f.old $ICONVBIN -f $2 -t $3 $f.old > $f else echo -e "\nSkipping $f - not a regular file"; fi done terminal line sudo convert/dir_iconv.sh

linux 解压缩及相关命令汇总

半世苍凉 提交于 2019-11-28 21:22:47
zip unzip tar gzip zcat znew bzip2 bzcat bzmore bzless bzcmp bzgrep bzdiff bzip2recover gzexe compress uncompress lha命令是从lharc演变而来的压缩程序,文件经它压缩后,会另外产生具有.lzh扩展名的压缩文件。 zipsplit命令用于将较大的“zip”压缩包分割成各个较小的“zip”压缩包。 zipinfo命令用来列出压缩文件信息。执行zipinfo指令可得知zip压缩文件的详细信息。 arj命令是“.arj”格式的压缩文件的管理器,用于创建和管理“.arj”压缩包。 unarj命令用来解压缩由arj命令创建的压缩包。 zfore命令强制为gzip格式的压缩文件添加“.gz”后缀。 cpio命令主要是用来建立或者还原备份档的工具程序,cpio命令可以复制文件到归档包中,或者从归档包中复制文件。 iconv命令是用来转换文件的编码方式的,比如它可以将UTF8编码的转换成GB18030的编码,反过来也行。JDK中也提供了类似的工具native2ascii。Linux下的iconv开发库包括iconv_open,iconv_close,iconv等C函数,可以用来在C/C++程序中很方便的转换字符编码,这在抓取网页的程序中很有用处,而iconv命令在调试此类程序时用得着

Transliterate any convertible utf8 char into ascii equivalent

痞子三分冷 提交于 2019-11-28 20:30:37
Is there any good solution out there that does this transliteration in a good manner? I've tried using iconv() , but is very annoying and it does not behave as one might expect. Using //TRANSLIT will try to replace what it can, leaving everything nonconvertible as "?" Using //IGNORE will not leave "?" in text, but will also not transliterate and will also raise E_NOTICE when nonconvertible char is found, so you have to use iconv with @ error suppressor Using //IGNORE//TRANSLIT (as some people suggested in PHP forum) is actually same as //IGNORE (tried it myself on php versions 5.3.2 and 5.3.13

Migrating a php application to handle UTF-8

混江龙づ霸主 提交于 2019-11-28 12:48:59
I am working on a multi-language app in php. All was fine until recently I was asked to support Chinese characters. The actions I took to support UTF-8 characters are the following: All DB tables are now UTF-8 HTML templates contain the tag <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> The controllers send out a header specifying the encoding (utf-8) to use for the http response All was good until I started making some string manipulations (substr and the likes) With chinese it won't work because the chinese is represented as multibytes and hence if you do a normal

Convert UTF-16LE to UTF-8 in php

微笑、不失礼 提交于 2019-11-28 11:14:50
I use iconv php function but some characters doesn't convert correctly: ... $s = iconv('UTF-16', 'UTF-8', $s); ... $s = iconv('UTF-16//IGNORE', 'UTF-8', $s); ... $s = iconv('UTF-16LE', 'UTF-8', $s); ... $s = iconv('UTF-16LE//IGNORE', 'UTF-8', $s); ... I also try mb_convert_encoding function but can't solve my problem. A sample text file: 9px.ir/utf8-16LE.rar iconv supports the UTF-16LE encoding . You can use it to transpose the encoding from UTF-16LE to UTF-8 : $result = iconv($in_charset = 'UTF-16LE' , $out_charset = 'UTF-8' , $str); if (false === $result) { throw new Exception('Input string

PHP: Dealing special characters with iconv

天涯浪子 提交于 2019-11-28 07:17:40
I still don't understand how iconv works. For instance, $string = "Löic & René"; $output = iconv("UTF-8", "ISO-8859-1//TRANSLIT", $string); I get, Notice: iconv() [function.iconv]: Detected an illegal character in input string in... $string = "Löic"; or $string = "René"; I get, Notice: iconv() [function.iconv]: Detected an incomplete multibyte character in input string in. I get nothing with $string = "&"; There are two sets of different outputs I need store them in the two different columns inside the table of my database, I need to convert Löic & René to Loic & Rene for clean url purposes. I

Batch convert latin-1 files to utf-8 using iconv

随声附和 提交于 2019-11-28 03:12:25
I'm having this one PHP project on my OSX which is in latin1 -encoding. Now I need to convert files to UTF8. I'm not much a shell coder and I tried something I found from internet: mkdir new for a in `ls -R *`; do iconv -f iso-8859-1 -t utf-8 <"$a" >new/"$a" ; done But that does not create the directory structure and it gives me heck load of errors when run. Can anyone come up with neat solution? You shouldn't use ls like that and a for loop is not appropriate either. Also, the destination directory should be outside the source directory. mkdir /path/to/destination find . -type f -exec iconv