iconv

Importing an Excel file with Greek characters into R in the correct encoding

核能气质少年 提交于 2019-11-28 00:57:43
问题 I am having some trouble importing the following file: http://www.kuleuven.be/bio/ento/temp/test.xlsx into R in the correct encoding. In particular, library("xlsx") read.xlsx("test.xlsx",1,header=F,colClasses=c("character"),encoding="UTF-8") gives me X1 1 a-cadinol 2 a-calacorene 3 a-caryophyllene alcohol 4 a-curcumene 5 a-elemol 6 a-muurolene 7 a-terpineol acetate 8 ß-4-dimethyl-3-cyclohexane-1-ethanol acetate 9 ß-bisabolene 10 ß-bisabolol 11 ß-bourbonene 12 ß-caryophyllene alcohol 13 ß

in `require': no such file to load — iconv (LoadError)

北慕城南 提交于 2019-11-27 23:33:13
➜ expertiza git:(master) ✗ ruby -v ruby 1.8.7 (2011-06-30 patchlevel 352) [i686-darwin11.1.0] ➜ expertiza git:(master) ✗ rails -v Rails 2.3.14 ➜ expertiza git:(master) ✗ script/server /Users/HPV/.rvm/gems/ruby-1.8.7-p352/gems/activesupport-2.3.14/lib/active_support/inflector.rb:3:in `require': no such file to load -- iconv (LoadError) from /Users/HPV/.rvm/gems/ruby-1.8.7-p352/gems/activesupport-2.3.14/lib/active_support/inflector.rb:3 from /Users/HPV/.rvm/gems/ruby-1.8.7-p352/gems/activesupport-2.3.14/lib/active_support/core_ext/integer/inflections.rb:1:in `require' from /Users/HPV/.rvm/gems

how to get list of supported encodings by iconv library in php?

[亡魂溺海] 提交于 2019-11-27 23:31:09
Is it possible like in the mcrypt library with function mcrypt_list_algorithms() . Is there a iconv_list_encodings like function ? In PHP the iconv extension does not have a function to list all available encodings. The encodings which are available depends on which library iconv internally uses. For example there is libiconv . That website also contains a list of charsets you can use. You can also connect to your server via SSH and execute the following command: $ iconv -l This will give you the available list on that system if PHP is compiled against the same library as the command-line

iconv_strlen function causing execution timeout, running on MAMP

 ̄綄美尐妖づ 提交于 2019-11-27 20:45:12
Has anyone had issues with the iconv_strlen function while running MAMP? user673450 I have been having a timeout issue with it, but not with any exceptions being thrown. I'm working on a Zend Framework site. By following the debugger deep into the guts, I tracked the problem down to the use of iconv_strlen. It's not being called on any strange string, it's a simple function being used to validate a hostname. To verify the issue, I tried a simple iconv_strlen("test", 'UTF-8'); This causes the error to come up - endless spinning loader in browser but no error log message, and the script goes

Ruby converting string encoding from ISO-8859-1 to UTF-8 not working

两盒软妹~` 提交于 2019-11-27 18:14:05
问题 I am trying to convert a string from ISO-8859-1 encoding to UTF-8 but I can't seem to get it work. Here is an example of what I have done in irb. irb(main):050:0> string = 'Norrlandsvägen' => "Norrlandsvägen" irb(main):051:0> string.force_encoding('iso-8859-1') => "Norrlandsv\xC3\xA4gen" irb(main):052:0> string = string.encode('utf-8') => "Norrlandsvägen" I am not sure why Norrlandsvägen in iso-8859-1 will be converted into Norrlandsvägen in utf-8. I have tried encode, encode!, encode

cocos2d-x解决中文乱码问题的几种办法

戏子无情 提交于 2019-11-27 15:18:41
将源代码文件保存为utf8编码,不过由于编译器的问题,这种方式会导致很多无法预测的问题 将字符串用utf8编码集中存到一文件中,然后用代码读取这些字符串来使用,这种办法还能很好的支持多语言版本 使用字符串时,先将其转换为utf8编码 我最终使用了第三种方法,第一种撇开不说,第二种实现起来比较麻烦,第三种则要方便很多。 一般在windows上,我们使用API MultiByteToWideChar来进行各种编码转换。 不过这东西只能在Windows上用,在cocos2d-x上用就有点不合时宜的感觉,毕竟安卓上可没这个API。 还好cocos2d-x考虑很周到,它自带了一个iconv库 只需要在项目附加依赖项里加入libiconv.lib,并且包含头文件iconv/iconv.h即可使用。 我通过这个库封装了几个编码转换的函数,代码如下 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 #include "Tool.h" int code_convert(const char *from_charset, const char *to_charset, const char *inbuf, size

How to convert character encoding from CP932 to UTF-8 in nodejs javascript, using the nodejs-iconv module (or other solution)

喜夏-厌秋 提交于 2019-11-27 15:16:59
问题 I'm attempting to convert a string from CP932 (aka Windows-31J) to utf8 in javascript. Basically I'm crawling a site that ignores the utf-8 request in the request header and returns cp932 encoded text (even though the html metatag indicates that the page is shift_jis). Anyway, I have the entire page stored in a string variable called "html". From there I'm attempting to convert it to utf8 using this code: var Iconv = require('iconv').Iconv; var conv = new Iconv('CP932', 'UTF-8//TRANSLIT/

iconv any encoding to UTF-8

安稳与你 提交于 2019-11-27 14:46:55
问题 I am trying to point iconv to a directory and all files will be converted UTF-8 regardless of the current encoding I am using this script but you have to specify what encoding you are going FROM. How can I make it autdetect the current encoding? dir_iconv.sh #!/bin/bash ICONVBIN='/usr/bin/iconv' # path to iconv binary if [ $# -lt 3 ] then echo "$0 dir from_charset to_charset" exit fi for f in $1/* do if test -f $f then echo -e "\nConverting $f" /bin/mv $f $f.old $ICONVBIN -f $2 -t $3 $f.old >

Emoticons in Twitter Sentiment Analysis in r

吃可爱长大的小学妹 提交于 2019-11-27 13:38:21
How do I handle/get rid of emoticons so that I can sort tweets for sentiment analysis? Getting: Error in sort.list(y) : invalid input Thanks and this is how the emoticons come out looking from twitter and into r: \xed��\xed�\u0083\xed��\xed�� \xed��\xed�\u008d\xed��\xed�\u0089 This should get rid of the emoticons, using iconv as suggested by ndoogan. Some reproducible data: require(twitteR) # note that I had to register my twitter credentials first # here's the method: http://stackoverflow.com/q/9916283/1036500 s <- searchTwitter('#emoticons', cainfo="cacert.pem") # convert to data frame df <-

Force encode from US-ASCII to UTF-8 (iconv)

岁酱吖の 提交于 2019-11-27 10:33:42
I'm trying to transcode a bunch of files from US-ASCII to UTF-8. For that, I'm using iconv: iconv -f US-ASCII -t UTF-8 file.php > file-utf8.php Thing is my original files are US-ASCII encoded, which makes the conversion not to happen. Apparently it occurs cause ASCII is a subset of UTF-8... http://www.linuxquestions.org/questions/linux-software-2/iconv-us-ascii-to-utf-8-or-iso-8859-15-a-705054/ And quoting: There's no need for the textfile to appear otherwise until non-ascii characters are introduced True. If I introduce a non-ASCII character in the file and save it, let's say with Eclipse,