问题
R doesn't display correctly Arabic text. I get very weird stuff when I use Arabic. Here's a screenshot:
The problem is that I want to create a wordcloud with Arabic text and I need to solve this problem first.
R version: R 2.15.2 GUI 1.53 Leopard build 64-bit (6335)
Here are more info:
> options("encoding")
$encoding
[1] "native.enc"
> Encoding("الله")
[1] "unknown"
SessionInfo():
> sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
locale:
[1] C/C/C/C/de_DE/C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] tools_2.15.2
>
Some tinkering:
> x = "مرحبا"
> Encoding(x) = "UTF-8"
> x
[1] "<U+0645><U+0631><U+062D><U+0628><U+0627>"
> Encoding(iconv(x))
[1] "unknown"
More info:
> Sys.getlocale()
[1] "C/C/C/C/de_DE/C"
> Sys.setlocale("LC_ALL", "en_US.utf8")
[1] ""
Warning message:
In Sys.setlocale("LC_ALL", "en_US.utf8") :
OS reports request to set locale to "en_US.utf8" cannot be honored
>
This solved the problem:
Sys.setlocale("LC_ALL", "en_US.UTF-8")
回答1:
This works:
Sys.setlocale("LC_ALL", "en_US.UTF-8")
回答2:
Just wanted to point out that I'm not having this problem (Arabic characters are displayed correctly without any change to locale), even though I am not in a UTF-8 locale. Not sure what to make of this, so if someone else does please enlighten us.
I'm using RStudio 0.98.1091 and my sessionInfo is as follows :
> sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: i386-w64-mingw32/i386 (32-bit)
locale:
[1] LC_COLLATE=French_France.1252 LC_CTYPE=French_France.1252 LC_MONETARY=French_France.1252
[4] LC_NUMERIC=C LC_TIME=French_France.1252
来源:https://stackoverflow.com/questions/18677571/assigning-arabic-text-to-r-variables