问题
I was working on a toy project and tried using some unicode variable names to match a paper I was attempting to implement.
The following code works fine on R 3.4.3 on Windows (RStudio version 1.1.456) and R 3.5.1 on OSX:
> µ <- function(ß, n) ß * n
> µ(2, 3)
[1] 6
This code gives the following error, with α typed as ALT+224:
> α <- 2
Error: unexpected input in "\"
The file was saved as UTF-8, so this is surprising to me.
make.names
is consistent with the results above:
> make.names('µ')
[1] "µ"
> make.names('α')
[1] "a"
What is the rule for non-ASCII letters, why are mu and scharfes OK but alpha isn't?
Edit: Output of sessionInfo()
> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_3.4.3 tools_3.4.3 yaml_2.2.0
Edit2: It seems like Sys.setlocale
should be the answer, but here is what happens when I try this:
> Sys.setlocale("LC_ALL", 'en_US.UTF-8')
[1] ""
Warning message:
In Sys.setlocale("LC_ALL", "en_US.UTF-8") :
OS reports request to set locale to "en_US.UTF-8" cannot be honored
回答1:
Working with Ben Bolker we determined the issue was that the current session was using character encoding Windows-1252, which has some non-ASCII characters but not many. This is despite the fact that RStudio saved the file as UTF-8.
Attempting to change the current collation of a running R session does not seem to be possible? At least on Windows I get a warning (see the question and here).
I have a partial solution, if someone finds themselves in the situation where they are given a file like this and want to run it and have interactive access to the results, the following will mostly work (variables will be translated to Win-1252):
> source('utf-8-file.r', encoding='UTF-8')
I would be very excited to see a better solution, one which allows editing and running the file and entering such snippets into the console of RStudio on Windows.
来源:https://stackoverflow.com/questions/52020256/unicode-variable-names-in-r