问题
UPDATE (April 2018):
The problem still persists, under different settings and computers.
I believe it is related to all UNICODE, UTF-8 characters.
https://www.joelonsoftware.com/2003/10/08/the-absolute-minimum-every-software-developer-absolutely-positively-must-know-about-unicode-and-character-sets-no-excuses/
PROBLEM:
My Rmd/R file is saved with UTF-8 encoding. Other sessionInfo() details:
Platform: x86_64-w64-mingw32/x64 (64-bit)
LC_CTYPE=English_Canada.1252
other attached packages:
[1] knitr_1.17
Here is a simple data frame that I need to print as a table in a html document, e.g. with kable(dt) or any other way.
dt <- data.frame(
name=c("Борис Немцов","Martin Luter King"),
year=c("2015","1968")
)
Neither of the following works:
Way 1
If I keep Sys.setlocale() as is (i.e. "English_Canada.1252"), then I get this:
> dt;
name year
1 <U+0411><U+043E><U+0440><U+0438><U+0441> <U+041D><U+0435><U+043C><U+0446><U+043E><U+0432> 2015
2 Martin Luter King 1968
> kable(dt)
|name |year |
|:-----------------------------------------------------------------------------------------|:----|
|<U+0411><U+043E><U+0440><U+0438><U+0441> <U+041D><U+0435><U+043C><U+0446><U+043E><U+0432> |2015 |
|Martin Luter King |1968 |
Note that <U+....> are printed instead of characters.
Using dt$name <- enc2utf8(as.character(dt$name)) did not help.
Way 2
If I change Sys.setlocale("LC_CTYPE", "russian") #"Russian_Russia.1251"`,
then I get this:
> dt;
name year
1 Áîðèñ Íåìöîâ 2015
2 Martin Luter King 1968
> kable(dt)
|name |year |
|:-----------------|:----|
|Áîðèñ Íåìöîâ |2015 |
|Martin Luter King |1968 |
Note that characters have become gibberish.
Using print(dt,encoding="windows-1251"); print(dt,encoding="UTF-8") had no effect.
Any advice?
The closest I could find to address this problem are in the following links, but they did not help: http://blog.rolffredheim.com/2013/01/r-and-foreign-characters.html, https://tomizonor.wordpress.com/2013/04/17/file-utf8-windows, https://www.smashingmagazine.com/2012/06/all-about-unicode-utf8-character-sets
I also tried to save my file with 1251 encoding (instead of current UTF-8 encoding) and some other character conversion/processing packages. Nothing helped yet.
UPDATE:
Opened related question: How to change Sys.setlocale, when you get Error "request to set locale … cannot be honored"
回答1:
The only solution that worked was the one suggested by Yihui Xie (knitr developer), which is :
creating a file .Rprofile, which contains one line Sys.setlocale("LC_CTYPE", "russian") and placing it in your home or working directory.
However, please note that, it works only with use of kable(), i.e with help of knitr package.
If you try to print with print(dt$name[1]), you still get Áîðèñ Íåìöîâ.
However, if you use kable(dt$name[1]), you'll get what you need - Борис Немцов !
来源:https://stackoverflow.com/questions/48307007/printing-utf-8-characters-in-r-rmd-knitr-bookdown