问题
UPDATE (April 2018):
The problem still persists, under different settings and computers.
I believe it is related to all UNICODE, UTF-8 characters.
https://www.joelonsoftware.com/2003/10/08/the-absolute-minimum-every-software-developer-absolutely-positively-must-know-about-unicode-and-character-sets-no-excuses/
PROBLEM:
My Rmd/R file is saved with UTF-8 encoding. Other sessionInfo()
details:
Platform: x86_64-w64-mingw32/x64 (64-bit)
LC_CTYPE=English_Canada.1252
other attached packages:
[1] knitr_1.17
Here is a simple data frame that I need to print as a table in a html document, e.g. with kable(dt)
or any other way.
dt <- data.frame(
name=c("Борис Немцов","Martin Luter King"),
year=c("2015","1968")
)
Neither of the following works:
Way 1
If I keep Sys.setlocale() as is (i.e. "English_Canada.1252"
), then I get this:
> dt;
name year
1 <U+0411><U+043E><U+0440><U+0438><U+0441> <U+041D><U+0435><U+043C><U+0446><U+043E><U+0432> 2015
2 Martin Luter King 1968
> kable(dt)
|name |year |
|:-----------------------------------------------------------------------------------------|:----|
|<U+0411><U+043E><U+0440><U+0438><U+0441> <U+041D><U+0435><U+043C><U+0446><U+043E><U+0432> |2015 |
|Martin Luter King |1968 |
Note that <U+....>
are printed instead of characters.
Using dt$name <- enc2utf8(as.character(dt$name))
did not help.
Way 2
If I change Sys.setlocale("LC_CTYPE", "russian")
#"Russian_Russia.1251"`,
then I get this:
> dt;
name year
1 Áîðèñ Íåìöîâ 2015
2 Martin Luter King 1968
> kable(dt)
|name |year |
|:-----------------|:----|
|Áîðèñ Íåìöîâ |2015 |
|Martin Luter King |1968 |
Note that characters have become gibberish.
Using print(dt,encoding="windows-1251"); print(dt,encoding="UTF-8")
had no effect.
Any advice?
The closest I could find to address this problem are in the following links, but they did not help: http://blog.rolffredheim.com/2013/01/r-and-foreign-characters.html, https://tomizonor.wordpress.com/2013/04/17/file-utf8-windows, https://www.smashingmagazine.com/2012/06/all-about-unicode-utf8-character-sets
I also tried to save my file with 1251 encoding (instead of current UTF-8 encoding) and some other character conversion/processing packages. Nothing helped yet.
UPDATE:
Opened related question: How to change Sys.setlocale, when you get Error "request to set locale … cannot be honored"
回答1:
The only solution that worked was the one suggested by Yihui Xie (knitr
developer), which is :
creating a file .Rprofile
, which contains one line Sys.setlocale("LC_CTYPE", "russian")
and placing it in your home or working directory.
However, please note that, it works only with use of kable()
, i.e with help of knitr
package.
If you try to print with print(dt$name[1])
, you still get Áîðèñ Íåìöîâ
.
However, if you use kable(dt$name[1])
, you'll get what you need - Борис Немцов
!
来源:https://stackoverflow.com/questions/48307007/printing-utf-8-characters-in-r-rmd-knitr-bookdown