Using “cat” to write non-English characters into a .html file (in R)

半城伤御伤魂 提交于 2019-12-04 04:05:50

问题


Here is the code showing the problem:

myPath = getwd()
cat("abcd", append = T, file =paste(myPath,"temp1.html", sep = "\\")) # This is fine
cat("<BR/><BR/><BR/>", append = T, file =paste(myPath,"temp1.html", sep = "\\")) # This is fine
cat("שלום", append = F, file =paste(myPath,"temp1.html", sep = "\\")) # This text gets garbled when the html is opened using google chrome on windows 7.
cat("שלום", append = F, file =paste(myPath,"temp1.txt", sep = "\\")) # but if I open this file in a text editor - the text looks fine

# The text in the HTML folder would look as if I where to run this in R:
(x <- iconv("שלום", from = "CP1252", to = "UTF8") )
# But if I where to try and put it into the file, it wouldn't put anything in:
cat(x, append = T, file =paste(myPath,"temp1.html", sep = "\\")) # empty

Edit: I've also tried using the following encoding (without success)

ff <-file(paste(myPath,"temp1.html", sep = "\\"), encoding="CP1252")
cat("שלום", append = F, file =ff)
ff<-file(paste(myPath,"temp1.html", sep = "\\"), encoding="utf-8")
cat("שלום", append = F, file =ff)
ff<-file(paste(myPath,"temp1.html", sep = "\\"), encoding="ANSI_X3.4-1986")
cat("שלום", append = F, file =ff)
ff<-file(paste(myPath,"temp1.html", sep = "\\"), encoding="iso8859-8")
cat("שלום", append = F, file =ff)

Any suggestions? Thanks.


回答1:


Your code is a bit redundant. Is temp1.txt on line 5 a typo (.html)? Anyway, perhaps you should set charset within <meta> tag.

Take this as an example:

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<%
cat("abcd")
cat("<BR/><BR/><BR/>")
cat("שלום")
cat("שלום")
(x <- iconv("שלום", from = "CP1252", to = "UTF8") )
cat(x)
-%>
</body>
</html>

It is a brew code, so if you go ahead and brew it, you'll get correct response. Long story short, the keyword was charset.




回答2:


The problem isn’t with R (R is correctly producing UTF-8 encoded output) … it’s just that your web browser assumes the wrong encoding in the absence of an explicitly specified encoding. Just use the following snippet (from inside R) instead:

<html>
    <head>
        <meta http-equiv="content-type" content="text/html; charset=utf-8">
    </head>
    <body>
        שלום
    </body>
</html>

This specifies a correct encoding (UTF-8), and hence causes the browser to thread the following text correctly.




回答3:


Try it this way

cat("abcd", file = (con <- file("temp1.html", "w", encoding="UTF-8"))); close(con)


来源:https://stackoverflow.com/questions/7483742/using-cat-to-write-non-english-characters-into-a-html-file-in-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!