Convert a file encoding using R? (ANSI to UTF-8)

前端 未结 2 474
终归单人心
终归单人心 2020-12-16 18:52

I wish to convert an HTML file encoded in ANSI to UTF-8, using R.

Is there a tool, or a combination of tools, that can make this work?

Thanks.

相关标签:
2条回答
  • 2020-12-16 19:19

    you can use iconv:

    writeLines(iconv(readLines("tmp.html"), from = "ANSI_X3.4-1986", to = "UTF8"), "tmp2.html")
    

    tmp2.html should be utf-8.


    Edit by Henrik in June 2015:
    A working solution for Windows distilled from the comments is as follows:

    writeLines(iconv(readLines("tmp.html"), from = "ANSI_X3.4-1986", to = "UTF8"), 
               file("tmp2.html", encoding="UTF-8"))
    
    0 讨论(0)
  • 2020-12-16 19:35

    I had some problems with the solutions proposed above, especially with the TAB character. This alternative never disappointed me. Unfortunately it only works on UNIX-like systems.

    system('iconv -f CP1252 -t UTF-8 < tmp.html > tmp2.html')
    
    0 讨论(0)
提交回复
热议问题