Convert non-ASCII chars from ASCII-8BIT to UTF-8

后端 未结 4 1408
鱼传尺愫
鱼传尺愫 2021-02-01 12:46

I\'m pulling text from remote sites and trying to load it into a Ruby 1.9/Rails 3 app that uses utf-8 by default.

Here is an example of some offending text:



        
4条回答
  •  情话喂你
    2021-02-01 13:37

    I used to do this for a script that scraped Greek Windows-encoded pages, using open-uri, iconv and Hpricot:

    doc = open(DATA_URL)
    doc.rewind
    data = Hpricot(Iconv.conv('utf-8', "WINDOWS-1253", doc.readlines.join("\n")))
    

    I believe that was Ruby 1.8.7, not sure how things are with ruby 1.9

提交回复
热议问题