Equivalent of Iconv.conv(“UTF-8//IGNORE”,…) in Ruby 1.9.X?

后端 未结 6 1606
情歌与酒
情歌与酒 2021-02-04 09:47

I\'m reading data from a remote source, and occassionally get some characters in another encoding. They\'re not important.

I\'d like to get get a \"best guess\" utf-8 st

6条回答
  •  小蘑菇
    小蘑菇 (楼主)
    2021-02-04 10:50

    I thought this was it:

    string.encode("UTF-8", :invalid => :replace, :undef => :replace, :replace => "?")

    will replace all knowns with '?'.

    To ignore all unknowns, :replace => '':

    string.encode("UTF-8", :invalid => :replace, :undef => :replace, :replace => "")

    Edit:

    I'm not sure this is reliable. I've gone into paranoid-mode, and have been using:

    string.encode("UTF-8", ...).force_encoding('UTF-8')

    Script seems to be running, ok now. But I'm pretty sure I'd gotten errors with this earlier.

    Edit 2:

    Even with this, I continue to get intermittant errors. Not every time, mind you. Just sometimes.

提交回复
热议问题