Equivalent of Iconv.conv(“UTF-8//IGNORE”,…) in Ruby 1.9.X?

后端 未结 6 1623
情歌与酒
情歌与酒 2021-02-04 09:47

I\'m reading data from a remote source, and occassionally get some characters in another encoding. They\'re not important.

I\'d like to get get a \"best guess\" utf-8 st

6条回答
  •  伪装坚强ぢ
    2021-02-04 10:40

    I have not had luck with the one-line uses of String#encode ala string.encode("UTF-8", :invalid => :replace, :undef => :replace, :replace => "?") . Do not work reliably for me.

    But I wrote a pure ruby "backfill" of String#scrub to MRI 1.9 or 2.0 or any other ruby that does not offer a String#scrub.

    https://github.com/jrochkind/scrub_rb

    It makes String#scrub available in rubies that don't have it; if loaded in MRI 2.1, it will do nothing and you'll still be using the built-in String#scrub, so it can allow you to easily write code that will work on any of these platforms.

    It's implementation is somewhat similar to some of the other char-by-char solutions proposed in other answers, but it does not use exceptions for flow control (don't do that), is tested, and provides an API compatible with MRI 2.1 String#scrub

提交回复
热议问题