I\'m reading data from a remote source, and occassionally get some characters in another encoding. They\'re not important.
I\'d like to get get a \"best guess\" utf-8 st
I have not had luck with the one-line uses of String#encode ala string.encode("UTF-8", :invalid => :replace, :undef => :replace, :replace => "?")
. Do not work reliably for me.
But I wrote a pure ruby "backfill" of String#scrub to MRI 1.9 or 2.0 or any other ruby that does not offer a String#scrub.
https://github.com/jrochkind/scrub_rb
It makes String#scrub available in rubies that don't have it; if loaded in MRI 2.1, it will do nothing and you'll still be using the built-in String#scrub, so it can allow you to easily write code that will work on any of these platforms.
It's implementation is somewhat similar to some of the other char-by-char solutions proposed in other answers, but it does not use exceptions for flow control (don't do that), is tested, and provides an API compatible with MRI 2.1 String#scrub