发表新帖

发表新帖

Equivalent of Iconv.conv(“UTF-8//IGNORE”,…) in Ruby 1.9.X?

后端未结

关注

 6  1623

情歌与酒 2021-02-04 09:47

I\'m reading data from a remote source, and occassionally get some characters in another encoding. They\'re not important.

I\'d like to get get a \"best guess\" utf-8 st

6条回答

伪装坚强ぢ (楼主)

2021-02-04 10:40

I have not had luck with the one-line uses of String#encode ala string.encode("UTF-8", :invalid => :replace, :undef => :replace, :replace => "?"). Do not work reliably for me.

But I wrote a pure ruby "backfill" of String#scrub to MRI 1.9 or 2.0 or any other ruby that does not offer a String#scrub.

https://github.com/jrochkind/scrub_rb

It makes String#scrub available in rubies that don't have it; if loaded in MRI 2.1, it will do nothing and you'll still be using the built-in String#scrub, so it can allow you to easily write code that will work on any of these platforms.

It's implementation is somewhat similar to some of the other char-by-char solutions proposed in other answers, but it does not use exceptions for flow control (don't do that), is tested, and provides an API compatible with MRI 2.1 String#scrub

0 讨论(0)

查看其它6个回答
发布评论:

提交评论
- 加载中...

热议问题