I\'m getting \"invalid byte sequence in UTF-8\" on page requests (permalinks) and I have no idea why nor can I reproduce it but I do get a lot of exceptions like this:
I've just posted a new gem called UTF8Cleaner which is heavily based on @phoet and @pithyless' work. It include a Railtie, so you can just drop it in to your Gemfile and forget about those "invalid byte sequence" errors.
https://github.com/singlebrook/utf8-cleaner
If you are using apache (and mod_rails) you can prevent these invalid url requests from hitting your Rails application completely by following this answer:
https://stackoverflow.com/questions/13512727/how-can-i-configure-apache-to-respond-400-when-request-contains-an-invalid-byte/13527812#13527812
we created a rails middleware that filters out all the strange encodings that can not be handled within our app.
the problem that we encounter is that there are requests that have strange encodings, for example Cp1252 / Windows-1252. when ruby 1.9 tries to match those strings against utf-8 regexps it blows up.
i tried various ways of dealing with this problem by using iconv, but it looks like solutions that work on my mac don't work on the servers. so the simplest approach is probably the best...
Similar to @phoet, I also used a Rails Middleware to solve similar encoding issues.
Tested on Ruby 1.9.3 (no Iconv):
https://gist.github.com/3639014