I have some errors triggered by a chinese bot: http://www.easou.com/search/spider.html when it scrolls my websites.
Versions of my applications are all with Ruby 1.9.3 and Rails 3.2.X
Here a stacktrace :
An ArgumentError occurred in listings#show:
invalid byte sequence in UTF-8
rack (1.4.5) lib/rack/utils.rb:104:in `normalize_params'
-------------------------------
Request:
-------------------------------
* URL : http://www.my-website.com
* IP address: X.X.X.X
* Parameters: {"action"=>"show", "controller"=>"listings", "id"=>"location-t7-villeurbanne--58"}
* Rails root: /.../releases/20140708150222
* Timestamp : 2014-07-09 02:57:43 +0200
-------------------------------
Backtrace:
-------------------------------
rack (1.4.5) lib/rack/utils.rb:104:in `normalize_params'
rack (1.4.5) lib/rack/utils.rb:96:in `block in parse_nested_query'
rack (1.4.5) lib/rack/utils.rb:93:in `each'
rack (1.4.5) lib/rack/utils.rb:93:in `parse_nested_query'
rack (1.4.5) lib/rack/request.rb:332:in `parse_query'
actionpack (3.2.18) lib/action_dispatch/http/request.rb:275:in `parse_query'
rack (1.4.5) lib/rack/request.rb:209:in `POST'
actionpack (3.2.18) lib/action_dispatch/http/request.rb:237:in `POST'
actionpack (3.2.18) lib/action_dispatch/http/parameters.rb:10:in `parameters'
-------------------------------
Session:
-------------------------------
* session id: nil
* data: {}
-------------------------------
Environment:
-------------------------------
* CONTENT_LENGTH : 514
* CONTENT_TYPE : application/x-www-form-urlencoded
* HTTP_ACCEPT : text/html, application/xml;q=0.9, application/xhtml+xml, image/png, image/jpeg, image/gif, image/x-xbitmap, */*;q=0.1
* HTTP_ACCEPT_ENCODING : gzip, deflate
* HTTP_ACCEPT_LANGUAGE : zh;q=0.9,en;q=0.8
* HTTP_CONNECTION : close
* HTTP_HOST : www.my-website.com
* HTTP_REFER : http://www.my-website.com/
* HTTP_USER_AGENT : Mozilla/5.0 (compatible; EasouSpider; +http://www.easou.com/search/spider.html)
* ORIGINAL_FULLPATH : /
* PASSENGER_APP_SPAWNER_IDLE_TIME : -1
* PASSENGER_APP_TYPE : rack
* PASSENGER_CONNECT_PASSWORD : [FILTERED]
* PASSENGER_DEBUGGER : false
* PASSENGER_ENVIRONMENT : production
* PASSENGER_FRAMEWORK_SPAWNER_IDLE_TIME : -1
* PASSENGER_FRIENDLY_ERROR_PAGES : true
* PASSENGER_GROUP :
* PASSENGER_MAX_REQUESTS : 0
* PASSENGER_MIN_INSTANCES : 1
* PASSENGER_SHOW_VERSION_IN_HEADER : true
* PASSENGER_SPAWN_METHOD : smart-lv2
* PASSENGER_USER :
* PASSENGER_USE_GLOBAL_QUEUE : true
* PATH_INFO : /
* QUERY_STRING :
* REMOTE_ADDR : 183.60.212.153
* REMOTE_PORT : 52997
* REQUEST_METHOD : GET
* REQUEST_URI : /
* SCGI : 1
* SCRIPT_NAME :
* SERVER_PORT : 80
* SERVER_PROTOCOL : HTTP/1.1
* SERVER_SOFTWARE : nginx/1.2.6
* UNION_STATION_SUPPORT : false
* _ : _
* action_controller.instance : listings#show
* action_dispatch.backtrace_cleaner : #<Rails::BacktraceCleaner:0x000000056e8660>
* action_dispatch.cookies : #<ActionDispatch::Cookies::CookieJar:0x00000006564e28>
* action_dispatch.logger : #<ActiveSupport::TaggedLogging:0x0000000318aff8>
* action_dispatch.parameter_filter : [:password, /RAW_POST_DATA/, /RAW_POST_DATA/, /RAW_POST_DATA/]
* action_dispatch.remote_ip : 183.60.212.153
* action_dispatch.request.content_type : application/x-www-form-urlencoded
* action_dispatch.request.parameters : {"action"=>"show", "controller"=>"listings", "id"=>"location-t7-villeurbanne--58"}
* action_dispatch.request.path_parameters : {:action=>"show", :controller=>"listings", :id=>"location-t7-villeurbanne--58"}
* action_dispatch.request.query_parameters : {}
* action_dispatch.request.request_parameters : {}
* action_dispatch.request.unsigned_session_cookie: {}
* action_dispatch.request_id : 9f8afbc8ff142f91ddbd9cabee3629f3
* action_dispatch.routes : #<ActionDispatch::Routing::RouteSet:0x0000000339f370>
* action_dispatch.show_detailed_exceptions : false
* action_dispatch.show_exceptions : true
* rack-cache.allow_reload : false
* rack-cache.allow_revalidate : false
* rack-cache.cache_key : Rack::Cache::Key
* rack-cache.default_ttl : 0
* rack-cache.entitystore : rails:/
* rack-cache.ignore_headers : ["Set-Cookie"]
* rack-cache.metastore : rails:/
* rack-cache.private_headers : ["Authorization", "Cookie"]
* rack-cache.storage : #<Rack::Cache::Storage:0x000000039c5768>
* rack-cache.use_native_ttl : false
* rack-cache.verbose : false
* rack.errors : #<IO:0x000000006592a8>
* rack.input : #<PhusionPassenger::Utils::RewindableInput:0x0000000655b3a0>
* rack.multiprocess : true
* rack.multithread : false
* rack.request.cookie_hash : {}
* rack.request.form_hash :
* rack.request.form_input : #<PhusionPassenger::Utils::RewindableInput:0x0000000655b3a0>
* rack.request.form_vars : ���W�"��陷q�B��)���
�F��P Z� 8�� & G\y�P��u�T ed �.�%�mxEAẳ\�d*�Hg� �C賳�lj��� � U 1��]pgt�P�
Ɗ ��c"� ��LX��D���HR�y��p`6�l���lN�P �l�S����`V4y��c����X2� &JO!��*p �l��-�гU��w }g�ԍk�� (� F J�� q�:�5G�Jh�pί����ࡃ] �z�h���� d }�}
* rack.request.query_hash : {}
* rack.request.query_string :
* rack.run_once : false
* rack.session : {}
* rack.session.options : {:path=>"/", :domain=>nil, :expire_after=>nil, :secure=>false, :httponly=>true, :defer=>false, :renew=>false, :coder=>#<Rack::Session::Cookie::Base64::Marshal:0x000000034d4ad8>, :id=>nil}
* rack.url_scheme : http
* rack.version : [1, 0]
As you can see there is no invalid utf-8 in the url but only in the rack.request.form_vars
. I have about hundred errors per days, and all similar as this one.
So, I tried to force utf-8 in rack.request.form_vars
with something like this:
class RackFormVarsSanitizer
def initialize(app)
@app = app
end
def call(env)
if env["rack.request.form_vars"]
env["rack.request.form_vars"] = env["rack.request.form_vars"].force_encoding('UTF-8')
end
@app.call(env)
end
end
And I call it in my application.rb
:
config.middleware.use "RackFormVarsSanitizer"
It doesn't seem to work because I already have errors. The problem is I can't test in development mode because I don't know how to set rack.request.form_vars
.
I installed utf8-cleaner
gem but it fixes nothing.
Somebody have an idea to fix this? or to trigger it in development?
So you don't have to piece together the comments in my other reply, this is what I'm doing now – I've seen no errors for 24 hours, so it looks very promising:
Add rack-utf8_sanitizer to your Gemfile:
gem 'rack-utf8_sanitizer'
and run
bundle
Put this middleware in app/middleware/handle_invalid_percent_encoding.rb
and rename the class HandleInvalidPercentEncoding
(because ExceptionApp
is a bit too general).
In the config
block of config/application.rb
do:
require "#{Rails.root}/app/middleware/handle_invalid_percent_encoding.rb"
# NOTE: These must be in this order relative to each other.
# HandleInvalidPercentEncoding just raises for encoding errors it doesn't cover,
# so it must run after (= be inserted before) Rack::UTF8Sanitizer.
config.middleware.insert 0, HandleInvalidPercentEncoding
config.middleware.insert 0, Rack::UTF8Sanitizer # from a gem
Deploy. Done.
(app
happens to be the location for middleware in the project I'm working on, but I'd probably prefer lib
. Whatever. Either should work.)
Add this line into your Gemfile
, then run bundle
in your terminal:
gem "handle_invalid_percent_encoding_requests"
This solution is based on Henrik's answer, turned into a Rails Engine gem.
There is an issue in the gem repo with a link to someone's possible solution – they say it works for them but they're not sure if it's a good solution.
I've yet to try it, but I think I will.
来源:https://stackoverflow.com/questions/24648206/ruby-on-rails-invalid-byte-sequence-in-utf-8-due-to-bot