Large number of threads under unicorn

前提是你 提交于 2019-12-13 01:01:01

问题


I am in the process of debugging some Posgtres connection leaks in our application. Few days back we suddenly crossed 100 connections when we should not be - coz we just have 8 unicorn workers and a sidekiq process (25 threads).

I was looking at htop today and saw that a ton of threads were being spawned from my unicorn workers. Eg:

Am I reading this correctly? This should not be happening right? If these are threads being spawned, any idea how to debug this?

Thanks! Btw, my other problem - (Postgres connections) Debugging unicorn postgres connection leak

EDIT

I just followed some tips here - http://varaneckas.com/blog/ruby-tracing-threads-unicorn/ - and when I printed the stack trace from the workers' threads, here's what I got when there are many threads..

[17176] ---
[17176] /u/apps/eventstream_production/shared/bundle/ruby/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:1057:in `pop'
[17176] /u/apps/eventstream_production/shared/bundle/ruby/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:1057:in `block in spawn_threadpool'
[17176] ---
[17176] /u/apps/eventstream_production/shared/bundle/ruby/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:1057:in `pop'
[17176] /u/apps/eventstream_production/shared/bundle/ruby/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:1057:in `block in spawn_threadpool'
[17176] ---
[17176] /u/apps/eventstream_production/shared/bundle/ruby/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:1057:in `pop'
[17176] /u/apps/eventstream_production/shared/bundle/ruby/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:1057:in `block in spawn_threadpool'
[17176] ---
[17176] /u/apps/eventstream_production/shared/bundle/ruby/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:1057:in `pop'
[17176] /u/apps/eventstream_production/shared/bundle/ruby/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:1057:in `block in spawn_threadpool'
[17176] ---
[17176] /u/apps/eventstream_production/shared/bundle/ruby/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:1057:in `pop'
[17176] /u/apps/eventstream_production/shared/bundle/ruby/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:1057:in `block in spawn_threadpool'
[17176] ---
[17176] /u/apps/eventstream_production/shared/bundle/ruby/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:1057:in `pop'
[17176] /u/apps/eventstream_production/shared/bundle/ruby/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:1057:in `block in spawn_threadpool'
[17176] ---
[17176] /u/apps/eventstream_production/shared/bundle/ruby/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:1057:in `pop'
[17176] /u/apps/eventstream_production/shared/bundle/ruby/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:1057:in `block in spawn_threadpool'
[17176] ---
[17176] /u/apps/eventstream_production/shared/bundle/ruby/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:1057:in `pop'
[17176] /u/apps/eventstream_production/shared/bundle/ruby/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:1057:in `block in spawn_threadpool'
[17176] ---
[17176] /u/apps/eventstream_production/shared/bundle/ruby/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:1057:in `pop'
[17176] /u/apps/eventstream_production/shared/bundle/ruby/2.2.0/gems/eventmachine-1.0.8/lib/eventmachine.rb:1057:in `block in spawn_threadpool'
[17176] ---
[17176] -------------------

This is my unicorn.rb https://gist.github.com/steverob/b83e41bb49d78f9aa32f79136df5af5f and it spawns a thread for EventMachine in after_fork.

The reason for EventMachine is this --> https://github.com/keenlabs/keen-gem#asynchronous-publishing

Is this normal? Shouldn't the threads be killed? Could this be also causing unncessary db connections to be open? Thanks

UPDATE: I just found out that I am using an older version of the PubNub gem which uses EM and I ran into these lines in the pubnub.log file -

D, [2016-04-06T21:31:12.130123 #1573] DEBUG -- pubnub: Created event Pubnub::Publish
D, [2016-04-06T21:31:12.130144 #1573] DEBUG -- pubnub: Pubnub::SingleEvent#fire
D, [2016-04-06T21:31:12.130162 #1573] DEBUG -- pubnub: Pubnub::SingleEvent#fire | Adding event to async_events
D, [2016-04-06T21:31:12.130178 #1573] DEBUG -- pubnub: Pubnub::SingleEvent#fire | Starting railgun
D, [2016-04-06T21:31:12.130194 #1573] DEBUG -- pubnub: Pubnub::Client#start_event_machine | starting EM in new thread
D, [2016-04-06T21:31:12.130243 #1573] DEBUG -- pubnub: Pubnub::Client#start_event_machine | We aren't running on thin
D, [2016-04-06T21:31:12.130264 #1573] DEBUG -- pubnub: Pubnub::Client#start_event_machine | EM already running

回答1:


So, after all, the behavior seems to be normal in your particular case.

The unicorn threads stack traces you provided (obtained using this method) point to the spawn_threadpool method in EventMachine. This code in EventMachine is called when some other code calls EventMachine.defer, a method which spawns a pool of 20 threads by default on its first invocation. I found usage of EventMachine.defer in an older version of the pubnub gem (e.g. here) but it may as well be used from other places.

So, I think this explains the high volume of threads you observe on each worker. They mostly wait in the pop method which suspends the thread until something is pushed in the queue (deferred again in EventMachine). So unless you have a high load of deferred operations, the threads are mostly doing nothing.

If you don't need to have 20 threads on each unicorn worker ready for deferrable operations (most probably you don't), you can try to lower the number of the threads in the pool by setting the threadpoolsize variable to some reasonable number, e.g.:

EventMachine.threadpool_size = 5

I would put this somewhere in the after_fork block in unicorn config.

Also, as another option, you might consider using the unicorn-worker-killer gem to periodically kill unicorn's workers.

By the way, the messages that pubnub spits into its log seem to be OK, as it just tells us that it found an already initialized EventMachine threads so it does not have to start a new one. This source code clarifies it.




回答2:


Ran into this issue today with version 4. When using PubNub in a background worker the thread count would continue to climb until we got an error. The solution was as follows:

client = Pubnub.new(...)
client.publish(...)
client.telemetry.terminate


来源:https://stackoverflow.com/questions/36497328/large-number-of-threads-under-unicorn

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!