Are there any examples on the web of how to monitor delayed_job with Monit?
Everything I can find uses God, but I refuse to use God since long running processes in Ruby generally suck. (The most current post in the God mailing list? God Memory Usage Grows Steadily.)
Update: delayed_job now comes with a sample monit config based on this question.
Here is how I got this working.
- Use the collectiveidea fork of delayed_job besides being actively maintained, this version has a nice
script/delayed_job
daemon you can use with monit. Railscasts has a good episode about this version ofdelayed_job
(ASCIICasts version). This script also has some other nice features, like the ability to run multiple workers. I don't cover that here. - Install monit. I installed from source because Ubuntu's version is so ridiculously out of date. I followed these instructions to get the standard init.d scripts that come with the Ubuntu packages. I also needed to configure with
./configure --sysconfdir=/etc/monit
so the standard Ubuntu configuration dir was picked up. Write a monit script. Here's what I came up with:
check process delayed_job with pidfile /var/www/app/shared/pids/delayed_job.pid
start program = "/var/www/app/current/script/delayed_job -e production start"
stop program = "/var/www/app/current/script/delayed_job -e production stop"
I store this in my soucre control system and point monit at it with
include /var/www/app/current/config/monit
in the/etc/monit/monitrc
file.- Configure monit. These instructions are laden with ads but otherwise OK.
- Write a task for capistrano to stop and start.
monit start delayed_job
andmonit stop delayed_job
is what you want to run. I also reload monit when deploying to pick up any config file changes.
Problems I ran into:
daemons
gem must be installed forscript/delayed_job
to run.- You must pass the Rails environment to
script/delayed_job
with-e production
(for example). This is documented in the README file but not in the script's help output. - I use Ruby Enterprise Edition, so I needed to get monit to start with that copy of Ruby. Because of the way sudo handles the PATH in Ubuntu, I ended up symlinking
/usr/bin/ruby
and/usr/bin/gem
to the REE versions.
When debugging monit, I found it helps to stop the init.d version and run it from the th command line, so you can get error messages. Otherwise it is very difficult to figure out why things are going wrong.
sudo /etc/init.d/monit stop
sudo monit start delayed_job
Hopefully this helps the next person who wants to monitor delayed_job
with monit.
For what it's worth, you can always use /usr/bin/env with monit to setup the environment. This is especially important in the current version of delayed_job, 1.8.4, where the environment (-e) option is deprecated.
check process delayed_job with pidfile /var/app/shared/pids/delayed_job.pid
start program = "/usr/bin/env RAILS_ENV=production /var/app/current/script/delayed_job start"
stop program = "/usr/bin/env RAILS_ENV=production /var/app/current/script/delayed_job stop"
In some cases, you may also need to set the PATH with env, too.
I found it was easier to create an init script for delayed job. It is available here: http://gist.github.com/408929 or below:
#! /bin/sh set_path="cd /home/rails/evatool_staging/current" case "$1" in start) echo -n "Starting delayed_job: " su - rails -c "$set_path; RAILS_ENV=staging script/delayed_job start" >> /var/log/delayed_job.log 2>&1 echo "done." ;; stop) echo -n "Stopping sphinx: " su - rails -c "$set_path; RAILS_ENV=staging script/delayed_job stop" >> /var/log/delayed_job.log 2>&1 echo "done." ;; *) N=/etc/init.d/delayed_job_staging echo "Usage: $N {start|stop}" >&2 exit 1 ;; esac exit 0
Then make sure that monit is set to start / restart the app so in your monitrc file:
check process delayed_job with pidfile "/path_to_my_rails_app/shared/pids/delayed_job.pid" start program = "/etc/init.d/delayed_job start" stop program = "/etc/init.d/delayed_job stop"
and that works great!
I found a nice way to start delayed_job with cron on boot. I'm using whenever to control cron.
My schedule.rb:
# custom job type to control delayed_job job_type :delayed_job, 'cd :path;RAILS_ENV=:environment script/delayed_job ":task"' # delayed job start on boot every :reboot do delayed_job "start" end
Note: I upgraded whenever gem to 0.5.0 version to be able to use job_type
I don't know with Monit, but I've written a couple Munin plugins to monitor Queue Size and Average Job Run Time. The changes I made to delayed_job in that patch might also make it easier for you to write Monit plugins in case you stick with that.
Thanks for the script.
One gotcha -- since monit by definition has a 'spartan path' of
/bin:/usr/bin:/sbin:/usr/sbin
... and for me ruby was installed / linked in /usr/local/bin, I had to thrash around for hours trying to figure out why monit was silently failing when trying to restart delayed_job (even with -v for monit verbose mode).
In the end I had to do this:
check process delayed_job with pidfile /var/www/app/shared/pids/delayed_job.pid
start program = "/usr/bin/env PATH=$PATH:/usr/local/bin /var/www/app/current/script/delayed_job -e production start"
stop program = "/usr/bin/env PATH=$PATH:/usr/local/bin /var/www/app/current/script/delayed_job -e production stop"
I had to combine the solutions on this page with another script made by toby to make it work with monit and starting with the right user.
So my delayed_job.monitrc looks like this:
check process delayed_job
with pidfile /var/app/shared/pids/delayed_job.pid
start program = "/bin/su -c '/usr/bin/env RAILS_ENV=production /var/app/current/script/delayed_job start' - rails"
stop program = "/bin/su -c '/usr/bin/env RAILS_ENV=production /var/app/current/script/delayed_job stop' - rails"
If your monit is running as root and you want to run delayed_job as my_user then do this:
/etc/init.d/delayed_job:
#!/bin/sh
# chmod 755 /etc/init.d/delayed_job
# chown root:root /etc/init.d/delayed_job
case "$1" in
start|stop|restart)
DJ_CMD=$1
;;
*)
echo "Usage: $0 {start|stop|restart}"
exit
esac
su -c "cd /var/www/my_app/current && /usr/bin/env bin/delayed_job $DJ_CMD" - my_user
/var/www/my_app/shared/monit/delayed_job.monitrc:
check process delayed_job with pidfile /var/www/my_app/shared/tmp/pids/delayed_job.pid
start program = "/etc/init.d/delayed_job start"
stop program = "/etc/init.d/delayed_job stop"
if 5 restarts within 5 cycles then timeout
/etc/monit/monitrc:
# add at bottom
include /var/www/my_app/shared/monit/*
Since i didn't want to run as root, I ended up creating a bash init script that monit used for starting and stopping (PROGNAME would be the absolute path to script/delayed_job):
start() {
echo "Starting $PROGNAME"
sudo -u $USER /usr/bin/env HOME=$HOME RAILS_ENV=$RAILS_ENV $PROGNAME start
}
stop() {
echo "Stopping $PROGNAME"
sudo -u $USER /usr/bin/env HOME=$HOME RAILS_ENV=$RAILS_ENV $PROGNAME stop
}
I have spent quite a bit of time on this topic. I was fed up with not having a good solution for it so I wrote the delayed_job_tracer plugin that specifically addresses monitoring of delayed_job and its jobs.
Here's is an article I've written about it: http://modernagility.com/articles/5-monitoring-delayed_job-and-its-jobs
This plugin will monitor your delayed job process and send you an e-mail if delayed_job crashes or if one of its jobs fail.
For Rails 3, you may need set HOME env to make compass work properly, and below config works for me:
check process delayed_job
with pidfile /home/user/app/shared/pids/delayed_job.pid
start program = "/bin/sh -c 'cd /home/user/app/current; HOME=/home/user RAILS_ENV=production script/delayed_job start'"
stop program = "/bin/sh -c 'cd /home/user/app/current; HOME=/home/user RAILS_ENV=production script/delayed_job stop'"
I ran into an issue where if the delayed job dies while it still has a job locked, that job will not be freed. I wrote a wrapper script around delayed job that will look at the pid file and free any jobs from the dead worker.
The script is for rubber/capistrano
roles/delayedjob/delayed_job_wrapper:
<% @path = '/etc/monit/monit.d/monit-delayedjob.conf' %>
<% workers = 4 %>
<% workers.times do |i| %>
<% PIDFILE = "/mnt/custora-#{RUBBER_ENV}/shared/pids/delayed_job.#{i}.pid" %>
<%= "check process delayed_job.#{i} with pidfile #{PIDFILE}"%>
group delayed_job-<%= RUBBER_ENV %>
<%= " start program = \"/bin/bash /mnt/#{rubber_env.app_name}-#{RUBBER_ENV}/current/script/delayed_job_wrapper #{i} start\"" %>
<%= " stop program = \"/bin/bash /mnt/#{rubber_env.app_name}-#{RUBBER_ENV}/current/script/delayed_job_wrapper #{i} stop\"" %>
<% end %>
roles/delayedjob/delayed_job_wrapper
#!/bin/bash
<% @path = "/mnt/#{rubber_env.app_name}-#{RUBBER_ENV}/current/script/delayed_job_wrapper" %>
<%= "pid_file=/mnt/#{rubber_env.app_name}-#{RUBBER_ENV}/shared/pids/delayed_job.$1.pid" %>
if [ -e $pid_file ]; then
pid=`cat $pid_file`
if [ $2 == "start" ]; then
ps -e | grep ^$pid
if [ $? -eq 0 ]; then
echo "already running $pid"
exit
fi
rm $pid_file
fi
locked_by="delayed_job.$1 host:`hostname` pid:$pid"
<%=" /usr/bin/mysql -e \"update delayed_jobs set locked_at = null, locked_by = null where locked_by='$locked_by'\" -u#{rubber_env.db_user} -h#{rubber_instances.for_role('db', 'primary' => true).first.full_name} #{rubber_env.db_name} " %>
fi
<%= "cd /mnt/#{rubber_env.app_name}-#{RUBBER_ENV}/current" %>
. /etc/profile
<%= "RAILS_ENV=#{RUBBER_ENV} script/delayed_job -i $1 $2"%>
to see what is going on, run monit in foreground verbose mode: sudo monit -Iv
using rvm
installed under user "www1" and group "www1".
in file /etc/monit/monitrc
:
#delayed_job
check process delayed_job with pidfile /home/www1/your_app/current/tmp/pids/delayed_job.pid
start program "/bin/bash -c 'PATH=$PATH:/home/www1/.rvm/bin;source /home/www1/.rvm/scripts/rvm;cd /home/www1/your_app/current;RAILS_ENV=production bundle exec script/delayed_job start'" as uid www1 and gid www1
stop program "/bin/bash -c 'PATH=$PATH:/home/www1/.rvm/bin;source /home/www1/.rvm/scripts/rvm;cd /home/www1/your_app/current;RAILS_ENV=production bundle exec script/delayed_job stop'" as uid www1 and gid www1
if totalmem is greater than 200 MB for 2 cycles then alert
来源:https://stackoverflow.com/questions/1226302/how-to-monitor-delayed-job-with-monit