I find debugging monit to be a major pain. Monit\'s shell environment basically has nothing in it (no paths or other environment variables). Also, there are no log file that
Yeah monit isn't too easy to debug.
Here a few best practices
shell:
#!/usr/bin/env bash
logfile=/var/log/myjob.log
touch ${logfile}
echo $$ ": ################# Starting " $(date) "########### pid " $$ >> ${logfile}
echo "Command: the-command $@" >> ${logfile} # log your command arguments
{
exec the-command $@
} >> ${logfile} 2>&1
That helps a lot.
The other thing I find that helps is to run monit with '-v', which gives you verbosity. So the workflow is
Be sure to always double check your conf and monitor your processes by hand before letting monit handle everything. systat(1), top(1) and ps(1) are your friends to figure out resource usage and limits. Knowing the process you monitor is essential too.
Regarding the start and stop scripts i use a wrapper script to redirect output and inspect environment and other variables. Something like this :
$ cat monit-wrapper.sh
#!/bin/sh
{
echo "MONIT-WRAPPER date"
date
echo "MONIT-WRAPPER env"
env
echo "MONIT-WRAPPER $@"
$@
R=$?
echo "MONIT-WRAPPER exit code $R"
} >/tmp/monit.log 2>&1
Then in monit :
start program = "/home/billitch/bin/monit-wrapper.sh my-real-start-script and args"
stop program = "/home/billitch/bin/monit-wrapper.sh my-real-stop-script and args"
You still have to figure out what infos you want in the wrapper, like process infos, id, system resources limits, etc.
I've had the same problem. Using monit's verbose command-line option helps a bit, but I found the best way was to create an environment as similar as possible to the monit environment and run the start/stop program from there.
# monit runs as superuser
$ sudo su
# the -i option ignores the inherited environment
# this PATH is what monit supplies by default
$ env -i PATH=/bin:/usr/bin:/sbin:/usr/sbin /bin/sh
# try running start/stop program here
$
I've found the most common problems are environment variable related (especially PATH
) or permission-related. You should remember that monit usually runs as root.
Also if you use as uid myusername
in your monit config, then you should change to user myusername
before carrying out the test.
I hope that helps.
monit -c /path/to/your/config -v
By default, monit logs to your system message log and you can check there to see what's happening.
Also, depending on your config, you might be logging to a different place
tail -f /var/log/monit
http://mmonit.com/monit/documentation/monit.html#LOGGING
Assuming defaults (as of whatever old version of monit I'm using), you can tail the logs as such:
CentOS:
tail -f /var/log/messages
Ubuntu:
tail -f /var/log/syslog
Mac OSX
tail -f /var/log/system.log
Windows
Here be Dragons
But there is a neato project I found while searching on how to do this out of morbid curiosity: https://github.com/derFunk/monit-windows-agent
You can start Monit in verbose/debug mode by adding MONIT_OPTS="-v"
to /etc/default/monit
(don't forget to restart; /etc/init.d/monit restart
).
You can then capture the output using tail -f /var/log/monit.log
[CEST Jun 4 21:10:42] info : Starting Monit 5.17.1 daemon with http interface at [*]:2812
[CEST Jun 4 21:10:42] info : Starting Monit HTTP server at [*]:2812
[CEST Jun 4 21:10:42] info : Monit HTTP server started
[CEST Jun 4 21:10:42] info : 'ocean' Monit 5.17.1 started
[CEST Jun 4 21:10:42] debug : Sending Monit instance changed notification to monit@example.io
[CEST Jun 4 21:10:42] debug : Trying to send mail via smtp.sendgrid.net:587
[CEST Jun 4 21:10:43] debug : Processing postponed events queue
[CEST Jun 4 21:10:43] debug : 'rootfs' succeeded getting filesystem statistics for '/'
[CEST Jun 4 21:10:43] debug : 'rootfs' filesytem flags has not changed
[CEST Jun 4 21:10:43] debug : 'rootfs' inode usage test succeeded [current inode usage=8.5%]
[CEST Jun 4 21:10:43] debug : 'rootfs' space usage test succeeded [current space usage=59.6%]
[CEST Jun 4 21:10:43] debug : 'ws.example.com' succeeded testing protocol [WEBSOCKET] at [ws.example.com]:80/faye [TCP/IP] [response time 114.070 ms]
[CEST Jun 4 21:10:43] debug : 'ws.example.com' connection succeeded to [ws.example.com]:80/faye [TCP/IP]