I've written a scrip that works fine to start and stop a server.
#!/bin/bash PID_FILE='/var/run/rserve.pid' start() { touch $PID_FILE eval "/usr/bin/R CMD Rserve" PID=$(ps aux | grep Rserve | grep -v grep | awk '{print $2}') echo "Starting Rserve with PID $PID" echo $PID > $PID_FILE } stop () { pkill Rserve rm $PID_FILE echo "Stopping Rserve" } case $1 in start) start ;; stop) stop ;; *) echo "usage: rserve {start|stop}" ;; esac exit 0
If I start it by running
rserve start
and then start monit
it will correctly capture the PID and the server:
The Monit daemon 5.3.2 uptime: 0m Remote Host 'localhost' status Online with all services monitoring status Monitored port response time 0.000s to localhost:6311 [DEFAULT via TCP] data collected Mon, 13 May 2013 20:03:50 System 'system_gauss' status Running monitoring status Monitored load average [0.37] [0.29] [0.25] cpu 0.0%us 0.2%sy 0.0%wa memory usage 524044 kB [25.6%] swap usage 4848 kB [0.1%] data collected Mon, 13 May 2013 20:03:50
If I stop it, it will properly kill the process and unmonitor it. However if I start it again, it won't start the server again:
ps ax | grep Rserve | grep -vc grep 1 monit stop localhost ps ax | grep Rserve | grep -vc grep 0 monit start localhost [UTC May 13 20:07:24] info : 'localhost' start on user request [UTC May 13 20:07:24] info : monit daemon at 4370 awakened [UTC May 13 20:07:24] info : Awakened by User defined signal 1 [UTC May 13 20:07:24] info : 'localhost' start: /usr/bin/rserve [UTC May 13 20:07:24] info : 'localhost' start action done [UTC May 13 20:07:34] error : 'localhost' failed, cannot open a connection to INET[localhost:6311] via TCP
Here is the monitrc:
check host localhost with address 127.0.0.1 start = "/usr/bin/rserve start" stop = "/usr/bin/rserve stop" if failed host localhost port 6311 type tcp with timeout 15 seconds for 5 cycles then restart