What\'s a quick-and-dirty way to make sure that only one instance of a shell script is running at a given time?
For shell scripts, I tend to go with the mkdir
over flock
as it makes the locks more portable.
Either way, using set -e
isn't enough. That only exits the script if any command fails. Your locks will still be left behind.
For proper lock cleanup, you really should set your traps to something like this psuedo code (lifted, simplified and untested but from actively used scripts) :
#=======================================================================
# Predefined Global Variables
#=======================================================================
TMPDIR=/tmp/myapp
[[ ! -d $TMP_DIR ]] \
&& mkdir -p $TMP_DIR \
&& chmod 700 $TMPDIR
LOCK_DIR=$TMP_DIR/lock
#=======================================================================
# Functions
#=======================================================================
function mklock {
__lockdir="$LOCK_DIR/$(date +%s.%N).$$" # Private Global. Use Epoch.Nano.PID
# If it can create $LOCK_DIR then no other instance is running
if $(mkdir $LOCK_DIR)
then
mkdir $__lockdir # create this instance's specific lock in queue
LOCK_EXISTS=true # Global
else
echo "FATAL: Lock already exists. Another copy is running or manually lock clean up required."
exit 1001 # Or work out some sleep_while_execution_lock elsewhere
fi
}
function rmlock {
[[ ! -d $__lockdir ]] \
&& echo "WARNING: Lock is missing. $__lockdir does not exist" \
|| rmdir $__lockdir
}
#-----------------------------------------------------------------------
# Private Signal Traps Functions {{{2
#
# DANGER: SIGKILL cannot be trapped. So, try not to `kill -9 PID` or
# there will be *NO CLEAN UP*. You'll have to manually remove
# any locks in place.
#-----------------------------------------------------------------------
function __sig_exit {
# Place your clean up logic here
# Remove the LOCK
[[ -n $LOCK_EXISTS ]] && rmlock
}
function __sig_int {
echo "WARNING: SIGINT caught"
exit 1002
}
function __sig_quit {
echo "SIGQUIT caught"
exit 1003
}
function __sig_term {
echo "WARNING: SIGTERM caught"
exit 1015
}
#=======================================================================
# Main
#=======================================================================
# Set TRAPs
trap __sig_exit EXIT # SIGEXIT
trap __sig_int INT # SIGINT
trap __sig_quit QUIT # SIGQUIT
trap __sig_term TERM # SIGTERM
mklock
# CODE
exit # No need for cleanup code here being in the __sig_exit trap function
Here's what will happen. All traps will produce an exit so the function __sig_exit
will always happen (barring a SIGKILL) which cleans up your locks.
Note: my exit values are not low values. Why? Various batch processing systems make or have expectations of the numbers 0 through 31. Setting them to something else, I can have my scripts and batch streams react accordingly to the previous batch job or script.
Some unixes have lockfile
which is very similar to the already mentioned flock
.
From the manpage:
lockfile can be used to create one or more semaphore files. If lock- file can't create all the specified files (in the specified order), it waits sleeptime (defaults to 8) seconds and retries the last file that didn't succeed. You can specify the number of retries to do until failure is returned. If the number of retries is -1 (default, i.e., -r-1) lockfile will retry forever.
Another option is to use shell's noclobber
option by running set -C
. Then >
will fail if the file already exists.
In brief:
set -C
lockfile="/tmp/locktest.lock"
if echo "$$" > "$lockfile"; then
echo "Successfully acquired lock"
# do work
rm "$lockfile" # XXX or via trap - see below
else
echo "Cannot acquire lock - already locked by $(cat "$lockfile")"
fi
This causes the shell to call:
open(pathname, O_CREAT|O_EXCL)
which atomically creates the file or fails if the file already exists.
According to a comment on BashFAQ 045, this may fail in ksh88
, but it works in all my shells:
$ strace -e trace=creat,open -f /bin/bash /home/mikel/bin/testopen 2>&1 | grep -F testopen.lock
open("/tmp/testopen.lock", O_WRONLY|O_CREAT|O_EXCL|O_LARGEFILE, 0666) = 3
$ strace -e trace=creat,open -f /bin/zsh /home/mikel/bin/testopen 2>&1 | grep -F testopen.lock
open("/tmp/testopen.lock", O_WRONLY|O_CREAT|O_EXCL|O_NOCTTY|O_LARGEFILE, 0666) = 3
$ strace -e trace=creat,open -f /bin/pdksh /home/mikel/bin/testopen 2>&1 | grep -F testopen.lock
open("/tmp/testopen.lock", O_WRONLY|O_CREAT|O_EXCL|O_TRUNC|O_LARGEFILE, 0666) = 3
$ strace -e trace=creat,open -f /bin/dash /home/mikel/bin/testopen 2>&1 | grep -F testopen.lock
open("/tmp/testopen.lock", O_WRONLY|O_CREAT|O_EXCL|O_LARGEFILE, 0666) = 3
Interesting that pdksh
adds the O_TRUNC
flag, but obviously it's redundant:
either you're creating an empty file, or you're not doing anything.
How you do the rm
depends on how you want unclean exits to be handled.
Delete on clean exit
New runs fail until the issue that caused the unclean exit to be resolved and the lockfile is manually removed.
# acquire lock
# do work (code here may call exit, etc.)
rm "$lockfile"
Delete on any exit
New runs succeed provided the script is not already running.
trap 'rm "$lockfile"' EXIT
I use a simple approach that handles stale lock files.
Note that some of the above solutions that store the pid, ignore the fact that the pid can wrap around. So - just checking if there is a valid process with the stored pid is not enough, especially for long running scripts.
I use noclobber to make sure only one script can open and write to the lock file at one time. Further, I store enough information to uniquely identify a process in the lockfile. I define the set of data to uniquely identify a process to be pid,ppid,lstart.
When a new script starts up, if it fails to create the lock file, it then verifies that the process that created the lock file is still around. If not, we assume the original process died an ungraceful death, and left a stale lock file. The new script then takes ownership of the lock file, and all is well the world, again.
Should work with multiple shells across multiple platforms. Fast, portable and simple.
#!/usr/bin/env sh
# Author: rouble
LOCKFILE=/var/tmp/lockfile #customize this line
trap release INT TERM EXIT
# Creates a lockfile. Sets global variable $ACQUIRED to true on success.
#
# Returns 0 if it is successfully able to create lockfile.
acquire () {
set -C #Shell noclobber option. If file exists, > will fail.
UUID=`ps -eo pid,ppid,lstart $$ | tail -1`
if (echo "$UUID" > "$LOCKFILE") 2>/dev/null; then
ACQUIRED="TRUE"
return 0
else
if [ -e $LOCKFILE ]; then
# We may be dealing with a stale lock file.
# Bring out the magnifying glass.
CURRENT_UUID_FROM_LOCKFILE=`cat $LOCKFILE`
CURRENT_PID_FROM_LOCKFILE=`cat $LOCKFILE | cut -f 1 -d " "`
CURRENT_UUID_FROM_PS=`ps -eo pid,ppid,lstart $CURRENT_PID_FROM_LOCKFILE | tail -1`
if [ "$CURRENT_UUID_FROM_LOCKFILE" == "$CURRENT_UUID_FROM_PS" ]; then
echo "Script already running with following identification: $CURRENT_UUID_FROM_LOCKFILE" >&2
return 1
else
# The process that created this lock file died an ungraceful death.
# Take ownership of the lock file.
echo "The process $CURRENT_UUID_FROM_LOCKFILE is no longer around. Taking ownership of $LOCKFILE"
release "FORCE"
if (echo "$UUID" > "$LOCKFILE") 2>/dev/null; then
ACQUIRED="TRUE"
return 0
else
echo "Cannot write to $LOCKFILE. Error." >&2
return 1
fi
fi
else
echo "Do you have write permissons to $LOCKFILE ?" >&2
return 1
fi
fi
}
# Removes the lock file only if this script created it ($ACQUIRED is set),
# OR, if we are removing a stale lock file (first parameter is "FORCE")
release () {
#Destroy lock file. Take no prisoners.
if [ "$ACQUIRED" ] || [ "$1" == "FORCE" ]; then
rm -f $LOCKFILE
fi
}
# Test code
# int main( int argc, const char* argv[] )
echo "Acquring lock."
acquire
if [ $? -eq 0 ]; then
echo "Acquired lock."
read -p "Press [Enter] key to release lock..."
release
echo "Released lock."
else
echo "Unable to acquire lock."
fi
I wanted to do away with lockfiles, lockdirs, special locking programs and even pidof
since it isn't found on all Linux installations. Also wanted to have the simplest code possible (or at least as few lines as possible). Simplest if
statement, in one line:
if [[ $(ps axf | awk -v pid=$$ '$1!=pid && $6~/'$(basename $0)'/{print $1}') ]]; then echo "Already running"; exit; fi
You need an atomic operation, like flock, else this will eventually fail.
But what to do if flock is not available. Well there is mkdir. That's an atomic operation too. Only one process will result in a successful mkdir, all others will fail.
So the code is:
if mkdir /var/lock/.myscript.exclusivelock
then
# do stuff
:
rmdir /var/lock/.myscript.exclusivelock
fi
You need to take care of stale locks else aftr a crash your script will never run again.