What is the best way to ensure only one instance of a Bash script is running?

linux bash pid flock lockfile

131,125

Solution 1

If the script is the same across all users, you can use a lockfile approach. If you acquire the lock, proceed else show a message and exit.

As an example:

[Terminal #1] $ lockfile -r 0 /tmp/the.lock
[Terminal #1] $ 

[Terminal #2] $ lockfile -r 0 /tmp/the.lock
[Terminal #2] lockfile: Sorry, giving up on "/tmp/the.lock"

[Terminal #1] $ rm -f /tmp/the.lock
[Terminal #1] $ 

[Terminal #2] $ lockfile -r 0 /tmp/the.lock
[Terminal #2] $

After /tmp/the.lock has been acquired your script will be the only one with access to execution. When you are done, just remove the lock. In script form this might look like:

#!/bin/bash

lockfile -r 0 /tmp/the.lock || exit 1

# Do stuff here

rm -f /tmp/the.lock

Solution 2

Advisory locking has been used for ages and it can be used in bash scripts. I prefer simple flock (from util-linux[-ng]) over lockfile (from procmail). And always remember about a trap on exit (sigspec == EXIT or 0, trapping specific signals is superfluous) in those scripts.

In 2009 I released my lockable script boilerplate (originally available at my wiki page, nowadays available as gist). Transforming that into one-instance-per-user is trivial. Using it you can also easily write scripts for other scenarios requiring some locking or synchronization.

Here is the mentioned boilerplate for your convenience.

#!/bin/bash
# SPDX-License-Identifier: MIT

## Copyright (C) 2009 Przemyslaw Pawelczyk <[email protected]>
##
## This script is licensed under the terms of the MIT license.
## https://opensource.org/licenses/MIT
#
# Lockable script boilerplate

### HEADER ###

LOCKFILE="/var/lock/`basename $0`"
LOCKFD=99

# PRIVATE
_lock()             { flock -$1 $LOCKFD; }
_no_more_locking()  { _lock u; _lock xn && rm -f $LOCKFILE; }
_prepare_locking()  { eval "exec $LOCKFD>\"$LOCKFILE\""; trap _no_more_locking EXIT; }

# ON START
_prepare_locking

# PUBLIC
exlock_now()        { _lock xn; }  # obtain an exclusive lock immediately or fail
exlock()            { _lock x; }   # obtain an exclusive lock
shlock()            { _lock s; }   # obtain a shared lock
unlock()            { _lock u; }   # drop a lock

### BEGIN OF SCRIPT ###

# Simplest example is avoiding running multiple instances of script.
exlock_now || exit 1

# Remember! Lock file is removed when one of the scripts exits and it is
#           the only script holding the lock or lock is not acquired at all.

Solution 3

I think flock is probably the easiest (and most memorable) variant. I use it in a cron job to auto-encode dvds and cds

# try to run a command, but fail immediately if it's already running
flock -n /var/lock/myjob.lock   my_bash_command

Use -w for timeouts or leave out options to wait until the lock is released. Finally, the man page shows a nice example for multiple commands:

   (
     flock -n 9 || exit 1
     # ... commands executed under lock ...
   ) 9>/var/lock/mylockfile

Solution 4

Use bash `set -o noclobber` option and attempt to overwrite a common file.

This "bash friendly" technique will be useful when flock is not available or not applicable.

A short example

if ! (set -o noclobber ; echo > /tmp/global.lock) ; then
    exit 1  # the global.lock already exists
fi

# ... remainder of script ...

A longer example

This example will wait for the global.lock file but timeout after too long.

 function lockfile_waithold()
 {
    declare -ir time_beg=$(date '+%s')
    declare -ir time_max=7140  # 7140 s = 1 hour 59 min.
 
    # poll for lock file up to ${time_max}s
    # put debugging info in lock file in case of issues ...
    while ! \
       (set -o noclobber ; \
        echo -e "DATE:$(date)\nUSER:$(whoami)\nPID:$$" > /tmp/global.lock \ 
       ) 2>/dev/null
    do
        if [ $(($(date '+%s') - ${time_beg})) -gt ${time_max} ] ; then
            echo "Error: waited too long for lock file /tmp/global.lock" 1>&2
            return 1
        fi
        sleep 1
    done
 
    return 0
 }
 
 function lockfile_release()
 {
    rm -f /tmp/global.lock
 }
 
 if ! lockfile_waithold ; then
      exit 1
 fi
 trap lockfile_release EXIT
 
 # ... remainder of script ...

This technique reliably worked for me on a long-running Ubuntu 16 host. The host regularly queued many instances of a bash script that coordinated work using the same singular system-wide "lock" file.

(This is similar to this post by @Barry Kelly which was noticed afterward.)

Solution 5

i found this in procmail package dependencies:

apt install liblockfile-bin

To run: dotlockfile -l file.lock

file.lock will be created.

To unlock: dotlockfile -u file.lock

Use this to list this package files / command: dpkg-query -L liblockfile-bin

View more solutions

131,125

Author by

Admin

Updated on July 08, 2022

Comments

Admin almost 2 years
What is the simplest/best way to ensure only one instance of a given script is running - assuming it's Bash on Linux?

At the moment I'm doing:
```
ps -C script.name.sh > /dev/null 2>&1 || ./script.name.sh
```
but it has several issues:
1. it puts the check outside of script
2. it doesn't let me run the same script from separate accounts - which I would like sometimes.
3. -C checks only first 14 characters of process name
Of course, I can write my own pidfile handling, but I sense that there should be a simple way to do it.
Admin over 14 years

This has the relatively well known bug with grep finding itself. Of course I can work around it, but it's not something I would call simple and robust.
outis over 14 years

+1. Even if behavior differs across users, OP could use lockfile. Just have a separate lockfile for each user or group that's allowed to run their own instance.
martin clayton over 14 years

I've seen many 'grep -v grep's. Your ps might support -u $LOGNAME too.
martin clayton over 14 years

Can we have an example code snippet?
ennuikiller over 14 years

it's relatively robust in that its uses $0 and whoami to ensure your gettinmg only the script started by your userid
Admin over 14 years

ennuikiller: no - grep $0 will find processes like $0 (for example the one that is running this ps right now), but it will also find a grep itself! so basically - it will virtually always succeed.
ezpz over 14 years

Added an example and skeleton script.
ennuikiller over 14 years

@depesz, yes of course I'm assuming your doing grep -v grep as well!
SourceSeeker over 14 years

That's not a bug, that's a feature! Also, ps -ef | grep [\ ]$0 eliminates finding the grep.
Admin over 14 years

I don't have lockfile program on my linux, but one thing bothers me - will it work if first script will die without removing lock? i.e. in such case i want next run of script to run, and not die "because previous copy is still working"
Admin over 14 years

@ennuikiller: that assumption was not in your example. besides - it will find "call.sh" even in things like "call.sh". and it will also fail if i'll call it from ./call.sh itself (it will find the call.sh copy that is doing the check, not some previous) - so. in short - this is not solution. it can be changed to be solution by adding at least 2 more greps, or changing existing one, but it doesn't on its own solve the problem.
ezpz over 14 years

That would involve some notion of try / catch. This is not impossible to fake, but AFAIK is not directly implemented in bash. Any (bash) solution will be confronted with this same problem, however...
martin clayton over 14 years

@depesz - if you make your lockfile name include the process id somehow (maybe filename.$$) you can check whether a lockfile has gone stale. I'm not saying it's pretty though...
Shannon Nelson over 14 years

You should also use the trap builtin to catch any signals that might kill your script prematurely. Near the top of the script, add something like: trap " [ -f /var/run/my.lock ] && /bin/rm -f /var/run/my.lock" 0 1 2 3 13 15 You can search /usr/bin/* for more examples.
mgalgs over 12 years

+1 for mentioning trap
Angelo almost 11 years

depesz is correct. Instead use pgrep with -u or -U options and exclude $$ for current process id
qed over 10 years

+1 for its simplicity.
qed over 10 years

What is the 0 signal? It can't be seen in kill -l
martin clayton over 10 years

@qed - it means run the trap on exit from the script. See gnu.org/software/bash/manual/bashref.html#index-trap
qed over 10 years

It looks much like the try...catch...finally... in python.
Carlos P over 10 years

Excellent script; is there a way within this framework to simply check if the lock exists rather than always obtaining a lock when doing so?
przemoc over 10 years

@CarlosP: No. Under the hood flock uses simply flock(2) syscall and it doesn't provide such information nor it even should. If you want to unreliably check, whether there is a lock present (or lack thereof), i.e. without holding it, then you have to try to acquire it in a non-blocking way (exlock_now) and release it immediately (unlock) if you succeeded. If you think that you need to check the lock presence without changing its state, then you're possibly using wrong tools to solve your problems.
Edouard Thiel about 10 years

In a more bash way you may replace LOCKFILE="/var/lock/`basename $0`" by LOCKFILE="/var/lock/${0##*/}"
Kalin about 10 years

@user80168 Current Ubuntu (14.04) has available a package called "lockfile-progs" (NFS-safe locking library) that provides lockfile-{check,create,remove,touch}. man page says: "Once a file is locked, the lock must be touched at least once every five minutes or the lock will be considered stale, and subsequent lock attempts will succeed...". Seems like a good package to use and mentions a "--use-pid" option.
overthink over 9 years

I feel like I'm missing something obvious, but why does the exec call have to be wrapped in eval? e.g. why eval "exec $LOCKFD>\"$LOCKFILE\"" rather than exec $LOCKFD>"$LOCKFILE"?
George Young over 9 years

This template is very cool. But I don't understand why you do { _lock u; _lock xn && rm -f $LOCKFILE; }. What is the purpose of the xn lock after you just unlocked it?
przemoc over 9 years

@EdouardThiel Yeah. I actually try to avoid any bashisms, but your basename equivalent is not a BASH thing, it's POSIX thing, so it's fine indeed. BTW I also don't use backticks anymore and prefer much more sane $() notation. Why is $(...) preferred over backticks?
przemoc over 9 years

@overthink only literal number next to > is considered as file descriptor number, so without eval there exec tries to execute binary called 99 (or whatever else is put in $LOCKFD). It's worth to add that some shells (like dash) have a bug that requires fd number to be single digit. I chose high fd number to avoid possible collisions (they depend on the use case, though). I went with BASH also because of convenient EXIT condition in trap IIRC, but it looks I was wrong, as it is part of POSIX shell.
przemoc over 9 years

@GeorgeYoung Removing lock file at the end of script only if immediate exclusive lock succeeds (after earlier unlocking) is for the case of other script instance patiently waiting to obtain exclusive or shared lock (i.e. using exlock or shlock), because when it finally starts, then previous instance shouldn't remove the file.
geirha almost 9 years

That solution has a very glaring race condition (not that the others don't).
przemoc almost 9 years

Important update about trap fn EXIT. It must be supported by any POSIX-compliant shell, as I already wrote 6 months ago, but the thing is it's differently implemented. We simply want to have fn be executed when script ends (be it normal exit or invoked via some signal). EXIT works that way in bash or ksh, but not in zsh or dash for instance. That's why I originally went with bash despite having otherwise quite clean sh script. I relearned this trap issue a few months ago (I have to finally start blogging to ease relearning things), but forgot to write about it here as well, sorry for that!
Jay Paroline over 8 years

This solution doesn't automatically clean up stale lockfiles if the process dies. Simple test: add a long sleep to the bash script, run it in the background, kill it. Stale lockfile will exist. Try running the script again, it will immediately exit because of the lockfile.
przemoc over 8 years

@JayParoline You're misinterpreting what you observe. When you kill (-9) script, i.e. bash instance running script file, it will surely die, but processes fork()+exec()-ed from it (like your sleep did) inherit copies of open file descriptors along with flock() locks. Killing the script while sleep is sleeping won't unlock, because sleep process is still holding the lock. For lockable script it's important, because you usually want to protect "environment" (do not start another instance while something is still runing).
przemoc over 8 years

@JayParoline But you may change the behavior explained above by adding ( eval "exec $LOCKFD>&-" before your stuff and ) after, so everything running within such block won't inherit LOCKFD (and obviously the lock put on it).
Jay Paroline over 8 years

@przemoc cool, that makes sense, I just assumed that killing the parent would kill the child, but I never checked.. I ended up going with stackoverflow.com/a/1441036/160709 instead, which works the way I need but I can see how this is a more correct solution
Cerin over 8 years

I agree, flock is nice, especially compared to lockfile since flock is usually pre-installed on most Linux distros and doesn't require a large unrelated utility like postfix the way lockfile does.
Charles Duffy over 8 years

@przemoc, incidentally, that's no longer true with bash 4.1 or newer, where eval is no longer needed for redirection with FDs from expansion results.
Charles Duffy over 8 years

@overthink, (see above -- with new enough bash, the eval is in fact no longer necessary).
Ivan Hamilton over 7 years

@przemoc We've been using this code, and there is a subtle race condition here. It's possible for a 2nd process to acquire a file descriptor on the lockfile between "_lock xn" and " && rm -f $LOCKFILE" when the 1st process exits. The 2nd process is then running, but the lockfile will have been unlinked from the file system by the "rm". A 3rd process can come along, create a new lockfile at the same name, acquire a lock on it, and also start running. After acquiring the lock, you need to check that what you acquired is still on the filesystem - stackoverflow.com/questions/17708885
A Sahra over 7 years

@jake Biesinger am i locking the .sh file or the file that i write output of my application with .sh file? i am new to scripting bash so where do i have to put this in my script plus how to do the unlocking?
A Sahra over 7 years

@Cerin I need to do same thing with ffmpeg process conversion so i need to finish the first process regardless of crontab in every minute? pleas i need help for this
studgeek over 7 years

Note the Stack Overflow contribution terms require contributed code be under MIT (now) and previously under creative commons. So including the GPL is a violation of SO terms. See meta.stackexchange.com/q/12527/153541 and meta.stackexchange.com/q/272956/153541.
przemoc over 7 years

@studgeek Thank you for the info. I planned to change the license long time ago, so I finally did it. Hopefully I'll introduce some other changes I mentioned here or on gist in the near future.
BVengerov about 7 years

Also exit $? will always return zero.
Jitesh Sojitra almost 7 years

Works for me! Need to understand, is there any reason for downvote for this answer?
Charles Duffy almost 7 years

One disadvantage of this (as opposed to flock-style locking) is that your lock isn't automatically released on kill -9, reboot, power loss, etc.
Tylla about 6 years

@user80168 On Debian (at least on Jessie (8) and Stretch (9)) lockfile is part of the procmail package. Moreover it has a "locktimeout" parameter. If locktimeout is given, the program checks if the modification time of the lockfile is older then locktimeout seconds and if this is true then it ignores the lockfile as it must belong to some old, already dead program.
Tylla about 6 years

@Kalin Yes, the lockfile-progs suite seems usable as well, but the usage of the lockfile-touch utility as suggested in the manual page of lockfile-create (start lockfile-touch in background which will periodically touch the lock file until killed, do the business in parallel, kill lockfile-touch) is prone to the same error of the "control program dying in mid-air" as any other naive solution.
Paul over 5 years

very nice ! thk
Pro Backup over 5 years

pgrep -fn ... -fo $0 also matches you text editor which has the script open for editing. Is there a workaround for that situation?
Dm1 over 5 years

This is a very specific solution for situations when traditional ways can't be used, if it doesn't match you needs you still can use a lockfile. If you need this one line solution anyway, you can modify it using $* with $0 and pass unique parameter to your script, which will not be present in a text editor command line.
Alexandros over 5 years

I used this in a script A which launches a background process (script B) with nohup. I found that as long as B is running, the eval statement fails with "flock: 9: Bad file descriptor". Is it possible to use this in such a case? So I want to be able to run only one instance of A no matter how many background B processes are running.
Alexandros over 5 years

To clarify: when running A the second time (so there's a B instance active) the _prepare_locking fails with "/var/lock/pserver.sh: Permission denied" and thus the file descriptor is then invalid and locking commands fail. So even though the file /var/lock/myscript.sh has been removed after the first instance of A terminated, it cannot be recreated for the second run...
JamesThomasMoon over 5 years

@CharlesDuffy , you could add a trap lockfile_release EXIT which should cover most cases. If power loss is a concern, then using a temporary directory for the lock file would work, e.g. /tmp.
Charles Duffy over 5 years

In addition to reboots &c, exit traps don't fire on SIGKILL (which is used by the OOM killer, and thus a very real-world concern in some environments). I still consider this approach generally less robust to anything where the kernel provides a guarantee of release. (/tmp being memory-backed and thus given a hard guarantee of being cleared on reboot is mostly the case in recent years, but I'm old-school enough not to trust such facilities to be available; I suppose some rant about kids and a yard is appropriate).
JamesThomasMoon over 5 years

@CharlesDuffy good point, set -o noclobber doesn't claim guarantees like flock. In my uncommon situation, using flock wouldn't work because the script didn't know the target directory it was going process until it had processed some user-passed options (which might also affect the target directory for processing). And, the script might, in some cases, want to release the lock file long before it was done processing. This particular script was used heavily on a multi-host build system. Despite valid concerns, set -o noclobber never had a problem of errant multiple lock holders.
Charles Duffy over 5 years

I'm not sure I follow why that's a concern; you can certainly grab a lock with a dynamic filename with flock after your program has started, and release it without exiting. Using some modern (bash 4.1) facilities to avoid needing to assign a FD manually: exec {lock_fd}>"$filename" && flock -x "$lock_fd" || { echo "Lock failed" >&2; exit 1; }; ...stuff here...; exec {lock_fd}>&-
Charles Duffy over 5 years

One can also use a code block: { flock -x 3 || exit; ...stuff here...; } 3>"$lockfile" will release the lock on reaching the closing }.
swdev about 5 years

lsof is not always in your $PATH.
Adrian Zaugg almost 5 years

This solution suffers under race conditions: The test construct is not atomic.
Adrian Zaugg almost 5 years

lsof is probably not an atomic action, hence it suffers under race conditions.
James Tan over 4 years

flock works well until u realised your application didnt terminate or hang. ive to use it together with timeout to limit the execution time or to prevent lock file not being released due to application hang
hagello almost 4 years

@qed: @martin is right, the documentation states that trap ... 0 is an alias for trap ... EXIT. However, when sending signal 0 with kill -0 ..., you just check whether the process exists and you are allowed to send a signal to it. This is used for waiting (polling) for the end of one of your processes that is not the son of the current process. Signal 0 does not have any effect.
JamesThomasMoon over 3 years

I see your point. Perhaps I didn't understand flock correctly. I do recall testing various locking strategies with flock but deciding that set -o noclobber worked better. Unfortunately, I have moved on from that script so I cannot review.
nielsen over 3 years

This solution is useful in my case where flock and lockfile are not available in the environment.
Compholio over 2 years

The trap on EXIT doesn't work very well in some unusual exit circumstances (especially if you use dash), if you trap on "EXIT INT TERM" then you can cover your bases.
plijnzaad over 2 years

Directory /var/lock may not be writeable, resulting in obscure flock: 99: Bad file descriptor errors, maybe good to point this out.
Paul over 2 years

How it behaves after power loss?