Safe rm -rf function in shell script

regex bash sed

11,263

Solution 1

I've found a big danger with rm in bash is that bash usually doesn't stop for errors. That means that:

cd $SOMEPATH
rm -rf *

Is a very dangerous combination if the change directory fails. A safer way would be:

cd $SOMEPATH && rm -rf *

Which will ensure the rf won't run unless you are really in $SOMEPATH. This doesn't protect you from a bad $SOMEPATH but it can be combined with the advice given by others to help make your script safer.

EDIT: @placeybordeaux makes a good point that if $SOMEPATH is undefined or empty cd doesn't treat it as an error and returns 0. In light of that this answer should be considered unsafe unless $SOMEPATH is validated as existing and non-empty first. I believe cd with no args should be an illegal command since at best is performs a no-op and at worse it can lead to unexpected behaviour but it is what it is.

Solution 2

There is a set -u bash directive that will cause exit, when uninitialized variable is used. I read about it here, with rm -rf as an example. I think that's what you're looking for. And here is set's manual.

Solution 3

I would recomend to use realpath(1) and not the command argument directly, so that you can avoid things like /A/B/../ or symbolic links.

Solution 4

Generally, when I'm developing a command with operations such as 'rm -fr' in it, I will neutralize the remove during development. One way of doing that is:

RMRF="echo rm -rf"
...
$RMRF "/${PATH1}"

This shows me what should be deleted - but does not delete it. I will do a manual clean up while things are under development - it is a small price to pay for not running the risk of screwing up everything.

The notation '"/${PATH1}"' is a little unusual; normally, you would ensure that PATH1 simply contains an absolute pathname.

Using the metacharacter with '"${PATH2}/"*' is unwise and unnecessary. The only difference between using that and using just '"${PATH2}"' is that if the directory specified by PATH2 contains any files or directories with names starting with dot, then those files or directories will not be removed. Such a design is unlikely and is rather fragile. It would be much simpler just to pass PATH2 and let the recursive remove do its job. Adding the trailing slash is not necessarily a bad idea; the system would have to ensure that $PATH2 contains a directory name, not just a file name, but the extra protection is rather minimal.

Using globbing with 'rm -fr' is usually a bad idea. You want to be precise and restrictive and limiting in what it does - to prevent accidents. Of course, you'd never run the command (shell script you are developing) as root while it is under development - that would be suicidal. Or, if root privileges are absolutely necessary, you neutralize the remove operation until you are confident it is bullet-proof.

Solution 5

Meanwhile I've found this perl project: http://code.google.com/p/safe-rm/

View more solutions

11,263

Author by

Max Arnold

Updated on July 27, 2022

Comments

Max Arnold over 1 year
This question is similar to What is the safest way to empty a directory in *nix?

I'm writing bash script which defines several path constants and will use them for file and directory manipulation (copying, renaming and deleting). Often it will be necessary to do something like:
```
rm -rf "/${PATH1}"
rm -rf "${PATH2}/"*
```
While developing this script I'd want to protect myself from mistyping names like PATH1 and PATH2 and avoid situations where they are expanded to empty string, thus resulting in wiping whole disk. I decided to create special wrapper:
```
rmrf() {
    if [[ $1 =~ "regex" ]]; then
        echo "Ignoring possibly unsafe path ${1}"
        exit 1
    fi

    shopt -s dotglob
    rm -rf -- $1
    shopt -u dotglob
}
```
Which will be called as:
```
rmrf "/${PATH1}"
rmrf "${PATH2}/"*
```
Regex (or sed expression) should catch paths like "*", "/*", "/**/", "///*" etc. but allow paths like "dir", "/dir", "/dir1/dir2/", "/dir1/dir2/*". Also I don't know how to enable shell globbing in case like "/dir with space/*". Any ideas?

EDIT: this is what I came up with so far:
```
rmrf() {
    local RES
    local RMPATH="${1}"
    SAFE=$(echo "${RMPATH}" | sed -r 's:^((\.?\*+/+)+.*|(/+\.?\*+)+.*|[\.\*/]+|.*/\.\*+)$::g')
    if [ -z "${SAFE}" ]; then
        echo "ERROR! Unsafe deletion of ${RMPATH}"
        return 1
    fi

    shopt -s dotglob
    if [ '*' == "${RMPATH: -1}" ]; then
        echo rm -rf -- "${RMPATH/%\*/}"*
        RES=$?
    else
        echo rm -rf -- "${RMPATH}"
        RES=$?
    fi
    shopt -u dotglob

    return $RES
}
```
Intended use is (note an asterisk inside quotes):
```
rmrf "${SOMEPATH}"
rmrf "${SOMEPATH}/*"
```
where $SOMEPATH is not system or /home directory (in my case all such operations are performed on filesystem mounted under /scratch directory).

CAVEATS:
- not tested very well
- not intended to use with paths possibly containing '..' or '.'
- should not be used with user-supplied paths
- rm -rf with asterisk probably can fail if there are too many files or directories inside $SOMEPATH (because of limited command line length) - this can be fixed with 'for' loop or 'find' command
Max Arnold almost 15 years

To delete subdirectories and files starting with dot I use "shopt -s dotglob". Using rm -rf "${PATH2}" is not appropriate because in my case PATH2 can be only removed by superuser and this results in error status for "rm" command (and I verify it to track other errors).
Max Arnold almost 15 years

Useful but non-standard command. I've found possible bash replacement: archlinux.org/pipermail/pacman-dev/2009-February/008130.html
Jonathan Leffler almost 15 years

Then, with due respect, you should use a private sub-directory under $PATH2 that you can remove. Avoid glob expansion with commands like 'rm -rf' like you would avoid the plague (or should that be A/H1N1?).
vadipp over 7 years

Actually there is a way: if PATH1 is something like ../../someotherdir
placeybordeaux almost 6 years

If $SOMEPATH is empty won't this rm -rf the user's home directory?
Mamey almost 6 years

@placeybordeaux The && only runs the second command if the first succeeds - so if cd fails rm never runs
placeybordeaux almost 6 years

@SpliFF at least in ZSH the return value of cd $NONEXISTANTVAR is 0
Mamey almost 6 years

@placeybordeaux Good point. Since cd with no args is apparently a valid command.
ruakh almost 6 years

Instead of cd $SOMEPATH, you should write cd "${SOMEPATH?}". The ${varname?} notation ensures that the expansion fails with a warning-message if the variable is unset or empty (such that the && ... part is never run); the double-quotes ensure that special characters in $SOMEPATH, such as whitespace, don't have undesired effects.
placeybordeaux almost 6 years

cd with no args moves to your home dir. linuxcommand.org/lc3_man_pages/cdh.html