Delete directory by referencing symbolic link

command-line scripts symbolic-link rm

5,372

Solution 1

Using rm -rf $(readlink temporary) has some flaws: it will fail if the directory name has spaces. Try it:

$ mkdir 'I have spaces'
$ ln -s 'I have spaces' test
$ ls -log
drwxr-xr-x 2 4096 Oct 31 19:19 I have spaces
lrwxrwxrwx 1   14 Oct 31 19:19 test -> I have spaces/
$ rm -rf $(readlink test)
$ ls -log
drwxr-xr-x 2 4096 Oct 31 19:19 I have spaces
lrwxrwxrwx 1   14 Oct 31 19:19 test -> I have spaces/

and is dangerous if the directories I, have and spaces already exist.

Adding quotes helps:

$ rm -rf "$(readlink test)"
$ ls -log
lrwxrwxrwx 1 14 Oct 31 19:19 test -> I have spaces/

(shown in read, since the symlink now points to nothing).

Yet, there's still a case where this fails. Look

$ mkdir $'a dir with a trailing newline\n'
$ ln -s $'a dir with a trailing newline\n' test
$ ls -log
drwxr-xr-x  2 4096 Oct 31 19:30 a dir with a trailing newline?
lrwxrwxrwx  1   31 Oct 31 19:30 test -> a dir with a trailing new line?
$ rm -rf "$(readlink test)"
$ ls -log
drwxr-xr-x  2 4096 Oct 31 19:30 a dir with a trailing newline?
lrwxrwxrwx  1   31 Oct 31 19:30 test -> a dir with a trailing new line?

What??????

So, what can I do?

If you're using bash, a good strategy is the following: use cd with the -P option, to cd into the directory after all links have been dereferenced. Watch the difference:

$ cd test
$ pwd
/home/gniourf/lalala/test
$ cd ..
$ cd -P test
$ pwd
/home/gniourf/lalala/a dir with a trailing newline

$ # Cool!

Type help cd for more help about the cd bash builtin. You'll read that cd has another switch, namely -L which is the one used by default.

Now what?

Well, the strategy could be to cd -P into the directory, and use that. The problem is that I can't delete the directory I'm in (well, in fact I could, see the bottom of this post). Try it:

$ rm -rf .
rm: cannot remove directory: `.'

And sudo will not help much here.

So I could cd -P in that directory, and save the working directory in a variable, and then come back to where I were. That's a beautiful idea! But how do you guys "save the current the working directory in a variable"? how do you tell what directory you're in? I'm sure you all know the pwd command. (Oh, I actually used it 2 minutes ago). But if I do:

$ saved_dir=$(pwd)

I'll run in the same trouble because of the trailing newline. Fortunately, Bash has the variable PWD that automagically contains the working directory (with the trailing newline, if necessary).

Ok, at this point there's something I'd like to tell you. How do you, for sure, know the exact content of a variable? echo is not enough. Look:

$ a="lalala    " # with 4 trailing spaces
$ echo "$a"
lalala
$ # :(

A trick is to use this:

$ a="lalala    "
$ echo "'$a'"
'lalala    '
$ # :)

Cool. But sometimes you won't be able to see everything. A good strategy is to use bash's builtin printf together with its %q modifier. Look:

$ a="lalala    "
$ printf '%q\n' "$a"
lalala\ \ \ \ 
$ # :)

Application:

$ cd -P test
$ pwd
a dir with a trailing newline

$ printf '%q\n' "$PWD"
$'a dir with a trailing newline\n'
$ # :)
$ cd .. # let's go back where we were

Amazing!

So, a strategy to remove this dir is the following:

$ cd -P test
$ saved_dir=$PWD
$ cd -
/home/gniourf/lalala
$ rm -rf "$PWD"
$ ls -log
lrwxrwxrwx 1   31 Oct 31 19:30 test -> a dir with a trailing new line?
$ # :)

Done!

Now, wait, what was that cd -? Well, this just means: go back to where I were before cding here. :)

You want to put this in a function? Let's go:

rm_where_this_points_to() {
    cd -P -- "$1" || { echo >&2 "Sorry, I couldn't go where \`$1' points to."; return 1; }
    local saved_dir=$PWD
    cd -
    if [[ $saved_dir = $PWD ]]; then
        echo &>2 "Oh dear, \`$1' points here exactly."
        return 1
    fi
    rm -rfv -- "$saved_dir"
}

Done!

Notice how I'm using -- just in case the dir starts with a hyphen, so as to not confuse cd and rm. Or,

rm_where_this_points_to() {
    local here=$PWD
    cd -P -- "$1" || { echo >&2 "Sorry, I couldn't go where \`$1' points to."; return 1; }
    local saved_dir=$PWD
    cd -- "$here"
    if [[ "$saved_dir" = "$PWD" ]]; then
        echo >&2 "Oh dear, \`$1' points here exactly."
        return 1
    fi
    rm -rfv -- "$saved_dir"
}

if you don't want to use cd - inside the function. Even better, bash has this variable OLDPWD which contains the directory you where in before the last cd (this variable is used for cd -). Then, you could use it to avoid the uses of the variables to save the locations as in (but it's a bit confusing):

rm_where_this_points_to() {
    cd -P -- "$1" || { echo >&2 "Sorry, I couldn't go where \`$1' points to."; return 1; }
    cd -- "$OLDPWD"
    # At this point, OLDPWD is the directory to remove, not the one we're in!
    if [[ "$OLDPWD" = "$PWD" ]]; then
        echo >&2 "Oh dear, \`$1' points to here exactly."
        return 1
    fi
    rm -rfv -- "$OLDPWD"
}

One side effect of this method is that it will change your directory stack (so after executing this function, cd - will not quite work as you'd like).

Alright, there's one more thing I'd like to tell you. It will actually use printf with its %q modifier, and also it will use the terrible, the evil eval command. Just because it's funny. This will have the advantage of not needing to cd back. We'll call cd from a subshell. Oh, you know subshells don't you? It's the cute little thing surrounded by parentheses. When something happens in a subshell, it's not seen by the parent shell. Look:

$ a=1
$ echo "$a"
1
$ ( a=42 ) # <--- in a subshell!
$ echo "$a"
1
$ # :)

Now this also works with cd:

$ pwd
/home/gniourf/lalala
$ ( cd ~ ; pwd )
/home/gniourf
$ pwd
/home/gniourf/lalala
$ # :)

That's why, sometimes, in scripts, you will see stuff done in subshells, just so as to not mess around with the global state. For example when messing around with IFS. It's usually considered good practice.

Back to our thing: how are we going to get back the value of PWD while inside a subshell, since I just showed you that variables set inside a subshell can't be seen by the parent shell. That's where we'll use printf with its %q modifier:

$ ( cd test; printf '%q' "$PWD" )
$'a dir with a trailing newline\n'
$ # :)

Now, can we put the stdout of a subshell in a variable? Sure, just prepend the subshell by a $ as:

$ labas=$(cd test; printf '%q' "$PWD")
$ echo "$labas"
$'a dir with a trailing newline\n'
$ # :)

(là-bas means over there in French).

But what are we going to do with this silly string? Easy, we'll use the terrible, the evil eval. Really, do never, never, ever use eval. It's bad. God kills a kitten each time you use eval. Really, believe me. Unless you exactly know what you're doing, of course. Unless you're absolutely sure your data has been sanitized. (You know Bobby Tables?). But here, it's the case. printf's modifier %q just does that for us! Then:

rm_where_this_points_to() {
    local labas=$( cd -P -- "$1" && printf '%q' "$PWD" )
    [[ -z "$labas" ]] && { echo >&2 "Couldn't go where \`$1' points to."; return 1; }
    if [[ "$labas" = "$(printf '%q' "$PWD")" ]]; then
        echo >&2 "Oh dear, \`$1' points here exactly."
        return 1
    fi
    # eval and unquoted $labas, I know what I'm doing:
    # $labas has been sanitized and comes directly from `printf '%q'`
    eval "rm -rfv -- $labas"
}

Look, there's an unquoted $labal and an eval! You should feel really weird deep inside you when you see this! (that's my case). That's why I added the comment. So that later I'll realize it's fine.

There's a caveat, though. The part that checks that we're not deleting our current directory is flawed as shown by:

$ ln -s . wtf
$ rm_where_this_points_to wtf
Oh dear, `wtf' points here exactly.
$ # Good :) but look:
$ cd wtf
$ rm_where_this_points_to wtf
$ # Oh, it did it! :(

To fix this, we'll need to derefence the PWD, escape it with printf '%q' and compare this with the dereferenced and escaped thing we want to delete. It'll just make the code slightly more complex:

rm_where_this_points_to() {
    local labas=$( cd -P -- "$1" && printf '%q' "$PWD" )
    local ici=$( cd -P . && printf '%q' "$PWD" )
    [[ -z "$labas" ]] && { echo >&2 "Couldn't go where \`$1' points to."; return 1; }
    [[ -z "$ici" ]] && { echo >&2 "Something really weird happened: I can't dereference the current directory."; return 1; }
    if [[ "$labas" = "$ici" ]]; then
        echo >&2 "Oh dear, \`$1' points here exactly."
        return 1
    fi
    # eval and unquoted $labas, I know what I'm doing:
    # $labas has been sanitized and comes directly from `printf '%q'`
    eval "rm -rfv -- $labas"
}

(ici means here in French).

This is the function I would use (until I find it still has some flaws... if you find any, please let me know in the comments).

I said before that a cool solution would be to cd -P to the directory I want to remove, and remove if from there using rm -rf ., but this doesn't work. Well, a really dirty method would be to rm -rf -- "$PWD" instead. That's horribly dirty but a lot of fun:

rm_where_this_points_to_dirty() {
    ( cd -P -- "$1" && rm -rfv -- "$PWD" )
}

Done!

Use at your own risks!!!

Cheers!

In this post you might have learned one of the following:

To use more quotes!
To use more quotes!
cd -P v.s. cd -L.
that a trailing newline can be disastrous when using command substitution, i.e., $(...).
The variables PWD and OLDPWD.
To use $'...' to have some cool effects.
printf and its %q modifier.
To use --.
Subshells.
How to say over there and here in French.
To never use eval. Never.

Solution 2

You can use the following command:

rm -rf $(ls -l ~/temporary | awk '{print $NF}')

If you don't want to parse ls, then you can use:

rm -rf $(file ~/temporary | awk '{print $NF}' | tr -d "\`\',")

or simple:

rm -rf $(readlink ~/temporary)

To take care of spaces both in the name of the directory that in the name of the link, you can modify the last command as follows

rm -rf "$(readlink ~/"temporary with spaces")"

5,372

Author by

Adam

Updated on September 18, 2022

Comments

Adam over 1 year
To set up the question, imagine this scenario:
```
mkdir ~/temp
cd ~/
ln -s temp temporary
```
rm -rf temporary, rm -f temporary, and rm temporary each will remove the symbolic link but leave the directory ~/temp/.

I have a script where the name of the symbolic link is easily derived but the name of the linked directory is not.

Is there a way to remove the directory by referencing the symbolic link, short of parsing the name of the directory from ls -od ~/temporary?
- Adam over 10 years
  
  I found a method by reading info coreutils 'Special File Types': rm -rf $(readlink temporary)
Adam over 10 years

That's great, and it works, but it is also exactly what I said I'm not looking for: "short of parsing the name of the directory from ls -od ~/temporary". If there is actually no other way to do this then I'll mark your answer as correct.
Adam over 10 years

I found readlink from info pages (see above). Thanks for your help though.
enzotib over 10 years

instructive, but too much work to take into account for an improbable newline in a filename. Worth to take care of for a production system, but not for a casual user.
gniourf_gniourf over 10 years

@enzotib thanks for your feedback. My only hope is that some curious people reading the post and interested in good practices, security or scripting in general will find something to dig. :).