convert soft- to hardlinks with cp

5,256

Solution 1

The example in the info page shows you how though the example is a bit hard to follow:

$ mkdir c; : > a; ln -s a b; cp -aH a b c; ls -i1 c
74161745 a
74161745 b

Let's break that down into its component commands:

  • mkdir c; : creates the directory c/
  • : > a; : just a quick way of creating an empty file. It is equivalent to echo "" > a. : is a bash built in which does nothing, see help :.
  • ln -s a b : create a softlink to a called b. At this point, these are the contents of the current directory:

    $ ls -l | cc2ter 
    total 4
    -rw-r--r-- 1 terdon terdon    0 Oct  9 02:50 a
    lrwxrwxrwx 1 terdon terdon    1 Oct  9 02:50 b -> a
    drwxr-xr-x 2 terdon terdon 4096 Oct  9 02:50 c
    

    Note that b is a symbolic link (soft link) it does not point to the same inode as a:

    $ ls -i1c a b
    16647344 a
    16647362 b
    
  • cp -aH a b c; : copy files a and b into directory c. This is where the conversion is happening, the options passed to cp are:

    -a, --archive
          same as -dR --preserve=all
    -d    same as --no-dereference --preserve=links
    -H    follow command-line symbolic links in SOURCE
    

    The -H is necessary because (from info cp):

    When copying from a symbolic link, `cp' normally follows the link only when not copying recursively.

    Since -a activates recursive copying (-R), -H is needed to follow symbolic links. -H means that links are followed despite recursion and will result in hard links being made in the target directory. These are the contents of c/ after the last step (the first column is the inode number):

    $ ls -li c 
    total 0
    17044704 -rw-r--r-- 2 terdon terdon 0 Oct  9 02:50 a
    17044704 -rw-r--r-- 2 terdon terdon 0 Oct  9 02:50 b
    

Now as to how exactly it works, as far as I can figure out from playing around with it, cp --preserve=links combined with -L or -H will convert symbolic links to hard links if both the link and the target are being copied to the same directory.


In fact, as the OP found out, at least on Debian systems, cp --preserve=links is sufficient to convert symlinks to hard links if the target directory is the same.

Solution 2

It'd be difficult to convert hard links to symlinks. In the case of a hard link, there is a data block on the filesystem which has two or more file entries pointing at it. There is no "source" and "destination"; it's literally one file with multiple equivalent names. You can use GNU find to identify those this way:

sauer@zipper:~$ find . -type f -links +1 -printf "%i: %p (%n)\n"
609: ./link1 (2)
609: ./link2 (2)

Once you've got all of the files with the same inode, you'd have to pick one to be the "real" file and then just replace all of the others with symlinks to the master file. Probably the way to do that would be to use this:

sauer@zipper:~$ find . -type f -links +1 -printf "%i %p\n" | sort -nk1
609 ./link1
609 ./link2

And then have a script figure out how to pick one of the values with the same number to have all the others link to it. Maybe the first one becomes the target, and any more with the same inode are symlinked to it. Here's one really simple, untested shell script example

#!/bin/sh
prev=""
target=""
find /tmp -type f -links +1 -printf "%i %p\n" | sort -nk1 \
| while read inode file
do
  if [[ $inode != $prev ]]
  then
     target="$file"
     prev=$inode
  else
    ln -sf "$target" "$file"
  fi
done

There are potential problems, in that links from different directories may be created with an invalid target if the path in find (/tmp in this example) is not absolute. But the general idea should be fine.

Solution 3

I've sent a report on a possible bug to the coreutils team @gnu.org in info cp documentation and got this reply:

The docs are a bit terse here. The main issue is that -a implies -d and that implies --no-dereference which is required to get your commands to work as expected. I.E. --no-dereference is required to stop cp implicitly following symlinks in the source.

To verify and split out the detail being demonstrated here:

$ mkdir links; : > a; ln -s a b;

Here we see that -d overrides -H as it comes after. Therefore we will not dereference symlinks in the first place.

$ rm links/*; cp -H -d a b links
$ l links/
lrwxrwxrwx. 1 padraig 1 Oct 10 09:37 b ▪▶ a
-rw-rw-r--. 1 padraig 0 Oct 10 09:37 a

Here we see that -H is now honored as it comes last, and therefore symlinks are followed in the source, resulting in hardlinks in the destination.

$ rm links/*
$ rm links/*; cp -d -H a b links
$ l links
-rw-rw-r--. 2 padraig 0 Oct 10 09:37 b
-rw-rw-r--. 2 padraig 0 Oct 10 09:37 a

I'll make the docs a bit more explicit with the following:

diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index b273627..aeed4ca 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -8257,9 +8257,11 @@ $ mkdir c; : > a; ln -s a b; cp -aH a b c; ls -i1 c
 @noindent

Note the inputs: @file{b} is a symlink to regular file @file{a}, yet the files in destination directory, @file{c/}, are hard-linked.

  • Since @option{-a} implies @option{--preserve=links}, and since @option{-H} tells @command{cp} to dereference command line arguments, it sees two files with the same inode number, and preserves the perceived hard link.
  • Since @option{-a} implies @option{--no-dereference} it would copy the symlink, but the later @option{-H} tells @command{cp} to dereference the command line arguments where it then sees two files with the same inode number. Then the @option{--preserve=links} option also implied by @option{-a} will preserve the perceived hard link.
Share:
5,256

Related videos on Youtube

erch
Author by

erch

my about me is blink at the moment

Updated on September 18, 2022

Comments

  • erch
    erch over 1 year

    The cp command's infopage offers on the option --preserve= the following:

    links
    Preserve in the destination files any links between corresponding source files. Note that with -L' or-H', this option can convert symbolic links to hard links.

    followed by an example I don't get [now]; anyhow:

    Question: How to turn soft- into hardlinks with cp? And is there a way back too [converting hard- into softlinks]?


    Secondary Issue: Where does can in the quote above come into play? I understand the purpose of -L and -H, I'm able to copy fully functional softlinks etc., but so far I didn't manage to turn soft- into hardlinks.

  • erch
    erch over 10 years
    For some reason, it seems to work without the -H and/or -L. Anyhow: : > file was new for me and is great! Also: According to the UNIX philosophy ["… programs that do one thing and do it well"] there is ln for creating links. But then, why not make this possible and makes sense in the workflow :)
  • terdon
    terdon over 10 years
    @chirp the link will be copied without the -H but it will be a soft link. Adding the -H makes it convert to a hard link.
  • erch
    erch over 10 years
    I've tried it with[out] -H and both the copied link and the copied file [the link originally pointed to] share the same inode -> thus are hardlinked. For whatever reason. Is there a place where I can put a screenshot or something as proove? For whatever reasons. I wonder it there is there [still] a misunderstanding from my side. A bug/feature?
  • terdon
    terdon over 10 years
    @chirp I just tried the exact command from the info page with and without the -H and one gave me a hardlink and the other a symlink with different inodes. This was on an ext4 filesystem running Debian and cp (GNU coreutils) 8.21. Do you really get different results? If you want to post somewhere, ping me in chat.
  • erch
    erch over 10 years
    just sent a bug report to coreutils mailing list on a possible bug in the documentation.
  • kurtm
    kurtm over 10 years
    As an FYI, : also works for ksh, csh, and OpenBSD's sh (which is really based on pdksh).
  • Fredrick Gauss
    Fredrick Gauss almost 10 years
    So many things learned with an answer.
  • rubo77
    rubo77 over 9 years
    Nice answer, but it doesn't answer the Question, instead it would answer how to convert-a-hardlink-into-a-symbolic-link
  • rubo77
    rubo77 over 9 years
    I created a script that takes working directory and source directory as options: gist.github.com/rubo77/7a9a83695a28412abbcd
  • rubo77
    rubo77 over 9 years
    You should add -v for verbose to the ln command, and at first just write echo ln -sfv "$target" "$file" to see what would happen, otherwise this would be really dangerous. If everything looks fine, you can remove the echo
  • dannysauer
    dannysauer over 9 years
    The question says the man page explains how to go symlink->hardlink and there was already an answer for that; I answered the remaining question about how to go back. Adding a -v only tells you what blew up, but yes, good point about running with an echo at first.