bash force copy over same file

10,040

Normally, filesystem implementations are supposed to guarantee to application programs that at any given time on a given machine, each file can be uniquely identified by the combination of its device ID (the st_dev field in the stat structure) and its inode (the st_ino field). The device ID indicates which mounted filesystem the file is on, and the inode characterise one file inside a particular filesystem. cp considers two files to be identical if they have the same device ID and the same inode.

It is possible to have more than one (device ID, inode) pair for a file if it is accessed through different means, for example an NFS mount to localhost (such scenarios tend to be exotic).

It should not be possible for different files to have the same (device ID, inode) pairs. However, this is up to the filesystem implementation. If you can change the content of the source without changing the snapshot, then I would expect the snapshot to exhibit a different device ID from the source, but it's possible that some implementations out there don't do this.

Note that apart from changing the file, your tests prove nothing. Deleting one hard link doesn't delete the file's other names. Copying files only on demand is common for snapshots, so it would not be abnormal if the file in the snapshot was exactly the same as the file outside the snapshot as long as the file contents remain identical. The inode number would typically remain the same.

When you change the file, make sure you're writing to the same file, and not removing one file and immediately after creating another file with the same name.

If you do have two different files (the snapshot and the source) with the same device ID and the same inode but different contents, most applications are going to believe that they're the same. You'll have to find a way to test for file equality that depends on the snapshot technology, or else checksum both sides' content or systematically remove the target.

Share:
10,040

Related videos on Youtube

Yotam
Author by

Yotam

Updated on September 18, 2022

Comments

  • Yotam
    Yotam over 1 year

    Our sysadmin has created a backup system that creates snapshots of the hard drive.

    When I try to restore an older version from the snapshot:

    cp /path/to/snapshots/foo.bar /path/to/folder/foo.bar
    

    I get an error:

    cp: `/path/to/snapshots/foo.bar' and  `/path/to/folder/foo.bar' are the same file. 
    

    I can delete the file and copy it but I wanted to know if there is a way to do this with cp.

    I know that the files are not linked because:

    1. I can change the content of the source and the snapshots are kept unchanged (tested)
    2. I can delete the source and the snapshots are there to restore (tested)
    3. The files are stored on the university computation cluster. If something like that were true, somebody would have been fired already (or at least shouted at).

    Nevertheless, the inode number of the files is the same

    The cluster is implemented over red hat linux and I don't know what is the file system

    df result:

    Filesystem           1K-blocks      Used Available Use% Mounted on
    <ipadress>:/vol/hpc/storage
                          67633152  67633152         0 100% /storage
    <ipadress>:/vol/hpc/storage
                         2186805248 982498048 1204307200  45% /storage
    

    stat result:

      File: `/path/to/snapshots/foo.bar'
      Size: 404         Blocks: 8          IO Block: 4096   regular file
    Device: 17h/23d Inode: 19750461    Links: 1
    Access: (0644/-rw-r--r--)  Uid: (<num1>/  yotama9)   Gid: ( <num2>/ <groupname>)
    Access: 2012-01-22 00:03:27.246852000 +0200
    Modify: 2012-01-19 23:10:32.746397000 +0200
    Change: 2012-01-19 23:10:32.746397000 +0200
      File: `/path/to/folder/foo.bar'
      Size: 404         Blocks: 8          IO Block: 4096   regular file
    Device: 17h/23d Inode: 26335134    Links: 1
    Access: (0644/-rw-r--r--)  Uid: (<num1>/  yotama9)   Gid: ( <num2>/ <groupname>)
    Access: 2012-01-24 16:03:48.732453000 +0200
    Modify: 2012-01-24 16:03:30.728900000 +0200
    Change: 2012-01-24 16:03:30.728900000 +0200
    
    • Admin
      Admin over 12 years
      Are you sure the source path is not a symlink to the destination path?
    • Admin
      Admin over 12 years
      It seems that the files are one and the same .. ie. that one is a hard-link of the other. run ls -l -i on both files to see if the inode number is the same. If the inode number is the same, then you do not have a backup; it means there is only 1 real file and 1 hard-link to that file,
    • Admin
      Admin over 12 years
      I have edited my question, they are not hard linked
    • Admin
      Admin over 12 years
      Of course they are. 1. How do you change the source? Maybe your editor uses a temporary file and then copies it over. Try echo x >> foo.bar. Or test whether the inode is still the same after you change it your way. 2. That's the idea. You delete one link, the other one remains.
    • Admin
      Admin over 12 years
      What OS and what filesystem are involved? What do you know about the way the snapshots are set up? What does df /path/to/snapshot /path/to/source; ls -l /path/to/snapshot /path/to/source; stat /path/to/snapshot /path/to/source show?