bash force copy over same file
Normally, filesystem implementations are supposed to guarantee to application programs that at any given time on a given machine, each file can be uniquely identified by the combination of its device ID (the st_dev
field in the stat
structure) and its inode (the st_ino
field). The device ID indicates which mounted filesystem the file is on, and the inode characterise one file inside a particular filesystem. cp
considers two files to be identical if they have the same device ID and the same inode.
It is possible to have more than one (device ID, inode) pair for a file if it is accessed through different means, for example an NFS mount to localhost (such scenarios tend to be exotic).
It should not be possible for different files to have the same (device ID, inode) pairs. However, this is up to the filesystem implementation. If you can change the content of the source without changing the snapshot, then I would expect the snapshot to exhibit a different device ID from the source, but it's possible that some implementations out there don't do this.
Note that apart from changing the file, your tests prove nothing. Deleting one hard link doesn't delete the file's other names. Copying files only on demand is common for snapshots, so it would not be abnormal if the file in the snapshot was exactly the same as the file outside the snapshot as long as the file contents remain identical. The inode number would typically remain the same.
When you change the file, make sure you're writing to the same file, and not removing one file and immediately after creating another file with the same name.
If you do have two different files (the snapshot and the source) with the same device ID and the same inode but different contents, most applications are going to believe that they're the same. You'll have to find a way to test for file equality that depends on the snapshot technology, or else checksum both sides' content or systematically remove the target.
Related videos on Youtube
Yotam
Updated on September 18, 2022Comments
-
Yotam over 1 year
Our sysadmin has created a backup system that creates snapshots of the hard drive.
When I try to restore an older version from the snapshot:
cp /path/to/snapshots/foo.bar /path/to/folder/foo.bar
I get an error:
cp: `/path/to/snapshots/foo.bar' and `/path/to/folder/foo.bar' are the same file.
I can delete the file and copy it but I wanted to know if there is a way to do this with cp.
I know that the files are not linked because:
- I can change the content of the source and the snapshots are kept unchanged (tested)
- I can delete the source and the snapshots are there to restore (tested)
- The files are stored on the university computation cluster. If something like that were true, somebody would have been fired already (or at least shouted at).
Nevertheless, the
inode
number of the files is the sameThe cluster is implemented over red hat linux and I don't know what is the file system
df result:
Filesystem 1K-blocks Used Available Use% Mounted on <ipadress>:/vol/hpc/storage 67633152 67633152 0 100% /storage <ipadress>:/vol/hpc/storage 2186805248 982498048 1204307200 45% /storage
stat result:
File: `/path/to/snapshots/foo.bar' Size: 404 Blocks: 8 IO Block: 4096 regular file Device: 17h/23d Inode: 19750461 Links: 1 Access: (0644/-rw-r--r--) Uid: (<num1>/ yotama9) Gid: ( <num2>/ <groupname>) Access: 2012-01-22 00:03:27.246852000 +0200 Modify: 2012-01-19 23:10:32.746397000 +0200 Change: 2012-01-19 23:10:32.746397000 +0200 File: `/path/to/folder/foo.bar' Size: 404 Blocks: 8 IO Block: 4096 regular file Device: 17h/23d Inode: 26335134 Links: 1 Access: (0644/-rw-r--r--) Uid: (<num1>/ yotama9) Gid: ( <num2>/ <groupname>) Access: 2012-01-24 16:03:48.732453000 +0200 Modify: 2012-01-24 16:03:30.728900000 +0200 Change: 2012-01-24 16:03:30.728900000 +0200
-
Admin over 12 yearsAre you sure the source path is not a symlink to the destination path?
-
Admin over 12 yearsIt seems that the files are one and the same .. ie. that one is a hard-link of the other.
run ls -l -i
on both files to see if theinode
number is the same. If the inode number is the same, then you do not have a backup; it means there is only 1 real file and 1 hard-link to that file, -
Admin over 12 yearsI have edited my question, they are not hard linked
-
Admin over 12 yearsOf course they are. 1. How do you change the source? Maybe your editor uses a temporary file and then copies it over. Try
echo x >> foo.bar
. Or test whether the inode is still the same after you change it your way. 2. That's the idea. You delete one link, the other one remains. -
Admin over 12 yearsWhat OS and what filesystem are involved? What do you know about the way the snapshots are set up? What does
df /path/to/snapshot /path/to/source; ls -l /path/to/snapshot /path/to/source; stat /path/to/snapshot /path/to/source
show?