Determine if a file has been modified

40,682

Solution 1

If you want to detect whether a file has been modified through normal means (editing it in some application, checking out a new version from a revision control systems, rebuilding it, etc.), check whether its modification time (mtime) has changed from the last check. That's what stat -c %Y reports.

The modification time can be set by the touch command. If you want to detect whether the file has changed in any way (including the use of touch, extracting an archive, etc.), check whether its inode change time (ctime) has changed from the last check. That's what stat -c %Z reports. The ctime cannot be spoofed except by the system administrator (and even then, only through indirect means: by changing the system clock, or by accessing the disk directly, bypassing the filesystem).

Solution 2

The stat command only has a resolution of a second. So if the file was modified twice in the same second you could miss a modification. Newer filesystems like ext4 provide higher resolution timestamps in nanoseconds, but some of the old tools haven't caught up yet.

Also, it's possible for other programs to set an arbitrary modification time. You can see how this can happen via the touch command.

If you're concerned about either of those two possibilities it wouldn't be a bad idea to look at the file size as well. This is what rsync does when it's looking for modified files.

Solution 3

My feeling is that one wants to throw in more parameters to be even more sure.

What you have is the correct method. The only reason for that to fail would be if the filesystem is not updating properly -- in which case you will end up with a whole bunch of more serious problems.

Of course, I presume someone with the right knowledge and root access to a system where the partition is accessible might be able to alter the information to make it look as if the file hasn't been changed. However, in this case they would surely have made sure to do the same with the size, etc.

Share:
40,682

Related videos on Youtube

DustByte
Author by

DustByte

Updated on September 18, 2022

Comments

  • DustByte
    DustByte over 1 year

    In Linux (currently using ext4 filesystem), how can one check quickly if the contents of a file has been modified without reading any of its contents?

    Is the stat command a recommended approach? I currently do

    $ stat --format "%Y" hello.txt
    

    and later I can check if the same command yields the same output. If it does, I conclude that hello.txt has not changed.

    My feeling is that one wants to throw in more parameters to be even more sure. For example, would adding the file size, file name, etc, provide an even better "fingerprint" of the file?

    On this topic, I recall that a TrueCrypt volume I once had was always ignored by my incremental backup program, possibly because TrueCrypt made sure to leave no meta data changes behind. I suppose it is indeed possible to change all the data returned by stat, hence it cannot be guaranteed to pick up on every possible modification of the file?

    • Ramesh
      Ramesh over 9 years
      md5sum filename?
    • DustByte
      DustByte over 9 years
      md5sum or any sort of checksum reads the contents of the file. I do not want to do that as it is way too slow for my purposes.
    • Matej Vrzala M4
      Matej Vrzala M4 over 9 years
      ls -t will sort the contents in a directory by modification time.
    • Ray Andrews
      Ray Andrews over 9 years
      "has been modified"? Every file has been modified, the question is when was it modified. You can use 'find' to search for a specific range of modification times.
  • DustByte
    DustByte over 9 years
    Thanks, I gather that ctime is what I should use. It did not follow from my question that the purpose of this is to use it in my own backup script, where checksums will be computed only for new files or files that have changed. I can afford computing checksums for files that have changed just "slightly", say permissions have changed, etc. I prefer being as close as possible to actually looking at the contents of the file to determine a change.