How do you compare two folders and copy the difference to a third folder?

34,267

Solution 1

I am not sure whether you can do it with any existing linux commands such as rsync or diff. But in my case I had to write my own script using Python, as python has the "filecmp" module for file comparison. I have posted the whole script and usage in my personal site - http://linuxfreelancer.com/

It usage is simple - give it the absolute path of new directory, old directory and difference directory in that order.

#!/usr/bin/env python

import os, sys
import filecmp
import re
from distutils import dir_util
import shutil

holderlist = []


def compareme(dir1, dir2):
    dircomp = filecmp.dircmp(dir1, dir2)
    only_in_one = dircomp.left_only
    diff_in_one = dircomp.diff_files
    dirpath = os.path.abspath(dir1)
    [holderlist.append(os.path.abspath(os.path.join(dir1, x))) for x in only_in_one]
    [holderlist.append(os.path.abspath(os.path.join(dir1, x))) for x in diff_in_one]
    if len(dircomp.common_dirs) > 0:
        for item in dircomp.common_dirs:
            compareme(
                os.path.abspath(os.path.join(dir1, item)),
                os.path.abspath(os.path.join(dir2, item)),
            )
        return holderlist


def main():
    if len(sys.argv) > 3:
        dir1 = sys.argv[1]
        dir2 = sys.argv[2]
        dir3 = sys.argv[3]
    else:
        print "Usage: ", sys.argv[0], "currentdir olddir difference"
        sys.exit(1)

    if not dir3.endswith("/"):
        dir3 = dir3 + "/"

    source_files = compareme(dir1, dir2)
    dir1 = os.path.abspath(dir1)
    dir3 = os.path.abspath(dir3)
    destination_files = []
    new_dirs_create = []
    for item in source_files:
        destination_files.append(re.sub(dir1, dir3, item))
    for item in destination_files:
        new_dirs_create.append(os.path.split(item)[0])
    for mydir in set(new_dirs_create):
        if not os.path.exists(mydir):
            os.makedirs(mydir)
    # copy pair
    copy_pair = zip(source_files, destination_files)
    for item in copy_pair:
        if os.path.isfile(item[0]):
            shutil.copyfile(item[0], item[1])


if __name__ == "__main__":
    main()

Solution 2

I have figured out what the problem was in my case:

The files I was comparing had different timestamps. I shouldn't have used the -a argument, I assume because rsync was trying to preserve the timestamps when copying files. The command which worked for me was:

rsync -rvcm --compare-dest=../old/ new/ difference/

Solution 3

This might help some readers: In Windows, an older, little freeware program -- Third Dir -- does exactly what's being asked for here. It's no longer available via the developer, Robert Vašíček. But I'm sure it can be found via some repositories online.

Here's the developer's description, which remains on his site:

Third Dir: An unusual directory-synchronizer - the different files are copied to third directory. It is very useful to extract e.g. new or edited photos from a huge directory tree on fixed disk to temporary folder, then add them to archive CD (note - the original files are compared against the CD). Version 1.4, size 23kB. Created 2005-02-12.

History: Version 1.14 - More efficient when many ten of thousands of files are compared.

Share:
34,267

Related videos on Youtube

Thane
Author by

Thane

Updated on September 18, 2022

Comments

  • Thane
    Thane over 1 year

    You've got three folders:

    • folder current, which contains your current files
    • folder old, which contains an older version of the same files
    • folder difference, which is just an empty folder

    How do you compare old with current and copy the files which are different (or entirely new) in current to difference?


    I have searched all around and it seems like a simple thing to tackle, but I can't get it to work in my particular example. Most sources suggested the use of rsync so I ended up with the following command:

    rsync -ac --compare-dest=../old/ new/ difference/
    

    What this does however, is copies all the files from new to difference, even those which are the same as in old.

    In case it helps (maybe the command is fine and the fault lies elsewhere), this is how I tested this:

    1. I made the three folders.
    2. I made several text files with different contents in old.
    3. I copied the files from old to new.
    4. I changed the contents of some of the files in new and added a few additional files.
    5. I ran the above command and checked the results in difference.

    I have been looking for a solution for the past couple of days and I'd really appreciate some help. It doesn't necessarily have to be using rsync, but I'd like to know what I'm doing wrong if possible.

    • Admin
      Admin over 10 years
      possible duplicate of How do I save changed files?
    • Admin
      Admin over 10 years
      @wingedsubmariner I don't think it is a duplicate, as the accepted answer at the linked question, is the command that the OP is asking a question about.
    • Admin
      Admin over 10 years
      @Bernhard Ah, my bad. I guess I misunderstood the original question.
    • Admin
      Admin over 10 years
      @wingedsubmariner No worries, you said "possible", and I agree it looks very similar :)
  • Anthon
    Anthon over 10 years
    If you are talking about non standard software, you should include a link. If you mean XYplorer that is not going to help the OP at all.
  • sage
    sage over 9 years
    I think to test this with the -a (archive) option, you should have used rsync -a to "copy" the files initially (or the cp equivalent), then deleted or modified. (I like to stick to rsync because I know it is self-consistent without thinking about what it might be doing.) I think that should have worked with the original command. The -a option includes -t (compare by timestamp), which is the alternative to -c (compare by checksum).
  • Yamaneko
    Yamaneko over 8 years
    In my opinion, this answer should be the one accepted, as it's far more simple. Also, the command only worked for me when I provided the full path for old/ and new/.
  • Ryan Williams
    Ryan Williams almost 5 years
    The caveat seems to be that the compare-dest must be the relative path to the difference as seen from inside the actual dest
  • mivk
    mivk almost 5 years
    Note that instead of -empty -exec rmdir {} \; you can use -empty -delete.