One-way-sync a directory, but leave deleted files deleted on the destination

13,593

If you're not going to use the remote file system as the data source of what has been transferred then you need to externally track the files that have been successfully transferred previously, then exclude them from future transfers.

rsync can include and exclude files based on patterns in a file so you can include a specific list of files in a transfer. Then exclude that list from future transfers.

#!/usr/bin/env bash

set -e

track_dir=~/.track_xfer
inc_file="$track_dir/include_files"
exc_file="$track_dir/exclude_files"
xfer_dir=~/testrsync
xfer_dest=~/testrsync_dest

mkdir -p "$track_dir"
touch $exc_file
cd "$xfer_dir"

# find files and create rsync filter list
find . -type f -print0 | perl -e '
  $/="\0"; 
  while (<>){ 
   chomp; 
   $_ =~ s!^\.!!;    # remove leading .
   $f = quotemeta;   # quote special chars
   $f =~ s!\\/!/!g;  # fix quoted paths `/`
   print $f."\n"; 
  }' > "$inc_file"

# Run the rsync
rsync -va --delete --exclude-from "$exc_file" --include-from "$inc_file" "$xfer_dir/" "$xfer_dest"

# Add the included/transferred files to the exclusion list
cat "$inc_file" "$exc_file" > "$exc_file".tmp
sort "$exc_file".tmp | uniq > "$exc_file"

You might need some more rsync specific regex quoting but the Perl quotemeta function and replacements was the first easy solution that came to mind.

The main problem will be dealing with any special characters in files names. If you want to deal with new lines or tabs and other strange things in the names then you will have to put a bit more work into the perl (or whatever) that parses and generates the inclusion pattern list. If you can restrict the names of your transfer files to a simple character set then you don't need to worry about this step as much. The perl is a halfway solution that should get you past most common regex chars.

The reason for using the include list rather than letting rsync pull the whole directory it self is so that you have a defined/complete list of files for the subsequent exclude list. You could probably achieve the same result by parsing the rsync output or a --log-file=FILE for the files that were transferred but that looked a little harder.

Share:
13,593

Related videos on Youtube

maxschlepzig
Author by

maxschlepzig

My name is Georg Sauthoff. 'Max Schlepzig' is just a silly old pseudonym (I am hesitant to change it because existing @-replies will not be updated) I studied computer science In my current line of work, I work on trading system software and thus care about low-latency

Updated on September 18, 2022

Comments

  • maxschlepzig
    maxschlepzig over 1 year

    I want to sync a directory between two systems. To make it more interesting the syncing must only be done in one direction, i.e.:

    • if a file is deleted in the source directory, it must also be deleted in the destination, if it was previously transfered
    • deleted files in the destination directory must not be deleted in the source
    • partially transfered files (e.g. because of network problems) must be finished on the next sync
    • new files in the source directory must be transfered to the destination
    • deleted files in the destination directory must not be re-transfered

    That means the source system has basically a master role, except that deleted files in the destination will not be forced back.

    Both Linux systems have rsync/ssh/scp available.

    New files in the source directory are created in such a way that one can use their mtime to detect them, e.g.:

    if mtime(file) > date-of-last-sync then: it is a new file that needs to be transfered
    

    Also, existing files are not changed in the source directory, i.e. the sync does not need to check for differences in already (completely) transfered files.

    • Admin
      Admin over 9 years
      @terdon, yes, but the challenge is that rsync would (by default) re-transfer files which were deleted from the destination directory.
    • Admin
      Admin over 9 years
      Sorry, I had only glanced at your question and did not notice you had mentioned rsync nor the full breadth of your requirements.
  • Anthon
    Anthon over 9 years
    This will re-transfer files deleted from the destination directory, the OP stated that is not appropriate.
  • maxschlepzig
    maxschlepzig over 9 years
    and how to avoid re-transfers of files deleted in the destination directory?
  • BillThor
    BillThor over 9 years
    @maxschlepzig See my edit for three options. (One is to use a tool more appropriate for incremental transfers.) Rsync is not designed to automatically handle your use case.
  • AReddy
    AReddy about 8 years
    if --delete is added to rsync it will transfer the files to the destination. it will not fulfill the question