Run command automatically when files are copied into a directory

17,021

Solution 1

Per your bonus question, add the following line below the rsync command in the shell script I provided below. I wrote this in the comment but I'll officially add it to my answer here:

    find /auto/std2/nat2/B -name '*.zip' -exec sh -c 'unzip -d `dirname {}` {}' ';'

This will handle unzipping all the zip files that are copied via rsync from folder /auto/std2/nat2/A to /auto/std2/nat2/B



If you have rsync installed why not just cron it and have rsync manage the file mirroring?

Create script myrsyncscript.sh

Don't forget to make it executable: chmod 700 myrsyncscript.sh

#!/bin/sh

LOCKFILE=/tmp/.hiddenrsync.lock

if [ -e $LOCKFILE ]
        then
        echo "Lockfile exists, process currently running."
        echo "If no processes exist, remove $LOCKFILE to clear."
        echo "Exiting..."
#        mailx -s "Rsync Lock - Lock File found" [email protected] <<+
#Lockfile exists, process currently running.
#If no processes exist, remove $LOCKFILE to clear.
#+
        exit
fi

touch $LOCKFILE
timestamp=`date +%Y-%m-%d::%H:%M:%s`
echo "Process started at: $timestamp" >> $LOCKFILE

## Run Rsync if no Lockfile
rsync -a --no-compress /auto/std1/nat1/A /auto/std2/nat2/B


echo "Task Finished, removing lock file now at `date +%Y-%m-%d::%H:%M:%s`"
rm $LOCKFILE

Options breakdown:

-a is for archive, which preserves ownership, permissions etc.
--no-compress as there's no lack of bandwidth between local devices

Additional options you might consider man rsync:

--ignore-existing

skip updating files that exist on receiver

--update

This forces rsync to skip any files which exist on the destination and have a modified time that is newer than the source file. (If an existing destination file has a modification time equal to the source file’s, it will be updated if the sizes are different.) Note that this does not affect the copying of symlinks or other special files. Also, a difference of file format between the sender and receiver is always considered to be important enough for an update, no matter what date is on the objects. In other words, if the source has a directory where the destination has a file, the transfer would occur regardless of the timestamps.

This option is a transfer rule, not an exclude, so it doesn’t affect the data that goes into the file-lists, and thus it doesn’t affect deletions. It just limits the files that the receiver requests to be transferred.

Add it to cron like so, and set the frequency to whatever you feel most comfortable with:

Open cron with crontab -e and add the below:

### Every 5 minutes
*/5 * * * * /path/to/my/script/myrsyncscript.sh > /path/to/my/logfile 2>&1 

# * * * * *  command to execute
 # │ │ │ │ │
 # │ │ │ │ │
 # │ │ │ │ └───── day of week (0 - 6) (0 to 6 are Sunday to Saturday, or use names; 7 is Sunday, the same as 0)
 # │ │ │ └────────── month (1 - 12)
 # │ │ └─────────────── day of month (1 - 31)
 # │ └──────────────────── hour (0 - 23)
 # └───────────────────────── min (0 - 59)

Solution 2

Implementing @Izkata's suggestion using inotifywait with paced event response to keep the rsyncs down to at most 1 every 5 minutes while still responding quickly to initial changes:

#!/bin/sh
# usage: whateveryouwanttotcallthis "$directorytowatch" rsync args here

cd "$1" || { echo "${0##*/}: can't cd to $1"; exit 1; }
shift
rsync -nq "$@" || { echo "rsync doesn't like your $# arguments $@"; exit 1; }

notbefore=0
watchlock=.lock.inotifywait
rsynclock=.lock.rsync-pending

mkdir $watchlock ||
       { echo "${0##*/}: already running, rmdir '$PWD/$watchlock' if kill -9 took it out"
         exit 1;
       }
trap "rmdir '$watchlock'" 0 1 2 3 15

inotifywait -m -e close-write . |    # maybe also add move
        while read; do
                mkdir $rsynclock 2>&- || continue
                schedule=$(( notbefore <= (now=`date +%s`+2) ? (now+2)
                                                           : (notbefore) ))
                notbefore=$((schedule+300))
                ( ( trap "rmdir '$rsynclock'" 0 1 2 3 15
                    sleep $(( schedule-now ))
                  )
                  # substitute your payload here
                  rsync --exclude='.lock.*' "$@" \
                          || echo ssia | mail -s "${0##*/} '$PWD' rsync failed" opswatch
                ) &
        done

The two-second delay both helps with batching up small bursts and allows time for the renames some things do when a write is complete. Maybe 15 seconds would be better.

Can't test, on windows atm, I hope and believe it's at least pretty darn close.

Solution 3

You could use what DevNull suggested which rsyncs periodically. Personally I would use inotify. It is a nifty tool that you can give a folder to watch. It sets up watches and notifies you whenever a filesystem change occurs. You could then trigger an rsync based on the trigger from inotify.

For the specific case at the end you talk about, you can use the trigger from inotify to see what change has happened and then write a simple bash script to check if it is a zip file that has been added to folder A and if it is, then you can just unzip it to folder B (or whatever else you would like to do) instead of just copying the zip.

Share:
17,021

Related videos on Youtube

Admin
Author by

Admin

Updated on September 18, 2022

Comments

  • Admin
    Admin over 1 year

    I have two folders called: A and B, in different paths on the same computer. When I add any new file(s) into folder A, I want to copy it to folder B automatically.

    My folders:

    /auto/std1/nat1/A
    /auto/std2/nat2/B
    

    What I currently do to copy the files:

    cp -r A B
    

    But I want this process to run automatically in the background for every new file and folder in A into B.

    Added question/problem

    While copying files I would like specific actions to be performed on certain files types, example: when I have a zip file in folder A, I would like it to unzip this file in folder B automatically.

    This is on a CentOS 7` system.

  • Admin
    Admin over 9 years
    Thanks! Do I need to make it like a bash to run in the background all the time?
  • devnull
    devnull over 9 years
    No, the cron should automatically run this rsync every 5 minutes. You can test it by trying to kick it off yourself manually (as you always sure before setting a cron) by running rsync -a --no-compress /auto/std1/nat1/A /auto/std2/nat2/B in your terminal to make sure it works first. Then when you set the cron, every 5 minutes it will run this. You can make it more frequent if you feel that frequency is an issue by running it every 1 minute by doing */1. Will but you might run into rsync conflicts. I will update my answer with a lock file to ensure no conflict
  • Admin
    Admin over 9 years
    Thanks for your kindness. This is great. One more question: regarding rsync can I use it to do other jobs than coping files. i.e if I add zip file in folder A is it offer an unzip process automatically to folder B?
  • devnull
    devnull over 9 years
    @MJA No, problem. I updated my answer with a script version that creates a lock file first to ensure if you are running it frequently and there is a lot of files that you don't spin off multiple ryncs and things get out of hand. I also included an email you can send yourself if you just uncomment the email part in the lock file and add your email address to it. Assuming you have mailx as well on the server. Let me know if you have any problems with this. Also note I updated the cron too so test run it first: /path/to/my/script/myrsyncscript.sh > /path/to/my/logfile 2>&1
  • devnull
    devnull over 9 years
    @MJA, no rsync doesn't unzip files (that i'm aware of) but you can easily add that functionality into the script I posted above by running a find on directory B after rsync completed to unzip files.
  • Admin
    Admin over 9 years
    It works fine. This is fantastic. Many thousands of thans :)
  • devnull
    devnull over 9 years
    @MJA Right below the rsync add find /auto/std2/nat2/B -name '*.zip' -exec sh -c 'unzip -d `dirname {}` {}' ';' This should do the file unziping for you
  • Sobrique
    Sobrique over 9 years
    inotify keeps your sysadmin happy, when he'll get quite grumpy about 5 minute find/rsyncs.
  • Sobrique
    Sobrique over 9 years
    Be cautious with run frequency of rsyncs and find. If a significant number of files are involved, it can thrash your filesystem, and that's not good news.
  • devnull
    devnull over 9 years
    @Sobrique, rsync can be tweak enough that it should keep performance impact down to a minimal (assuming you aren't mirroring TB of data, but then in that case you won't be able to save much performance anyway). I updated my answer with some additional performance saving options just in case OP or anyone else is interested.
  • Sobrique
    Sobrique over 9 years
    Yes, there are options for doing it, I'm just trying to suggest caution. I have had systems hammered by badly thought out cron jobs, and find/rsync is one of the key culprits. Because even if there's nothing to do, they do a filesystem traversal and generate a burst of IOPs.