Checking if a file exists in several directories
Solution 1
You don't mention if you need to keep the files (perhaps removing duplicates?), hardlink them or anything else.
So, depending on your intention, the best solution would be to use one program like rdfind (not interactive), fdupes (more interactive, allowing you to choose which files to keep or not), duff (to only report the files that were duplicate) or many others.
If you want something fancier with a GUI that will let you choose what to keep via a point-and-click interface, then fslint (via its fslint-gui
command) would be my recommended choice.
All of the above are available in Debian's repository and, by transition, I think that they are in Ubuntu's or Linux Mint's repositories, if that's what you are using.
Solution 2
This could be very slow if you traverse /downloads
or /media
for each file name. So traverse each hierarchy only once, store the list of file names, and then process the lists.
For simplicity, I assume that your file names don't contain any newlines.
find /downloads -type f | sed 's!^.*/\(.*\)$!\1/&!' |
sort -t / -k1,1 >/tmp/downloads.find
find /media/tv /media/music /media/movie -type f |
sed 's!^.*/\(.*\)$!\1/&!' |
sort -t / -k1,1 >/tmp/media.find
At this point, the two .find
files contain lists of file paths, with the name of the file prepended, sorted by file name. Join the files on the first /
-separated field, and clean up the result a bit.
join -j 1 -t / /tmp/downloads.find /tmp/media.find |
sed -e 's![^/]*/!!' -e 's![^/]*/! has the same name as !'
Solution 3
This will list all files in downloads that are also in your specified /media subdirectories:
find /downloads -type f | while IFS= read -r file ; do
bn=$(basename "$file")
find /media/tv /media/movie /media/music -type f -name "$bn"
done
and this will just print whether the file has been found in one of those /media sub-directories or not.
find /downloads -type f | while IFS= read -r file ; do
bn=$(basename "$file")
count=$(find /media/tv /media/movie /media/music -type f -name "$bn" | wc -l)
[ "$count" -gt 0 ] && printf "found %s\n" "$f"
done
If there are many files in /downloads, running find
once for each file will be very slow. That can be solved (if you are using GNU find
) by building a regular expression containing all the filenames you want to search for and using GNU find
's -regex
or -iregex
options.
REGEXP="^.*/\("
find /downloads -type f | while IFS= read -r file ; do
bn=$(basename "$file" | sed -e 's/\./\\./g')
REGEXP="$REGEXP\|$bn"
done
REGEXP="$REGEXP\)$"
find /media/tv /media/movie /media/music -type f -iregex "$REGEXP"
And here's another version that doesn't use the shell built-in read
so should be much faster:
REGEXP=$(find /downloads -type f | sed -e 's/^.*\/// ; s/\([]*\ .|[]\)/\\\1/g ;
s/$/\\|/' | tr -d '\n')
find /media/tv /media/movie /media/music -type f -iregex "^.*\($REGEXP\)$"
Both of these regexp versions are limited by the maximum line length of a shell command - too many files and they will fail.
NOTE: like most other answers here, these examples do not cope with filenames that have newlines (\n
) in them. Any other character, including space, is fine.
Solution 4
Here is an implementation in bash using brace expansion:
the_file=foo.mp3
for file in /downloads/media/{tv,movie,music}/"$the_file"; do
if [[ -e $file ]]; then
printf '%s found in %s:\n' "$the_file" "${file%/*}"
fi
done
andrew.vh
Updated on September 18, 2022Comments
-
andrew.vh almost 2 years
I need a script that will look at files in a directory and see if it exists in one of several directories.
I need something like this:
for files in /downloads/ #may or may not be in a sub-directory do print if file exists in /media/tv, /media/movie, or /media/music done
the files will not be in the root of the directory. I can't just search /media, because I don't want to search in cd-rom or videos.
I am using the latest version of Ubuntu server.
-
Bernhard over 11 yearsAs I understand it,
/media/
is not a subfolder of/downloads/
-
user1106106 over 11 yearsThe original poster didn't mention it, but your method essentially only looks at the file names, not at the file contents. Which is the original poster's intention is not clear, though.
-
andrew.vh over 11 yearsalso I didnt realize you could compare files based on contents. that would be nice as sometimes I would like to rename the files.
-
andrew.vh over 11 yearsThe goal is to find files that i have downloaded, but have not yet copied to my media directory. Although finding duplicates will be a nice way to clean up my collection, its not what I'm looking to do in this script.
-
user1106106 over 11 years@andrew.vh, any of the utilities above would catch files with the same names, as they, more generally, will look for the contents of the files, not just their names, as you already noticed.
-
IronSummitMedia over 11 yearsThis is the
join
command that worked for me:join -j 1 -t / /tmp/downloads.find /tmp/media.find | sed -e 's![^/]*/!!' -e 's!//! has the same name as /!'
(tested on two computers running SLES 10 and 11.) -
vonbrand about 11 yearsThat only checks in the current directory...