Sort files according to their extensions
Solution 1
While the general problem of identifying extensions is hard, you can clean up the script a bit:
- Tell
find
to only consider files with an extension:-iname '*.*'
- Use
awk
instead ofcut
ting yourself: - Use a script, and then tell
find
to exec that script.
Thus: a script called, say, move.sh
:
#! /bin/bash
for i
do
ext=/some/where/else/$(awk -F. '{print $NF}' <<<"$i")
mkdir -p "$ext"
mv "$i" "$ext"
done
Then run find
thus:
find . -name '*.*' -type f -exec move.sh {} +
This has the problem that you can't rearrange within the folder, so you could use xargs
:
find . -name '*.*' -type f -print0 > /tmp/temp
xargs -0 move.sh < /tmp/tmp
I'm not too sure of the efficiency involved, but another approach would be to get all the extensions, then move all the files involved in one swoop.
Something like:
find . -name '*.*' -type f -print0 | sed -z 's/.*\.//g' | sort -zu > /tmp/file-exts
This should get you a list of unique file extensions. Then our move.sh
will look like:
#!/bin/bash
for i
do
mkdir -p "$i"
find . -name "*.$i" -type f -exec mv -t "$i" {} +
done
And we'll run it:
xargs -0 move.sh < /tmp/file-exts
I make quite a few assumptions in this post, such as sed
and sort
supporting -z
(allowing them to work with the NUL-terminated lines that find
and xargs
thrive on).
Solution 2
Recursing into subdirectories
Parsing the output of find
is unreliable. What if there was a file name with a newline in it? Use find … -exec …
, which guarantees reliable processing.
find . -type f -exec sh -c '…' {} \;
The shell snippet …
receives the file name in $0
. Note that this is a separate shell process, it doesn't inherit variables or functions from the grandparent script. You can speed up processing by using the same shell subprocess to handle multiple files.
find . -type f -exec sh -c 'for x; do … done' _ {} +
This time, inside the loop, the file name is in the variable x
.
Breaking up the file name
Invoking external utilities such as sed
, cut
, etc. is fragile: you have to be extremely careful to avoid mangling some file names. You don't need that: the shell's built-in string processing features are enough for what you want to do here. Given a file name $x
:
directory=${x%/*}
basename=${x##*/}
extension=…
if [ -n "$extension" ]; then
mkdir -p "$directory/extension"
mv "$x" "$directory/extension"
fi
The extension
What is the extension of a file? It's the part after one of the .
in the names. There's no standard that says which one. It's up to you to decide what you consider to be the extension in cases like foo.tar.gz
or bar-1.2
.
Here's some example code that considers common compression extensions to nest, and that requires extensions to contain a letter, so that foo-1.2.tar.gz
is considered to have the extension tar.gz
.
extension=
while case "${basename##*.}" in
gz|bz2|xz) extension=.${basename##*.}$extension;; # stackable extension
*) false;;
do
basename=${basename%.*}
done
case "${basename##*.}" in
"$basename") :;; # no . ==> no extension
*[!0-9A-Za-z]*) :;; # only allow alphanumeric characters
*[A-Za-z]*) extension=${basename##*.}$extension;; # non-stackable extension
*) false;; # require at least one letter
esac
extension=${extension#.}
Related videos on Youtube
Edward Torvalds
Updated on September 18, 2022Comments
-
Edward Torvalds over 1 year
I have made a script that will sort files according to their extension and place them in the proper folder. For example, place
abc.jpg
in the directoryjpg
.#!/bin/bash #this script sorts files according to their extensions oldIFS=$IFS IFS=$'\n' (find . -type f) > /tmp/temp for var in `cat /tmp/temp` do name=`basename "$var"` ext=`echo $name | cut -d'.' -f2- | cut -d'.' -f2- | cut -d'.' -f2- | cut -d'.' -f2- | cut -d'.' -f2- | cut -d'.' -f2- | cut -d'.' -f2-` mkdir -p $ext mv "$var" $ext/ 2> /dev/null done IFS=$oldIFS
problem with this script:
- it involves use of IFS, it is said to avoid use of IFS, as much as possible
- it does not sorts file without file extensions
- it will sort files like abc.tar.bz in folder named bz, but however such a file should go in tar.bz folder
- see line 9 of my script; if any file contain more no. of dots(in its name) than no. of
cut -d'.' -f2-
in the script than if will result in file name taken in extension part.
for example, a file namedi.am.live.in.india.and.i.study.computer.science.txt
will be placed in folder namedstudy.computer.science.txt
you may also suggest any tweaks to make this script more smaller and neat.
-
Edward Torvalds over 9 yearsLet us continue this discussion in chat.
-
muru over 9 years@edwardtorvalds it's the path where you want the sorted files to go to. Use it if you use the first approach (
find
with-exec
). If you want the files to go in the same directory where you ranfind
, remove it and use the second approach (find
, followed byxargs
). -
Edward Torvalds over 9 yearshow come the script works fine for i.am.live.in.india.and.i.study.computer.science.txt and not for abcdg.tar.bz ?
-
Edward Torvalds over 9 yearsoh i see you grabbing the last part :p
-
muru over 9 years@edwardtorvalds as I said, determining whether an arbitrary string is an extension or not is too hard. Gilles used extra code for some known extensions. I didn't.
-
Mauricio Gracia Gutierrez about 3 yearsI have just used the second approach and it works very fast and fine, thanks