Creating directories in a loop and moving files to those directories

12,350

Solution 1

list=(*)     # an array containing all the current file and subdir names
nd=5         # number of new directories to create
((nf = (${#list[@]} / nd) + 1))  # number of files to move to each dir
                                 # add 1 to deal with shell's integer arithmetic

# brace expansion doesn't work with variables due to order of expansions
echo mkdir $(printf "direc%d " $(seq $nd))

# move the files in chunks
# this is an exercise in arithmetic to get the right indices into the array
for (( i=1; i<=nd; i++ )); do
    echo mv "${list[@]:(i-1)*nf:nf}" "direc$i"
done

remove the two "echo" commands after you have tested this.

Or, if you want to go with a fixed number of files per directory, this is simpler:

list=(*)
nf=10
for ((d=1, i=0; i < ${#list[@]}; d++, i+=nf)); do
    echo mkdir "direc$d"
    echo mv "${list[@]:i:nf}" "direc$d"
done

Solution 2

#!/bin/sh
d=1                # index of the directory
f=0                # number of files already copied into direc$d
for x in *; do     # iterate over the files in the current directory
  # Create the target directory if this is the first file to be copied there
  if [ $f -eq 0 ]; then mkdir "direc$d"; fi
  # Move the file
  mv -- "$x" "direc$d"
  f=$((f+1))
  # If we've moved 5 files, go on to the next directory
  if [ $f -eq 5 ]; then
    f=0 d=$((d+1))
  fi
done

Useful references:

Solution 3

d=0; set ./somedirname                         #init dir counter; name
for f in ./*                                   #glob current dir
do   [ -f "$f" ] && set "$f" "$@" &&           #if "$f" is file and...
     [ "$d" -lt "$((d+=$#>5))" ]  || continue  #d<(d+($#>5)) or continue
     mkdir "$6$d" && mv "$@$d"    || !  break  #mk tgtdir and mv 5 files or
     shift 5                                   #break with error
done

The above command takes advantage of the shell's arg array's ability to concatenate strings to its head and tail. For example, if you wrote a function:

fn(){ printf '<%s>\n' "42$@"; }

...and called it like:

fn show me

...it would print:

<42show>
<me>

...because you can prepend or append to the first or last element (respectively) in the arg array by simply enclosing its quotes around your pre/affixed string.

The arg array does double duty here as well in that it also serves as the counter - the $# shell parameter will always let us know exactly how many elements we have have stacked thus far.

But... Here's a step-by-step:

  1. d=0; set ./somedirname
    • The $d var will increment by one for every new directory created. Here it is initialized to zero.
    • ./somedirname is whatever you like it to be. The ./ prefix is important though - not only does it surely root all operations to the current directory, it also allows you to specify any kind of name you would like (if you want to get crazy and use newlines or begin it with hyphens you can safely - but it is probably not advisable). Because the argname will always start with ./ no command will ever misinterpret it as an option in a command-line.
  2. for f in ./*
    • This starts a loop over all (if any) of the matches for * in the current directory. Again, each match is prefixed with ./.
  3. [ -f "$f" ]
    • verifies that each iteration's match is definitely a regular file (or a link to one) and...
  4. set "$f" "$@"
    • stacks the matches one in front of another in the shell array. In this way the ./somedirname is always at the tail of the array.
  5. [ "$d" -lt "$((d+=$#>5))" ]
    • adds 1 to $d if there are more than 5 array elements in "$@" while simultaneously testing the result for an increment.
  6. || continue
    • If any one of the [ -f "$f" ] set ... [ "$d" -lt... commands does not return true the loop continues to the next iteration and does not attempt to complete the rest of the loop. This is both efficient and safe.
  7. mkdir "$6$d"
    • Because the continue clause ensures we can only make it to this point if $# > 5 our ./somedirname is now in $6 and the value of $d was just incremented by one. So for the first group of 5 files matched and moved, this creates a directory named ./somedirname1 and for the fifth ./somedirname5 and so on. Importantly, this command fails if any pathname with the target pathname already exists. In other words, this command is only successful if there is definitely no directory with the same name existing already.
  8. mv "$@$d"

    • This expands the array while affixing the value for $d to the tail of the last element - which is the target directory name. So it expands like:

    mv ./file5 ./file4 ./file3 ./file2 ./file1 ./somedirname1

    • ... which is exactly what you want to happen.
  9. || ! break

    • If either of the previous two commands does not complete successfully for any reason the for loop breaks. The ! sends the boolean inverse of break's return - which is typically zero - so break returns 1. This way the loop will return false if any error happens in any of the previous commands. This is important - for loops - unlike while/until loops - do not imply tests, only iteration. Without explicitly testing the return of those two commands the shell will not necessarily halt on error - and set -e will likely kill the parent shell altogether. Rather this ensures a meaningful return and that the loop will not continue to iterate if anything goes wrong.

    • At a quick glance, it would appear this is the only answer here that will halt, for example, if mkdir ./somedirname does not return true - all others will continue to loop (and likely to repeat the error, or, worse, to move files in the current directory into an existing directory and possibly over other files of the same name there). Working with arbitrary filenames in loops you should always test both the existence of the source file and for existence of the target.

  10. shift 5
    • This shifts away the first 5 elements in the shell's arg array - which puts ./somedirname back in $1 and resets the array state for the next iteration.

Solution 4

With this awk program you can create the shell commands and, in case of doubt, inspect in advance whether they are correct...

awk -v n=5 '{ printf "mv \"%s\" %s\n", $0, "direc" int((NR-1)/n)+1 }' list

If you are okay with the output pipe the whole command into sh. Also, if you want to avoid the extra file 'list' you can create it on the fly; the whole program would then be...

ls  |  awk -v n=5 '{ printf "mv \"%s\" %s\n", $0, "direc" int((NR-1)/n)+1 }'  |  sh

You can defined other values than 5 if you change the setting n=5.

If you also want to create the target directories on the fly here's a variant...

ls  |  awk -v n=5 '
         NR%n==1 { ++c ; dir="direc"c ; print "mkdir "dir }
         { printf "mv \"%s\" %s\n", $0, dir }
'  |  sh
Share:
12,350

Related videos on Youtube

Polar.Ice
Author by

Polar.Ice

Trying to learn R and Linux !

Updated on September 18, 2022

Comments

  • Polar.Ice
    Polar.Ice over 1 year

    Consider a directory with N files.

    I can list the files alphabetically sorted and store the list by

    ls > list
    

    Then I want to create n sub-directories in same directory, which can be done by

    mkdir direc{1..n}
    

    Now I want to move the first m or say first 5 (1-5) files from list to direc1, the next 5 file i.e. 6-10 to direc2 and so on.

    This may be a trivial task for you guys, but am not able to do it at the moment. Please help.

    • Polar.Ice
      Polar.Ice about 9 years
      Thank you all for the answers. All answers helped me in learning something useful. Sadly, I am new here so can't upvote any of your answer and can only accept one as best answer.
  • mikeserv
    mikeserv about 9 years
    Your echo mkdir $(printf "$direc%d " $(seq $nd)) command is subject to all kinds of environment related issues. It's also usually best to ensure everything has gone well - by verifying exit statuses and etc - when working with arbitrary filenames. In other words - it is probably bad practice to create a bunch of directories before ensuring that you have the files you mean to put in them.
  • peterph
    peterph about 9 years
    Short explanation would be nice (like 1 sentence per line). :) What does the || ! break part do, actually?
  • mikeserv
    mikeserv about 9 years
    @peterph - it's pretty self-explanatory. && means execute the the following command only if the previous was successful. || means the opposite. ! reverses the following commands return. You should always use that if breaking for an error - the loop should not return 0 if an error occurs.
  • mikeserv
    mikeserv about 9 years
    @peterph - I did just realize that -p was probably a bad way to go though.
  • peterph
    peterph about 9 years
    Well, it is self-explanatory for people with certain level of knowledge - I was afraid your level was a bit above OP's one - and mine as well it seems, for that matter. Which is why I asked about the ! break - I was somehow unable to find it in bash(1) at first (it is in SHELL GRAMMAR - Pipelines for those curious). :)
  • peterph
    peterph about 9 years
    I agree, knowing what POSIX allows and only then laying back and using whatever one's preferred shell offers is better. However, those of us who mostly write shell scripts for their own use or with have some sort of guarantee that a particular shell will be used usually take the easier way. There always is the threat of the WTF moments when POSIX is the best one has at hand for some reason, but there are costs to both approaches.