Bash script to rename files from a text file source

5,096

Solution 1

This ought to work:

sh <(sed -r 's/^\s*(.*)\s+([0-9\.]+)\s+([0-9A-Z]{8}\.dat)\s*$/mv -iv \3 "\2 \1"/' files)

... where files is the name of your source file.

What this does is pass the result of the sed command to a new instance of sh (the shell), using process substitution. The output of the sed command is:

mv -iv 000011F4.dat "0.1 New File Name.xlsx"
mv -iv 000011F5.dat "0.2 New File Name.xlsx"
mv -iv 000011F6.dat "0.3 New File Name.xlsx"
mv -iv 000011F7.dat "0.4 New File Name.xlsx"
mv -iv 000011F8.dat "0.5 New File Name.xlsx"
mv -iv 000011F9.dat "0.6 New File Name.xlsx"

Taking the sed command apart, it searches for a pattern:

  • ^ - the beginning of the line
  • \s* - any whitespace at the start
  • (.*) - any characters (the parentheses store the result to \1)
  • \s+ - at least one whitespace character
  • ([0-9\.]+) - at least one of 0-9 and . (stored to \2)
  • \s+ - at least one whitespace character
  • ([0-9A-Z]{8}\.dat) - 8 characters in 0-9 or A-Z, followed by .dat (stored to \3)
  • \s* - any whitespace at the end
  • $ - the end of the line

... and replaces it with mv -iv \3 "\2 \1", where \1 to \3 are the previously stored values. You can use something other than a space between the version number and the rest of the filename, if you like.

Here's the result:

$ ls -l
total 60
-rw-rw-r-- 1 z z   0 Aug  8 14:15 000011F4.dat
-rw-rw-r-- 1 z z   0 Aug  8 14:15 000011F5.dat
-rw-rw-r-- 1 z z   0 Aug  8 14:15 000011F6.dat
-rw-rw-r-- 1 z z   0 Aug  8 14:15 000011F7.dat
-rw-rw-r-- 1 z z   0 Aug  8 14:15 000011F8.dat
-rw-rw-r-- 1 z z   0 Aug  8 14:15 000011F9.dat
-rw-rw-r-- 1 z z 222 Aug  8 13:47 files
$ sh <(sed -r 's/^\s*(.*)\s+([0-9\.]+)\s+([0-9A-Z]{8}\.dat)\s*$/mv -iv \3 "\2 \1"/' files)
`000011F4.dat' -> `0.1 New File Name.xlsx'
`000011F5.dat' -> `0.2 New File Name.xlsx'
`000011F6.dat' -> `0.3 New File Name.xlsx'
`000011F7.dat' -> `0.4 New File Name.xlsx'
`000011F8.dat' -> `0.5 New File Name.xlsx'
`000011F9.dat' -> `0.6 New File Name.xlsx'
$ ls -l
total 60
-rw-rw-r-- 1 z z   0 Aug  8 14:15 0.1 New File Name.xlsx
-rw-rw-r-- 1 z z   0 Aug  8 14:15 0.2 New File Name.xlsx
-rw-rw-r-- 1 z z   0 Aug  8 14:15 0.3 New File Name.xlsx
-rw-rw-r-- 1 z z   0 Aug  8 14:15 0.4 New File Name.xlsx
-rw-rw-r-- 1 z z   0 Aug  8 14:15 0.5 New File Name.xlsx
-rw-rw-r-- 1 z z   0 Aug  8 14:15 0.6 New File Name.xlsx
-rw-rw-r-- 1 z z 222 Aug  8 13:47 files

Solution 2

sed 's/^\(.*\.xlsx\) \+\([[:digit:]]\+\.[[:digit:]]\+\) \+\(.[^ ]*\)/"\3" "\2\1"/' \
  <file_list | xargs -n 2 mv

This divides the line into the part before .xlsx, which is the second part of the new name, which becomes accessible as \1. The it grabs the version and assigns it to \2. Then comes the old file name, ignoring a trailing space.

This is quoted an provided to mv as an argument. The -n 2 ensures that mv receives two arguments, the old and the new file name.

The spaces do not pose any problem, what complicates matters is that your input list is not well structured. If the columns would be swapped and the file names quoted, you could just use xargs and mv, without prior manipulation.

Solution 3

The spaces in the file name, and the use of multiple spaces between some columns, make this harder, but by no means insurmountable.

Read the list file line by line. Usually one would use while IFS= read -r; do …, but here it might be more robust to strip leading and trailing whitespace. For each line:

  • Break each line into three parts. One way to do that is with regex matching. [[:space:]]+ matches one or more whitespace character (space or tab); [[:space:]]+ matches one or more non-whitespace characters. Parenthesized groups can be retrieved via the BASH_REMATCH variable.
    Another way, less convenient here, would be with ${VAR##PATTERN} and ${VAR%PATTERN} to strip off a prefix or suffix from a variable respectively.
  • Finally perform the move. Don't forget to log any errors.

Putting it all together:

ret=0
while read line; do
  if [[ $line =~ (.*[^[:space:]])[[:space:]]+([^[:space:]]+)[[:space:]]+([^[:space:]]+) ]]; then
    new_name="${BASH_REMATCH[1]}"
    version="${BASH_REMATCH[2]}"
    old_name="${BASH_REMATCH[3]}"
    mv -- "$old_name" "$version$new_name" || ret=1
  else
    echo "Malformed line: $line"
  fi
done <name_list.txt
exit $ret
Share:
5,096

Related videos on Youtube

user2472419
Author by

user2472419

Updated on September 18, 2022

Comments

  • user2472419
    user2472419 almost 2 years

    I'm fairly new to bash; I can just about perform simple administrative tasks with simple commands 1 at a time. However, I've been tasked with renaming some files in a directory using a text file as the source for my renaming and would really appreciate a few pointers, as I am well out of my depth.

    Let me explain:

    New File Name.xlsx 0.1  000011F4.dat 
    New File Name.xlsx 0.2  000011F5.dat 
    New File Name.xlsx 0.3  000011F6.dat 
    New File Name.xlsx 0.4  000011F7.dat 
    New File Name.xlsx 0.5  000011F8.dat 
    New File Name.xlsx 0.6  000011F9.dat 
    

    The source text file I have resembles the above somewhat. The intention is that the first 'column' is the new name for the file, the middle is the version and the third is the current filename.

    I need to rename the .dat files in the directory, changing them to the names presented in the first column. I also need to prepend the version number 0.1, 0.2 etc... to the beginning of each file.

    I have a few questions:

    Is it a massive problem that the files have whitespace in them? Would it be better adding " " around each file string?

    Basically I have no idea where to start and any help would be massively appreciated. As you can see it's slightly more complex than a usual renaming, giving the need to add the version column to the beginning of the filename and the whitespace in the list.

    • evilsoup
      evilsoup almost 11 years
      It isn't an insurmountable problem that the filenames have spaces in them, but it does rule out using many simple approaches. Without spaces this would be pretty trivial with awk or cut, but with the spaces you have to go with uglier, longer commands as in the answers given.
  • user2472419
    user2472419 almost 11 years
    Thanks alot. You've saved my bacon, I also appreciate the accompanying explanation, it's nice to see what's going on in a command before you run it. To be totally honest, I'm still not 100% certain what's happening. Is that using the sort of syntax you'd see in a regular expression? More specifically, I don't quite understand how that has defined the variables /1 /2 and /3 Thanks as well to the other responses, I appreciate the help!
  • user2472419
    user2472419 almost 11 years
    Actually, this worked when I tested it at home last night but hasn't worked today in production. Apparently there's a problem with /dev/fd/63. I browsed to it and it doesn't exist, any ideas anybody? Google hasn't been much help on this one. Cheers.
  • Eli
    Eli almost 11 years
    What do you get if you don't do the process substitution? That is to say, just run sed -r 's/^\s*(.*)\s+([0-9\.]+)\s+([0-9A-Z]{8}\.dat)\s*$/mv -iv \3 "\2 \1"/' files without the sh <(...) part.
  • user2472419
    user2472419 almost 11 years
    In fact, it did execute, I forgot to remove the last ). However, nothing appears to have been renamed, I get a large list of mv commands after it has executed but all the files remain unchanged
  • user2472419
    user2472419 almost 11 years
    As a further addition, the script did run eventually. I think it was a permission error. After resolving this the script did rename some of the files, however many it just renamed to the version number, rather than putting it at the beginning. Some others were simply left as 000001f5.dat but with a ? on the end.
  • Eli
    Eli almost 11 years
    That suggests a problem with your input data; sed isn't going to just randomly ignore some filenames and not others.