How to rename file names to avoid conflict in Windows or Mac?
Solution 1
You could do something like:
rename 's/[<>:"\\|?*]/_/g' /path/to/file
This will replace all these characters with a _
. Note that you need not to replace /
, since it's an invalid character for filenames in both filesystems, but is used as the Unix path separator.
Extend to a directory and all its contents with:
find /path/to/directory -depth -exec rename 's/[<>:"\\|?*]/_/g' {} +
Note that both /
(which marks the end of the pattern) and \
are escaped. To retain uniqueness, you could append a random prefix to it:
$ rename -n 's/[<>:"\/\\|?*]/_/g && s/^/int(rand(10000))/e' a\\b
a\b renamed as 8714a_b
A more complete solution should, at least:
- Convert all characters to the same case
- Use a sane counting system
That's to say, foo.mp3
should not become foo.mp3.1
, but foo.1.mp3
, since Windows is more reliant on extensions.
With that in mind, I wrote the following script. I tried to be non-destructive, by using a prefix path into which I can copy the renamed files, instead of modifying the original.
#! /bin/bash
windows_chars='<>:"\|?*'
prefix="windows/"
# Find number of files/directories which has this name as a prefix
find_num_files ()
(
if [[ -e $prefix$1$2 ]]
then
shopt -s nullglob
files=( "$prefix$1-"*"$2" )
echo ${#files[@]}
fi
)
# From http://www.shell-fu.org/lister.php?id=542
# Joins strings with a separator. Separator not present for
# edge case of single string.
str_join ()
(
IFS=${1:?"Missing separator"}
shift
printf "%s" "$*"
)
for i
do
# convert to lower case, then replace special chars with _
new_name=$(tr "$windows_chars" _ <<<"${i,,}")
# if a directory, make it, instead of copying contents
if [[ -d $i ]]
then
mkdir -p "$prefix$new_name"
echo mkdir -p "$prefix$new_name"
else
# get filename without extension
name_wo_ext=${new_name%.*}
# get extension
# The trick is to make sure that, for:
# "a.b.c", name_wo_ext is "a.b" and ext is ".c"
# "abc", name_wo_ext is "abc" and ext is empty
# Then, we can join the strings without worrying about the
# . before an extension
ext=${new_name#$name_wo_ext}
count=$(find_num_files "$name_wo_ext" "$ext")
name_wo_ext=$(str_join - "$name_wo_ext" $count)
cp "$i" "$prefix$name_wo_ext$ext"
echo cp "$i" "$prefix$name_wo_ext$ext"
fi
done
In action:
$ tree a:b
a:b
├── b:c
│ ├── a:d
│ ├── A:D
│ ├── a:d.b
│ └── a:D.b
├── B:c
└── B"c
└── a<d.b
3 directories, 5 files
$ find a:b -exec ./rename-windows.sh {} +
mkdir -p windows/a_b
mkdir -p windows/a_b/b_c
mkdir -p windows/a_b/b_c
cp a:b/B"c/a<d.b windows/a_b/b_c/a_d.b
mkdir -p windows/a_b/b_c
cp a:b/b:c/a:D.b windows/a_b/b_c/a_d-0.b
cp a:b/b:c/A:D windows/a_b/b_c/a_d
cp a:b/b:c/a:d windows/a_b/b_c/a_d-1
cp a:b/b:c/a:d.b windows/a_b/b_c/a_d-1.b
$ tree windows/
windows/
└── a_b
└── b_c
├── a_d
├── a_d-0.b
├── a_d-1
├── a_d-1.b
└── a_d.b
2 directories, 5 files
The script is available in my Github repo.
Solution 2
Recursively replace a list of strings or characters in filenames by other strings or characters
The script below can be used to replace a list of strings or characters, possibly occurring in a file's name, by an arbitrary replacement per string. Since the script only renames the file itself (not the path), there is no risk of messing with directories.
The replacement is defined in the list: chars
(see further below). It is possible to give each string its own replacement, to be able to reverse the renaming if you'd ever want to do that. (assuming the replacement is a unique string). In case you'd like to replace all problematic strings by an underscore, simply define the list like:
chars = [
("<", "_"),
(">", "_"),
(":", "_"),
('"', "_"),
("/", "_"),
("\\", "_"),
("|", "_"),
("?", "_"),
("*", "_"),
]
Dupes
To prevent duplicated names, the script first creates the "new" name. It then checks if a similarly named file already exists in the same directory. If so, it creates a new name, preceded by dupe_1
or dupe_2
, until it finds an "available" new name for the file:
becomes:
The script
#!/usr/bin/env python3
import os
import shutil
import sys
directory = sys.argv[1]
# --- set replacement below in the format ("<string>", "<replacement>") as below
chars = [
("<", "_"),
(">", "_"),
(":", "_"),
('"', "_"),
("/", "_"),
("\\", "_"),
("|", "_"),
("?", "_"),
("*", "_"),
]
# ---
for root, dirs, files in os.walk(directory):
for file in files:
newfile = file
for c in chars:
newfile = newfile.replace(c[0], c[1])
if newfile != file:
tempname = newfile; n = 0
while os.path.exists(root+"/"+newfile):
n = n+1; newfile = "dupe_"+str(n)+"_"+tempname
shutil.move(root+"/"+file, root+"/"+newfile)
How to use
- Copy the script into an empty file, save it as
rename_chars.py
. - Edit if you want the replacement list. As it is, the scrip0t replaces all occurrences of problematic characters by an underscore, but the choice is yours.
-
Test- run it on a directory by the command:
python3 /path/to/rename_chars.py <directory_to_rename>
Note
Note that in the line:
("\\", "_bsl_"),
in python, a backslash needs to be escaped by another backslash.
Related videos on Youtube
don.joey
Before I was called Private, but due to namespace polution I am henceforth known as don.joey! For my real avatar (.gif): check here.
Updated on September 18, 2022Comments
-
don.joey over 1 year
How can I batch rename file names so that they do not include characters that clash with other file systems as for instance,
Screenshot 2015-09-07-25:10:10
Note that the colons are the issue in this file name. These will not be digested by Windows or Mac.
These files could be renamed to
Screenshot 2015-09-07-25--10--10
I have to move a large amount of files from Ubuntu to another OS. I copied them to an NTFS drive using Rsync, but that lost some files. I also copied them to an ext4 drive.
The following list are the reserved characters:
< (less than) > (greater than) : (colon) " (double quote) / (forward slash) \ (backslash) | (vertical bar or pipe) ? (question mark) * (asterisk)
Another issue is that Windows is not case-sensitive when it comes to file names, (and most OS X systems as well).
-
Panther over 8 yearsHow much of the information do you want to preserve ? Use a loop for i in Screenshot* .. n=1 ... mv $i $i$n ... n=n+1 ...
-
Jacob Vlijm over 8 yearsDo you have any preferences how to rename? Also: is there a risk on dupes after renaming?
-
Rinzwind over 8 years@JacobVlijm I would assume that to be a yes (just to be safe and yes I know ... that regex will be long :D )
-
Rinzwind over 8 yearsYou probably need to name a character for every char you want to replace. And that could be added to 1 regex or to multiple rename.ul instructions.
-
don.joey over 8 yearsTo be honest: the chance for me having dupes is small. For future purposes, though, I think a snippet should avoid dupes and should preserve as much info as possible
-
-
Rinzwind over 8 years1 slight issue: 1<1.txt and 1:1.txt will have you end up with 1 file less than intended.
-
muru over 8 years@Rinzwind True. But how do you decide which one is which in the Windows world?
-
don.joey over 8 yearsI think Rinz has a point. Maybe manpages.ubuntu.com/manpages/oneiric/man1/rename.ul.1.html could come in handy?
-
muru over 8 years@don.joey Ok - then where do you stop? Have you taken into account case-sensitivity?
-
don.joey over 8 yearsCould you take it into account?
-
Rinzwind over 8 years"case" could be solved with a backup parameter with "cp" or "rsync"(?)
-
muru over 8 years@Rinzwind will ruin extensions (not a bother for Linux, but will mess up the Windows world).
-
don.joey over 8 yearsActually this messes up the dir structure because it will replace the slashes.
-
muru over 8 years@don.joey yes, it does. I was in the process of editing it, then my attention went to other things.
-
muru over 8 years@don.joey see update.
-
conualfy about 4 yearsLength is also a thing is the Windows world, NTFS does not allow that long filenames and folder names as ext4 does. It should be treated, too.
-
Andy almost 3 yearsYo this ruined my git subfolder. all i did was
find . <etc>
. I guess it kinda was my fault. but still, no warning. -
muru almost 3 years@AndiHamolli none of the problematic characters are likely to appear in a
.git
folder (unless used in a branch name or something). In any case, files are only copied, not moved, so your original stuff should remain as is. -
Andy almost 3 yearsDoes it delete anything?
-
Andy almost 3 years@muru , can you also make a version that just renames the existing files, without creating a separate
windows
folder. -
muru almost 3 years@AndiHamolli Currently it doesn't delete anything. It just makes copies of whatever needs to be renamed. If you want to just rename, change
cp
tomv
and delete theprefix="windows"
line.