How do I check if a given string is a legal/valid file name under Windows?
Solution 1
You can get a list of invalid characters from Path.GetInvalidPathChars
and GetInvalidFileNameChars
.
UPD: See Steve Cooper's suggestion on how to use these in a regular expression.
UPD2: Note that according to the Remarks section in MSDN "The array returned from this method is not guaranteed to contain the complete set of characters that are invalid in file and directory names." The answer provided by sixlettervaliables goes into more details.
Solution 2
From MSDN's "Naming a File or Directory," here are the general conventions for what a legal file name is under Windows:
You may use any character in the current code page (Unicode/ANSI above 127), except:
-
<
>
:
"
/
\
|
?
*
- Characters whose integer representations are 0-31 (less than ASCII space)
- Any other character that the target file system does not allow (say, trailing periods or spaces)
- Any of the DOS names: CON, PRN, AUX, NUL, COM0, COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9, LPT0, LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, LPT9 (and avoid AUX.txt, etc)
- The file name is all periods
Some optional things to check:
- File paths (including the file name) may not have more than 260 characters (that don't use the
\?\
prefix) - Unicode file paths (including the file name) with more than 32,000 characters when using
\?\
(note that prefix may expand directory components and cause it to overflow the 32,000 limit)
Solution 3
For .Net Frameworks prior to 3.5 this should work:
Regular expression matching should get you some of the way. Here's a snippet using the System.IO.Path.InvalidPathChars
constant;
bool IsValidFilename(string testName)
{
Regex containsABadCharacter = new Regex("["
+ Regex.Escape(System.IO.Path.InvalidPathChars) + "]");
if (containsABadCharacter.IsMatch(testName)) { return false; };
// other checks for UNC, drive-path format, etc
return true;
}
For .Net Frameworks after 3.0 this should work:
http://msdn.microsoft.com/en-us/library/system.io.path.getinvalidpathchars(v=vs.90).aspx
Regular expression matching should get you some of the way. Here's a snippet using the System.IO.Path.GetInvalidPathChars()
constant;
bool IsValidFilename(string testName)
{
Regex containsABadCharacter = new Regex("["
+ Regex.Escape(new string(System.IO.Path.GetInvalidPathChars())) + "]");
if (containsABadCharacter.IsMatch(testName)) { return false; };
// other checks for UNC, drive-path format, etc
return true;
}
Once you know that, you should also check for different formats, eg c:\my\drive
and \\server\share\dir\file.ext
Solution 4
Try to use it, and trap for the error. The allowed set may change across file systems, or across different versions of Windows. In other words, if you want know if Windows likes the name, hand it the name and let it tell you.
Solution 5
This class cleans filenames and paths; use it like
var myCleanPath = PathSanitizer.SanitizeFilename(myBadPath, ' ');
Here's the code;
/// <summary>
/// Cleans paths of invalid characters.
/// </summary>
public static class PathSanitizer
{
/// <summary>
/// The set of invalid filename characters, kept sorted for fast binary search
/// </summary>
private readonly static char[] invalidFilenameChars;
/// <summary>
/// The set of invalid path characters, kept sorted for fast binary search
/// </summary>
private readonly static char[] invalidPathChars;
static PathSanitizer()
{
// set up the two arrays -- sorted once for speed.
invalidFilenameChars = System.IO.Path.GetInvalidFileNameChars();
invalidPathChars = System.IO.Path.GetInvalidPathChars();
Array.Sort(invalidFilenameChars);
Array.Sort(invalidPathChars);
}
/// <summary>
/// Cleans a filename of invalid characters
/// </summary>
/// <param name="input">the string to clean</param>
/// <param name="errorChar">the character which replaces bad characters</param>
/// <returns></returns>
public static string SanitizeFilename(string input, char errorChar)
{
return Sanitize(input, invalidFilenameChars, errorChar);
}
/// <summary>
/// Cleans a path of invalid characters
/// </summary>
/// <param name="input">the string to clean</param>
/// <param name="errorChar">the character which replaces bad characters</param>
/// <returns></returns>
public static string SanitizePath(string input, char errorChar)
{
return Sanitize(input, invalidPathChars, errorChar);
}
/// <summary>
/// Cleans a string of invalid characters.
/// </summary>
/// <param name="input"></param>
/// <param name="invalidChars"></param>
/// <param name="errorChar"></param>
/// <returns></returns>
private static string Sanitize(string input, char[] invalidChars, char errorChar)
{
// null always sanitizes to null
if (input == null) { return null; }
StringBuilder result = new StringBuilder();
foreach (var characterToTest in input)
{
// we binary search for the character in the invalid set. This should be lightning fast.
if (Array.BinarySearch(invalidChars, characterToTest) >= 0)
{
// we found the character in the array of
result.Append(errorChar);
}
else
{
// the character was not found in invalid, so it is valid.
result.Append(characterToTest);
}
}
// we're done.
return result.ToString();
}
}
Related videos on Youtube
tomash
Updated on October 01, 2021Comments
-
tomash over 2 years
I want to include a batch file rename functionality in my application. A user can type a destination filename pattern and (after replacing some wildcards in the pattern) I need to check if it's going to be a legal filename under Windows. I've tried to use regular expression like
[a-zA-Z0-9_]+
but it doesn't include many national-specific characters from various languages (e.g. umlauts and so on). What is the best way to do such a check?-
AMissico over 9 yearsI suggest using a static compiled Regex if you are going to use any of the answers with Regex..
-
-
Eugene Katz over 15 yearsdoesn't this only test the path, not the filename?
-
eugened about 15 years+1 for including reserved filenames - those were missed in previous answers.
-
Marbal over 14 years"AUX" is a perfectly usable filename if you use the "\\?\" syntax. Of course, programs that don't use that syntax have real problems dealing with it... (Tested on XP)
-
Christian Hayter over 13 yearsSurely that's not due to an NTFS naming rule, but merely because a file called
$Boot
already exists in the directory? -
rao over 13 yearsstring strTheseAreInvalidFileNameChars = new string( System.IO.Path.GetInvalidFileNameChars() ) ; Regex regFixFileName = new Regex("[" + Regex.Escape(strTheseAreInvalidFileNameChars ) + "]");
-
gap about 12 yearsThis seems to be the only one that tests against all constraints. Why are the other answers being chosen over this?
-
Antimony over 11 years@gap because it doesn't always work. For example, trying to access CON will often succeed, even though it's not a real file.
-
Antimony over 11 yearsAlso characters <= 31 are forbidden.
-
Owen Blacker over 11 yearsIt's always better to avoid the memory overhead of throwing an Exception, where possible, though.
-
Werner Henze about 11 yearsThis is only half of the truth. You can create files with these names if calling the unicode version of CreateFile (prefixing the file name with "\\?\").
-
nawfal almost 11 yearsyour answer could be better fit here:stackoverflow.com/questions/146134/…
-
Dour High Arch almost 11 yearsThis does not answer the question; there are many strings consisting only of valid characters (e.g. "....", "CON", strings hundreds of chars long) that are not valid filenames.
-
Erik Philips over 10 yearsA little research from people would work wonders. I've updated the post to reflect the changes.
-
Thomas Nguyen about 10 yearsAnyone else disappointed that MS doesn't provide system level function/API for this capability instead of each developer has to cook his/her own solution? Wondering if there's a very good reason for this or just an oversight on MS part.
-
Paul Hunt about 10 years2nd piece of code doesn't compile. "Cannot convert from char[] to string
-
yar_shukan over 9 yearssPattern regex doesn't allow files started with period character. But MSDN says "it is acceptable to specify a period as the first character of a name. For example, ".temp"". I would remove "\..*" to make .gitignore correct file name :)
-
tcbrazil over 9 yearsYour example didn't worked for a CON file (C:\temp\CON).
-
whywhywhy about 9 yearsThe correct regex for all these conditions mentioned above is as below:
Regex unspupportedRegex = new Regex("(^(PRN|AUX|NUL|CON|COM[1-9]|LPT[1-9]|(\\.+)$)(\\..*)?$)|(([\\x00-\\x1f\\\\?*:\";|/<>])+)|(([\\. ]+)", RegexOptions.IgnoreCase);
-
Wilky over 8 years@whywhywhy I think you've got an extra opening bracket in that Regex. "(^(PRN|AUX|NUL|CON|COM[1-9]|LPT[1-9]|(\\.+)$)(\\..*)?$)|(([\\x00-\\x1f\\\\?*:\";|/<>])+)|([\\. ]+)" worked for me.
-
mmmmmmmm over 8 years@High Arch: See answer for question "In C# check that filename is possibly valid (not that it exists)". (Although some clever guys closed that question in favour of this one...)
-
Mark A. Donohoe over 8 yearsBut isn't 'C:\temp\CON' a valid filename? Why wouldn't it be?
-
Hyndrix over 8 yearsWilky: your regex will also remove "." within the filename which are perfectly valid.
-
Hyndrix over 8 yearsThis is better:
(^(PRN|AUX|NUL|CON|COM[1-9]|LPT[1-9]|(\\.+)$)(\\..*)?$)|(([\\x00-\\x1f\\\\?*:\"|/<>])+)|(^([\\.]+))
-
dlf about 8 yearsAll regexes above reject filenames that begin with '.', which is allowed by the OS.
-
Rich Jenks about 8 yearsDepends on how you define "allowed". Windows allows filenames that begin with a dot but Explorer does not let you name a file as such, unless if also has an extension. For example,
.foo
is not allowed, but.foo.bar
is. -
mejdev about 8 yearsI read the same article mentioned in this answer and found through experimentation that COM0 and LPT0 are also not allowed. @dlf this one works with filenames that begin with '.':
^(?!^(?:PRN|AUX|CLOCK\$|NUL|CON|COM\d|LPT\d)(?:\..+)?$)(?:\.*?(?!\.))[^\x00-\x1f\\?*:\";|\/<>]+(?<![\s.])$
-
mejdev about 8 years(I have incrementally made this better and deleted prev comments I left) This one is better than the answer's regex because it allows ".gitignore", "..asdf", doesn't allow '<' and '>' or the yen sign, and doesn't allow space or period at the end (which disallows names consisting only of dots):
@"^(?!(?:PRN|AUX|CLOCK\$|NUL|CON|COM\d|LPT\d)(?:\..+)?$)[^\x00-\x1F\xA5\\?*:\"";|\/<>]+(?<![\s.])$"
-
magicandre1981 almost 8 yearsthis fails for all files I tested. running it for C:\Windows\System32\msxml6.dll reports false.
-
Scott Dorman almost 8 years@magicandre1981 You need to give it just the file name, not the fully qualified path.
-
magicandre1981 almost 8 yearsok, but I need to check if the full path is valid. I used now a different solution.
-
ρяσѕρєя K over 7 yearsAdd some explanation with answer for how this answer help OP in fixing current issue
-
nawfal about 7 yearsIs there a library which handles all of these cases?
-
Tony Sun about 7 yearsSee the doc in the MSDN for the AugumentExcpetion, it reads:path is a zero-length string, contains only white space, or contains one or more of the invalid characters defined in GetInvalidPathChars. -or- The system could not retrieve the absolute path.
-
CodeLurker almost 7 yearsAlso, you might not have permissions to access it; e.g. to test it by writing, even if you can read it if it does or will exist.
-
rory.ap over 6 yearsI can create a file named "CLOCK$" just fine. Windows 7.
-
rory.ap over 6 years@MarqueIV -- no, it's not valid. Read all the answers and comments above, or try it yourself and see.
-
rory.ap over 6 years@Jer, "/example" is not legal, yet your method returns
true
. -
rory.ap over 6 years@papaiatis -- "CLOCK$" works just fine for me. Windows 7.
-
Mark A. Donohoe over 6 yearsAaaah... I missed the 'CON' part. The name itself is valid from a string standpoint (which is what I was referring to), but I see now CON is a reserved name, making it non-valid from a Windows standpoint. My bad.
-
rory.ap over 6 yearsYour pattern fails on
.foo.bar
. -
rory.ap over 6 yearsIt also allows
<
and>
. -
Ashkan Mobayen Khiabani over 6 years+1 for the code, but please replace
Path.GetInvalidPathChars()
withPath.GetInvalidFileNameChars()
asPath.GetInvalidPathChars()
is obsolete now -
Jack Griffin almost 6 yearsDid you mean : "return !fileName.Any(f=>Path.GetInvalidFileNameChars().Contains(f));" ?
-
tmt almost 6 years@JackGriffin Of course! Thank you for your attentiveness.
-
Oleg Savelyev almost 6 yearsBTW "the file name is all periods" rule is already contained in "trailing periods or spaces rule"
-
Thomas Weller over 5 yearsThis statement is incomplete and misses LPT#
-
Piotr Zierhoffer about 4 yearsWhile this code is very nice to read, we should take into account the sorry internals of
Path.GetInvalidFileNameChars
. Take a look here: referencesource.microsoft.com/#mscorlib/system/io/path.cs,289 - for each character of yourfileName
, a clone of the array is created. -
IvanH about 4 years@AshkanMobayenKhiabani: InvalidPathChars is obsolete but GetInvalidPathChars does not.
-
Michel Jansson about 4 yearsIn theory (according to the docs) this should work, problem is though at least in .NET Core 3.1, it does not.
-
Ciccio Pasticcio almost 4 years"DD:\\\\\AAA.....AAAA". Not valid, but for your code, it is.
-
Igor Levicki almost 4 yearsThis should be the accepted answer (with the possible exception of network paths).