How do I check if a given string is a legal/valid file name under Windows?

178,314

Solution 1

You can get a list of invalid characters from Path.GetInvalidPathChars and GetInvalidFileNameChars.

UPD: See Steve Cooper's suggestion on how to use these in a regular expression.

UPD2: Note that according to the Remarks section in MSDN "The array returned from this method is not guaranteed to contain the complete set of characters that are invalid in file and directory names." The answer provided by sixlettervaliables goes into more details.

Solution 2

From MSDN's "Naming a File or Directory," here are the general conventions for what a legal file name is under Windows:

You may use any character in the current code page (Unicode/ANSI above 127), except:

  • < > : " / \ | ? *
  • Characters whose integer representations are 0-31 (less than ASCII space)
  • Any other character that the target file system does not allow (say, trailing periods or spaces)
  • Any of the DOS names: CON, PRN, AUX, NUL, COM0, COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9, LPT0, LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, LPT9 (and avoid AUX.txt, etc)
  • The file name is all periods

Some optional things to check:

  • File paths (including the file name) may not have more than 260 characters (that don't use the \?\ prefix)
  • Unicode file paths (including the file name) with more than 32,000 characters when using \?\ (note that prefix may expand directory components and cause it to overflow the 32,000 limit)

Solution 3

For .Net Frameworks prior to 3.5 this should work:

Regular expression matching should get you some of the way. Here's a snippet using the System.IO.Path.InvalidPathChars constant;

bool IsValidFilename(string testName)
{
    Regex containsABadCharacter = new Regex("[" 
          + Regex.Escape(System.IO.Path.InvalidPathChars) + "]");
    if (containsABadCharacter.IsMatch(testName)) { return false; };

    // other checks for UNC, drive-path format, etc

    return true;
}

For .Net Frameworks after 3.0 this should work:

http://msdn.microsoft.com/en-us/library/system.io.path.getinvalidpathchars(v=vs.90).aspx

Regular expression matching should get you some of the way. Here's a snippet using the System.IO.Path.GetInvalidPathChars() constant;

bool IsValidFilename(string testName)
{
    Regex containsABadCharacter = new Regex("["
          + Regex.Escape(new string(System.IO.Path.GetInvalidPathChars())) + "]");
    if (containsABadCharacter.IsMatch(testName)) { return false; };

    // other checks for UNC, drive-path format, etc

    return true;
}

Once you know that, you should also check for different formats, eg c:\my\drive and \\server\share\dir\file.ext

Solution 4

Try to use it, and trap for the error. The allowed set may change across file systems, or across different versions of Windows. In other words, if you want know if Windows likes the name, hand it the name and let it tell you.

Solution 5

This class cleans filenames and paths; use it like

var myCleanPath = PathSanitizer.SanitizeFilename(myBadPath, ' ');

Here's the code;

/// <summary>
/// Cleans paths of invalid characters.
/// </summary>
public static class PathSanitizer
{
    /// <summary>
    /// The set of invalid filename characters, kept sorted for fast binary search
    /// </summary>
    private readonly static char[] invalidFilenameChars;
    /// <summary>
    /// The set of invalid path characters, kept sorted for fast binary search
    /// </summary>
    private readonly static char[] invalidPathChars;

    static PathSanitizer()
    {
        // set up the two arrays -- sorted once for speed.
        invalidFilenameChars = System.IO.Path.GetInvalidFileNameChars();
        invalidPathChars = System.IO.Path.GetInvalidPathChars();
        Array.Sort(invalidFilenameChars);
        Array.Sort(invalidPathChars);

    }

    /// <summary>
    /// Cleans a filename of invalid characters
    /// </summary>
    /// <param name="input">the string to clean</param>
    /// <param name="errorChar">the character which replaces bad characters</param>
    /// <returns></returns>
    public static string SanitizeFilename(string input, char errorChar)
    {
        return Sanitize(input, invalidFilenameChars, errorChar);
    }

    /// <summary>
    /// Cleans a path of invalid characters
    /// </summary>
    /// <param name="input">the string to clean</param>
    /// <param name="errorChar">the character which replaces bad characters</param>
    /// <returns></returns>
    public static string SanitizePath(string input, char errorChar)
    {
        return Sanitize(input, invalidPathChars, errorChar);
    }

    /// <summary>
    /// Cleans a string of invalid characters.
    /// </summary>
    /// <param name="input"></param>
    /// <param name="invalidChars"></param>
    /// <param name="errorChar"></param>
    /// <returns></returns>
    private static string Sanitize(string input, char[] invalidChars, char errorChar)
    {
        // null always sanitizes to null
        if (input == null) { return null; }
        StringBuilder result = new StringBuilder();
        foreach (var characterToTest in input)
        {
            // we binary search for the character in the invalid set. This should be lightning fast.
            if (Array.BinarySearch(invalidChars, characterToTest) >= 0)
            {
                // we found the character in the array of 
                result.Append(errorChar);
            }
            else
            {
                // the character was not found in invalid, so it is valid.
                result.Append(characterToTest);
            }
        }

        // we're done.
        return result.ToString();
    }

}
Share:
178,314

Related videos on Youtube

tomash
Author by

tomash

Updated on October 01, 2021

Comments

  • tomash
    tomash over 2 years

    I want to include a batch file rename functionality in my application. A user can type a destination filename pattern and (after replacing some wildcards in the pattern) I need to check if it's going to be a legal filename under Windows. I've tried to use regular expression like [a-zA-Z0-9_]+ but it doesn't include many national-specific characters from various languages (e.g. umlauts and so on). What is the best way to do such a check?

    • AMissico
      AMissico over 9 years
      I suggest using a static compiled Regex if you are going to use any of the answers with Regex..
  • Eugene Katz
    Eugene Katz over 15 years
    doesn't this only test the path, not the filename?
  • eugened
    eugened about 15 years
    +1 for including reserved filenames - those were missed in previous answers.
  • Marbal
    Marbal over 14 years
    "AUX" is a perfectly usable filename if you use the "\\?\" syntax. Of course, programs that don't use that syntax have real problems dealing with it... (Tested on XP)
  • Christian Hayter
    Christian Hayter over 13 years
    Surely that's not due to an NTFS naming rule, but merely because a file called $Boot already exists in the directory?
  • rao
    rao over 13 years
    string strTheseAreInvalidFileNameChars = new string( System.IO.Path.GetInvalidFileNameChars() ) ; Regex regFixFileName = new Regex("[" + Regex.Escape(strTheseAreInvalidFileNameChars ) + "]");
  • gap
    gap about 12 years
    This seems to be the only one that tests against all constraints. Why are the other answers being chosen over this?
  • Antimony
    Antimony over 11 years
    @gap because it doesn't always work. For example, trying to access CON will often succeed, even though it's not a real file.
  • Antimony
    Antimony over 11 years
    Also characters <= 31 are forbidden.
  • Owen Blacker
    Owen Blacker over 11 years
    It's always better to avoid the memory overhead of throwing an Exception, where possible, though.
  • Werner Henze
    Werner Henze about 11 years
    This is only half of the truth. You can create files with these names if calling the unicode version of CreateFile (prefixing the file name with "\\?\").
  • nawfal
    nawfal almost 11 years
    your answer could be better fit here:stackoverflow.com/questions/146134/…
  • Dour High Arch
    Dour High Arch almost 11 years
    This does not answer the question; there are many strings consisting only of valid characters (e.g. "....", "CON", strings hundreds of chars long) that are not valid filenames.
  • Erik Philips
    Erik Philips over 10 years
    A little research from people would work wonders. I've updated the post to reflect the changes.
  • Thomas Nguyen
    Thomas Nguyen about 10 years
    Anyone else disappointed that MS doesn't provide system level function/API for this capability instead of each developer has to cook his/her own solution? Wondering if there's a very good reason for this or just an oversight on MS part.
  • Paul Hunt
    Paul Hunt about 10 years
    2nd piece of code doesn't compile. "Cannot convert from char[] to string
  • yar_shukan
    yar_shukan over 9 years
    sPattern regex doesn't allow files started with period character. But MSDN says "it is acceptable to specify a period as the first character of a name. For example, ".temp"". I would remove "\..*" to make .gitignore correct file name :)
  • tcbrazil
    tcbrazil over 9 years
    Your example didn't worked for a CON file (C:\temp\CON).
  • whywhywhy
    whywhywhy about 9 years
    The correct regex for all these conditions mentioned above is as below:Regex unspupportedRegex = new Regex("(^(PRN|AUX|NUL|CON|COM[1-9]|LPT[1-9]|(\\.+)$)(\\..*)?‌​$)|(([\\x00-\\x1f\\\‌​\?*:\";|/<>])+)|(([\‌​\. ]+)", RegexOptions.IgnoreCase);
  • Wilky
    Wilky over 8 years
    @whywhywhy I think you've got an extra opening bracket in that Regex. "(^(PRN|AUX|NUL|CON|COM[1-9]|LPT[1-9]|(\\.+)$)(\\..*)?$)|(([‌​\\x00-\\x1f\\\\?*:\"‌​;‌​|/<>])+)|([\\. ]+)" worked for me.
  • mmmmmmmm
    mmmmmmmm over 8 years
    @High Arch: See answer for question "In C# check that filename is possibly valid (not that it exists)". (Although some clever guys closed that question in favour of this one...)
  • Mark A. Donohoe
    Mark A. Donohoe over 8 years
    But isn't 'C:\temp\CON' a valid filename? Why wouldn't it be?
  • Hyndrix
    Hyndrix over 8 years
    Wilky: your regex will also remove "." within the filename which are perfectly valid.
  • Hyndrix
    Hyndrix over 8 years
    This is better: (^(PRN|AUX|NUL|CON|COM[1-9]|LPT[1-9]|(\\.+)$)(\\..*)?$)|(([\‌​\x00-\\x1f\\\\?*:\"​‌​|/<>‌​])+)|(^([\\.]+‌​))
  • dlf
    dlf about 8 years
    All regexes above reject filenames that begin with '.', which is allowed by the OS.
  • Rich Jenks
    Rich Jenks about 8 years
    Depends on how you define "allowed". Windows allows filenames that begin with a dot but Explorer does not let you name a file as such, unless if also has an extension. For example, .foo is not allowed, but .foo.bar is.
  • mejdev
    mejdev about 8 years
    I read the same article mentioned in this answer and found through experimentation that COM0 and LPT0 are also not allowed. @dlf this one works with filenames that begin with '.': ^(?!^(?:PRN|AUX|CLOCK\$|NUL|CON|COM\d|LPT\d)(?:\..+)?$)(?:\.‌​*?(?!\.))[^\x00-\x1f‌​\\?*:\";|\/<>]+(?<![‌​\s.])$
  • mejdev
    mejdev about 8 years
    (I have incrementally made this better and deleted prev comments I left) This one is better than the answer's regex because it allows ".gitignore", "..asdf", doesn't allow '<' and '>' or the yen sign, and doesn't allow space or period at the end (which disallows names consisting only of dots): @"^(?!(?:PRN|AUX|CLOCK\$|NUL|CON|COM\d|LPT\d)(?:\..+)?$)[^\x‌​00-\x1F\xA5\\?*:\"";‌​|\/<>]+(?<![\s.])$"
  • magicandre1981
    magicandre1981 almost 8 years
    this fails for all files I tested. running it for C:\Windows\System32\msxml6.dll reports false.
  • Scott Dorman
    Scott Dorman almost 8 years
    @magicandre1981 You need to give it just the file name, not the fully qualified path.
  • magicandre1981
    magicandre1981 almost 8 years
    ok, but I need to check if the full path is valid. I used now a different solution.
  • ρяσѕρєя K
    ρяσѕρєя K over 7 years
    Add some explanation with answer for how this answer help OP in fixing current issue
  • nawfal
    nawfal about 7 years
    Is there a library which handles all of these cases?
  • Tony Sun
    Tony Sun about 7 years
    See the doc in the MSDN for the AugumentExcpetion, it reads:path is a zero-length string, contains only white space, or contains one or more of the invalid characters defined in GetInvalidPathChars. -or- The system could not retrieve the absolute path.
  • CodeLurker
    CodeLurker almost 7 years
    Also, you might not have permissions to access it; e.g. to test it by writing, even if you can read it if it does or will exist.
  • rory.ap
    rory.ap over 6 years
    I can create a file named "CLOCK$" just fine. Windows 7.
  • rory.ap
    rory.ap over 6 years
    @MarqueIV -- no, it's not valid. Read all the answers and comments above, or try it yourself and see.
  • rory.ap
    rory.ap over 6 years
    @Jer, "/example" is not legal, yet your method returns true.
  • rory.ap
    rory.ap over 6 years
    @papaiatis -- "CLOCK$" works just fine for me. Windows 7.
  • Mark A. Donohoe
    Mark A. Donohoe over 6 years
    Aaaah... I missed the 'CON' part. The name itself is valid from a string standpoint (which is what I was referring to), but I see now CON is a reserved name, making it non-valid from a Windows standpoint. My bad.
  • rory.ap
    rory.ap over 6 years
    Your pattern fails on .foo.bar.
  • rory.ap
    rory.ap over 6 years
    It also allows < and >.
  • Ashkan Mobayen Khiabani
    Ashkan Mobayen Khiabani over 6 years
    +1 for the code, but please replace Path.GetInvalidPathChars() with Path.GetInvalidFileNameChars() as Path.GetInvalidPathChars() is obsolete now
  • Jack Griffin
    Jack Griffin almost 6 years
    Did you mean : "return !fileName.Any(f=>Path.GetInvalidFileNameChars().Contains(f))‌​;" ?
  • tmt
    tmt almost 6 years
    @JackGriffin Of course! Thank you for your attentiveness.
  • Oleg Savelyev
    Oleg Savelyev almost 6 years
    BTW "the file name is all periods" rule is already contained in "trailing periods or spaces rule"
  • Thomas Weller
    Thomas Weller over 5 years
    This statement is incomplete and misses LPT#
  • Piotr Zierhoffer
    Piotr Zierhoffer about 4 years
    While this code is very nice to read, we should take into account the sorry internals of Path.GetInvalidFileNameChars. Take a look here: referencesource.microsoft.com/#mscorlib/system/io/path.cs,28‌​9 - for each character of your fileName, a clone of the array is created.
  • IvanH
    IvanH about 4 years
    @AshkanMobayenKhiabani: InvalidPathChars is obsolete but GetInvalidPathChars does not.
  • Michel Jansson
    Michel Jansson about 4 years
    In theory (according to the docs) this should work, problem is though at least in .NET Core 3.1, it does not.
  • Ciccio Pasticcio
    Ciccio Pasticcio almost 4 years
    "DD:\\\\\AAA.....AAAA". Not valid, but for your code, it is.
  • Igor Levicki
    Igor Levicki almost 4 years
    This should be the accepted answer (with the possible exception of network paths).