How can I compare (directory) paths in C#?

50,363

Solution 1

From this answer, this method can handle a few edge cases:

public static string NormalizePath(string path)
{
    return Path.GetFullPath(new Uri(path).LocalPath)
               .TrimEnd(Path.DirectorySeparatorChar, Path.AltDirectorySeparatorChar)
               .ToUpperInvariant();
}

More details in the original answer. Call it like:

bool pathsEqual = NormalizePath(path1) == NormalizePath(path2);

Should work for both file and directory paths.

Solution 2

GetFullPath seems to do the work, except for case difference (Path.GetFullPath("test") != Path.GetFullPath("TEST")) and trailing slash. So, the following code should work fine:

String.Compare(
    Path.GetFullPath(path1).TrimEnd('\\'),
    Path.GetFullPath(path2).TrimEnd('\\'), 
    StringComparison.InvariantCultureIgnoreCase)

Or, if you want to start with DirectoryInfo:

String.Compare(
    dirinfo1.FullName.TrimEnd('\\'),
    dirinfo2.FullName.TrimEnd('\\'), 
    StringComparison.InvariantCultureIgnoreCase)

Solution 3

The question has been edited and clarified since it was originally asked and since this answer was originally posted. As the question currently stands, this answer below is not a correct answer. Essentially, the current question is asking for a purely textual path comparison, which is quite different from wanting to determine if two paths resolve to the same file system object. All the other answers, with the exception of Igor Korkhov's, are ultimately based on a textual comparison of two names.

If one actually wants to know when two paths resolve to the same file system object, you must do some IO. Trying to get two "normalized" names, that take in to account the myriad of possible ways of referencing the same file object, is next to impossible. There are issues such as: junctions, symbolic links, network file shares (referencing the same file object in different manners), etc. etc. In fact, every single answer above, with the exception of Igor Korkhov's, will absolutely give incorrect results in certain circumstances to the question "do these two paths reference the same file system object. (e.g. junctions, symbolic links, directory links, etc.)

The question specifically requested that the solution not require any I/O, but if you are going to deal with networked paths, you will absolutely need to do IO: there are cases where it is simply not possible to determine from any local path-string manipulation, whether two file references will reference the same physical file. (This can be easily understood as follows. Suppose a file server has a windows directory junction somewhere within a shared subtree. In this case, a file can be referenced either directly, or through the junction. But the junction resides on, and is resolved by, the file server, and so it is simply impossible for a client to determine, purely through local information, that the two referencing file names refer to the same physical file: the information is simply not available locally to the client. Thus one must absolutely do some minimal IO - e.g. open two file object handles - to determine if the references refer to the same physical file.)

The following solution does some IO, though very minimal, but correctly determines whether two file system references are semantically identical, i.e. reference the same file object. (if neither file specification refers to a valid file object, all bets are off):

public static bool AreDirsEqual(string dirName1, string dirName2, bool resolveJunctionaAndNetworkPaths = true)
{
    if (string.IsNullOrEmpty(dirName1) || string.IsNullOrEmpty(dirName2))
        return dirName1==dirName2;
    dirName1 = NormalizePath(dirName1); //assume NormalizePath normalizes/fixes case and path separators to Path.DirectorySeparatorChar
    dirName2 = NormalizePath(dirName2);
    int i1 = dirName1.Length;
    int i2 = dirName2.Length;
    do
    {
        --i1; --i2;
        if (i1 < 0 || i2 < 0)
            return i1 < 0 && i2 < 0;
    } while (dirName1[i1] == dirName2[i2]);//If you want to deal with international character sets, i.e. if NormalixePath does not fix case, this comparison must be tweaked
    if( !resolveJunctionaAndNetworkPaths )
        return false;
    for(++i1, ++i2; i1 < dirName1.Length; ++i1, ++i2)
    {
        if (dirName1[i1] == Path.DirectorySeparatorChar)
        {
            dirName1 = dirName1.Substring(0, i1);
            dirName2 = dirName1.Substring(0, i2);
            break;
        }
    }
    return AreFileSystemObjectsEqual(dirName1, dirName2);
}

public static bool AreFileSystemObjectsEqual(string dirName1, string dirName2)
{
    //NOTE: we cannot lift the call to GetFileHandle out of this routine, because we _must_
    // have both file handles open simultaneously in order for the objectFileInfo comparison
    // to be guaranteed as valid.
    using (SafeFileHandle directoryHandle1 = GetFileHandle(dirName1), directoryHandle2 = GetFileHandle(dirName2))
    {
        BY_HANDLE_FILE_INFORMATION? objectFileInfo1 = GetFileInfo(directoryHandle1);
        BY_HANDLE_FILE_INFORMATION? objectFileInfo2 = GetFileInfo(directoryHandle2);
        return objectFileInfo1 != null
                && objectFileInfo2 != null
                && (objectFileInfo1.Value.FileIndexHigh == objectFileInfo2.Value.FileIndexHigh)
                && (objectFileInfo1.Value.FileIndexLow == objectFileInfo2.Value.FileIndexLow)
                && (objectFileInfo1.Value.VolumeSerialNumber == objectFileInfo2.Value.VolumeSerialNumber);
    }
}

static SafeFileHandle GetFileHandle(string dirName)
{
    const int FILE_ACCESS_NEITHER = 0;
    //const int FILE_SHARE_READ = 1;
    //const int FILE_SHARE_WRITE = 2;
    //const int FILE_SHARE_DELETE = 4;
    const int FILE_SHARE_ANY = 7;//FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE
    const int CREATION_DISPOSITION_OPEN_EXISTING = 3;
    const int FILE_FLAG_BACKUP_SEMANTICS = 0x02000000;
    return CreateFile(dirName, FILE_ACCESS_NEITHER, FILE_SHARE_ANY, System.IntPtr.Zero, CREATION_DISPOSITION_OPEN_EXISTING, FILE_FLAG_BACKUP_SEMANTICS, System.IntPtr.Zero);
}


static BY_HANDLE_FILE_INFORMATION? GetFileInfo(SafeFileHandle directoryHandle)
{
    BY_HANDLE_FILE_INFORMATION objectFileInfo;
    if ((directoryHandle == null) || (!GetFileInformationByHandle(directoryHandle.DangerousGetHandle(), out objectFileInfo)))
    {
        return null;
    }
    return objectFileInfo;
}

[DllImport("kernel32.dll", EntryPoint = "CreateFileW", CharSet = CharSet.Unicode, SetLastError = true)]
static extern SafeFileHandle CreateFile(string lpFileName, int dwDesiredAccess, int dwShareMode,
 IntPtr SecurityAttributes, int dwCreationDisposition, int dwFlagsAndAttributes, IntPtr hTemplateFile);
[DllImport("kernel32.dll", SetLastError = true)]
static extern bool GetFileInformationByHandle(IntPtr hFile, out BY_HANDLE_FILE_INFORMATION lpFileInformation);

[StructLayout(LayoutKind.Sequential)]
public struct BY_HANDLE_FILE_INFORMATION
{
    public uint FileAttributes;
    public System.Runtime.InteropServices.ComTypes.FILETIME CreationTime;
    public System.Runtime.InteropServices.ComTypes.FILETIME LastAccessTime;
    public System.Runtime.InteropServices.ComTypes.FILETIME LastWriteTime;
    public uint VolumeSerialNumber;
    public uint FileSizeHigh;
    public uint FileSizeLow;
    public uint NumberOfLinks;
    public uint FileIndexHigh;
    public uint FileIndexLow;
};

Note that in the above code I have included two lines like dirName1 = NormalizePath(dirName1); and have not specified what the function NormalizePath is. NormalizePath can be any path-normalization function - many have been provided in answers elsewhere in this question. Providing a reasonable NormalizePath function means that AreDirsEqual will give a reasonable answer even when the two input paths refer to non-existent file system objects, i.e. to paths that you simply want to compare on a string-level. ( Ishmaeel's comment above should be paid heed as well, and this code does not do that...)

(There may be subtle permissions issues with this code, if a user has only traversal permissions on some initial directories, I am not sure if the file system accesses required by AreFileSystemObjectsEqual are permitted. The parameter resolveJunctionaAndNetworkPaths at least allows the user to revert to pure textual comparison in this case...)

The idea for this came from a reply by Warren Stevens in a similar question I posted on SuperUser: https://superuser.com/a/881966/241981

Solution 4

There are some short comes to the implementation of paths in .NET. There are many complaints about it. Patrick Smacchia, the creator of NDepend, published an open source library that enables handling of common and complex path operations. If you do a lot of compare operations on paths in your application, this library might be useful to you.

Solution 5

It seems that P/Invoking GetFinalPathNameByHandle() would be the most reliable solution.

UPD: Oops, I didn't take into account your desire not to use any I/O

Share:
50,363

Related videos on Youtube

Eamon Nerbonne
Author by

Eamon Nerbonne

My work and hobby concern programming: I'm interested in data-mining, and enjoy collecting interesting stats from last.fm's openly accessible web-services. Open source libraries: ValueUtils (nuget: ValueUtils) provides a .NET base class for ValueObjects with auto-implemented GetHashCode and Equals using runtime code generation to perform similar to hand-rolled versions. Can also generate hash function and equality delegates for other types. ExpressionToCode (nuget: ExpressionToCodeLib) generates C# source code from LINQ expression trees and can annotate that code with runtime values, which is hopefully useful in Unit Testing (integrates with NUnit, xUnit.net &amp; mstest, but runs fine without a unit test framework too). a-vs-an (nuget: AvsAn) determines whether "a" or "an" is more appropriate before a word, symbol, or acronym. Fast &amp; accurate. Uses real-world statistics aggregated from wikipedia, and can therefore deal well even with cases that might trip up rules-based systems (e.g. an NSA analyst vs. a NASA flight plan). Includes a C# and Javascript implementation; the javascript implementation you can try online.

Updated on July 05, 2022

Comments

  • Eamon Nerbonne
    Eamon Nerbonne almost 2 years

    If I have two DirectoryInfo objects, how can I compare them for semantic equality? For example, the following paths should all be considered equal to C:\temp:

    • C:\temp
    • C:\temp\
    • C:\temp\.
    • C:\temp\x\..\..\temp\.

    The following may or may not be equal to C:\temp:

    • \temp if the current working directory is on drive C:\
    • temp if the current working directory is C:\
    • C:\temp.
    • C:\temp...\

    If it's important to consider the current working directory, I can figure that out myself, so that's not that important. Trailing dots are stripped in windows, so those paths really should be equal - but they aren't stripped in unix, so under mono I'd expect other results.

    Case sensitivity is optional. The paths may or may not exist, and the user may or may not have permissions to the path - I'd prefer a fast robust method that doesn't require any I/O (so no permission checking), but if there's something built-in I'd be happy with anything "good enough" too...

    I realize that without I/O it's not possible to determine whether some intermediate storage layer happens to have mapped the same storage to the same file (and even with I/O, when things get messy enough it's likely impossible). However, it should be possible to at least positively identify paths that are equivalent, regardless of the underlying filesystem, i.e. paths that necessarily would resolve to the same file (if it exists) on all possible file-systems of a given type. The reason this is sometimes useful is (A) because I certainly want to check this first, before doing I/O, (B) I/O sometimes triggers problematic side-effects, and (C) various other software components sometimes mangle paths provided, and it's helpful to be able to compare in a way that's insensitive to most common transformations of equivalent paths, and finally (D) to prepare deployments it's useful to do some sanity checks beforehand, but those occur before the to-be-deployed-on system is even accessible.

    • Deanna
      Deanna over 11 years
    • Mr.B
      Mr.B over 7 years
    • Elaskanator
      Elaskanator almost 5 years
      Why does the System.IO.DirectoryInfo class not implement bool Equals(DirectoryInfo other) to handle this? Seems to me that this stuff should be so standardized by now that we shouldn't even be able to mess up simple things like comparing two paths.
    • Good Night Nerd Pride
      Good Night Nerd Pride about 3 years
    • Eamon Nerbonne
      Eamon Nerbonne about 3 years
      @GoodNightNerdPride no, largely because the question doesn't clarify what it means for paths to be identical; e.g. the need to deal with non-existing paths. I'm looking for path equivalence, not file system object equivalence on a specific system.
    • Good Night Nerd Pride
      Good Night Nerd Pride about 3 years
      Have you seen this answer though? It detects almost all your examples cases as equal paths (except temp and C:\temp...\ ) without any IO.
  • Steven
    Steven about 14 years
    This is the easy version of Binary Worrier's solution. Please note there is a problem however with the trailing slash: "c:\temp" is unequal to "c:\temp\".
  • Steven
    Steven about 14 years
    The Name property will only return the name of the deepest subdirectory, so "c:\foo\bar" will return "bar". Comparing "d:\foo\bar" with "c:\bar" will result true, which is not good.
  • Andy Shellam
    Andy Shellam about 14 years
    That's why you do a recursive comparison on all parents! Please remove the downvote, this is a perfectly acceptable solution. I've modified my answer with a full code sample.
  • Eamon Nerbonne
    Eamon Nerbonne about 14 years
    Well, I'd prefer no I/O, but a simple I/O using solution is better than writing something from scratch...
  • Eamon Nerbonne
    Eamon Nerbonne about 14 years
    Hmm, interesting - have you used it?
  • Andy Shellam
    Andy Shellam about 14 years
    You could do Path.GetFullPath(pathx).ToUpperInvariant().TrimEnd('\\') to get rid of the case sensitivity. This should be used with caution on UNIX though, as UNIX treats two names of different case as different folders, whereas Windows treats them as one and the same.
  • Eamon Nerbonne
    Eamon Nerbonne about 14 years
    Ok, so if I normalize out the trailing slashes and potentially the casing - and accept the fact that it does some FileIOPermission stuff - this looks like a good start, thanks!
  • Igor Korkhov
    Igor Korkhov about 14 years
    @Eamon Nerbonne: my solution has two more downsides: 1) it will work only on Vista and newer OSs 2) it won't work if at least one of the paths does not exist. But it also has one benefit: it works with symbolic links, i.e. answers your question "how can I compare them for semantic equality?"; while GetFullPath() doesn't. So it's up to you to decide if you need real semantic euqality or not.
  • Eamon Nerbonne
    Eamon Nerbonne about 14 years
    If you edit this to be case invariant and use DirectoryInfo's (via FullName), you'll have a perfect answer :-)
  • Eamon Nerbonne
    Eamon Nerbonne about 14 years
    Hmm, this works even for D:\temp vs. C:\temp. Fine idea; you didn't deal with case sensitivity though, and it could be a bit shorter: while(dir1!=null && dir2!=null) if(!string.Equals(dir1.Name,dir2.Name,StringCom‌​parison.InvariantCul‌​tureIgnoreCase)) return false; else {dir1=dir1.Parent; dir2=dir2.Parent;} return dir1==dir2;
  • Andy Shellam
    Andy Shellam about 14 years
    Yeah case sensitivity is a difficult one because the OP wanted code to work on both Mono and Windows, but on Linux 2 names of different case are considered different, but on Windows they're considered to be the same file, so it's a per-platform decision.
  • VladV
    VladV about 14 years
    @Eamon, I've added DirectoryInfo variant for you :-) . And it was case invariant already - that's what StringComparison.InvariantCultureIgnoreCase does.
  • Eamon Nerbonne
    Eamon Nerbonne about 14 years
    Oh, and in case some other user stumbles across this answer; FullName does require path-discovery security permissions and is sensitive to the current-working-directory (which effectively means you can only compare absolute paths - or relative paths as evaluated in the CWD).
  • Denis
    Denis over 13 years
    List<T> _list = new List<T>(bla-bla-bla); _list.Sort(new FSChangeElemComparerByPath());
  • Mashmagar
    Mashmagar about 12 years
    Will this handle two paths to the same directory when one path uses a junction point?
  • VladV
    VladV about 12 years
    @Mashmagar, no, it won't. Only pathes are compared, not any real directories (as OP said, they might not exist at all).
  • Kevin Coulombe
    Kevin Coulombe over 11 years
    I have used it to determine if one directory contains another (e.g. C:\A/B contains C:\a\b\c\d\e\..\..\..\f\g) and it works very well.
  • Pato
    Pato almost 11 years
    Maybe you should use System.IO.Path.DirectorySeparatorChar instead of '\\'.
  • Eamon Nerbonne
    Eamon Nerbonne over 10 years
    Somewhat amusingly, that's actually what I ended up doing since this has no I/O!
  • Eamon Nerbonne
    Eamon Nerbonne about 10 years
    This is essentially equivalent to what @nawfal suggests (which is currently the accepted answer)
  • ili
    ili over 9 years
    new Uri(path).LocalPath - would give wrong path in case of # symbol in path
  • Sam
    Sam over 9 years
    What about the trailing slash issue?
  • 2xMax
    2xMax over 8 years
    System.IO.Path.GetFullPath(@"C:\LOL").Equals(System.IO.Path.‌​GetFullPath(@"C:\LOL‌​\")) returns false
  • David I. McIntosh
    David I. McIntosh over 7 years
    There are cases, in particular for network files, where this will give the incorrect answer. In fact, when dealing with network files, there are cases where it will simply not be possible to correctly determine the answer without doing any IO, i.e. without dealing with file handles. See my answer below for a "more correct" solution, which admittedly uses IO contrary to the specific request in the question.
  • John Reynolds
    John Reynolds about 7 years
    Note that this will throw an exception for examples like "\temp" and "temp" in the question.
  • user3613932
    user3613932 almost 7 years
    If you stick in Path.GetFullPath before you pass the path to the Uri constructor then paths such as ., .., temp, etc. can be handled. Code will become Path.GetFullPath(new Uri(Path.GetFullPath(path)).LocalPath).TrimEnd(Path.Director‌​ySeparatorChar, Path.AltDirectorySeparatorChar).ToUpperInvariant()
  • Ishmaeel
    Ishmaeel over 6 years
    Used ToUpperInvariant on a file system path? Congratulations. You now have an application that has a chance to blow up mysteriously on operating systems with Turkish regional settings.
  • David I. McIntosh
    David I. McIntosh about 4 years
    This will give incorrect answer in certain network file cases. If one is going to include networked files, it is NOT POSSIBLE to do this correctly without doing some IO - any answer that attempts to do this based on resolving path names WILL give incorrect answers in certain cases. As a result, almost every answer on this page WILL give incorrect answers for networked files in certain cases.
  • Eamon Nerbonne
    Eamon Nerbonne about 3 years
    This answer won't work for not-yet existing paths, but it's pretty nifty for existing objects that you have permissions for in the way it can look through local smb handles. In general, any path may be remapped to refer to the same storage; so the question of whether two existing accessible file system objects refer to the same storage is a different one (and likely one without a completely correct answer, because it depends on all kinds of layers in between).
  • David I. McIntosh
    David I. McIntosh about 3 years
    @Eamon.Nerbonne - Thanks for the edits. Re: "This answer won't work for not-yet existing paths" - correct, hence my comment in my answer above: "if neither file specification refers to a valid file object, all bets are off". Note that if one refers to a valid file object and the other does not, they are demonstrably not equal and the code above will correctly return "false" in this case. If neither exists but the strings are equal, it returns true, which is also correct. If neither exists and they are not equal at the string level, it returns false, see newly added comments in answer.
  • David I. McIntosh
    David I. McIntosh about 3 years
    Technically, suppose the two string paths dirNameX refer to non-existent file system objects, and suppose NormalizePath normalizes all path separators to \ . Suppose also the two string paths are of the form: <path1>\<terminalPath>, <path2>\<terminalPath>, where <path1> and <path2> refer to the same existent directory (but as mentioned, may not be equal at the string level). Then in fact one could reasonable say the two dirNameX's refer to the same path. Newly provided code above accounts for this. Code is so far untested...
  • nawfal
    nawfal about 3 years
    @Ishmaeel you are right. To everyone others... Please consider answers here as starting point. The main idea was to give a gist on why case should be handled (rather ignored on case insensitive OS like Windows).
  • Eamon Nerbonne
    Eamon Nerbonne over 2 years
    what about AreEqual(@"C:\Temp", @"C:\Temp2")?