Normalize file path with WinAPI

11,252

Solution 1

Depending on whether the paths could be relative, or contain "..", or junction points, or UNC paths this may be more difficult than you think. The best way might be to use the GetFileInformationByHandle() function as in this answer.

Edit: I agree with the comment by RBerteig that this may become hard to impossible to do if the paths are not pointing to a local file. Any comment on how to safely handle this case would be greatly appreciated.

Solution 2

May I suggest PathCanonicalize?

Solution 3

I found a blog posting with the most thorough, even elaborate, function I have ever seen to solve this problem. It handles anything, even horrible corner cases like V:foo.txt where you used the subst command to map V: to Z: but you already used subst to map Z: to some other drive; it loops until all subst commands are unwound. URL:

http://pdh11.blogspot.com/2009/05/pathcanonicalize-versus-what-it-says-on.html

My project is pure C code, and that function is C++. I started to translate it, but then I figured out that I could get the normalized path that I wanted with one function call: GetLongPathName(). This won't handle the horrible corner cases, but it handled my immediate needs.

I discovered that GetLongPathName("foo.txt") just returns foo.txt, but just by prepending ./ to the filename I got the expansion to normalized form:

GetLongPathName("./foo.txt"), if executed in directory C:\Users\steveha, returns C:\Users\steveha\foo.txt.

So, in pseudocode:

if the second char of the pathname is ':' or the first char is '/' or '\', just call GetLongPathName() else, copy "./" to a temp buffer, then copy the filename to temp buffer + 2, to get a copy of the filename prepended with "./" and then call GetLongPathName().

Solution 4

There are odd cases. For example, "c:\windows..\data\myfile.txt" is the same as "c:\data.\myfile.txt" and "c:\data\myfile.txt". You can have any number of "\.\" and "\..\" in there. You might look into the Windows API function GetFullPathName. It might do canonicalization for you.

Share:
11,252
Alex B
Author by

Alex B

Refactoring extraordinaire.

Updated on June 04, 2022

Comments

  • Alex B
    Alex B almost 2 years

    Possible Duplicate:
    Best way to determine if two path reference to same file in C/C++

    Given two file path strings with potentially different casing and slashes ('\' vs '/'), is there a quick way (that does not involve writing my own function) to normalize both paths to the same form, or at least to test them for equivalence?

    I'm restricted to WinAPI and standard C++. All files are local.

  • RBerteig
    RBerteig about 15 years
    As long as the two paths resolve to files on the same computer, then it looks like GetFileInformationByHandle() is the right answer. If they resolve to different computers, I don't see a guarantee, and I don't see a trivial way to get one, either. It isn't necessarily easy to test for this.
  • RBerteig
    RBerteig about 15 years
    It doesn't look like that addresses either Junction points or UNC paths... but it does look useful to know about.
  • Alex B
    Alex B about 15 years
    All files are local in my case, so this works.
  • Jim Mischel
    Jim Mischel about 15 years
    That's the method I was looking for. Not GetFullPathName.
  • steveha
    steveha almost 13 years
    After viewing this answer, I tried using PathCanonicalize() and discovered that it's horribly broken. PathCanonicalize("../foo.txt") always returns /foo.txt! PathCanonicalize() just does trivial editing on the string, and the above broken-ness is documented behavior. Useless. I will post another answer with what I found.
  • steveha
    steveha almost 13 years
    @RBerteig: I don't see a trivial way to get one, either. But I found a very non-trivial one and put it in an answer; take a look. Even that one is only mostly foolproof, but it ought to be more than enough for most people.
  • RBerteig
    RBerteig almost 13 years
    Well, at least that post provides code that looks like it does all the heavy lifting. It certainly handles more corner cases and more completely that you'd expect. Clearly the author has burned his fingers on these issues a few times... Your simpler fallback is probably good enough for most cases, and lets the exotic cases be used to fool software that needs fooling.
  • steveha
    steveha almost 13 years
    My users have probably never even heard of the subst command, and my simple C code has worked perfectly so far.
  • RBerteig
    RBerteig almost 13 years
    the big surprise to us old timers is that subst still exists... it is even present in Win7 64bit, but it dates back to DOS 5.0 if not earlier. Win7 comes with a bunch of JUNCTIONS for backward compatibility. That is giving some of my older Perl scripts that wander through the file system fits as they aren't quite directories and certainly aren't files. One example is that C:\Documents and Settings is a JUNCTION mapped to C:\Users.
  • Serge Rogatch
    Serge Rogatch almost 9 years
  • steveha
    steveha almost 9 years
    I answered this question four years ago, and I no longer remember why I used GetLongPathName() rather than GetFullPathName(). That blog posting from 2009 that I linked shows code that uses GetFullPathName(), and if it's good enough for that guy, I'm sure it's a good way to go.
  • 0xC0000022L
    0xC0000022L over 2 years
    Why should it bother handling junction points? That's done by the object manager.