How do I "normalize" a pathname using boost::filesystem?
Solution 1
Boost v1.48 and above
You can use boost::filesystem::canonical
:
path canonical(const path& p, const path& base = current_path());
path canonical(const path& p, system::error_code& ec);
path canonical(const path& p, const path& base, system::error_code& ec);
http://www.boost.org/doc/libs/1_48_0/libs/filesystem/v3/doc/reference.html#canonical
v1.48 and above also provide the boost::filesystem::read_symlink
function for resolving symbolic links.
Boost versions prior to v1.48
As mentioned in other answers, you can't normalise because boost::filesystem can't follow symbolic links. However, you can write a function that normalises "as much as possible" (assuming "." and ".." are treated normally) because boost offers the ability to determine whether or not a file is a symbolic link.
That is to say, if the parent of the ".." is a symbolic link then you have to retain it, otherwise it is probably safe to drop it and it's probably always safe to remove ".".
It's similar to manipulating the actual string, but slightly more elegant.
boost::filesystem::path resolve(
const boost::filesystem::path& p,
const boost::filesystem::path& base = boost::filesystem::current_path())
{
boost::filesystem::path abs_p = boost::filesystem::absolute(p,base);
boost::filesystem::path result;
for(boost::filesystem::path::iterator it=abs_p.begin();
it!=abs_p.end();
++it)
{
if(*it == "..")
{
// /a/b/.. is not necessarily /a if b is a symbolic link
if(boost::filesystem::is_symlink(result) )
result /= *it;
// /a/b/../.. is not /a/b/.. under most circumstances
// We can end up with ..s in our result because of symbolic links
else if(result.filename() == "..")
result /= *it;
// Otherwise it should be safe to resolve the parent
else
result = result.parent_path();
}
else if(*it == ".")
{
// Ignore
}
else
{
// Just cat other path entries
result /= *it;
}
}
return result;
}
Solution 2
With version 3 of boost::filesystem
you can also try to remove all the symbolic links with a call to canonical
. This can be done only for existing paths so a function that also works for non-existing ones would require two steps (tested on MacOS Lion and updated for Windows thanks to @void.pointer's comment):
boost::filesystem::path normalize(const boost::filesystem::path &path) {
boost::filesystem::path absPath = absolute(path);
boost::filesystem::path::iterator it = absPath.begin();
boost::filesystem::path result = *it++;
// Get canonical version of the existing part
for (; exists(result / *it) && it != absPath.end(); ++it) {
result /= *it;
}
result = canonical(result);
// For the rest remove ".." and "." in a path with no symlinks
for (; it != absPath.end(); ++it) {
// Just move back on ../
if (*it == "..") {
result = result.parent_path();
}
// Ignore "."
else if (*it != ".") {
// Just cat other path entries
result /= *it;
}
}
// Make sure the dir separators are correct even on Windows
return result.make_preferred();
}
Solution 3
Your complaints and/or wishes about canonical
have been addressed by Boost 1.60 [1] with
path lexically_normal(const path& p);
Solution 4
the explanation is at http://www.boost.org/doc/libs/1_40_0/libs/filesystem/doc/design.htm :
Work within the realities described below.
Rationale: This isn't a research project. The need is for something that works on today's platforms, including some of the embedded operating systems with limited file systems. Because of the emphasis on portability, such a library would be much more useful if standardized. That means being able to work with a much wider range of platforms that just Unix or Windows and their clones.
where the "reality" applicable to removal of normalize
is:
Symbolic links cause canonical and normal form of some paths to represent different files or directories. For example, given the directory hierarchy /a/b/c, with a symbolic link in /a named x pointing to b/c, then under POSIX Pathname Resolution rules a path of "/a/x/.." should resolve to "/a/b". If "/a/x/.." were first normalized to "/a", it would resolve incorrectly. (Case supplied by Walter Landry.)
the library cannot really normalize a path without access to the underlying filesystems, which makes the operation a) unreliable b) unpredictable c) wrong d) all of the above
Solution 5
It's still there. Keep using it.
I imagine they deprecated it because symbolic links mean that the collapsed path isn't necessarily equivalent. If c:\full\path
were a symlink to c:\rough
, then c:\full\path\..
would be c:\
, not c:\full
.
Mike Willekes
Updated on May 04, 2020Comments
-
Mike Willekes almost 4 years
We are using boost::filesystem in our application. I have a 'full' path that is constructed by concatenating several paths together:
#include <boost/filesystem/operations.hpp> #include <iostream> namespace bf = boost::filesystem; int main() { bf::path root("c:\\some\\deep\\application\\folder"); bf::path subdir("..\\configuration\\instance"); bf::path cfgfile("..\\instance\\myfile.cfg"); bf::path final ( root / subdir / cfgfile); cout << final.file_string(); }
The final path is printed as:
c:\some\deep\application\folder\..\configuration\instance\..\instance\myfile.cfg
This is a valid path, but when I display it to the user I'd prefer it to be normalized. (Note: I'm not even sure if "normalized" is the correct word for this). Like this:
c:\some\deep\application\configuration\instance\myfile.cfg
Earlier versions of Boost had a
normalize()
function - but it seems to have been deprecated and removed (without any explanation).Is there a reason I should not use the
BOOST_FILESYSTEM_NO_DEPRECATED
macro? Is there an alternative way to do this with the Boost Filesystem library? Or should I write code to directly manipulating the path as a string? -
Kieveli over 14 yearsI think wanting to normalize the path is sane, natural, and expected behaviour. Looks like they have over-thought this one and erred on the side of wrong.
-
just somebody over 14 yearsBoost.Filesystem aiming at inclusion in the C++ standard, which is why they removed the features that are useful on some of the platforms. there's already a de-facto and de-iure standard for the feature you're longing, its realpath() in POSIX: The realpath() function shall derive, from the pathname pointed to by file_name, an absolute pathname that resolves to the same directory entry, whose resolution does not involve '.' , '..' , or symbolic links. % cd /home/foo/tmp % ln -s foo .. % echo $PWD/foo/.. /home/foo/tmp/foo/.. % realpath $PWD/foo/.. /home/foo
-
Matthieu M. over 14 yearsThis part of symbolic links always bugged me, that's quite a violation of the Principle of Least Astonishment :/
-
just somebody over 14 yearswhich part? AFAICS "this part" is the whole point of symlinks, no?
-
Mike Willekes over 14 yearsAt the very least the macro to re-enable this functionality should have been called BOOST_FILESYSTEM_NOT_NECESSARILY_PORTABLE (or something like that). Calling the code 'deprecated' makes one think that it could be dropped from a future release.
-
Krish1992 about 13 yearsSucks majorly, interesting to claim "this isn't a research project" and then pretty much directly after come up with an excuse which leads everyone to believe that it is. Surely a better solution would've been to just implement it in terms of for example realpath() on posix, and whatever is needed on windows, and then on unsupported platforms throw an exception?
-
just somebody over 11 yearsnot sure why this answer has gotten a downvote as it's a copy/paste straight from the horse's mouth.
-
jarzec over 11 yearsSorry, a
++
was missing in line 4 above. -
jarzec about 11 years
canonical
works only for existing files. I needed something that also works for non-existing paths (canonical
is used bynormalize
for the existing bit of the path). -
zett42 almost 5 yearsNote that
boost::filesystem::canonicalize()
requires a path, that actually exists in the filesystem. So you cannot use it to normalize a path, that may point to a filesystem item that currently does not exist, such as a path on a removable medium or disconnected network drive. In these cases the function will report an error. Compare withboost::filesystem::path::lexically_normal
-
zett42 almost 5 yearsThis doesn't really answer the question.
-
void.pointer over 4 yearsThis doesn't work right on Windows. If I pass in
"E:\\foo\\.\\bar"
, I get back"E:/foo\\bar"
. The slashes are inconsistent. Change thereturn
expression toreturn result.make_preferred()
and it fixes the issue. Now I get"E:\\foo\\bar"
. -
jarzec over 4 years@void.pointer Thanks a lot. I ever had chance to test this on Windows.
-
Evgen about 4 yearsTypo in "make_prefered()" in the example. Also note that canonical has problems with Windows links and junctions, at least as of Boost 1.72. See github.com/boostorg/filesystem/issues
-
Evgen about 4 yearsNote that
canonical
has problems with Windows links and junctions, at least as of Boost 1.72. See github.com/boostorg/filesystem/issues Same forweakly_canonical
andread_symlink
-
jarzec almost 4 years@Evgen Thanks. I fixed the typo.