Files.walk(), calculate total size
Solution 1
No, this exception cannot be avoided.
The exception itself occurs inside the the lazy fetch of Files.walk()
, hence why you are not seeing it early and why there is no way to circumvent it, consider the following code:
long size = Files.walk(Paths.get("C://"))
.peek(System.out::println)
.mapToLong(this::count)
.sum();
On my system this will print on my computer:
C:\
C:\$Recycle.Bin
Exception in thread "main" java.io.UncheckedIOException: java.nio.file.AccessDeniedException: C:\$Recycle.Bin\S-1-5-18
And as an exception is thrown on the (main) thread on the third file, all further executions on that thread stop.
I believe this is a design failure, because as it stands now Files.walk
is absolutely unusable, because you never can guarantee that there will be no errors when walking over a directory.
One important point to notice is that the stacktrace includes a sum()
and reduce()
operation, this is because the path is being lazily loaded, so at the point of reduce()
, the bulk of stream machinery gets called (visible in stacktrace), and then it fetches the path, at which point the UnCheckedIOException
occurs.
It could possibly be circumvented if you let every walking operation execute on their own thread. But that is not something you would want to be doing anyway.
Also, checking if a file is actually accessible is worthless (though useful to some extent), because you can not guarantee that it is readable even 1ms later.
Future extension
I believe it can still be fixed, though I do not know how FileVisitOption
s exactly work.
Currently there is a FileVisitOption.FOLLOW_LINKS
, if it operates on a per file basis, then I would suspect that a FileVisitOption.IGNORE_ON_IOEXCEPTION
could also be added, however we cannot correctly inject that functionality in there.
Solution 2
2017 for those who keep arriving here.
Use Files.walk() when you are certain of the file system behaviour and really want to stop when there is any error. Generally Files.walk is not useful in standalone apps. I make this mistake so often, perhaps I am lazy. I realize my mistake the moment I see the time taken lasting more than a few seconds for something small like 1 million files.
I recommend walkFileTree
. Start by implementing the FileVisitor interface, here I only want to count files. Bad class name, I know.
class Recurse implements FileVisitor<Path>{
private long filesCount;
@Override
public FileVisitResult preVisitDirectory(Path dir, BasicFileAttributes attrs) throws IOException {
return FileVisitResult.CONTINUE;
}
@Override
public FileVisitResult visitFile(Path file, BasicFileAttributes attrs) throws IOException {
//This is where I need my logic
filesCount++;
return FileVisitResult.CONTINUE;
}
@Override
public FileVisitResult visitFileFailed(Path file, IOException exc) throws IOException {
// This is important to note. Test this behaviour
return FileVisitResult.CONTINUE;
}
@Override
public FileVisitResult postVisitDirectory(Path dir, IOException exc) throws IOException {
return FileVisitResult.CONTINUE;
}
public long getFilesCount() {
return filesCount;
}
}
Then use your defined Class like this.
Recurse r = new Recurse();
Files.walkFileTree(Paths.get("G:"), r);
System.out.println("Total files: " + r.getFilesCount());
I am sure you know how to modify your own class'es implementation of the FileVisitor<Path>
Interface class to do other things like filesize
with the example I posted. Refer to the docs for other methods in this
Speed:
- Files.walk : 20+ minutes and failing with exception
- Files.walkFileTree: 5.6 seconds, done with perfect answer.
Edit: As with everything, use tests to confirm the behaviour Handle Exceptions, they do still occur except for the ones we choose not to care about as above.
Solution 3
I found that using Guava's Files class solved the issue for me:
Iterable<File> files = Files.fileTreeTraverser().breadthFirstTraversal(dir);
long size = toStream( files ).mapToLong( File::length ).sum();
Where toStream
is my static utility function to convert an Iterable to a Stream. Just this:
StreamSupport.stream(iterable.spliterator(), false);
Solution 4
The short answer is you can't.
The exception is coming from FileTreeWalker.visit
.
To be precise, it is trying to build a newDirectoryStream
when it fails (this code is out of your control):
// file is a directory, attempt to open it
DirectoryStream<Path> stream = null;
try {
stream = Files.newDirectoryStream(entry);
} catch (IOException ioe) {
return new Event(EventType.ENTRY, entry, ioe); // ==> Culprit <==
} catch (SecurityException se) {
if (ignoreSecurityException)
return null;
throw se;
}
Maybe you should submit a bug.
Solution 5
filter out directories -> Files::isRegularFile
try(Stream<Path> pathStream = Files.walk(Path.of("/path/to/your/dir"))
) {
pathStream
.filter(Files::isRegularFile)
.forEach(System.out::println);
} catch (IOException e) {
e.printStackTrace();
}
Related videos on Youtube
![Aksel Willgert](https://i.stack.imgur.com/NO9Qh.jpg?s=256&g=1)
Aksel Willgert
Born 1981. Working at Ericsson since many years :) Have fun
Updated on June 20, 2022Comments
-
Aksel Willgert about 2 years
I'm trying to calculate the size of the files on my disc. In java-7 this could be done using Files.walkFileTree as shown in my answer here.
However if i wanted to do this using java-8 streams it will work for some folders, but not for all.
public static void main(String[] args) throws IOException { long size = Files.walk(Paths.get("c:/")).mapToLong(MyMain::count).sum(); System.out.println("size=" + size); } static long count(Path path) { try { return Files.size(path); } catch (IOException | UncheckedIOException e) { return 0; } }
Above code will work well for path
a:/files/
but forc:/
it will throw below exceptionException in thread "main" java.io.UncheckedIOException: java.nio.file.AccessDeniedException: c:\$Recycle.Bin\S-1-5-20 at java.nio.file.FileTreeIterator.fetchNextIfNeeded(Unknown Source) at java.nio.file.FileTreeIterator.hasNext(Unknown Source) at java.util.Iterator.forEachRemaining(Unknown Source) at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Unknown Source) at java.util.stream.AbstractPipeline.copyInto(Unknown Source) at java.util.stream.AbstractPipeline.wrapAndCopyInto(Unknown Source) at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(Unknown Source) at java.util.stream.AbstractPipeline.evaluate(Unknown Source) at java.util.stream.LongPipeline.reduce(Unknown Source) at java.util.stream.LongPipeline.sum(Unknown Source) at MyMain.main(MyMain.java:16)
I understand where it is coming from and how to avoid it using Files.walkFileTree API.
But how can this exception be avoided using Files.walk() API?
-
Boon about 10 yearsYes, agreed. This is a design failure.
-
Aksel Willgert about 10 yearsGood analyzis. I think i would prefer maybe another Files.walk() that also accepted an errorHandler or similiar.
-
Muhammad Hewedy about 10 yearsI've submitted one descries my exact case stackoverflow.com/questions/23220542
-
Stuart Marks about 10 yearsYes, good analysis, +1. The bug (enhancement request) that covers this is JDK-8039910.
-
Drakes over 5 years
fileTreeTraverser
is now deprecated -
MariuszS over 5 years@Drakes
Files.fileTraverser().breadthFirst()
orMoreFiles.fileTraverser(sourcePath).breadthFirst()
-
user1050755 over 5 yearsApache Commons-IO's DirectoryWalker is about as fast Files.walkFileTree. docs.leponceau.org/java-examples/java-evaluation/… vs docs.leponceau.org/java-examples/java-evaluation/…
-
Mark Jeronimus over 5 yearsIt's best to extend
SimpleFileVisitor
(abstract class) instead ofFileVisitor
(interface). However you have to overridevisitFileFailed()
because the default implementation ironically mimicsFiles.walk()
. -
zb226 about 4 years6 years later, the bug's closed with resolution "Future Project" and there's no signs of the latter :(
-
Abhishek Dujari about 2 yearsyeah people will insist on using Files.walk() and hit the same problem. It isnt a bug per se but interface exists for these reasons. This is the way