Files.walk(), calculate total size

java nio java-8 java-stream

18,703

Solution 1

No, this exception cannot be avoided.

The exception itself occurs inside the the lazy fetch of Files.walk(), hence why you are not seeing it early and why there is no way to circumvent it, consider the following code:

long size = Files.walk(Paths.get("C://"))
        .peek(System.out::println)
        .mapToLong(this::count)
        .sum();

On my system this will print on my computer:

C:\
C:\$Recycle.Bin
Exception in thread "main" java.io.UncheckedIOException: java.nio.file.AccessDeniedException: C:\$Recycle.Bin\S-1-5-18

And as an exception is thrown on the (main) thread on the third file, all further executions on that thread stop.

I believe this is a design failure, because as it stands now Files.walk is absolutely unusable, because you never can guarantee that there will be no errors when walking over a directory.

One important point to notice is that the stacktrace includes a sum() and reduce() operation, this is because the path is being lazily loaded, so at the point of reduce(), the bulk of stream machinery gets called (visible in stacktrace), and then it fetches the path, at which point the UnCheckedIOException occurs.

It could possibly be circumvented if you let every walking operation execute on their own thread. But that is not something you would want to be doing anyway.

Also, checking if a file is actually accessible is worthless (though useful to some extent), because you can not guarantee that it is readable even 1ms later.

Future extension

I believe it can still be fixed, though I do not know how FileVisitOptions exactly work.
Currently there is a FileVisitOption.FOLLOW_LINKS, if it operates on a per file basis, then I would suspect that a FileVisitOption.IGNORE_ON_IOEXCEPTION could also be added, however we cannot correctly inject that functionality in there.

Solution 2

2017 for those who keep arriving here.

Use Files.walk() when you are certain of the file system behaviour and really want to stop when there is any error. Generally Files.walk is not useful in standalone apps. I make this mistake so often, perhaps I am lazy. I realize my mistake the moment I see the time taken lasting more than a few seconds for something small like 1 million files.

I recommend walkFileTree. Start by implementing the FileVisitor interface, here I only want to count files. Bad class name, I know.

class Recurse implements FileVisitor<Path>{

    private long filesCount;
    @Override
    public FileVisitResult preVisitDirectory(Path dir, BasicFileAttributes attrs) throws IOException {
       return FileVisitResult.CONTINUE;
    }

    @Override
    public FileVisitResult visitFile(Path file, BasicFileAttributes attrs) throws IOException {
        //This is where I need my logic
        filesCount++;
        return FileVisitResult.CONTINUE;
    }

    @Override
    public FileVisitResult visitFileFailed(Path file, IOException exc) throws IOException {
        // This is important to note. Test this behaviour
        return FileVisitResult.CONTINUE;
    }

    @Override
    public FileVisitResult postVisitDirectory(Path dir, IOException exc) throws IOException {
       return FileVisitResult.CONTINUE;
    }

    public long getFilesCount() {
        return filesCount;
    }
}

Then use your defined Class like this.

Recurse r = new Recurse();
Files.walkFileTree(Paths.get("G:"), r);
System.out.println("Total files: " + r.getFilesCount());

I am sure you know how to modify your own class'es implementation of the FileVisitor<Path> Interface class to do other things like filesize with the example I posted. Refer to the docs for other methods in this

Speed:

Files.walk : 20+ minutes and failing with exception
Files.walkFileTree: 5.6 seconds, done with perfect answer.

Edit: As with everything, use tests to confirm the behaviour Handle Exceptions, they do still occur except for the ones we choose not to care about as above.

Solution 3

I found that using Guava's Files class solved the issue for me:

    Iterable<File> files = Files.fileTreeTraverser().breadthFirstTraversal(dir);
    long size = toStream( files ).mapToLong( File::length ).sum();

Where toStream is my static utility function to convert an Iterable to a Stream. Just this:

StreamSupport.stream(iterable.spliterator(), false);

Solution 4

The short answer is you can't.

The exception is coming from FileTreeWalker.visit.

To be precise, it is trying to build a newDirectoryStream when it fails (this code is out of your control):

// file is a directory, attempt to open it
DirectoryStream<Path> stream = null;
try {
    stream = Files.newDirectoryStream(entry);
} catch (IOException ioe) {
    return new Event(EventType.ENTRY, entry, ioe); // ==> Culprit <== 
} catch (SecurityException se) {
    if (ignoreSecurityException)
        return null;
    throw se;
}

Maybe you should submit a bug.

Solution 5

filter out directories -> Files::isRegularFile

try(Stream<Path> pathStream = Files.walk(Path.of("/path/to/your/dir"))
        ) {
            pathStream
                    .filter(Files::isRegularFile)
                    .forEach(System.out::println);
        } catch (IOException e) {
            e.printStackTrace();
        }

View more solutions

18,703

Aksel Willgert

Born 1981. Working at Ericsson since many years :) Have fun

Updated on June 20, 2022

Comments

Aksel Willgert about 2 years

I'm trying to calculate the size of the files on my disc. In java-7 this could be done using Files.walkFileTree as shown in my answer here.

However if i wanted to do this using java-8 streams it will work for some folders, but not for all.

public static void main(String[] args) throws IOException {
    long size = Files.walk(Paths.get("c:/")).mapToLong(MyMain::count).sum();
    System.out.println("size=" + size);
}

static long count(Path path) {
    try {
        return Files.size(path);
    } catch (IOException | UncheckedIOException e) {
        return 0;
    }
}

Above code will work well for path a:/files/ but for c:/ it will throw below exception

Exception in thread "main" java.io.UncheckedIOException: java.nio.file.AccessDeniedException: c:\$Recycle.Bin\S-1-5-20
at java.nio.file.FileTreeIterator.fetchNextIfNeeded(Unknown Source)
at java.nio.file.FileTreeIterator.hasNext(Unknown Source)
at java.util.Iterator.forEachRemaining(Unknown Source)
at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Unknown Source)
at java.util.stream.AbstractPipeline.copyInto(Unknown Source)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(Unknown Source)
at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(Unknown Source)
at java.util.stream.AbstractPipeline.evaluate(Unknown Source)
at java.util.stream.LongPipeline.reduce(Unknown Source)
at java.util.stream.LongPipeline.sum(Unknown Source)
at MyMain.main(MyMain.java:16)

I understand where it is coming from and how to avoid it using Files.walkFileTree API.

But how can this exception be avoided using Files.walk() API?

Boon about 10 years

Yes, agreed. This is a design failure.
Aksel Willgert about 10 years

Good analyzis. I think i would prefer maybe another Files.walk() that also accepted an errorHandler or similiar.
Muhammad Hewedy about 10 years

I've submitted one descries my exact case stackoverflow.com/questions/23220542
Stuart Marks about 10 years

Yes, good analysis, +1. The bug (enhancement request) that covers this is JDK-8039910.
Drakes over 5 years

fileTreeTraverseris now deprecated
MariuszS over 5 years

@Drakes Files.fileTraverser().breadthFirst() or MoreFiles.fileTraverser(sourcePath).breadthFirst()
user1050755 over 5 years

Apache Commons-IO's DirectoryWalker is about as fast Files.walkFileTree. docs.leponceau.org/java-examples/java-evaluation/… vs docs.leponceau.org/java-examples/java-evaluation/…
Mark Jeronimus over 5 years

It's best to extend SimpleFileVisitor (abstract class) instead of FileVisitor (interface). However you have to override visitFileFailed() because the default implementation ironically mimics Files.walk().
zb226 about 4 years

6 years later, the bug's closed with resolution "Future Project" and there's no signs of the latter :(
Abhishek Dujari about 2 years

yeah people will insist on using Files.walk() and hit the same problem. It isnt a bug per se but interface exists for these reasons. This is the way