What is the fastest way of deleting files in a directory? (Except specific file extension)

14,080

Solution 1

If you are using .NET 4 you can benifit the smart way .NET now parallizing your functions. This code is the fasted way to do it. This scales with your numbers of cores on the processor too.

DirectoryInfo di = new DirectoryInfo(yourDir);
var files = di.GetFiles();

files.AsParallel().Where(f => f.Extension != ".zip").ForAll((f) => f.Delete());

Solution 2

By fastest are you asking for the least lines of code or the quickest execution time? Here is a sample using LINQ with a parallel for each loop to delete them quickly.

string[] files = System.IO.Directory.GetFiles("c:\\temp", "*.*", IO.SearchOption.TopDirectoryOnly);

List<string> del = (
   from string s in files
   where ! (s.EndsWith(".zip"))
   select s).ToList();

Parallel.ForEach(del, (string s) => { IO.File.Delete(s); });

Solution 3

At the time of writing this answer none of the previous answers used Directory.EnumerateFiles() which allows you to carry on operations on the list of files while the list is being constructed . Code:

Parallel.ForEach(Directory.EnumerateFiles(path, "*", SearchOption.AllDirectories).AsParallel(), Item =>
        {
            if(!string.Equals(Path.GetExtension(Item), ".zip",StringComparison.OrdinalIgnoreCase))
                File.Delete(Item);
        });

as far as I know the performance gain from using AsParallel() shouldn't be significant(if found) in this case however it did make difference in my case.

I compared the time it takes to delete all but .zip files in a list of 4689 files of which 10 were zip files using 1-foreach. 2-parallel foreach. 3-IEnumerable().AsParallel().ForAll. 4-parallel foreach using IEnumerable().AsParallel() as illustrated above. Results:

1-1545

2-1015

3-1103

4-839

the fifth and the last case was a normal foreach using Directory.GetFiles()

5-2266

of course the results weren't conclusive , as far as I know to carry on a proper benchmarking you need to use a ram drive instead of a HDD .

Note:that the performance difference between EnumerateFiles and GetFiles becomes more apparent as the number of files increases.

Solution 4

Here's plain old C#

foreach(string file in Directory.GetFiles(Server.MapPath("~/yourdirectory")))
{
    if(Path.GetExtension(file) != ".zip")
    {
        File.Delete(file);
    }
}

And here's LINQ

var files = from f in Directory.GetFiles("")
            where Path.GetExtension(f) != ".zip"
            select f;

foreach(string file in files)
    File.Delete(file);
Share:
14,080

Related videos on Youtube

GeorgeBoy
Author by

GeorgeBoy

Updated on November 13, 2020

Comments

  • GeorgeBoy
    GeorgeBoy over 3 years

    I have seen questions like What is the best way to empty a directory?

    But I need to know,

    what is the fastest way of deleting all the files found within the directory, except any .zip files found.

    Smells like linq here... or what?

    By saying fastest way, I mean the Fastest execution time.

  • GeorgeBoy
    GeorgeBoy over 13 years
    Sorry for not mentioning it. I meant the quickest execution time
  • illegal-immigrant
    illegal-immigrant over 10 years
    Parallel.ForEach. Well, running disk operations (especially HDD operations) in parallel will most likely just slow things down. And "By fastest are you asking for the least lines of code or the quickest execution time?" - it's a really strange definition of "fast" (i mean lines of code). How do lines of code correlate with speed?
  • Renaud Gauthier
    Renaud Gauthier almost 9 years
    I'll run a benchmark after work, but I too am pretty sure Parallel will slow things down. Which is unfortunate, by the way, because I love the idea.
  • Marcus Mangelsdorf
    Marcus Mangelsdorf about 8 years
    Are the timings in ms? Anyway, thank you for actually testing this and documenting the performance differences!!!