Sluggish performance on NTFS drive with large number of files

13,571

The server did not have enough memory. Instead of caching NTFS metafile data in memory every file access required multiple disk reads. As usual, the issue is obvious once you see it. Let me share what clouded my perspective:

  • The server showed 2 GB memory available both in Task Manager and RamMap. So either Windows decided that the available memory was not enough to hold a meaningful part of the metafile data. Or some internal restriction does not allow to use the last bit of memory for metafile data.

  • After upgrading the RAM Task Manager would not show more memory being used. However, RamMap reported multiple GB of metafile data being held as standby data. Apparently, the standby data can have a substantial impact.

Tools used for the analysis:

  • fsutil fsinfo ntfsinfo driveletter: to show NTFS MFT size (or NTFSInfo)
  • RamMap to show memory allocation
  • Process Monitor to show that every file read is preceded by about 4 read operations to drive:\$Mft and drive:\$Directory. Though I could not find the exact definition of $Directory it seems to be related to the MFT as well.
Share:
13,571

Related videos on Youtube

Paul B.
Author by

Paul B.

Updated on September 18, 2022

Comments

  • Paul B.
    Paul B. almost 2 years

    I am looking at this setup:

    • Windows Server 2012
    • 1 TB NTFS drive, 4 KB clusters, ~90% full
    • ~10M files stored in 10,000 folders = ~1,000 files/folder
    • Files mostly quite small < 50 KB
    • Virtual drive hosted on disk array

    When an application accesses files stored in random folders it takes 60-100 ms to read each file. With a test tool it seems that the delay occurs when opening the file. Reading the data then only takes a fraction of the time.

    In summary this means that reading 50 files can easily take 3-4 seconds which is much more than expected. Writing is done in batch so performance is not an issue here.

    I already followed advice on SO and SF to arrive at these numbers.

    What to do about the read times?

    • Consider 60-100 ms per file to be ok (it isn't, is it?)
    • Any ideas how the setup can be improved?
    • Are there low-level monitoring tools that can tell what exactly the time is spent on?

    UPDATE

    1. As mentioned in the comments the system runs Symantec Endpoint Protection. However, disabling it does not change the read times.

    2. PerfMon measures 10-20 ms per read. This would mean that any file read takes ~6 I/O read operations, right? Would this be MFT lookup and ACL checks?

    3. The MFT has a size of ~8.5 GB which is more than main memory.

    • Tomas Dabasinskas
      Tomas Dabasinskas about 8 years
      To rule something out, would you mind sharing a screenshot of RAMMap?
    • Paul B.
      Paul B. about 8 years
      Do you mean the File Summary table? Now that you mention it I see a SYMEFA.DB file with 900 MB in memory which reminds me that Symantec Endpoint Protection is installed on the system. Maybe that's the culprit? I'll try to find out more.
    • Tomas Dabasinskas
      Tomas Dabasinskas about 8 years
      Actually, I was more interested in Metafile usage
    • Paul B.
      Paul B. about 8 years
      Ok, got it. Metafile shows 250 MB total, 40 active, 210 standby. Seems normal or not?
    • Tomas Dabasinskas
      Tomas Dabasinskas about 8 years
      Yes, it seems so
    • shodanshok
      shodanshok about 8 years
      A question: if the same file is opened two times in a row, the second time it is faster? If so, how much faster?
    • Paul B.
      Paul B. about 8 years
      The second read is around 500x faster than the first. Looks like the data comes directly from a cache.
    • Paul B.
      Paul B. about 8 years
      @TomasDabasinskas - getting back to Metafile usage, if the amount kept in memory is much smaller than the actual MFT can I assume that multiple MFT reads are require before accessing a file? This might explain the issue. Do you happen to know if adding RAM will make Windows cache the MFT? Currently 1 GB is unused. But maybe Windows determines that it cannot fit the entire MFT and therefore doesn't cache anything?
    • jarvis
      jarvis about 8 years
      Reads will take time if there are a lot of files in a folder. The number of files greatly matters over the size of each file. If its possible, you can further break them down to more sub-folders with fewer files. That should increase read speeds.
    • phuclv
      phuclv almost 6 years
      Since you have mainly small files, 4KB cluster will waste a lot of space, on average 2KB/50KB = 4%. Moreover you're 90% full which mean Windows already have to write into reserved space for the MFT (default 12.5%) which is not a good sign. Reducing cluster size will give you a few percent free space back for the MFT
  • D-Klotz
    D-Klotz almost 8 years
    So increasing physical memory did improve the response times? You did not configure any registry setting?
  • Paul B.
    Paul B. almost 8 years
    Yes. I had previously played around with registry settings. But in the end no change was needed after adding memory.
  • phuclv
    phuclv almost 6 years
    Standby memory are memory regions that are ready for programs to use. But since they're not used yet, the OS will utilize them as a cache. Once any program needs that memory it'll be released immediately