Does hard drive cache size matter in a RAID?

19,385

Solution 1

Caching allows you to increase processing speed-- however, the performance notice can be minimal for the purpose you are using it for (storage). RAID drives will use their full cache.

Edit: Just to explain a bit more on what caching is; it stores data so that future requests can be served faster. That means the higher the cache, the more block data that can be stored, which means it will be able to be retrieved faster.

Solution 2

Hard drive cache size does not matter anywhere since all modern operating systems do their own caching, and have MUCH more memory to use for it. If it's been recently accessed, then it's going to be in the OS cache anyhow so having it in the drive cache doesn't matter since the OS won't ask the drive for that data again.

To compare to CPU caches, it's like having your nice fat 8MB L3 cache modern cpus have, then watching the years go by and finding CPUs with a 128 MB L2 cache that is 32 times faster, but still having that old, slow 8 MB L3 cache. It won't be doing any good since the L2 is always consulted first and is both larger and faster. At that point, arguing about whether the L3 cache should be 8 or 16 MB is a moot point since anything in the L3 will also be in the L2 so the L3 won't even see the request.

To see the drive and kernel caches in action, you can play around with dd to see how fast you can read from the drive.

sudo dd if=/dev/sda of=/dev/null bs=52488 count=1

This will read 512kb from the drive. Repeat it a few times and you will start to see some very fast numbers. On this old machine I have handy, I'm seeing on the order of 751 MB/s. That is with the kernel cache. Now if you throw in the iflag=direct option, that will disable the kernel cache, allowing you to measure the speed of the drive's cache. Repeating this I see only around 100 MB/s, which is about the max transfer rate of this old IDE interface. That isn't much better than the drive's unbuffered throughput of around 61 MB/s.

Now ask yourself what good that slower, smaller drive cache is doing when you aren't bypassing the kernel cache.

Share:
19,385

Related videos on Youtube

Andrew
Author by

Andrew

I build websites, I care about user experience, I make films (sporadically) and love working with video on the web. I like interesting people, music, movies, and video games.

Updated on September 18, 2022

Comments

  • Andrew
    Andrew almost 2 years

    Possible Duplicate:
    Does hard drive buffer size matter?

    Ok, so not a RAID exactly, but I just bought a drive-less Drobo to use as raw storage for video/photo work and am now browsing hard drives to put into it. In most cases there's a pretty big price difference between drives with a 16MB/32MB/64MB cache size. In my particular case, with 4 1TB drives in a Drobo, does the cache size increase performance in any way? Thanks in advance!

    • Piotr Kula
      Piotr Kula over 12 years
      If you are mirroring then yes. If it is stripped.. it will be useless.. then usually dedicated raid cards with cache are used.. as 1 drive cannot cache a partial file..but 1 raid card can cache on which drives the entire files lies on.. you see.
  • Amadeusz Wieczorek
    Amadeusz Wieczorek over 12 years
    -1. Hard drive caching is still used and it will be used for years on. Operating system is using that memory cache you're talking about primarily for the system data and files. Andrew will use multiple hard drives, so a change that data is in that memory cache further decreases. It appears that Andrew will use the system for photo/video editing - not storage. That actually means a lot of IO and cache will help. For example, cache will serve as buffer for writes so that Andrew's app can continue work while HDD is writing data from cache to the platters.
  • psusi
    psusi over 12 years
    @Amadeu, whether it is used or not has nothing to do with whether it matters. See the detailed explanation on wikipedia linked to in the other question that Dave M mentioned. The OS uses its own cache to buffer the IO so the app can continue to work, so there is no need, and no benefit to having more cache on the drives.
  • Andrew
    Andrew over 12 years
    Thank you muchly! I'm not trying to be cheap with hard drives, especially when they're holding important work, I just needed to make sure I'm buying the right tools for the job. Thanks again.
  • Ethabelle
    Ethabelle over 12 years
    Understandable! I'd say if there isn't a huge price difference, go with the higher cache, but if there is (and sometimes there is) I wouldn't bother. You aren't going to notice much of a speed difference if you're just using it as storage.
  • Psycogeek
    Psycogeek over 12 years
    There is a darn good reason for cache on the drives, the path to the drive , is faster than the write to the platters. if you can get more data headed into the drive , without that data actually being written at that time, then it is going to speed up transfers of all kinds. not only that but most drives have implemented a read-ahead, be that nessisary for the read your doing or not, moving this or anything into the cache on the hard drive, makes transporting it out a LOT faster than reading it off the platters, or off the platters again, in the situation of caching re-reads.
  • Psycogeek
    Psycogeek over 12 years
    best way to tell, Turn it off :-) I would agree that the differance in sizes does not make that much differance, but then I want a 1-4GIG sized one , sooo what do I know. For anyone to make assumptions that it could make a 100% differance LOL mabey it would in some useless benchmark. "They apply to only a small ammount of transfers" yes, dependant. As far as having a piece of seperate hardware doing the work, I am all for it, many many disks all with leetel caches suit me just fine.
  • psusi
    psusi over 12 years
    @Psycogeek, you can't disable the drive's cache, but instead you can disable the kernel's and compare the difference. Added example to answer.
  • Psycogeek
    Psycogeek over 12 years
    And on the other hand isnt this more like debating over the data cache even being needed on a CPU? Gee the memory is right there, and anything in it will be in the next one. NO because the shorter faster path is why they do that stuff to begin with. . . Even If the system cache has all that stuff in it, the transfer From/To the system and disk cache is faster, the transfer from/to the platters not so fast. At what point is it no longer usefull, at the point it isnt big enough :-) but every little bit helps. We used drives back when they had a "few Sectors" of cache, not going back.
  • psusi
    psusi over 12 years
    @Psycogeek, no, it isn't. The question is not whether to cache or not, the question is whether or not a smaller, slower cache on the drive has been rendered obsolete by a larger, faster cache in the OS. It doesn't matter how fast or slow the disk transfer is, when you never perform that transfer in the first place ( because the request is satisfied by the OS cache ).
  • Adam Crume
    Adam Crume over 11 years
    It's a little old, but this paper concludes that increasing on-drive cache size beyond 512 KB has little benefit, assuming a reasonable OS cache. Zhu, Yingwu, and Yiming Hu. "Disk built-in caches: evaluation on system performance." Modeling, Analysis and Simulation of Computer Telecommunications Systems, 2003. MASCOTS 2003. 11th IEEE/ACM International Symposium on. IEEE, 2003.