Speed Calculator for Raid sets

17,539

Solution 1

I don't know of a calcualtor that can tell you that, in part because there are so many other factors besides just the disk and connection type factors. The RAID controllers make a huge difference, as can the firmware on those controllers, the type of data, as does the ability of the motherboard to push data. Your best bet is benchmarking on your own. I can't even think of a way to write a calculator do do that sort of thing. Also I believe that probably for most operations the network will bottleneck before the RAID.

Solution 2

There are quite a few variables that can affect speed, but here's some basic ideas to get a feel for what a given raid set should be capable of.

Raw disk throughput

Assuming that a random seek completes an average of 1/2 of a rotation (180 degrees) away from the sector you want, the average random access time is one average seek plus the time the disk takes to rotate 180 degrees.

  • On a 10K RPM disk 1/2 of a rotation takes approximately 3ms.

  • On a 15K RPM disk 1/2 of a rotation takes approximately 2ms.

  • Average seek time for a Seagate Cheetah 15K6 is quoted at 3.5ms for reads and 3.9ms for writes (I presume the writes include a period to align the head on the servo tracks). A 10K disk is slightly longer.

So, a raw estimate is an average of 5.5ms per random seek for a 15K drive and 7ms for a 10K drive. Tagged command queuing will optimise this slightly. Thus, for a 15k drive we have a theoretical random throughput of about 180 IOPS and 140 IOPS for a 10K drive.

RAID-1

On a non-striped RAID-1, reads can be split between the two disks, but writes must go to both drives. Random operations will give you twice the throughput of a single disk for reads and approximately the throughput of a single disk for writes. Sequential I/O tends to peak at the maximum throughput of a single disk. Interface cables may or may not present a bottleneck.

Striped RAID sets

RAID-5, RAID-10 or RAID-50 disks have the data split up into chunks spread in a round-robin fashion amongst the members of the RAID set. Assuming no read-ahead optimisation a disk can read at most one stripe per revolution of the disk. A 10K disk revolves about 170 times per second and a 15K disk revolves about 250 times per second.

For a 64K stripe this comes to approximately 10MB/sec per 10K disk or 15MB/sec per 15K disk. Larger stripe sizes give you better sequential throughput on the disks - for example a 256K stripe size on an array of 15K disks would give you 60MB/sec per disk. A heavily random access workload will reduce this by introducing more latency between seeks. Read-ahead on a controller might increase it.

Thus, an array with 14 15K disks using 64K stripes would have a theoretical streaming throughput of around 210MB/sec assuming no other constraints. If the controller is not fast enough the practical rate may be lower (for example, I could never get a dell PV660 (Mylex DAC-FFX) to get more than one read per two revolutions of the disks). A heavily random access workload would also be somewhat slower because the disk accesses will average less than one per revolution of the disk. Some reads will also be used on parity data so the actual application data throughput would be a bit slower.

Write bottlenecks

The fastest possible write on a RAID-5 involves two reads and two writes. The controller has to read the old block and corresponding parity block, XOR the old and new data with the parity block to recalculate the parity and write out the new block and parity. Caching can reduce the amount of disk activity if the old block and parity block are in cache. The same applies to a RAID-50.

A RAID-10 needs two disk accesses per write - one to the main and the other to the mirror. Read performance is roughly equivalent to a RAID-5.

Controller bottlenecks

In some cases (fibre channel is prone to this) the connections to the physical disk subsystem are of somewhat lower bandwidth than the disks are theoretically capable of delivering. Also, disk controllers can perform poorly. In many cases this is a more significant limitation than the disks themselves. High-end SAN hardware often has large multiprocessor machines as controllers - they may also have custom hardware for fast parity calculations. The controller for an EMC DMX takes up half a rack by itself - before you put any disks on it.

Tuning the disk itself

Caching and read-ahead parameters on the disks themselves may also affect performace for certain workloads. For example, disks using Seagate's 'V' firmware might be set up for fewer larger cache segments and agressive read-ahead to optimise for streaming throughput of media data. The same physical disk configured for use in a Clariion would be configured with more, smaller cache segments in order to support a larger number of smaller writes from many clients on a SAN.

Solution 3

Here is an example : I made benchmarks with the same drives (7x750GB seagate barracuda ES2), same RAID configuration (stripe size, etc), same motherboard (Supermicro H8DMe), same CPU (dual Opteron 2214), same RAM (8GB ECC) and same operating system (Linux), same filesystem (XFS, nobarrier option) and different RAID controllers. Appreciate the results :

  • Areca 1280 : 250MB/s write, 350MB/s read, 21000 file created/s
  • Adaptec ASR52445 : 240MB/s write, 350 MB/s read, 18000 file created/s
  • 3Ware 9550 : 310MB/s write, 410MB/s read, 6500 file created/s
  • 3Ware 9650 : 440MB/s write, 410MB/s read, 4500 file created/s

Of course these are the optimal results after fine-setting all software parameters for each controller (read-ahead, caching options, request queue length, request size...) by doing long and repeated benchmarks while adjusting the various knobs.

One of the funny thing I discovered by careful benchmarking is that the settings are entirely different if you use Barracuda ES2 (32MB cache) and Barracuda ES (16MB cache) drives, though the top performance is about the same.

Unfortunately, storage and RAID is hard. That's why you won't find an easy-to-go performance calculator.

Solution 4

The sort of speed that you will see with vary a lot depending on the drives, the controller, and your workload, so you are not going to fine a nice easy calculator that will give good accurate+precise results.

Solution 5

I found a calculator that will give you multipliers of speed.

It boils down to

  • JBOD:
    • Read: 1X
    • Write: 1X
  • Raid 0 (Striped Set)
    • Read: [NumberOfVolumes]X
    • Write: [NumberOfVolumes]X
  • Raid 1 (Mirror Set)
    • Read: [NumberOfVolumes]X
    • Write: 1X
  • Raid 5
    • Read: [NumberOfVolumes-1]X
    • Write: N/A Dependent on the controller
  • Raid 10: (Mirror of striped sets 4 drives)
    • Read: [4]X
    • Write: [2]X
Share:
17,539

Related videos on Youtube

Alan
Author by

Alan

Typical IT guy.

Updated on September 17, 2022

Comments

  • Alan
    Alan almost 2 years

    First my apologies if this questions has been asked. I've googled… and googled some more and can't seem to find what I'm looking for.

    Anyone know of a software or web based calculator that will let you plug in a RAID configuration (example below) and output expected R/W speeds, hopefully in MBs

    Number of disk, size, spin, type , Raid type

    EX. (8, 73Gb, 15k, SAS, Raid 1/0)

    Or

    EX. (6, 146Gb, 10k, FC, Raid 5)

    I found severely that calculate available space. Some that give some speed info, but they can't be realistic because they don’t take spin or type in the consideration.

  • sjas
    sjas almost 10 years
    This is one damn fine piece of an answer.
  • Hannah Vernon
    Hannah Vernon about 8 years
    As always, @COTW, this is an excellent answer with great details that are hard to find elsewhere! Thanks!
  • ConcernedOfTunbridgeWells
    ConcernedOfTunbridgeWells about 8 years
    @MaxVernon - Thanks. I spent an ungodly amount on fibre channel kit to find that out, right down to poking about with disk and controller firmware. Now it's all in landfill - the last of it went just a few weeks ago. R.I.P.