How to check hard disk performance

1,125,178

Solution 1

Terminal method

hdparm is a good place to start.

sudo hdparm -Tt /dev/sda

/dev/sda:
Timing cached reads:   12540 MB in  2.00 seconds = 6277.67 MB/sec
Timing buffered disk reads: 234 MB in  3.00 seconds =  77.98 MB/sec

sudo hdparm -v /dev/sda will give information as well.

dd will give you information on write speed.

If the drive doesn't have a file system (and only then), use of=/dev/sda.

Otherwise, mount it on /tmp and write then delete the test output file.

dd if=/dev/zero of=/tmp/output bs=8k count=10k; rm -f /tmp/output

10240+0 records in
10240+0 records out
83886080 bytes (84 MB) copied, 1.08009 s, 77.7 MB/s

Graphical method

  1. Open the “Disks” application. (In older versions of Ubuntu, go to System -> Administration -> Disk Utility)
    • Alternatively, launch the Gnome disk utility from the command line by running gnome-disks
  2. Select your hard disk at left pane.
  3. Now click “Benchmark Disk...” menu item under the three dots menu button, in the pane to the right.
  4. A new window with charts opens. Click “Start Benchmark...”. (In older versions, you will find and two buttons: one is for “Start Read Only Benchmark” and another one is “Start Read/Write Benchmark”. When you click on anyone button it starts benchmarking of hard disk.)

Screenshot

How to benchmark disk I/O

Article

Is there something more you want?

Solution 2

Suominen is right, we should use some kind of sync; but there is a simpler method, conv=fdatasync will do the job:

dd if=/dev/zero of=/tmp/output conv=fdatasync bs=384k count=1k; rm -f /tmp/output
1024+0records in
1024+0 records out
402653184 bytes (403 MB) copied, 3.19232 s, 126 MB/s

Solution 3

If you want accuracy, you should use fio. It requires reading the manual (man fio) but it will give you accurate results. Note that for any accuracy, you need to specify exactly what you want to measure. Some examples:

Sequential READ speed with big blocks (this should be near the number you see in the specifications for your drive):

fio --name TEST --eta-newline=5s --filename=fio-tempfile.dat --rw=read --size=500m --io_size=10g --blocksize=1024k --ioengine=libaio --fsync=10000 --iodepth=32 --direct=1 --numjobs=1 --runtime=60 --group_reporting

Sequential WRITE speed with big blocks (this should be near the number you see in the specifications for your drive):

fio --name TEST --eta-newline=5s --filename=fio-tempfile.dat --rw=write --size=500m --io_size=10g --blocksize=1024k --ioengine=libaio --fsync=10000 --iodepth=32 --direct=1 --numjobs=1 --runtime=60 --group_reporting

Random 4K read QD1 (this is the number that really matters for real world performance unless you know better for sure):

fio --name TEST --eta-newline=5s --filename=fio-tempfile.dat --rw=randread --size=500m --io_size=10g --blocksize=4k --ioengine=libaio --fsync=1 --iodepth=1 --direct=1 --numjobs=1 --runtime=60 --group_reporting

Mixed random 4K read and write QD1 with sync (this is worst case number you should ever expect from your drive, usually less than 1% of the numbers listed in the spec sheet):

fio --name TEST --eta-newline=5s --filename=fio-tempfile.dat --rw=randrw --size=500m --io_size=10g --blocksize=4k --ioengine=libaio --fsync=1 --iodepth=1 --direct=1 --numjobs=1 --runtime=60 --group_reporting

Increase the --size argument to increase the file size. Using bigger files may reduce the numbers you get depending on drive technology and firmware. Small files will give "too good" results for rotational media because the read head does not need to move that much. If your device is near empty, using file big enough to almost fill the drive will get you the worst case behavior for each test. In case of SSD, the file size does not matter that much.

However, note that for some storage media the size of the file is not as important as total bytes written during short time period. For example, some SSDs have significantly faster performance with pre-erased blocks or it might have small SLC flash area that's used as write cache and the performance changes once SLC cache is full (e.g. Samsung EVO series which have 20-50 GB SLC cache). As an another example, Seagate SMR HDDs have about 20 GB PMR cache area that has pretty high performance but once it gets full, writing directly to SMR area may cut the performance to 10% from the original. And the only way to see this performance degration is to first write 20+ GB as fast as possible and continue with the real test immediately afterwards. Of course, this all depends on your workload: if your write access is bursty with longish delays that allow the device to clean the internal cache, shorter test sequences will reflect your real world performance better. If you need to do lots of IO, you need to increase both --io_size and --runtime parameters. Note that some media (e.g. most cheap flash devices) will suffer from such testing because the flash chips are poor enough to wear down very quickly. In my opinion, if any device is poor enough not to handle this kind of testing, it should not be used to hold any valueable data in any case. That said, do not repeat big write tests for 1000s of times because all flash cells will have some level of wear with writing.

In addition, some high quality SSD devices may have even more intelligent wear leveling algorithms where internal SLC cache has enough smarts to replace data in-place if its being re-written while the data is still in SLC cache. For such devices, if the test file is smaller than total SLC cache of the device, the full test always writes to SLC cache only and you get higher performance numbers than the device can support for larger writes. So for such devices, the file size starts to matter again. If you know your actual workload it's best to test with the file sizes that you'll actually see in real life. If you don't know the expected workload, using test file size that fills about 50% of the storage device should result in a good average result for all storage implementations. Of course, for a 50 TB RAID setup, doing a write test with 25 TB test file will take quite some time!

Note that fio will create the required temporary file on first run. It will be filled with pseudorandom data to avoid getting too good numbers from devices that try to cheat in benchmarks by compressing the data before writing it to permanent storage. The temporary file will be called fio-tempfile.dat in above examples and stored in current working directory. So you should first change to directory that is mounted on the device you want to test. The fio also supports using direct media as the test target but I definitely suggest reading the manual page before trying that because a typo can overwrite your whole operating system when one uses direct storage media access (e.g. accidentally writing to OS device instead of test device).

If you have a good SSD and want to see even higher numbers, increase --numjobs above. That defines the concurrency for the reads and writes. The above examples all have numjobs set to 1 so the test is about single threaded process reading and writing (possibly with the queue depth or QD set with iodepth). High end SSDs (e.g. Intel Optane 905p) should get high numbers even without increasing numjobs a lot (e.g. 4 should be enough to get the highest spec numbers) but some "Enterprise" SSDs require going to range 32-128 to get the spec numbers because the internal latency of those devices is higher but the overall throughput is insane. Note that increasing numbjobs to high values usually increases the resulting benchmark performance numbers but rarely reflects the real world performance in any way.

Solution 4

I would not recommend using /dev/urandom because it's software based and slow as pig. Better to take chunk of random data on ramdisk. On hard disk testing random doesn't matter, because every byte is written as is (also on ssd with dd). But if we test dedupped zfs pool with pure zero or random data, there is huge performance difference.

Another point of view must be the sync time inclusion; all modern filesystems use caching on file operations.

To really measure disk speed and not memory, we must sync the filesystem to get rid of the caching effect. That can be easily done by:

time sh -c "dd if=/dev/zero of=testfile bs=100k count=1k && sync"

with that method you get output:

sync ; time sh -c "dd if=/dev/zero of=testfile bs=100k count=1k  && sync" ; rm testfile 
1024+0 records in
1024+0 records out
104857600 bytes (105 MB) copied, 0.270684 s, 387 MB/s

real    0m0.441s
user    0m0.004s
sys 0m0.124s

so the disk datarate is just 104857600 / 0.441 = 237772335 B/s --> 237MB/s

That is over 100MB/s lower than with caching.

Happy benchmarking,

Solution 5

If you want to monitor the disk read and write speed in real-time you can use the iotop tool.

This is useful to get information about how a disk performs for a particular application or workload. The output will show you read/write speed per process, and total read/write speed for the server, similar to top.

Install iotop:

sudo apt-get install iotop  

Run it:

sudo iotop

This tool is helpful to understand how a disk performs for a specific workload versus more general and theoretical tests.

Share:
1,125,178

Related videos on Youtube

Luis Alvarado
Author by

Luis Alvarado

System Engineer Social Engineer Master in Pedagogy Master in Open Source CCNA Certified Linux Foundation Certified Former Askubuntu Moderator Stack Careers | Linkedin | Launchpad | Ubuntu Wiki - Random SE Stuff - Latin American Members | JC Race Award | Human Robot Award 74

Updated on September 18, 2022

Comments

  • Luis Alvarado
    Luis Alvarado over 1 year

    How to check the performance of a hard drive (Either via terminal or GUI). The write speed. The read speed. Cache size and speed. Random speed.

  • Knowledge Cube
    Knowledge Cube over 12 years
    Seeing as the question is tagged for 11.10, I just thought I'd point out the Disk Utility can also be searched for easily in the Unity dash, or located under the Accessories category of the Applications lens using the provided filter.
  • Fran Marzoa
    Fran Marzoa over 11 years
    There is no such "System ->" on my Ubuntu 12.04 Unity. Or at least I haven't found it. And I do not see that disk tool neither within System Settings... O_o But I finallly managed to run it: /usr/bin/palimpsest
  • Alaa Ali
    Alaa Ali over 10 years
    It's an answer using a different command/option than the others. I see it's an answer worthy of a post of its own.
  • Paul Lammertsma
    Paul Lammertsma about 10 years
    Note that since 12.10 it's simply called Disks and can be found through Unity.
  • Ken Sharp
    Ken Sharp about 10 years
    On Gnome this has moved to Applications --> System Tools --> Preferences --> Disk Utility. For those of use who hate Unity.
  • Panther
    Panther about 10 years
    If /tmp is a ramdisk, then you will have to write to /home/your_user . I do not think Ubuntu uses a ramdisk for /tmp , see askubuntu.com/questions/62928/why-doesnt-tmp-use-tmpfs and askubuntu.com/questions/20783/…
  • Diego Fernández Durán
    Diego Fernández Durán almost 10 years
    Why have you used 384k as block size?
  • Tele
    Tele almost 10 years
    @Diego There is no reason. It was just an example. You can use anything else. (between about 4k ... 1M ) Of course bigger blocksize will give better performance. And of course decrease the count number when you use big bs, or it will take a year to finish.
  • VishApp
    VishApp over 8 years
    so limited, bonnie is the minimum test to get anything but linear read/write performance
  • Jonatan Öström
    Jonatan Öström almost 8 years
    On Ubuntu 16.04 with an SSD. For me the dd command needs to be dd if=/dev/zero of=/home/user/file bs=8k count=200k; rm -f /home/user/file (change to correct username). With a smaller count or writing to tmp the results are inconsistent and unrealistically good.
  • Mustafa Chelik
    Mustafa Chelik over 7 years
    Will benchmark overwrite my data on disk? Because it does write benchmark too. I think it will corrupt my data, right?
  • Panther
    Panther over 7 years
    @MustafaChelik - Yes it writes data to the disk, but it should not overwrite existing data or corrupt your disk. I am not aware of any method to benchmark your disk without writing data, lol. Why would you thing writing zeros to a temp file would "corrupt my data" ?
  • Mustafa Chelik
    Mustafa Chelik over 7 years
    Because my USB stick is encrypted with Truecrypt and nothing should be written in it by benchmark. So will my USB stick's data will corrupt if I start benchmark?
  • Panther
    Panther over 7 years
    @MustafaChelik as I said earlier, you can not really benchmark your disk if you do not write data to it. You will need to decrypt the disk and then run the benchmark. You will need to change the command to account for the crypt such as dd if=/dev/zero of=/media/your_crypt_mount_point bs=8k count=10k; rm -f /media/your_crypt_mount_point
  • MSS
    MSS over 7 years
    it's not reliable by bench mark tools like iozone and sysbench numbers are much much lower
  • sekrett
    sekrett over 7 years
    @MustafaChelik, if it writes zeros to a temp file, why the dialog says to backup important data before write benchmark?
  • Anon
    Anon almost 6 years
    Some of the fio settings are a bit strange and may be non-optimal. For example, having such a huge block size (2Mbytes) when you're doing direct I/O with an asynchronous I/O engine is likely to lead to a lot of splitting in the kernel thus creating overhead. Sending periodic fsyncs when you're only doing reads also looks unusual. I agree fio is useful but I'd recommend readers carefully investigate which parameters they wish to use with rather than just copying them verbatim from the 20180102 version of the answer above...
  • Mikko Rantalainen
    Mikko Rantalainen almost 6 years
    @Anon: you're right, the optimum for sequential read would be to match /sys/block/sd?/queue/max_sectors_kb because it may be lower than the actual hardware limit which is usually way more than the 2MB in the example above. However, I assume that the minor overhead caused by the CPU does not matter compared to the speed of actual I/O device. The fsync is a no-operation for reads so it will not affect the results - I kept it so that it's easier to understand the differences between different command lines. Are you having problems getting the results matching the manufacturer specs?
  • Anon
    Anon almost 6 years
    Not exactly, I just have (some) experience working with fio and Linux. Actually if you're guessing the best block size it would be wise to start with optimal_io_size if it's available (but you can assume 64Kbytes if it's 0 - that's what the kernel does).Not exactly, I just have (some) experience working with fio and Linux. Actually if you're guessing the best block size it would be wise to start with optimal_io_size if it's available (but you can assume 64Kbytes if it's 0 - that's what the kernel does).
  • Mikko Rantalainen
    Mikko Rantalainen almost 6 years
    I just re-tested some devices. Using above sequential read test (2MB block size) I got 280 MB/s from Samsung SSD 850 EVO and 1070 MB/s from Intel 910 SSD. With 64k block size and otherwise identical commandline I got 268 MB/s from 850 EVO and 1055 MB/s from 910 SSD. At least for this kind of devices, using 2 MB block size seems to improve results around 1-5% even though it causes kernel to split requests to hardware. I guess even with kernel optimizations the overhead of submitting more syscalls is worse than splitting inside kernel.
  • Mikko Rantalainen
    Mikko Rantalainen almost 6 years
    Upon further testing it seems that I get the highest sequential throughput using power of 2 value that is less than max_sectors_kb. I changed the above example commands to use 1 MB block size because that seems to work with real world hardware. And I also tested that fsync does not matter for reading.
  • Anon
    Anon almost 6 years
    Depending on how the drive is connected you may find that your iodepth was too low. You would have to watch what Linux is actually sending down to the device and what depth it's doing it at...
  • Mikko Rantalainen
    Mikko Rantalainen almost 6 years
    I set iodepth to 1 for random access exactly because real world programs often run algorithms/logic that does not work with depth any higher than 1. As a result, if such depth is "too low" your I/O device is bad. It's true that some SSD devices will benefit from depth higher than 32. However, can you point to any real world workload that requires read access and is able to keep up iodepth higher than 32? TL; DR: if you want to reproduce some insanely high read benchmark number with high latency device, use iodepth=256 --numjobs=4 but never expect to see such numbers for real.
  • Anon
    Anon almost 6 years
    Most "real world" programs aren't actually submitting I/O (o_)directly let alone asynchronously so all of our examples are in unusual workloads to push the limits benchmark territory (as they say, the best benchmark is your real workload). Having said that doing things like running multiple busy virtual machines are easily able to generate workloads with crazy high depths but where the I/O often looks random from the disk perspective and is a simple example of where you can see a huge speedup from things like NVMe. PS: setting numbers too high will reduce throughput so there's a sweet spot...
  • Anon
    Anon over 5 years
    Be careful with using zeros for your write data - some filesystems and disks will have a special case path for it (and other compressible data) which will cause artificially high benchmark numbers...
  • Anon
    Anon over 5 years
    Be careful with using zeros for your write data - some filesystems and disks will have a special case path for it (and other compressible data) which will cause artificially high benchmark numbers...
  • Anon
    Anon over 5 years
    Be careful with using zeros for your write data - some disks (such as SSDs) and some filesystems will have a special case path for it. This results in artificially high benchmark numbers when using zero buffers. Other highly compressible data patterns can also distort results...
  • Tele
    Tele over 5 years
    Yes thats true Anon. But if you use the rnd generator, you will measure that one not the disk. If you create a random file then you will measure read file also not only write. Maybe you should create a big memory chunk filled with random bytes.
  • Anon
    Anon over 5 years
    Bonnie is flawed for disk benchmarking and can easily generate numbers that actually reflect non-disk aspects of your system so a high degree of care is required if you choose to use it. See Brendan Gregg's Active Benchmarking: Bonnie++ for details.
  • Mikko Rantalainen
    Mikko Rantalainen almost 5 years
    If you want graphs, see askubuntu.com/q/1108347/50254
  • Scott
    Scott almost 5 years
    Answers like this one are highly dependent upon the distribution and window manager used. Disk Utility is an alias for some underlying program - what program specifically?? On what window manager? On what distribution?
  • Anon
    Anon over 4 years
    Quick update (and correction of a mistake I wrote): as @mikko-rantalainen alluded to max_sectors_kb will control the maximum size I/O a block device will submit down the disk. This is desirable value because we assume that the bigger the I/O the less work that has to be done per I/O. The slight wrinkle is that some devices expose a "preferred unit for sustained I/O" value and if so the value in /sys/block/<dev>/queue/optimal_io_size won't be zero (typically this only happens on certain RAID devices) and using that may yield a better speed.
  • Mikko Rantalainen
    Mikko Rantalainen over 4 years
    I think optimal_io_size really means data transfer smaller than this should be avoided. For example, a HDD having physical sector size of 4 KB but logical sector size of 512 bytes might report optimal_io_size 4096. However, that does not mean that the optimal way to write big files is to write in multiple 4 KB blocks. Instead it instructs that if the software can avoid using smaller writes than optimal_io_size the device will work much faster.
  • Anon
    Anon over 4 years
    In the specific scenario you gave I believe you are thinking of /sys/block/<disk>/queue/physical_block_size vs [...]/logical_block_size (e.g. an SSD that accepts 512 byte (logical) I/O but its "blocks" are (physically) 4096 bytes big meaning it will do extra work to cope with 512 byte I/O). There it's rare the optimal_io_size will be anything but 0 HOWEVER the minimum_io_size ("preferred minimum granularity") might be 4096. When I've seen a non-zero optimal_io_size it's been huge (e.g. 4MiB) compared to physical and was reflecting RAID array stripe size.
  • Anon
    Anon over 4 years
    "In case of SSD, the file size does not matter that much" - this is not true. For writes the file size can be massively significant because if it's too small (and/or not enough write I/O is being done) you may not exhaust pre-erased blocks and thus won't see the effect where sustained write speed falls to the speed of garbage collection. See snia.org/sites/default/education/tutorials/2011/fall/SolidSt‌​ate/… for more discussion.
  • Mikko Rantalainen
    Mikko Rantalainen over 4 years
    I'll rephrase the file size thing to make it more clear. In case of SSD, it doesn't matter how big file you have but it may matter how much IO you do in total. This will also make difference for SMR HDD, too.
  • Anon
    Anon over 4 years
    I was thinking that but when I was writing it into the above someone pointed out that the file size can still matter because if the SSD spots you doing overwrites of things in its cache it can choose to throw the overwritten data away.
  • Anon
    Anon over 4 years
    Additionally partial overwrites lead to more garbage collection work (because you have to read the bits of the data that weren't overwritten and preserve them) whereas with a full overwrite you can mark the whole cell as free. See codecapsule.com/2014/02/12/… for some more background.
  • funder7
    funder7 almost 4 years
    Thanks for this test, I'm not expert with disk benchmarks, but these results are really what I see when copying a large number of files on my drives. Today I configured a raid array for the first time with mdadm, with 3 different hdds (I mean, there's no hdd equal to the other in this array!). I'm a little confused, the raid performed better than my m.2 pcie ssd (it's a kingston shpm228). Not that I'm unhappy with the actual speed, but I have a slight sense that something is not working properly.
  • Mikko Rantalainen
    Mikko Rantalainen almost 4 years
    @Funder I assume you tested either sequential read or sequential write. If you get better performance from HDD raid with mixed 4k qd1 test above you've probably done some mistake during the testing. I'm not familiar with that specific Kingston SSD but I'd assume that is not bad enough to lose to HDD raid for non-sequential access. That said, I personally novadays only use Samsung, Intel or premium A-Data SSDs. Also, if you're interested in performance only, Linux software raid wins any hardware raid setup. Battery backup on write cache may still be reason to use a RAID card, though.
  • funder7
    funder7 almost 4 years
    @Mikko you're right, today I retried and raid was definitely less performant, probably I did something wrong. Well this SSD should be good enough, I have bought it 2yrs ago, and the price is still high..The brand is HyperX, I think that's the high end division of Kingston. Anyway looking at the reviews on e-shops, I see that some Samsung are the best, speaking of price/performance balance. Returning to RAID, yes I don't care much about data redundancy, I tried to build an array just to make improve the transfer rate between Solid state and hdd drives. I thought RAID cards were better!
  • Mikko Rantalainen
    Mikko Rantalainen over 3 years
    @Funder the problem with RAID cards is that they are basically trying to compete with real CPUs and whatever embedded processor the card has will be much slower than modern AMD or Intel CPU with optimized code. Forget getting a pricey card and use the money on system CPU instead. Linux kernel contains 2-4 different implementations for all RAID algorithms and the kernel benchmarks the CPU during boot so you get a really good implementation for any CPU. In addition, the real CPU has much better connection to RAM / disk cache so latency is going to be lower, too, with software RAID.
  • funder7
    funder7 over 3 years
    @MikkoRantalainen well I don't know which is the most used interface now for RAID cards, I'm pretty sure that you don't need the same power of the CPU to stay at the same speed level...probably hw cards have a cache mounted for this reason, having a buffer to address high traffic situations, plus they do only data transfer, while the CPU has many different duties to handle. My SSD is in M.2 format, so you can mount it also on its PCIe adapter, it should reach 6 Gb/s. I'd say that using a dedicated RAID card, you don't depend anymore on SSD's cache and bus speed.
  • Mikko Rantalainen
    Mikko Rantalainen over 3 years
    @Funder if you have lots of HDDs connected to a single RAID controller with hardware acceleration, the system CPU will need less power to use the RAID system. However, the performance of the whole system is still probably worse than just using Linux software RAID with direct connection to disks. Of course, most systems will run out of connectors to attach very many disks without a RAID card and depending on RAID card implementation you cannot directly talk to individual HDDs through the card which is required for high performance software RAID.
  • Mikko Rantalainen
    Mikko Rantalainen over 3 years
    For maximum performance you want NVMe devices directly attached to PCIe and put Linux software RAID over those. And in that case, the limit is probably going to be the max throughput between PCIe subsystem and system RAM.
  • funder7
    funder7 over 3 years
    Thanks @MikkoRantalainen, good things to know!
  • Anon
    Anon over 3 years
    @MikkoRantalainen "Upon further testing it seems that I get the highest sequential throughput using power of 2 value that is less than max_sectors_kb." - this is likely because the Linux kernel was choosing to break the I/O up at a smaller size. See stackoverflow.com/a/59403297/2732969 for details.
  • Mikko Rantalainen
    Mikko Rantalainen over 3 years
    @Anon Thanks. I was aware of max_sectors_kb but hadn't considered that DMA engine could be limiting the request size, too.
  • Dee
    Dee over 3 years
    @Panther your dd command gives me the same write speed on hdd and ssd
  • Bill Zhao
    Bill Zhao about 3 years
    dd command inside my Android device does not allow this command to have conv=fdatasync