Poor IO performance - PCIe NVMe Samsung 950 pro
Solution 1
Thank you for your question, it has been incredibly helpful for me.
I have a very similar experience, different hardware setup (I am using an Intel NVMe SSD). But I am also running Ubuntu 16.04. Given your evidence and a similar result found in this article I was convinced that the issue was with how Ubuntu was setting up the NVMe drives.
I was determined to solve the issue without giving up completely on Ubuntu. But no matter what I did, I was not able to get speeds above 2000 MB/sec when testing with hdparm exactly as you described.
So, I did some digging, and found a guide provided by Intel. I tried everything they suggested in this guide and found that one part was different. Near the bottom it discusses aligning the drive partitions correctly. This is the one part that didn't match up with my installation. My starting block was not divisible by 4096 bytes. It was using a 512 byte sector size instead of a 4k sector size.
Sure enough, I formatted the disk to start the partition at a value divisible by 4096 and FINALLY I was able to break speeds of 2000 MB/s.
Right now it is averaging 2.3 GB/s when I expect it to be a bit higher. I blame this on the fact that when I run sudo fdisk -l
the NVMe drive is still shown with a physical sector size of 512 bytes. I plan to continue investigating but I hope this helps you!
Solution 2
Caution: This answer is old. As of Linux 4.19 blk_mq is the default scheduler. It is most likely that the problem for your PCIe NVMe SSD running slow stems form elsewhere.
Original answer:
Please add
scsi_mod.use_blk_mq=1
to your kernel boot parameters, otherwise I don't think you will see the benefit of NVMe's increased command queue and command per queue.
Note: I know it's for arch but you might also want to take a look at the Wiki for more info about tuning I/O.
Solution 3
This thread is one year old (October 2016). One of the highest upvoted answers recommends an Intel NVMe driver that is two years old (2015).
In February 2017 though Samsung released a Firmware Update that uses a Linux based boot ISO installer. On the same link there are drivers you can install for Windows 7/8/10. I'll be installing both soon on my new Samsung 960 Pro and brand new Dell based i7-6700 laptop. Along with flashing BIOS and updating other Dell based drivers.
I think it's important to revisit these old threads and provide new users with current (as of October 11, 2017 anyways) links so they have all options open.
There are many google searches returned for slow performance of Samsung 960 Pro under Linux being half the speed of Windows so I encourage everyone to search out as many options as possible.
After implementing scsi_mod.use_blk_mq=1
kernel parameter:
$ systemd-analyze
Startup finished in 7.052s (firmware) + 6.644s (loader) + 2.427s (kernel) + 8.440s (userspace) = 24.565s
Removing the kernel parameter and rebooting:
$ systemd-analyze
Startup finished in 7.060s (firmware) + 6.045s (loader) + 2.712s (kernel) + 8.168s (userspace) = 23.986s
So it would appear now that scsi_mod.use_blk_mq=1
makes system slower not faster. At one time it may have been beneficial though.
Solution 4
Here's some interesting information: on Windows, the drive doesn't perform according to review benchmarks until cache flushing is disabled. Usually this isn't done directly; instead, the vendor's driver (in this case, Samsung NVMe driver) is installed.
If you benchmark with the vendor's driver, and then disable cache flushing in Windows, you get the same numbers. This would unlikely be the case if the vendor wasn't ignoring cache flushing.
Translated to Linux-land, that means that on Windows, to get the big benchmark numbers you see in all the reviews, you need to disable fsync
, with all that means for reliability (no fsync, or specifically, no write barrier, means that power loss at the wrong time could break the whole FS, depending on implementation - reordered writes create "impossible" situations).
Samsung's "data center" SSDs come with capacitors to ensure cached data is flushed correctly. This is not the case with their consumer drives.
I've just worked this out from first principles, having added a 1TB NVMe to my new build yesterday. I'm not particularly happy, and I've initiated contact with Samsung support to see what they say - but I doubt I'll hear back.
Related videos on Youtube
kross
@rosskevin Entrepreneur, adviser, seed investor. Kind-of retired, but I have my doubts.
Updated on September 18, 2022Comments
-
kross over 1 year
I just finished a hardware build expecting a big gain from the new NVMe drive. My prior performance was lower than expected (~3gb transferred), so I've replaced the motherboard/cpu/memory/hdd. While performance is double what it was, it is still half what I get on my 3 year old macbook pro with a SATA6 drive.
- CPU: i7-5820k 6core
- Mobo: MSI X99A MPOWER
- Memory: 32GB
- Drive: Samsung 950 pro NVMe PCIe
Ubuntu (also confirmed with
16.04.1 LTS
):Release: 15.10 Codename: wily 4.2.0-16-generic $ sudo blkid [sudo] password for kross: /dev/nvme0n1p4: UUID="2997749f-1895-4581-abd3-6ccac79d4575" TYPE="swap" /dev/nvme0n1p1: LABEL="SYSTEM" UUID="C221-7CA5" TYPE="vfat" /dev/nvme0n1p3: UUID="c7dc0813-3d18-421c-9c91-25ce21892b9d" TYPE="ext4"
Here are my test results:
sysbench --test=fileio --file-total-size=128G prepare sysbench --test=fileio --file-total-size=128G --file-test-mode=rndrw --max-time=300 --max-requests=0 run sysbench --test=fileio --file-total-size=128G cleanup Operations performed: 228000 Read, 152000 Write, 486274 Other = 866274 Total Read 3.479Gb Written 2.3193Gb Total transferred 5.7983Gb (19.791Mb/sec) 1266.65 Requests/sec executed Test execution summary: total time: 300.0037s total number of events: 380000 total time taken by event execution: 23.6549 per-request statistics: min: 0.01ms avg: 0.06ms max: 4.29ms approx. 95 percentile: 0.13ms Threads fairness: events (avg/stddev): 380000.0000/0.00 execution time (avg/stddev): 23.6549/0.00
The scheduler is set to
none
:# cat /sys/block/nvme0n1/queue/scheduler none
Here is the
lspci
information:# lspci -vv -s 02:00.0 02:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd Device a802 (rev 01) (prog-if 02 [NVM Express]) Subsystem: Samsung Electronics Co Ltd Device a801 Physical Slot: 2-1 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 32 bytes Interrupt: pin A routed to IRQ 45 Region 0: Memory at fb610000 (64-bit, non-prefetchable) [size=16K] Region 2: I/O ports at e000 [size=256] Expansion ROM at fb600000 [disabled] [size=64K] Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- Capabilities: [50] MSI: Enable- Count=1/8 Maskable- 64bit+ Address: 0000000000000000 Data: 0000 Capabilities: [70] Express (v2) Endpoint, MSI 00 DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset- MaxPayload 128 bytes, MaxReadReq 512 bytes DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr+ TransPend- LnkCap: Port #0, Speed 8GT/s, Width x4, ASPM L1, Exit Latency L0s <4us, L1 <64us ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+ LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 8GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR+, OBFF Not Supported DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+, EqualizationPhase1+ EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest- Capabilities: [b0] MSI-X: Enable+ Count=9 Masked- Vector table: BAR=0 offset=00003000 PBA: BAR=0 offset=00002000 Capabilities: [100 v2] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn- Capabilities: [148 v1] Device Serial Number 00-00-00-00-00-00-00-00 Capabilities: [158 v1] Power Budgeting <?> Capabilities: [168 v1] #19 Capabilities: [188 v1] Latency Tolerance Reporting Max snoop latency: 0ns Max no snoop latency: 0ns Capabilities: [190 v1] L1 PM Substates L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+ PortCommonModeRestoreTime=10us PortTPowerOnTime=10us Kernel driver in use: nvme
hdparm
:$ sudo hdparm -tT --direct /dev/nvme0n1 /dev/nvme0n1: Timing O_DIRECT cached reads: 2328 MB in 2.00 seconds = 1163.98 MB/sec Timing O_DIRECT disk reads: 5250 MB in 3.00 seconds = 1749.28 MB/sec
hdparm -v
:sudo hdparm -v /dev/nvme0n1 /dev/nvme0n1: SG_IO: questionable sense data, results may be incorrect multcount = 0 (off) readonly = 0 (off) readahead = 256 (on) geometry = 488386/64/32, sectors = 1000215216, start = 0
fstab
UUID=453cf71b-38ca-49a7-90ba-1aaa858f4806 / ext4 noatime,nodiratime,errors=remount-ro 0 1 # /boot/efi was on /dev/sda1 during installation #UUID=C221-7CA5 /boot/efi vfat defaults 0 1 # swap was on /dev/sda4 during installation UUID=8f716653-e696-44b1-8510-28a1c53f0e8d none swap sw 0 0 UUID=C221-7CA5 /boot/efi vfat defaults 0 1
fio
This has some comparable benchmarks it is way off. When I tested with fio and disabled
sync
, it is a different story:sync=1 1 job - write: io=145712KB, bw=2428.5KB/s, iops=607, runt= 60002msec 7 jobs - write: io=245888KB, bw=4097.9KB/s, iops=1024, runt= 60005msec sync=0 1 job - write: io=8157.9MB, bw=139225KB/s, iops=34806, runt= 60001msec 7 jobs - write: io=32668MB, bw=557496KB/s, iops=139373, runt= 60004msec
Here's the full
sync
results for one job and 7 jobs:$ sudo fio --filename=/dev/nvme0n1 --direct=1 --sync=1 --rw=write --bs=4k --numjobs=1 --iodepth=1 --runtime=60 --time_based --group_reporting --name=journal-test journal-test: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1 fio-2.1.11 Starting 1 process Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/2368KB/0KB /s] [0/592/0 iops] [eta 00m:00s] journal-test: (groupid=0, jobs=1): err= 0: pid=18009: Wed Nov 18 18:14:03 2015 write: io=145712KB, bw=2428.5KB/s, iops=607, runt= 60002msec clat (usec): min=1442, max=12836, avg=1643.09, stdev=546.22 lat (usec): min=1442, max=12836, avg=1643.67, stdev=546.23 clat percentiles (usec): | 1.00th=[ 1480], 5.00th=[ 1496], 10.00th=[ 1512], 20.00th=[ 1528], | 30.00th=[ 1576], 40.00th=[ 1592], 50.00th=[ 1608], 60.00th=[ 1608], | 70.00th=[ 1608], 80.00th=[ 1624], 90.00th=[ 1640], 95.00th=[ 1672], | 99.00th=[ 2192], 99.50th=[ 6944], 99.90th=[ 7328], 99.95th=[ 7328], | 99.99th=[ 7520] bw (KB /s): min= 2272, max= 2528, per=100.00%, avg=2430.76, stdev=61.45 lat (msec) : 2=98.44%, 4=0.58%, 10=0.98%, 20=0.01% cpu : usr=0.39%, sys=3.11%, ctx=109285, majf=0, minf=8 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=36428/d=0, short=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=1 Run status group 0 (all jobs): WRITE: io=145712KB, aggrb=2428KB/s, minb=2428KB/s, maxb=2428KB/s, mint=60002msec, maxt=60002msec Disk stats (read/write): nvme0n1: ios=69/72775, merge=0/0, ticks=0/57772, in_queue=57744, util=96.25% $ sudo fio --filename=/dev/nvme0n1 --direct=1 --sync=1 --rw=write --bs=4k --numjobs=7 --iodepth=1 --runtime=60 --time_based --group_reporting --name=journal-test journal-test: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1 ... fio-2.1.11 Starting 7 processes Jobs: 6 (f=6): [W(2),_(1),W(4)] [50.4% done] [0KB/4164KB/0KB /s] [0/1041/0 iops] [eta 01m:00s] journal-test: (groupid=0, jobs=7): err= 0: pid=18025: Wed Nov 18 18:15:10 2015 write: io=245888KB, bw=4097.9KB/s, iops=1024, runt= 60005msec clat (usec): min=0, max=107499, avg=6828.48, stdev=3056.21 lat (usec): min=0, max=107499, avg=6829.10, stdev=3056.16 clat percentiles (usec): | 1.00th=[ 0], 5.00th=[ 2992], 10.00th=[ 4512], 20.00th=[ 4704], | 30.00th=[ 5088], 40.00th=[ 6176], 50.00th=[ 6304], 60.00th=[ 7520], | 70.00th=[ 7776], 80.00th=[ 9024], 90.00th=[10048], 95.00th=[12480], | 99.00th=[15936], 99.50th=[18048], 99.90th=[22400], 99.95th=[23936], | 99.99th=[27008] bw (KB /s): min= 495, max= 675, per=14.29%, avg=585.60, stdev=28.07 lat (usec) : 2=4.41% lat (msec) : 2=0.57%, 4=4.54%, 10=80.32%, 20=9.92%, 50=0.24% lat (msec) : 250=0.01% cpu : usr=0.14%, sys=0.72%, ctx=173735, majf=0, minf=63 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=61472/d=0, short=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=1 Run status group 0 (all jobs): WRITE: io=245888KB, aggrb=4097KB/s, minb=4097KB/s, maxb=4097KB/s, mint=60005msec, maxt=60005msec Disk stats (read/write): nvme0n1: ios=21/122801, merge=0/0, ticks=0/414660, in_queue=414736, util=99.90%
Alignment
I have checked the alignment with
parted
, as well as did the math based on http://www.intel.com/content/dam/www/public/us/en/documents/technology-briefs/ssd-partition-alignment-tech-brief.pdfkross@camacho:~$ sudo parted GNU Parted 3.2 Using /dev/nvme0n1 Welcome to GNU Parted! Type 'help' to view a list of commands. (parted) unit s (parted) print all Model: Unknown (unknown) Disk /dev/nvme0n1: 1000215216s Sector size (logical/physical): 512B/512B Partition Table: gpt Disk Flags: Number Start End Size File system Name Flags 1 2048s 206847s 204800s fat32 EFI system partition boot, esp 2 206848s 486957055s 486750208s ntfs msftdata 3 486957056s 487878655s 921600s ntfs hidden, diag 4 590608384s 966787071s 376178688s ext4 5 966787072s 1000214527s 33427456s linux-swap(v1) kross@camacho:~$ sudo parted /dev/nvme0n1 GNU Parted 3.2 Using /dev/nvme0n1 Welcome to GNU Parted! Type 'help' to view a list of commands. (parted) align-check opt 1 1 aligned (parted) align-check opt 2 2 aligned (parted) align-check opt 3 3 aligned (parted) align-check opt 4 4 aligned (parted) align-check opt 5 5 aligned
TLDR;
I feel like I have something fundamentally set incorrectly, though my research hasn't turned up anything. I'm expecting throughput ~4x my 3yr old macbook pro w/SATA6, and I'm getting 1/2 of it with NVMe. I added
noatime,nodiratime
which gave me a very small improvement, but nothing like the 4x I'm expecting. I have re-partitioned/re-installed fresh 15.10 server just to be sure I didn't have anything lingering, and had the same results.Are my
fio
results above of sync/no sync indicative of a problem?So I have a clean slate and can try anything. What can I try to get my performance up to par? Any references are welcome.
-
Fabby over 8 yearsWhat's the output of
smartctl --scan
and then asmartctl --all /dev/xxx
wherexxx
is whatever came up in the first command??? -
kross over 8 years@fabby
apt-get install smartmontools
fails withgrub-probe: error: cannot find a GRUB drive for /dev/nvme0n1p3. Check your device.map.
. It appears (based on my endeavors) thatupdate-grub
doesn't work well due to agrub-probe
error.smartctl -i /dev/nvme0n1
returns/dev/nvme0n1: Unable to detect device type. Please specify device type with the -d option.
NVMe does not appear in thesmartctl -h
as a device type. -
Fabby over 8 yearswhat's the output of
uname --kernel-release&&lsb_release --code --short
??? -
kross over 8 years
4.2.0-16-generic wily
-
wawa over 8 yearsI could be completely wrong and I can't find the source currently, but as I have it in mind, you need a
Skylake
processor to run those SSD's at full speed... -
Fabby over 8 yearsOK, could you update your answer with the above and the output to
sudo blkid
? (I'll be responding 2morrow as I'm keeling over from lack of sleep right now-) -
kross over 8 yearsanswer updated - no device.map, I had a real struggle with that documented in this issue: askubuntu.com/questions/697446/…
-
kross over 8 years@wawa I researched and picked this X99 board and this Haswell-e i7 processor to be sure I could run it at full speed. Please correct me though as I didn't verify it definitively!
-
wawa over 8 years@kross I unfortunately just found an German article: computerbild.de/artikel/… At
Neue Hauptplatinen
which translates tonew motherboards
they write something like: you need a new motherboard too, which brings the advantage to use SSDs sized m.2. at full speed, since there will be additional PCI-Express-3.0-Connections available, with earlier once the m.2-SSD had to use the clasical PCI-Express-Connection (to the southbridge not to the northbridge). Now I'm not sure if it's mainboard or CPU related, but could that be it? -
wawa over 8 years@kross Hmm I think I was mistaking, it ain't about the CPU but about the motherboard. From en.wikipedia.org/wiki/M.2 M.2 sockets keyed to support SATA or two PCI Express lanes (PCIe ×2) are referred to as "socket 2 configuration" or "socket 2", while the sockets keyed for four PCI Express lanes (PCIe ×4) are referred to as "socket 3 configuration" So I'm guessing, your motherboard is only supporting socket 2, which isn't as performing as socket 3. Is that possible?
-
wawa over 8 yearsA further reading could be this article: pcworld.com/article/2977024/storage/…
-
kross over 8 years@wawa I linked the brand new motherboard, it has
Turbo M.2: delivering next generation M.2 Gen3 x4 performance with transfer speeds up to 32 Gb/s
so I think I'm set due to the X4 pci lanes. -
wawa over 8 yearsstrange. Then I don't see any reason why it shouldn't work.
-
Jan-Marten Spit over 8 yearswhat does iostat -xm (averages since boot) report for your nvme disk? did you check /sys/block/device/queue/max_hw_sectors_kb and /sys/block/device/queue/max_sectors_kb? not a misaligned partition? during the tests, what does /proc/pid/wchan give as kernel-wait-channel most often when the pid status is in D state?
-
zloster over 8 years
Servethehome.com
seems to have similar problem. Checkout this and their forum post. They do not have solution yet. -
rm-vanda about 8 yearsDamn, I just got my Samsung 950, and I have the same problem on an x99s - - -
-
mplappert about 8 yearsI'm also facing the same problem. Have you ever found a viable solution?
-
kross over 7 yearsI have no solution yet, and have confirmed the same with 16.04.1 LTS
-
WinEunuuchs2Unix over 5 yearsI'm curious if you've worked out these speed issues yet?
-
kross over 5 yearsNo, I abandoned Linux on this hardware and just use it for PC gaming now.
-
ikwyl6 over 3 yearsHas anyone had any luck with a newer ubuntu install to see if these results change?
-
kross over 8 yearsI'm using it in an identical way to the mbpro, and it is 1/2 the performance, which is the thing that doesn't make sense.
-
kross over 8 yearsI just added a
fio
test with 1 and 7 threads, and a reference to a bunch of benchmarks using it as a basis. -
kross over 8 yearsThanks but I already referenced that article above under the fio heading and you can see from the benchmarks there that my SSD is underperforming
Intel 750 NVMe 400GB 261 MB/s (1 job) 884 MB/s (5 jobs)
by a large margin with sync, and even underperforming against the previous generationSamsung XP941 256GB 2.5 MB/s (1 job) 5 MB/s (7 jobs)
. So while it may be well known, it is still less than it should be. -
kross over 7 yearsThank you for adding this, I tried it on Ubuntu 16.04.1 LTS and saw no difference. I was quite hopeful, but unfortunately this didn't change anything.
-
kross over 7 yearsThanks, I will check my alignment again. I know I investigated this at one point, but it is definitely worth taking a fresh look with this information.
-
kross over 7 yearsI updated the question with my alignment.
parted
says it is aligned, based on the 512 block size, but it isn't divisible by 4096. So I just want to confirm: your sector size remains at 512 and the only thing you did is start the partition at a location divisible by 4096, correct? -
kross over 7 yearsGood explanation: blog.kihltech.com/2014/02/…
-
kross over 7 yearsUgh, now what to do with my existing disk...try and resize/move, or dd, hmmm, not sure. Indeed this seems to be the root cause though.
-
kross over 7 yearsAnother resource: intel.com/content/dam/www/public/us/en/documents/…
-
kross over 7 yearsBased on ^^^, my ext4 partition is aligned:
590608384s * 512 / 4096 == (whole number)
-
cwoodwar6 over 7 yearsCorrect, my sector size is still showing as 512 and I started the partition at a number divisible by 4096. What I am not understanding is why does
parted
still show 512? You mention root cause, did you see an increase in performance after making any tweaks? -
kross over 7 yearsI haven't made any tweaks. My early comments were based on math calculated on incorrect units from parted/fdisk. The
Alignment
section is updated above, and the math based on the intel pdf confirms that my partitions are aligned (as well as the fact that parted states they are aligned). So, I've not tried to move the partitions because...they already seem to be aligned and I don't want to randomly move them. Do you see misalignment based on the data I added? -
cwoodwar6 over 7 yearsSorry, wasn't following your comments. Your math checks out, the partition is aligned. This just leaves me even more confused, I expected the sector size to be 4096.
-
wordsforthewise over 6 yearsSame for me, no noticeable difference in performance from hdparm benchmarks.
-
Csaba Toth about 6 yearsDid they say anything?
-
WinEunuuchs2Unix over 5 yearsSame for me. I've updated my answer below showing a 1 second decrease in boot speed.
-
Anon over 5 yearsJust an FYI: at one point enabling SCSI multiqueue did indeed slow down certain devices but various issues have been fixed. From the v4.19 kernel onwards Linux enables scsi-mq by default. Note: it is unclear to me whether this option would impact NVMe drives (as opposed to SCSI/SATA drives).