Extremely slow SSD (SanDisk SD8TB8U5) write speeds in brand new Linux Mint 18.2 install

7,792

Solution 1

More of a workaround than a proper fix to the problem -- replacing Linux Mint 18.2 32-bit with Linux Mint 18.2 64-bit resolved the insanely slow write speeds permanently.


Interested in more detail? Alas, dear reader, please read on!

We have a requirement to use a particular 32-bit compiler toolchain. With this, I assumed it would be best to install a 32-bit OS, and I selected Linux Mint Cinnamon (kernel 4.8) because it strikes me as a more desktop-friendly distro for some of our developers that aren't very familiar with Linux.

Immediately after bootup, blazing fast writes > 200 MiB/sec; after an hour or so, back down to < 5 MiB/sec.

The exact same behavior was observed running Manjaro Linux 32-bit (kernel 4.9 LTS, and some 4.10 as well).

So I tried installing Linux Mint 64-bit (again, the same 4.8 kernel). And now the disk writes start fast and stay fast; no disk issues at all. This immediately strikes me as a kernel bug, one of the SSD drivers, for 32-bit environments specifically.

Fortunately, apt has a meta-package ia32-libs that installs a solid collection of x86 libraries, which turned out very nearly sufficient to run our 32-bit compiler toolchain -- just had to manually install one extra lib.

Solution 2

For those wondering, this is a bug in the linux kernel, tracked here: 100+ times slower disk writes on 4.x+/i386/16+RAM, compared to 3.x

(and here: 32-bit kernel HDD slow write speed, and here: Slow disk writes after some uptime, only on 32bit/16+RAM/4+ kernels)

It gets triggered on 32bit kernels with more than 12GB (8GB?) ram (enabled). This was apparently introduced in kernel v4.2.0 and has not yet been resolved (RedHat seems to have provided a mitigating patch, but never made it upstream).

One user found a workaround by setting this in the GRUB cmdline options:

mem=12G

(or any value lower than that) but, as you can guess, it limits the amount of memory available in that session.

I personally found setting vm.highmem_is_dirtyable=1 to work (back when I was still using the 32bit kernel, this option does not exist for 64bit kernels), but Andrew Morton said doing that might lead to other problems. You can test it by running:

 sysctl -w vm.highmem_is_dirtyable=1

Here's a nice writeup on the problem: Extreme I/O slowdown with PAE kernel

Solution 3

With a SSD, you should definitely see speeds of at least 100 MiB/s. Usually more.

For the second part of your question, yes you can duplicate the exact contents of your installation to a new disk.

To diagnose your slow disk, you can try a couple of things in addition to what others have suggested:

  • see dmesg for any messages about timeouts regarding sda/sdb and similar
  • change the cable
  • try the disk in a different computer
  • boot a different distro (i.e. live USB) and check disk speed with different kernel
  • try a different SSD on the same cable, same port and same distro
  • run a smartctl -a to see if it reports something suspicious
  • try a blkdiscard or full erase (some older/low quality SSDs might still suffer badly due to performance degradation)
  • see if you can update SSD firmware
  • try the manufacturer diagnostic tools
Share:
7,792

Related videos on Youtube

ardnew
Author by

ardnew

Updated on September 18, 2022

Comments

  • ardnew
    ardnew over 1 year

    I recently installed Linux Mint 18.2 on a brand new PC containing 2 SanDisk SSDs. The installation completed without error.

    After setting up all of the software needed on this new PC, I've noticed the performance of the disk is absurdly slow -- the write speeds in particular. For example, dd is showing about 3.5MB/s on average:

    $ dd bs=1M count=256 if=/dev/zero of=test conv=fdatasync
    256+0 records in
    256+0 records out
    268435456 bytes (268 MB, 256 MiB) copied, 61.9192 s, 4.3 MB/s
    
    $ dd bs=1M count=256 if=/dev/urandom of=test conv=fdatasync
    256+0 records in
    256+0 records out
    268435456 bytes (268 MB, 256 MiB) copied, 86.7794 s, 3.1 MB/s
    

    I'm not sure where to even begin looking to diagnose or fix this. It is rendering this otherwise very fast PC almost unusable.

    The filesystem installed is ext4, and is mounted with options noatime,errors=remount-ro in /etc/fstab.

    I've tried lowering/disabling Advanced Power Management (hdparm -B 254 /dev/sda) and I've also manually run TRIM on the root fs (fstrim /). Neither of these seemed to make any difference.

    I'm pretty confident it shouldn't take 15 sec to copy a 25MiB file onto the same disk and partition as the original file itself:

    $ stat test
      File: 'test'
      Size: 26214400    Blocks: 51200      IO Block: 4096   regular file
    Device: 801h/2049d  Inode: 10880436    Links: 1
    Access: (0664/-rw-rw-r--)  Uid: ( 1000/     aps)   Gid: ( 1000/     aps)
    Access: 2017-10-14 15:48:24.460658106 -0500
    Modify: 2017-10-14 15:37:22.577357279 -0500
    Change: 2017-10-14 15:37:22.577357279 -0500
     Birth: -
    
    $ \time -v cp test test.out
      Command being timed: "cp test test.out"
      User time (seconds): 0.03
      System time (seconds): 0.00
      Percent of CPU this job got: 0%
      Elapsed (wall clock) time (h:mm:ss or m:ss): 0:14.36
      Average shared text size (kbytes): 0
      Average unshared data size (kbytes): 0
      Average stack size (kbytes): 0
      Average total size (kbytes): 0
      Maximum resident set size (kbytes): 2188
      Average resident set size (kbytes): 0
      Major (requiring I/O) page faults: 0
      Minor (reclaiming a frame) page faults: 134
      Voluntary context switches: 1452
      Involuntary context switches: 1
      Swaps: 0
      File system inputs: 0
      File system outputs: 51200
      Socket messages sent: 0
      Socket messages received: 0
      Signals delivered: 0
      Page size (bytes): 4096
      Exit status: 0
    

    hdparm shows the following details:

    $ sudo hdparm -I /dev/sda
    
    /dev/sda:
    
    ATA device, with non-removable media
      Model Number:       SanDisk SD8TB8U512G1001                 
      Serial Number:      165125801567        
      Firmware Revision:  X4133101
      Media Serial Num:   
      Media Manufacturer: 
      Transport:          Serial, ATA8-AST, SATA 1.0a, SATA II Extensions, SATA Rev 2.5, SATA Rev 2.6, SATA Rev 3.0
    Standards:
      Used: unknown (minor revision code 0x0110) 
      Supported: 9 8 7 6 5 
      Likely used: 9
    Configuration:
      Logical   max current
      cylinders 16383 0
      heads   16  0
      sectors/track 63  0
      --
      LBA    user addressable sectors:  268435455
      LBA48  user addressable sectors: 1000215216
      Logical  Sector size:                   512 bytes
      Physical Sector size:                   512 bytes
      Logical Sector-0 offset:                  0 bytes
      device size with M = 1024*1024:      488386 MBytes
      device size with M = 1000*1000:      512110 MBytes (512 GB)
      cache/buffer size  = unknown
      Form Factor: 2.5 inch
      Nominal Media Rotation Rate: Solid State Device
    Capabilities:
      LBA, IORDY(can be disabled)
      Queue depth: 32
      Standby timer values: spec'd by Standard, no device specific minimum
      R/W multiple sector transfer: Max = 1 Current = 1
      Advanced power management level: 254
      DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6 
           Cycle time: min=120ns recommended=120ns
      PIO: pio0 pio1 pio2 pio3 pio4 
           Cycle time: no flow control=120ns  IORDY flow control=120ns
    Commands/features:
      Enabled Supported:
         *  SMART feature set
            Security Mode feature set
         *  Power Management feature set
         *  Write cache
         *  Look-ahead
         *  WRITE_BUFFER command
         *  READ_BUFFER command
         *  DOWNLOAD_MICROCODE
         *  Advanced Power Management feature set
            SET_MAX security extension
         *  48-bit Address feature set
         *  Mandatory FLUSH_CACHE
         *  FLUSH_CACHE_EXT
         *  SMART error logging
         *  SMART self-test
         *  General Purpose Logging feature set
         *  64-bit World wide name
         *  WRITE_UNCORRECTABLE_EXT command
         *  {READ,WRITE}_DMA_EXT_GPL commands
         *  Segmented DOWNLOAD_MICROCODE
            unknown 119[8]
         *  Gen1 signaling speed (1.5Gb/s)
         *  Gen2 signaling speed (3.0Gb/s)
         *  Gen3 signaling speed (6.0Gb/s)
         *  Native Command Queueing (NCQ)
         *  Phy event counters
         *  READ_LOG_DMA_EXT equivalent to READ_LOG_EXT
         *  DMA Setup Auto-Activate optimization
            Device-initiated interface power management
         *  Software settings preservation
            Device Sleep (DEVSLP)
         *  reserved 69[4]
         *  DOWNLOAD MICROCODE DMA command
         *  WRITE BUFFER DMA command
         *  READ BUFFER DMA command
         *  Data Set Management TRIM supported (limit 8 blocks)
         *  Deterministic read ZEROs after TRIM
    Security: 
      Master password revision code = 1
        supported
      not enabled
      not locked
        frozen
      not expired: security count
        supported: enhanced erase
      2min for SECURITY ERASE UNIT. 2min for ENHANCED SECURITY ERASE UNIT. 
    Logical Unit WWN Device Identifier: 5001b448b48ce759
      NAA   : 5
      IEEE OUI  : 001b44
      Unique ID : 8b48ce759
    Device Sleep:
      DEVSLP Exit Timeout (DETO): 30 ms (drive)
      Minimum DEVSLP Assertion Time (MDAT): 30 ms (drive)
    Checksum: correct
    

    The second SSD installed in the PC is currently unformatted. Would it be possible to prepare that other disk in a way that will solve this problem, while retaining this current Mint's user-space installation/configuration copied onto it? It will be a nightmare to have to re-install all of the software that has been setup on this slow disk.


    Edit

    System specs from inxi:

    $ inxi -Fz
    System:    Host: dapadev Kernel: 4.8.0-53-generic i686 (32 bit) Desktop: Cinnamon 3.4.3
               Distro: Linux Mint 18.2 Sonya
    Machine:   System: LENOVO product: 30B7000YUS v: ThinkStation P710
               Mobo: LENOVO model: 1030 v: SBB0J05441 WIN 3305058809791 Bios: LENOVO v: S01KT40A date: 05/04/2017
    CPU:       Octa core Intel Xeon E5-2620 v4 (-HT-MCP-) cache: 20480 KB 
               clock speeds: max: 3000 MHz 1: 2099 MHz 2: 2100 MHz 3: 2120 MHz 4: 2100 MHz 5: 2099 MHz 6: 2100 MHz
               7: 2299 MHz 8: 2100 MHz
    Graphics:  Card: NVIDIA GK107GL [Quadro K420]
               Display Server: X.Org 1.18.4 drivers: nvidia (unloaded: fbdev,vesa,nouveau)
               Resolution: [email protected]
               GLX Renderer: Quadro K420/PCIe/SSE2 GLX Version: 4.5.0 NVIDIA 375.66
    Audio:     Card-1 NVIDIA GK107 HDMI Audio Controller driver: snd_hda_intel Sound: ALSA v: k4.8.0-53-generic
               Card-2 Intel C610/X99 series HD Audio Controller driver: snd_hda_intel
    Network:   Card-1: Intel Ethernet Connection (2) I218-LM driver: e1000e
               IF: eth0 state: up speed: 1000 Mbps duplex: full mac: <filter>
               Card-2: Intel I210 Gigabit Network Connection driver: igb
               IF: eth1 state: down mac: <filter>
               Card-3: Intel I210 Gigabit Network Connection driver: igb
               IF: eth2 state: up speed: 100 Mbps duplex: full mac: <filter>
    Drives:    HDD Total Size: 2024.4GB (2.1% used) ID-1: /dev/sda model: SanDisk_SD8TB8U5 size: 512.1GB
               ID-2: /dev/sdb model: SanDisk_SD8TB8U5 size: 512.1GB ID-3: /dev/sdc model: ST1000DM003 size: 1000.2GB
    Partition: ID-1: / size: 438G used: 8.9G (3%) fs: ext4 dev: /dev/sda1
               ID-2: swap-1 size: 34.24GB used: 0.00GB (0%) fs: swap dev: /dev/sda5
    RAID:      No RAID devices: /proc/mdstat, md_mod kernel module present
    Sensors:   System Temperatures: cpu: 43.0C mobo: N/A gpu: 53C
               Fan Speeds (in rpm): cpu: N/A
    Info:      Processes: 275 Uptime: 3 days Memory: 1887.9/32323.8MB Client: Shell (bash) inxi: 2.2.35
    

    Drive details from lsblk:

    $ lsblk -Sfalt
    NAME HCTL       TYPE VENDOR   MODEL             REV TRAN   NAME FSTYPE LABEL UUID MOUNTPOINT NAME ALIGNMENT MIN-IO OPT-IO PHY-SEC LOG-SEC ROTA SCHED    RQ-SIZE  RA WSAME
    sdb  3:0:0:0    disk ATA      SanDisk SD8TB8U5 3101 sata   sdb                               sdb          0    512      0     512     512    0 deadline     128 128    0B
    sr0  6:0:0:0    rom  PLDS     DVD-RW DH16AFSH  DL3M sata   sr0                               sr0          0    512      0     512     512    1 deadline     128 128    0B
    sdc  4:0:0:0    disk ATA      ST1000DM003-1SB1 CC62 sata   sdc                               sdc          0   4096      0    4096     512    1 deadline     128 128    0B
    sda  2:0:0:0    disk ATA      SanDisk SD8TB8U5 3101 sata   sda                               sda          0    512      0     512     512    0 deadline     128 128    0B
    

    Drive details from smartctl:

    $ sudo smartctl -a /dev/sda
    smartctl 6.5 2016-01-24 r4214 [i686-linux-4.8.0-53-generic] (local build)
    Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
    
    === START OF INFORMATION SECTION ===
    Device Model:     SanDisk SD8TB8U512G1001
    Serial Number:    165125801567
    LU WWN Device Id: 5 001b44 8b48ce759
    Firmware Version: X4133101
    User Capacity:    512,110,190,592 bytes [512 GB]
    Sector Size:      512 bytes logical/physical
    Rotation Rate:    Solid State Device
    Form Factor:      2.5 inches
    Device is:        Not in smartctl database [for details use: -P showall]
    ATA Version is:   ACS-2 T13/2015-D revision 3
    SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
    Local Time is:    Sat Oct 14 16:05:13 2017 CDT
    SMART support is: Available - device has SMART capability.
    SMART support is: Enabled
    
    === START OF READ SMART DATA SECTION ===
    SMART overall-health self-assessment test result: PASSED
    
    General SMART Values:
    Offline data collection status:  (0x00) Offline data collection activity
              was never started.
              Auto Offline Data Collection: Disabled.
    Self-test execution status:      (  16) The self-test routine was aborted by
              the host.
    Total time to complete Offline 
    data collection:    (    0) seconds.
    Offline data collection
    capabilities:        (0x11) SMART execute Offline immediate.
              No Auto Offline data collection support.
              Suspend Offline collection upon new
              command.
              No Offline surface scan supported.
              Self-test supported.
              No Conveyance Self-test supported.
              No Selective Self-test supported.
    SMART capabilities:            (0x0003) Saves SMART data before entering
              power-saving mode.
              Supports SMART auto save timer.
    Error logging capability:        (0x01) Error logging supported.
              General Purpose Logging supported.
    Short self-test routine 
    recommended polling time:    (   2) minutes.
    Extended self-test routine
    recommended polling time:    (  10) minutes.
    
    SMART Attributes Data Structure revision number: 4
    Vendor Specific SMART Attributes with Thresholds:
    ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
      5 Reallocated_Sector_Ct   0x0032   100   100   ---    Old_age   Always       -       0
      9 Power_On_Hours          0x0032   100   100   ---    Old_age   Always       -       519
     12 Power_Cycle_Count       0x0032   100   100   ---    Old_age   Always       -       309
    170 Unknown_Attribute       0x0032   100   100   ---    Old_age   Always       -       0
    171 Unknown_Attribute       0x0032   100   100   ---    Old_age   Always       -       0
    172 Unknown_Attribute       0x0032   100   100   ---    Old_age   Always       -       0
    173 Unknown_Attribute       0x0032   100   100   ---    Old_age   Always       -       0
    174 Unknown_Attribute       0x0032   100   100   ---    Old_age   Always       -       48
    178 Used_Rsvd_Blk_Cnt_Chip  0x0032   100   100   ---    Old_age   Always       -       0
    180 Unused_Rsvd_Blk_Cnt_Tot 0x0033   100   100   010    Pre-fail  Always       -       100
    184 End-to-End_Error        0x0033   100   100   097    Pre-fail  Always       -       0
    187 Reported_Uncorrect      0x0032   100   100   ---    Old_age   Always       -       0
    194 Temperature_Celsius     0x0022   068   033   ---    Old_age   Always       -       32 (Min/Max 23/33)
    199 UDMA_CRC_Error_Count    0x0032   100   100   ---    Old_age   Always       -       0
    233 Media_Wearout_Indicator 0x0033   100   100   001    Pre-fail  Always       -       16772743
    234 Unknown_Attribute       0x0032   100   100   ---    Old_age   Always       -       90
    241 Total_LBAs_Written      0x0030   253   253   ---    Old_age   Offline      -       101
    242 Total_LBAs_Read         0x0030   253   253   ---    Old_age   Offline      -       13
    249 Unknown_Attribute       0x0032   100   100   ---    Old_age   Always       -       40
    
    SMART Error Log Version: 1
    No Errors Logged
    
    SMART Self-test log structure revision number 1
    Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
    # 1  Short offline       Aborted by host               00%       519         -
    
    Selective Self-tests/Logging not supported
    
    • spikey_richie
      spikey_richie over 6 years
      I've no idea what the answer is, but I've up voted this question as it's a strong question! Does Linux care about AHCI like Windows does?
    • ardnew
      ardnew over 6 years
      @spikey_richie yes, the disks are using the ahci drivers in kernel 4.8.0-53-generic (default with Mint 18.2)
    • xenoid
      xenoid over 6 years
      Last time I had a disk that slow, hdparm was showing that it was used without DMA, but it doesn't appear to be the case here. The $64K question is if this a problem with write rate (ie, time roughly proportional to amount written), or with first access (long time even for smaller files). And yes, on my SSD your tests report 400+MB/s :)
    • ejbytes
      ejbytes over 6 years
      I remember doing a lot of Linux stuff for college as a CS major and using Linux was mandatory. However in my real life I don't use it at all. But I recall that one time I used a laptop to do just this and during installation I used the incorrect bootloader location. It's a distant bad memory, but it's worth doing a re-install and take your time on it. Something about the correct partition... Good luck in that!
    • ejbytes
      ejbytes over 6 years
      Oh! One more thing! I was at Best Buy and asking the "Tech" about what type of hard drive was used for the Surface Pro (I was new to the whole buzz)... and he said "SSD blah blah blah". I was like, "Oh... hmmmm", "What's wrong?", he asked. I said, "I don't like those, although they're fast... they're write once." He was like, "Uh no... blah blah". Then he googled for like 10 minutes.. "Oh.. well I guess you are right. Hmmm". So yeah, if it's an old (or used a bit) SSD the "space" could be not "spacially" good for an OS install. If you follow. Research SSD, it really is write once.
    • ardnew
      ardnew over 6 years
      @ejbytes i'm not really sure where to begin responding. first, the disks are slow whether i boot off of them or not (which i am doing with one of them), so it's clearly and entirely unrelated to the bootloader. second, the disks are slow whether i install an OS on them or not, so i also do not believe it's a matter of them being "spacially good for an OS install" -- though admittedly, i have no idea what that means
    • jurez
      jurez over 6 years
      @ejbytes SSDs are not write-once. They use flash memory (usually NAND gates) which can be erased by applying a higher voltage to them. From the computer's perspective, this is completely transparent and they behave just like regular hard disks. Erasing wears them down just a little bit, but their total life expectancy is on par with mechanical disks too.
    • ejbytes
      ejbytes over 6 years
      @jurez I can read Google and "Selling Ads" too. You can believe the hype if you wan't it's a free country. Well I don't know where you live so... and overprovisioning is a thing that exists under the radar to amp up it's reliability, well that makes you think it's more reliable because it's not obvious nor visible. Think about this one. Go back 40 years. Vinyl records are WAY superior to ANY digital media. Same is to be said about Magnetic Tape. Drink the Cool Aid it's cool man.
    • jurez
      jurez over 6 years
      @ejbytes Then why not trying a PC with SSD once? You won't get cancer if you try to delete a file. In fact, the file will be deleted.
  • ardnew
    ardnew over 6 years
    i wasn't seeing any errors from dmesg on the mint install, but i've now installed manjaro linux where the disks are performing wonderfully. unfortunately 32-bit manjaro does not have (or support) PAE, and with 32GB RAM (and a 32-bit cross-compiler toolchain requirement), it's kind of a deal breaker. so i recompiled the manjaro kernel with PAE, have full 32GB RAM available, but now the disk slowdown is back. could this possibly be related or merely coincidence? we don't really have the option of swapping out hardware components at the moment for further diagnosis from that angle
  • ardnew
    ardnew over 6 years
    i've also considered the possibility that its a problem with our power supply. the PC is plugged into a probably-overloaded UPS, so i'll try a more reliable power source to see if that helps at all. finally, i'll try installing a 64-bit OS and creating a 32-bit library to use for the cross-compiler tools -- this will be the most time-consuming fix, and it isn't guaranteed to resolve it. ANYWAY -- i'd hate to see that bounty go to waste, so it's all yours. but i'll reserve marking it as an answer for now to encourage more feedback. thanks
  • jurez
    jurez over 6 years
    Few more things to try - update BIOS, check disk settings and run a thorough memory test. You should try to collect as much datapoints as possible by experimenting with different combinations. From what you've described, it looks either like a hardware problem or a kernel bug. Cable problems are common, but often overlooked as well. You can (and should) ask this question on kernel mailing list. In the long run, I'd recommend you focus on 64-bit OS and use a 32-bit VM when needed.
  • jurez
    jurez over 6 years
    @ardnew Post your system's detailed specs.
  • ardnew
    ardnew over 6 years
    edited question with full specs
  • Tim_Stewart
    Tim_Stewart about 3 years
    This community is a suck with low rep/new users. I'll try to keep an eye on your contributions to make sure you get upvotes on decent quality answers.