How can I test to see if there something wrong with my SSD?
Solution 1
Edit: with respect to the fact I forgot not everyone would know this: Smart normalized data counts down not up!
The specific command you're likely looking for is:
# smartctl -a /dev/sda | grep Media_Wearout_Indicator
this higher this is, the more likely you are to run into issues. as an aside I would recommend considering replacing your drive after this hits:
50% - Mission Critical Drives (things that, for reasons beyond the scope here are NEEDED to be accessible no matter what.)
30% - your /home drive (your movies/music/personal files, things you care about having at hand)
20% everything else (drives only brought online for backups before being committed to cold-storage, drives that hold OSes you only use occasionally, etc)
Solution 2
Install Gnome Disk Utility and check tests for wear-leveling-count and SMART Data or any similar.
The higher the reported percentage, the more worn your SSD is, which means you are more likely to encounter problems.
Install using:
apt-get install gnome-disk-utility
Launch via command line
sudo palimpsest
or via the application menu under the name Disk Utility.
Related videos on Youtube
casolorz
Updated on September 18, 2022Comments
-
casolorz over 1 year
So a few weeks ago I update to 17.10 and at pretty much the same time I updated to Android Studio 3, and it was probably a mistake updating both as now I don't know where the problem lies.
Basically it seems like disk IO has gotten really bad. At first I noticed I was swapping so I doubled my ram (32 gigs now) and I'm never swapping anymore. But the machine still pretty much freezes when disk IO happens. By freezes I mean it will get really slow, to the point that I can type and I won't see what I'm typing for a few seconds, often I'll get a long string of one key when that happens.
When I go to commit my code, Android Studio will do an analysis of the code and the UI just freezes while it does that. Takes a few seconds. None of these issues used to happen before updating both things.
Also, when the cloud station backup runs to my NAS, it gets ridiculously slow.
I have a
Samsung SSD 850 PRO 512GB
SSD.So what can I run to see what the issue is?
Thanks.
Edit:
Smartctl output:
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.13.0-16-generic] (local build) Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Samsung based SSDs Device Model: Samsung SSD 850 PRO 512GB Serial Number: S250NSAG809789J LU WWN Device Id: 5 002538 8a0af305f Firmware Version: EXM02B6Q User Capacity: 512,110,190,592 bytes [512 GB] Sector Size: 512 bytes logical/physical Rotation Rate: Solid State Device Device is: In smartctl database [for details use: -P show] ATA Version is: ACS-2, ATA8-ACS T13/1699-D revision 4c SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Tue Nov 28 16:22:20 2017 CST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: ( 0) seconds. Offline data collection capabilities: (0x53) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. No Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 272) minutes. SCT capabilities: (0x003d) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 1 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0 9 Power_On_Hours 0x0032 095 095 000 Old_age Always - 23126 12 Power_Cycle_Count 0x0032 099 099 000 Old_age Always - 75 177 Wear_Leveling_Count 0x0013 098 098 000 Pre-fail Always - 117 179 Used_Rsvd_Blk_Cnt_Tot 0x0013 100 100 010 Pre-fail Always - 0 181 Program_Fail_Cnt_Total 0x0032 100 100 010 Old_age Always - 0 182 Erase_Fail_Count_Total 0x0032 100 100 010 Old_age Always - 0 183 Runtime_Bad_Block 0x0013 100 100 010 Pre-fail Always - 0 187 Uncorrectable_Error_Cnt 0x0032 100 100 000 Old_age Always - 0 190 Airflow_Temperature_Cel 0x0032 070 057 000 Old_age Always - 30 195 ECC_Error_Rate 0x001a 200 200 000 Old_age Always - 0 199 CRC_Error_Count 0x003e 100 100 000 Old_age Always - 0 235 POR_Recovery_Count 0x0012 099 099 000 Old_age Always - 34 241 Total_LBAs_Written 0x0032 099 099 000 Old_age Always - 37060089586 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay.
df -i output:
Filesystem Inodes IUsed IFree IUse% Mounted on udev 4096227 613 4095614 1% /dev tmpfs 4111096 1024 4110072 1% /run /dev/sda1 29908992 4301747 25607245 15% / tmpfs 4111096 524 4110572 1% /dev/shm tmpfs 4111096 5 4111091 1% /run/lock tmpfs 4111096 18 4111078 1% /sys/fs/cgroup tmpfs 4111096 17 4111079 1% /run/user/122 tmpfs 4111096 458 4110638 1% /run/user/1000 /home/mydir/.Private 29908992 4301747 25607245 15% /home/mydir
-
Sergiy Kolodyazhnyy over 6 yearsRun
smartctl
command, post the output. It might reveal something about your SSD health. Also check inode count withdf -i
. Swap and RAM usage might be worth checking too for io stufd -
casolorz over 6 yearsAdded the output. Thanks. According to top and the gui monitor app I have zero swap usage since adding more ram.
-
casolorz over 6 yearsI thought
pre-fail
was just the type of statistic, not the status? -
Sergiy Kolodyazhnyy over 6 yearsYes,
TYPE
column only indicates the type of statistic.VALUE
column is supposed to be compared withTHRESH
column, and it's bad only whenVALUE
gets low enough that it approachesTHRESH
. In your case it's alright. I looked only briefly through it last time on phone, so my previous comment was wrong. -
thecarpy over 6 yearsCan you elaborate on "Also, when the cloud station backup runs to my NAS, it gets ridiculously slow." Why did you do
df -i
, inodes is not really useful in this case, please providedf -h
ordf
, it might be that you are running out of space (less than say 15% available) on the drive, which causes slowdowns.... -
casolorz over 6 yearsCloud station backup is a backup utility from the company that makes my NAS.
-
casolorz over 6 yearsSorry hit enter before finishing the reply. I think I am down to 20% or so, I'll have to check tonight when I'm near the computer.
-
-
casolorz over 6 yearsWear leveling count is 118. How do I convert that to a percentage?
-
casolorz over 6 yearsAre you talking about this one:
Wear_Leveling_Count 0x0013 098 098 000 Pre-fail Always - 118
? I don't know what 118 means, that can't possibly be the percentage, can it? it can't go above 100, right? -
Fabby over 6 years@casolorz the above is the correct command and 118 is not the % but 98 is and you're in "pre-fail" status: Back-up right now. I mean like: STOP USING THE DISK and do a
ddrescue
right now onto a bigger disk because SSDs die suddenly without warning -
casolorz over 6 years
Pre-Fail
is the type, not the status. So 98 sounds really bad but on another post I read it sounded like the number goes backwards. You can see my smart output on my post. -
Gartral over 6 yearsok, after my brain derped earlier I found it prudend to update my answer and leave information gleaned from z-a-recovery.com/manual/smart.aspx there
-
Gartral over 6 years@casolorz your normalized count is 98% meaning that your drive is near perfect condition! YAY!
-
casolorz over 6 yearsAnymore ideas? I guess maybe I just need to clear some space on the drive.
-
Elder Geek over 6 years@casoloroz 118 is the raw data value (typically in hexadecimal). You don't convert it to a percentage.
-
Gartral over 6 years@casolorz do yourself a favor and check iotop or nmon (I prefer the latter) this will let you identify exactly what processes are eating your disk io and causing the thrashing