zpool status reports error ... what next?
Solution 1
Type zpool clear raid2
to clear the errors and initiate a scrub.
If the errors persist following that, replace the disk.
More details about the hardware would help, so this is generic advice. My recommendation for bunch of consumer disks connected to a PC motherboard are different than what I'd do for enterprise-level gear.
Solution 2
The tool tells you what you need to do: "Determine if the device needs to be replaced".
The tools are only so intelligent and need you, as the human administrator, to figure some things. The steps required are specific to your hardware and your set up, so you will need to make some decisions based on your knowledge of the system.
Take a look at the output from the command. It looks like device gptid/5fe33556-3ff2-11e2-9437-f46d049aaeca
is experiencing 'WRITE' errors. '1.13M' is a very high error rate and I suspect the problem has been occurring for a while without you noticing. See if you can figure out why and then replace the disk.
If you have a hardware controller, that controller might have additional tools to help you determine the nature of the failure.
ZFS can deal with corrupt sectors, so there is no need to panic. But don't ignore the problem either.
As a preventative measure, you should also run a ZFS scrub regularly. See http://doc.freenas.org/index.php/ZFS_Scrubs . This will alert you when ZFS first encounters a problem, well before you hit the 1.13M mark.
Solution 3
Use the following command change out /dev/adaX for your drives.
[blackout@freenas ~]# smartctl -a /dev/ada0 | grep "Serial"
Serial Number: WD-WCC4EXXXXXXXX
also a helpful commant
[blackout@freenas ~]# glabel status
Solution 4
Although the question is old, it might be looked at by other people.
If so, remember, the output of zpool status
and zpool status -v
relate to all errors experienced. That includes errors due to your motherboard SATA ports (if used), the HBA card (if used), the SATA cables themselves..... not just the disks.
Three quick diagnostic tests are - check the disk quickly using smartctl
, check the card is correctly seated and not loose, and try a different port or SATA cable (the cable is a common cause of read/write errors).
Related videos on Youtube
![Dan](https://i.stack.imgur.com/tIKxU.jpg?s=256&g=1)
Dan
Updated on September 18, 2022Comments
-
Dan almost 2 years
On our FreeNAS server,
zpool status
gives me:pool: raid2 state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'. see: http://www.sun.com/msg/ZFS-8000-9P scrub: none requested config: NAME STATE READ WRITE CKSUM raid2 ONLINE 0 0 0 raidz1 ONLINE 0 0 0 gptid/5f3c0517-3ff2-11e2-9437-f46d049aaeca ONLINE 0 0 0 gptid/5fe33556-3ff2-11e2-9437-f46d049aaeca ONLINE 3 1.13M 0 gptid/60570005-3ff2-11e2-9437-f46d049aaeca ONLINE 0 0 0 gptid/60ebeaa5-3ff2-11e2-9437-f46d049aaeca ONLINE 0 0 0 gptid/61925b86-3ff2-11e2-9437-f46d049aaeca ONLINE 0 0 0 errors: No known data errors
What should I do?
scrub
the pool? -
Dan about 10 yearsuh oh ... after
zpool clear raid2
,zpool status
gaveDEGRADED
and that disk isUNAVAIL
. No point in scrubbing now, right? Need to replace disk? But ... not sure how to identify it. Is there a way to get serial number forgptid/5fe33556-3ff2-11e2-9437-f46d049aaeca
? -
ewwhite about 10 years+1. ZFS is hard.
-
Andreas Mattisson almost 10 yearszdb raid2, will give the GUID for the disk. But I don't think this will give out the serialnumber.