"kernel: Buffer I/O error on device" - Does my server have a hardware problem?
This I/O error
message is written to warn about a hardware error with sdb
. It could be with the disks or with the cable, for example.
I suppose it is less likely to be an error in the disks themselves, if you have a large number of disks all showing errors at the same time :-). It could be an error in the disk controller.
If you see "Buffer I/O error" but no specific messages about ATA or SCSI error codes, or about retry attempts in general, maybe that gives some hint. But I do not really know :-).
Of course, a software error could cause any messages whatsoever :-).
To give an example of a software error, although I know this is not the same error: I have seen a kernel bug where "Buffer I/O error" was shown, without any error messages about ATA or SCSI or retry attempts. Fedora bug 1553979.
The "Buffer" part just means that it happened during a request for file data which is cacheable in the page cache. For historical reasons, people sometimes call these requests "buffered IO".
Related videos on Youtube
yael
Updated on September 18, 2022Comments
-
yael over 1 year
we have linux DB server redhat 7.2
we notice about many message as below about all disks that are mounted
from
/var/log/messages
what we are need to understand if this behavior is relevant to HW problem
Mar 29 13:28:22 server_DB kernel: Buffer I/O error on device sdb, logical block *N4980* Mar 29 13:28:22 server_DB kernel: Buffer I/O error on device sdb, logical block *N4981* Mar 29 13:28:22 server_DB kernel: Buffer I/O error on device sdb, logical block *N4982* Mar 29 13:28:22 server_DB kernel: Buffer I/O error on device sdb, logical block *N4983* Mar 29 13:28:22 server_DB kernel: Buffer I/O error on device sdb, logical block *N4984* Mar 29 13:28:22 server_DB kernel: Buffer I/O error on device sdb, logical block *N4985* Mar 29 13:28:22 server_DB kernel: Buffer I/O error on device sdb, logical block *N4986* Mar 29 13:28:22 server_DB kernel: Buffer I/O error on device sdb, logical block *N4987* Mar 29 13:28:22 server_DB kernel: Buffer I/O error on device sdb, logical block *N4988* Mar 29 13:28:22 server_DB kernel: Buffer I/O error on device sdb, logical block *N4989* Mar 29 13:28:22 server_DB kernel: Buffer I/O error on device sdb, logical block *N4990* Mar 29 13:28:22 server_DB kernel: Buffer I/O error on device sdb, logical block *N4991* Mar 29 13:28:22 server_DB kernel: Buffer I/O error on device sdb, logical block *N4992* Mar 29 13:28:22 server_DB kernel: Buffer I/O error on device sdb, logical block *N4993* Mar 29 13:28:22 server_DB kernel: Buffer I/O error on device sdb, logical block *N4994* Mar 29 13:28:22 server_DB kernel: Buffer I/O error on device sdb, logical block *N4995* Mar 29 13:28:22 server_DB kernel: Buffer I/O error on device sdb, logical block *N4996* Mar 29 13:28:22 server_DB kernel: Buffer I/O error on device sdb, logical block *N4997*
we also seen this messages
Mar 27 09:18:08 server_DB smartd[1734]: Monitoring 0 ATA and 26 SCSI devices Mar 27 09:18:08 server_DB ModemManager[1755]: <warn> Couldn't find support for device at '/sys/devices/pci0000:00/0000:00*CO*/0000:02*CO*': not supported by any plugin Mar 27 09:18:08 server_DB ModemManager[1755]: <warn> Couldn't find support for device at '/sys/devices/pci0000:00/0000:00*CO*/0000:02*CO*': not supported by any plugin Mar 27 09:18:08 server_DB ModemManager[1755]: <warn> Couldn't find support for device at '/sys/devices/pci0000:00/0000:00*CO*/0000:01*CO*': not supported by any plugin Mar 27 09:18:08 server_DB ModemManager[1755]: <warn> Couldn't find support for device at '/sys/devices/pci0000:00/0000:00*CO*/0000:01*CO*': not supported by any plugin Mar 27 09:18:08 server_DB ModemManager[1755]: <warn> Couldn't find support for device at '/sys/devices/pci0000:80/0000:80*CO*/0000:81*CO*': not supported by any plugin Mar 27 09:18:08 server_DB ModemManager[1755]: <warn> Couldn't find support for device at '/sys/devices/pci0000:80/0000:80*CO*/0000:81*CO*': not supported by any plugin
I am also checked the disk
smartctl -a -d megaraid,0 /dev/sdb smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.10.0-327.el7.x86_64] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Vendor: SEAGATE Product: ST600MM0238 Revision: BS04 User Capacity: 600,127,266,816 bytes [600 GB] Logical block size: 512 bytes Formatted with type 2 protection Logical block provisioning type unreported, LBPME=0, LBPRZ=0 Rotation Rate: 10000 rpm Form Factor: 2.5 inches Logical Unit id: 0x5000c500a0f28343 Serial number: W0M0LYD2 Device type: disk Transport protocol: SAS Local Time is: Wed Mar 27 10:51:30 2019 UTC SMART support is: Available - device has SMART capability. SMART support is: Enabled Temperature Warning: Disabled or Not Supported === START OF READ SMART DATA SECTION === SMART Health Status: OK Current Drive Temperature: 24 C Drive Trip Temperature: 60 C Manufactured in week 45 of year 2017 Specified cycle count over device lifetime: 10000 Accumulated start-stop cycles: 50 Specified load-unload count over device lifetime: 300000 Accumulated load-unload cycles: 177 Elements in grown defect list: 0 Vendor (Seagate) cache information Blocks sent to initiator = 412242328 Blocks received from initiator = 3213595579 Blocks read from cache and sent to initiator = 312462212 Number of read and write commands whose size <= segment size = 31915885 Number of read and write commands whose size > segment size = 0 Vendor (Seagate/Hitachi) factory information number of hours powered up = 3178.45 number of minutes until next internal SMART test = 12
-
Admin about 5 yearsWhen you say "many message as below about all disks that are mounted", do you mean you're seeing error messages about not just sdb but other disks as well?
-
Admin about 5 yearsIs sdb a hard disk or a DVD?
-
Admin about 5 yearsyes I mean also other disks , and disk is hard disk not DVD
-
Admin about 5 yearsBe aware, this question is quite broad, because of how few details it includes. It might not work well on this site. The preset reasons for closing questions include both "too broad" and "Primarily opinion-based". You are asking about errors which might be in hardware or drivers, but you have not specified what the hardware is (and what the relevant driver is). It is also good to mention the specific kernel version that you saw the errors with. Also, if you can write a good question about this, ServerFault.com might know more about e.g. the hardware and drivers used on servers.
-
Admin about 5 yearsI add also details about the disk , the same out is on the other disks , hope it help to give more details
-
Admin about 5 years@yael kernel version? what is the disk controller called? what is the driver for the controller?
-
Admin about 5 yearsTwo suggestions of ways to find drivers here: unix.stackexchange.com/questions/15274/…
-
-
yael about 5 yearscan I ask little question , if we install from scratch the OS , and then the application , it could be help , I mean maybe something with OS level ? , or we can be sure its HW?
-
user2948306 about 5 years@yael could be driver error. could be another rare error in the core (different from the one I linked to).
-
yael about 5 yearsyes it is strange most of the disks are with the error as sdb
-
user2948306 about 5 years@yael if the only error you see on these disks is "Buffer I/O error", and it does not show a specific error about SCSI (or ATA, or generally retrying), maybe that says something. I really don't know, all I am saying is that often I have seen them together in the past. Maybe if you only see "Buffer I/O error", that could mean the kernel has an error communicating with the disk controller.