Dell Open Manage message in Windows Event Log, should I be worried?

7,772

As Mitch suggested, I would first ensure that all components on your server have the latest firmware as well as latest drivers installed. We have had entire hard drives fail because they did not have the latest firmware on them (3 in a row as a matter of fact, until Dell figured out the firmware on the actual drives was out of date). This has nothing to do with your case, but I'm trying to illustrate that keeping your devices on the latest firmware is more than just "good practice".

Just navigate here and enter your service tag: http://www.dell.com/support/home/us/en/04/home2.

We have been monitoring DELL events for many years now, and events logged by OpenManage should not be taken lightly. The warning you are getting most likely suggested a problem that needs to be addressed.

The reason why you are probably not noticing any problems is because you are running a RAID 1. As such, even disconnecting one of the drives entirely will not cause any noticable issue, though it may result in a raid rebuild (which would be logged as well).

If you keep seeing those events after you updated all the drivers and firmware, I would power the server down (if possible), disconnect and re-connect the hard disk cables.

If the issue persists, then I would call DELL as it is most likely a hardware issue such as a defect cable, backplane, etc.

Share:
7,772

Related videos on Youtube

Baby
Author by

Baby

Desktop/Web developer, Sysadmin and also DBA. Pick your favourite.

Updated on September 18, 2022

Comments

  • Baby
    Baby over 1 year

    I have a Dell T110 server with a SAS 6i/R controller and two hard disks in RAID 1. Ocasionally a warning appears in the Windows Event Log with the following message:

    SAS port report: SAS wide port 2 lost link on PHY 2.: Controller 0 (SAS 6/iR Adapter)

    And about 20 seconds later, the following informational message appears:

    SAS port report: SAS wide port 2 restored link on PHY 2.: Controller 0 (SAS 6/iR Adapter)

    Until now, I haven't noticed any disruption in the programs that are running in this server. Is this a sign of future problems with the controller or the disks? Has any of you ever seen something like this?

    Update

    Yesterday, 3 days after I posted this question the RAID 1 setup lost redundancy. After a sequence of the messages mentioned above, the following messages were logged:

    (10-07-2012 21:42:42) - An invalid SAS configuration has been detected. Details: SAS topology error: Unaddressable device.: Controller 0 (SAS 6/iR Adapter)

    (10-07-2012 21:42:45) - Reset to device, \Device\RaidPort0, was issued.

    (10-07-2012 21:43:02) - Device failed: Physical Disk 0:2 Controller 0, Connector 0

    (10-07-2012 21:52:59) - The driver detected a controller error on \Device\RaidPort0.

    (10-07-2012 21:53:02) - Redundancy lost: Virtual Disk 1 (Virtual Disk 1) Controller 0 (SAS 6/iR Adapter)

    (10-07-2012 21:53:02) - Virtual disk degraded: Virtual Disk 1 (Virtual Disk 1) Controller 0 (SAS 6/iR Adapter)

    (10-07-2012 21:53:02) - The rebuild failed due to errors on the target physical disk.: Physical Disk 0:2 Controller 0, Connector 0

    From these messages one can assume that problem is with one of the disks of the array. I'm right now using the Dell Online Diagnostics tool to test the disks. While one of the disks finished the tests the other is stuck at 20%. So I think i found the culprit.

    • gravyface
      gravyface over 11 years
      Assuming you have warranty on the server, call Dell.
    • Mitch
      Mitch over 11 years
      What do the logs in Openmanage Admin say?
    • floyd
      floyd over 11 years
      Are the timestamps on these logs during reboots by any chance? I have seem DOM throw these types of alerts when the server is powering on. Mostly with NIC adapter status however.
    • Baby
      Baby over 11 years
      @Mitch. The logs in Openamanage Admin say the same thing the Windows Event Log is saying. There's nothing different.
    • Baby
      Baby over 11 years
      @floyd. No. There's no sign that the server was rebooting around the time these messages appeared.
    • Mitch
      Mitch over 11 years
      Firmware and drivers up-to-date? Are the entries repeated through the history of the machine (perhaps this is normal behavior)?
    • Baby
      Baby over 11 years
      @Mitch. I checked the firmware and drivers versions and noticed a strange situation. The driver version that the OpenManage reports is 1.28.03.52 and the latest version that you can download from Dell is 1.28.03.01
  • Baby
    Baby over 11 years
    Yesterday, the RAID 1 finally failed and lost redundancy. Now I'll need to find out what component is failing the controller, one of the disks or if it's a driver issue. Luckily, the system still running.
  • Lucky Luke
    Lucky Luke over 11 years
    Yes, you almost certainly have a failed disk. I would try to get a replacement as soon as possible and create a backup as well (in case you're not doing that already). I would also refrain from running anything that will create a lot of load/stress on the working drive, other than a backup.