Proliant RAID 1 Rebuild Questions

17,028

Solution 1

You've got something funky going on there, though I'm not exactly sure what it is.

The server should boot and operate normally with just 1 drive in it. All that should happen is the controller marks the array as degraded, but Operating Systems don't care (or even know) about this condition and should carry on as normal.

With regards to the rebuild, ordinarily I'd say look at the HP Array Diag Utility as that will give you some indication of rebuild progress. Since the Operating System sounds hosed at this point, the BIOS may have some rudimentary was of configuring arrays and displaying their status. Failing that, you should be able to boot off of a StartSmart CD which contains the HP Array Diag Utility. A 36GB drive should rebuild relatively quickly - I've seen a 36GB RAID1 on an ML370 rebuild in a morning.

Is it definitely the BIOS telling you drive C: isn't found? C: is a very Windows thing, and I'd be surprised that a BIOS would reference a very Windows-centric thing like that when other Operating Systems can be installed (it may well do, it just strikes me as odd).

Solution 2

is this normal behaviour not to start up on 1 disk?

No, not on a good controller. You should operate on either disk in a hardware RAID1.

My second question is how long will this rebuild take?

It will take as long as it takes. This usually can happen in the background while the system is running. If your system is waiting for this to happen, that may be a bad sign.

It still says during the boot up process that the C: drive is not found

This is troubling. I would be tempted to boot off a Livecd or something and see if you can see any data on the disks. Maybe the bootloader is messed up. Hopefully you have good recent backups.

I have see power supply failures destroy drives in the past. It would be unusual, but I guess it could wipe out data.

Share:
17,028
Nicholas
Author by

Nicholas

Updated on September 17, 2022

Comments

  • Nicholas
    Nicholas over 1 year

    I have a HP Proliant ML350 G5 server that experienced a power supply failure overnight. The power supply was replaced but unfortunately it got restarted with only 1 disk in the RAID 1 set plugged in. (The raid controller is the build in E200i).

    The raid BIOS then said on start-up that it had entered Interim Recovery Mode. However I would have expected it to still start up with only the 1 drive. The bios however says that it cannot find a C: drive and enters a reboot loop polling the other boot devices. First question is, is this normal behaviour not to start up on 1 disk?

    The second drive was then plugged in (all drives are ok) and the raid bios started an automatic rebuild on that disk. This appears to be a background process as there is no progress shown. However based on the light flashing it looks like it is working. My second question is how long will this rebuild take? (36GB 15K SAS drive).

    I cannot see any error messages and it looks like it is rebuilding the drive ok, but the computer still will not start-up. It still says during the boot up process that the C: drive is not found. If I wait for the rebuild to finish, is it likely to fix itself and find the C: drive? Or is there some other problem here?

    Answers

    These are the conclusions I made after solving this issue.

    1) No it is not normal. On our system (as most others), if one of the RAID 1 disks is missing or in the process of being rebuild, the single remaining disk should still operate fine and boot up correctly. (Although the controller does drop into a reduced performance mode.)

    2) The RAID 1 rebuild on our system took about 4.5 hours to reconstruct the disk after it was put back in. Seemed like a long time to me for a RAID 1+0 36GB 15k rpm SAS drive that wasn't being used at the time. But that's what it took. (As an experiment, I pulled and replaced a 10k rpm 146GB SAS drive from this machine's companion RAID 5 array which uses 4 disks. It took less than 2 hours. Go figure.)

    3) The fundamental problem I was having with this machine turned out to be a corruption in the machine's NVRAM. I can only assume the power supply fault was responsible for corrupting it. Although there was no obvious signs in the BIOS as anything being wrong. All the settings looked as they should be. However after clearing the NVRAM via the S6 switch on the motherboard, the system booted without problem. I guess the referenced boot controller had somehow changed in some underlying BIOS setting. (Incidentally if you do this, don't forget to reset the date and time before letting your server get carried away with receiving mail and missing backups.)

    • Aashraya Singal
      Aashraya Singal about 13 years
      Does your controller have a battery-backed write-cache? Do you have write-caching enabled without a battery?
    • Nicholas
      Nicholas about 13 years
      Yes, it has an attached BBU.
  • Nicholas
    Nicholas about 13 years
    Thanks, I didn't realize the HP SmartStart disk contained boot-able diagnostic utilities. I downloaded the latest CD from HP was able to get a percentage progress indicator from inside the ACU. The disk took about 4.5 hours to rebuild.
  • Nicholas
    Nicholas about 13 years
    Turned out my fundamental problem related to a NVRAM corruption. Toggling the S6 switch on the motherboard cleared the NVRAM and allowed the machine to find the C: drive and boot windows. Nothing looked amiss in the normal BIOS settings that control disk selection on boot-up but I know there are often behind the scenes bios settings not shown. Although I have no idea why the corruption might have occurred.
  • ewwhite
    ewwhite about 13 years
    Your motherboard battery might be on its way out.