RAID-1 drive failure - will the missing data be "rebuilt"?

5,239

It almost sounds like your RAID decided to boot off or rebuild using the failed drive. As one drive fails, the RAID keeps writing to the other drive, system reboots off the failed one somehow. Perhaps it is only somewhat failed.

Hopefully it actually failed out the drive and didn't try rebuilding.

In any case my first suggestion is this. Turn off the system, and disconnect one of the drives (start with the one making noises). Then boot it up and see if your data is present. If not, then try switching to the other drive so that only it is connected. You might need to boot up the system using a livecd or some sort so you inspect the contents of the drives without changing anything.

If you don't see your data on either drive, then you are most likely out of luck.

Share:
5,239

Related videos on Youtube

Steve K
Author by

Steve K

Updated on September 18, 2022

Comments

  • Steve K
    Steve K almost 2 years

    We are a small company with an old Dell PowerEdge 830 with a CERC 6ch raid controller. Server is our file server, domain controller (Windows Server 2003), MySQL server, etc. We have a sysadmin that we have worked with for a couple years that usually keeps things working well for us, but he's out of the country and unreachable right now.

    Yesterday I received a call from my manager that the server had an alarm sound going off, quite loud and would not stop. No one at the office complained of errors saving files to the server or reading files. I came into the office and did some googling and determined that the alarm was related to the RAID and that there was a BIOS setting to silence it (until we can replace the bad drive). Oh yeah, I forgot to mention I could hear a mechanical failure in one of the drives. So I go into the raid configuration and find the alarm and silence it. This of course required a reboot and during reboot I could hear the poor, dead drive and also there were a few BIOS messages to the effect of "Raid SATA 0 offline or rebuilding" - (not exactly what it said, I apologize I didn't write it down)

    Long story short, the server booted back up and we soon found that all the data that had been written to disks between the time the alarm went of (i.e. disk failed) and the time I rebooted was gone. I saved some files POST-reboot and they persisted across an additional reboot. But the files that were saved Sunday, Yesterday and Today up until the first reboot are gone.

    This completely surprises me, RAID-1 is mirrored so why would data be missing? People in the office started grumbling about all the files they would need to recreate (ah yes, the backup is also missing the files) and I stopped them until I could figure out a bit more about all this. My question to you pros is: Is there anything that can be done to restore that data? Is there a RAID utility or process that should be followed in order to fix the problem? In other words, does what I've described thus far sound normal in a failure event and is there simply some additional steps that need to be taken to tell the raid the other disk is dead and to rely on the data that is mirrored on the remaining drive?

    I'm fairly comfortable administering our server and the various services it's running, but when it comes to RAID and hardware in general I'm a total newb and considering we've got real-world data at stake I'm reluctant to start trial-n-erroring my way through the process.

    • Zoredache
      Zoredache about 12 years
      What are you using for backups? Why would that be missing the files? You are making backups onto something other then the server right?
  • Steve K
    Steve K about 12 years
    Thank you for the tips and info. I understand what you've described and will give this a shot, hopefully it's as you've said and the system is booting off the lame drive.
  • psusi
    psusi about 12 years
    If the system can boot from the failed drive instead of the good one, that is a pretty big bug in the raid controller. If backups were made before the system rebooted though, and they don't contain the files either, that would indicate that is not what happened. The only explanation I can think of in that case is user error -- i.e. they didn't actually save what they think they did to the server.
  • Zoredache
    Zoredache about 12 years
    @psusi, I agree, something sounds weird. Would you agree that if he looks at both drives individually, and can't find his data, then it is almost certainly gone?
  • psusi
    psusi about 12 years
    Obviously if it isn't on either drive, nor in the backup, then it's gone ( or never was there in the first place -- user error ).