md/raid:md2: cannot start dirty degraded array, kernel panic

centos kernel-panic

5,028

From the grub screen, edit your boot commands and add emergency to the end of the boot command line options. Then boot it up. This isn't guaranteed to work (if md2 is your root filesystem it will fail). If this fails to get you a shell, you will have to go find a CD-ROM drive.

Once you have a shell, you can run mdadm to attempt to recover your RAID array.

Find out what devices are supposed to be part of it:

mdadm -D /dev/md2

You'll see a listing of devices. If some are marked as removed or failed then you'll have to deal with the failed disks first.

After that, reassemble your RAID array:

mdadm --assemble --force /dev/md2 /dev/**** /dev/**** /dev/**** ...

(* listing each of the devices which are supposed to be in the array from the previous output.)

5,028

nl-x

Updated on September 18, 2022

Comments

nl-x almost 2 years
After having made use of a remote power switch, my server did not come back online. When I went to the datacenter and reboot the computer on the spot I see the server booting (I see the centos progress bar with running almost all the way to the end) and eventually giving the following messages:
```
md/raid:md2: cannot start dirty degraded array.
md/raid:md2: failed to run raid set.
md: pers->run() failed ...
md/raid:md2: cannot start dirty degraded array.
md/raid:md2: failed to run raid set.
md: pers->run() failed ...



Kernel panic - not syncing: Attempted to kill init!
Pid: 1, comm: init not tainted 2.6.32-279.1.1.el6.i686 #1
Call Trace:
 [<c083bfbc>] ? panic+0x68/0x11c
 [<c045a501>] ? do_exit+0x741/0x750
 [<c045a54c>] ? do_group_exit+0x3c/0xa0
 [<c045a5c1>] ? sys_exit_group+0x11/0x20
 [<c083eba4>] ? syscall_call+0x7/0xb
 [<c083007b>] ? cmos_wake_setup+0x62/0x112
```
The server runs CentOS and has software raid, and I don't have backups of the raid settings. The only backup I have is of /home and the database dumps. (Glad to at least have those though.)

Since the server is an old Dell PowerEdge 1750 with no CD-ROM drive, I have no way of booting the machine from a boot disk. I also remember in the past that the server also wouldn't boot from a bootable USB disk. So the only way I know how to boot the server is to go to the datacenter, pick up the server and take it to the office. Screw open the server. Attach a cdrom drive to an IDE slot on the motherboard. And then boot it. I am hoping you guys could help me avoid this.

I have looked a bit through the boot options and I found the following boot options. When CentOS is about to boot and interrupt the boot-countdown:
```
CentOS (2.6.32-279.1.1.el63.i686)
CentOS Linux (2.6.32-71.29.1.el6.i686)
centos (2.6.32-71.el6.i686)
```
I think the first configuration is the default one, because choosing that gets me to the above mentioned kernel panic. The other ones end with something like "Sleeping forever".

I can press 'e' to edit boot commands, press 'a' to modify kernel arguments and press 'c' for grub command line.

The command line gives a grub> prompt. But I have no idea how to get the system to boot without (trying to) access the dirty partitions.

What I want to do is off course: - boot the machine - check hard drive for errors - mark the drive as clean
- Hennes almost 12 years
  
  Does your Dell have a DRAC? If it has one you can avoid going to the data center and edit the console via that. Disclaimer: if it has one and if it is connected. I think you would know if you bought it (it is a few hundred euro more expensive if you include that option), but it might just have slipped though if this is a old second hand server.
- Michael Hampton almost 12 years
  
  Used DRAC cards for the 1750 can be found quite cheaply on eBay these days.
- Hennes almost 12 years
  
  Aye. But he will still have to install it if it is not present. If the physical machine gets pulled out of the data centre then I strongly recomment adding a DRAC. (or configuring the already present one).
nl-x almost 12 years

adding emergency to the end of the boot command changes nothing. However, when I remove some stuff from the boot command and only leave the UUID's, I do see stuff happening. Seeing some md's recognised, seeing something about 2 out of 3 mirrors active. But still ending with a kernel panic.
Michael Hampton almost 12 years

Quoting myself: "If this fails to get you a shell, you will have to go find a CD-ROM drive."
nl-x almost 12 years

I have found a CD-ROM drive btw. I was able to boot with an old Knoppix boot CD. But when I tried to mount /dev/sda1 , it wouldn't. When I did dmesg | tail , it said something about /dev/sda1 containing unsupported features. (Maybe a newer boot-cd will have a linux this features present?!)
Michael Hampton almost 12 years

You're running CentOS 6. Boot with a CentOS 6 disc.
nl-x almost 12 years

By the way, I have 3 partitions ... 1st is /boot ... 2nd is swap ... 3rd is / (in this order. Or so Knoppix told me with gparted )
nl-x almost 12 years

One more Question: with knoppix the md's weren't recognized (that is there was no /dev/md1 and so on)... Do you expect them to be available automatically with CentOS6 Live CD ? If not, what should I do to have them recognized so I can then do the mdadm stuff? Im asking this in advance just to avoid going to the datacenter again, experiencing it, and come back empty handed and then ask this question.
nl-x almost 12 years

Hooray. With centos6 live cd i.can see the following md's: /dev/md125 /dev/md126 /dev/md127. And also /dev/md/0_0 and /dev/md/1_0. Md125 and md126 show raiddevice2 as removed. Md127 shows all 3 raiddevices in activesync.
Michael Hampton almost 12 years

I hope you took a DRAC card with you to install in that thing, or at least a CD-ROM drive.
nl-x almost 12 years

Success! I recovered the partitions. I wasn't able to address the "removed" notices correctly. But still doing "mdadm -D /dev/md126" and then "mdadm --assemble --force /dev/md127 /dev/sda2 /dev/sdb2 /dev/sdc2" did the trick ... The root partition still is not optimal as it only runs on 2 of 3 disks. I will start a new thread on this topic, asking how to repair the disks.
nl-x almost 12 years

Google helped out a bit, so won't start a new topic. As /dev/sdc3 was 'removed' from md2 , all I needed to do is "mdadm --add /dev/md2 /dev/sdc3" ... It was re-added and the drive started recovering. I could keep track of the recovering status by periodically typing "mdadm -D /dev/md2"