Oracle Linux 6 kernel panic boot - anything I can do?
The first message you see during init: [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 38d is 330)
is not an issue. That's standard on EL6 and ProLiant systems. However, the fix to remove the message is available here.
As for the crazy Oracle Linux kernel version, 2.6.39-300.26.1.el6euk.x86-64, can you try booting with the previous kernel in GRUB?
Related videos on Youtube
Sotis
Updated on September 18, 2022Comments
-
Sotis over 1 year
I'm running Oracle Linux 6 on a HP Proliant server. It's been running fine for the last week, but seemed slow earlier so the Oracle service was stopped. Rather than restart the service, I was asked to reboot the server, but on start we got a kernel panic
First I get the following, which HP said isn't important, but I'm inclined not to believe them
[Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 38d is 330)
ERST: Can not request iomem region <0xffff88030c1dfe20-0xffff1006183bfc40> for ERSTThen the Kernel panic
Kernel panic - not syncing: Attempted to kill init!
Pid: 1, comm: init Not tainted 2.6.39-300.26.1.el6euk.x86-64 #1
Call Trace:
[<ffffffff81509077>] panic+0x91/0x1a8
[<ffffffff81061562>] ? enqueue_entity+0x52/0x210
[<ffffffff8107196b>] forget_original_parent+0x32b/0x330
[<ffffffff8105adbd>] ? sched_move_task+0x9d/0x150
[<ffffffff8107198b>] exit_notify+0x1b/0x190
[<ffffffff81072a8e>] do_exit+0x1fe/0x430
[<ffffffff81072d15>] do_group_exit+0x55/0xd0
[<ffffffff81072da7>] sys_exit_group+0x17/0x20
[<ffffffff81514402>] system_call_fastpath+0x16/0x1b
panic occurred: switching back to text consoleCould anyone give me a pointer as to what is or even could be causing this? I'm completely stumped at this point. (System administration isn't my day job - I can get a server running but kernel panics are outside my comfort zone)
Edit: Tested with the following kernels
2.6.39-300.26.1.el6euk.x86_64
2.6.39-200.24.1.el6euk.x86_64
2.6.32-279.19.1.el6.x86_64
2.6.32-279.el6.x86_64-
ewwhite over 11 yearsWhat specific server model is it?
-
Sotis over 11 years@ewwhite it's a HP ProLiant DL380 G7
-
-
Sotis over 11 yearsMemtest seems fine, but I'll run a longer test soon. Currently doing some diagnostics on the disks. Thanks.
-
Sotis over 11 years@ewwwhite That would make sense: I'm running diagnostics at the moment, but if they don't come up with anything I'll try one of the other 3 kernels in GRUB. I'm not sure what they are right now, but I'll post the options before trying it out.
-
ewwhite over 11 yearsThere's nothing wrong with your server hardware. This is a kernel/OS interaction. Don't waste the time.
-
Sotis over 11 years@ewwwhite The same error occurs with all 4 kernel options available (I've posted these in the main question as I can't put line breaks in the comment)
-
ewwhite over 11 yearsI fear an issue with an initial ramdisk. The server can't find its root filesystem, it seems... What type of storage is this system using? Local? SAN?
-
Sotis over 11 years@ewwwhite A local 6-disk RAID-5 array
-
ewwhite over 11 yearsIf I were you, I'd check the firmware on the server/controllers. Use the Firmware Update DVD or Service Pack for ProLiant DVD. Outside of that, it's possible that some autoupdating of packages could have rendered the system unbootable.
-
Sotis over 11 yearsWilldo, cheers. We've ended up nuking the disk and re-installing - not ideal, but it's got us a working system again. I'll try to keep a closer eye on what gets updated in case anything similar happens. We've gone for CentOS this time, so I know exactly what's being installed - OracleLinux starts off with too many packages for my liking. Thanks for your help
-
Sotis over 9 yearsApologies, I forgot to come back to this - there was no hardware issue we ever found. After a re-install, the system has been rock solid for 18 months