Oracle Linux 6 kernel panic boot - anything I can do?

9,083

The first message you see during init: [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 38d is 330) is not an issue. That's standard on EL6 and ProLiant systems. However, the fix to remove the message is available here.

As for the crazy Oracle Linux kernel version, 2.6.39-300.26.1.el6euk.x86-64, can you try booting with the previous kernel in GRUB?

Share:
9,083

Related videos on Youtube

Sotis
Author by

Sotis

Updated on September 18, 2022

Comments

  • Sotis
    Sotis over 1 year

    I'm running Oracle Linux 6 on a HP Proliant server. It's been running fine for the last week, but seemed slow earlier so the Oracle service was stopped. Rather than restart the service, I was asked to reboot the server, but on start we got a kernel panic

    First I get the following, which HP said isn't important, but I'm inclined not to believe them

    [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 38d is 330)
    ERST: Can not request iomem region <0xffff88030c1dfe20-0xffff1006183bfc40> for ERST

    Then the Kernel panic

    Kernel panic - not syncing: Attempted to kill init!
    Pid: 1, comm: init Not tainted 2.6.39-300.26.1.el6euk.x86-64 #1
    Call Trace:
    [<ffffffff81509077>] panic+0x91/0x1a8
    [<ffffffff81061562>] ? enqueue_entity+0x52/0x210
    [<ffffffff8107196b>] forget_original_parent+0x32b/0x330
    [<ffffffff8105adbd>] ? sched_move_task+0x9d/0x150
    [<ffffffff8107198b>] exit_notify+0x1b/0x190
    [<ffffffff81072a8e>] do_exit+0x1fe/0x430
    [<ffffffff81072d15>] do_group_exit+0x55/0xd0
    [<ffffffff81072da7>] sys_exit_group+0x17/0x20
    [<ffffffff81514402>] system_call_fastpath+0x16/0x1b
    panic occurred: switching back to text console

    Could anyone give me a pointer as to what is or even could be causing this? I'm completely stumped at this point. (System administration isn't my day job - I can get a server running but kernel panics are outside my comfort zone)

    Edit: Tested with the following kernels

    2.6.39-300.26.1.el6euk.x86_64
    2.6.39-200.24.1.el6euk.x86_64
    2.6.32-279.19.1.el6.x86_64
    2.6.32-279.el6.x86_64

    • ewwhite
      ewwhite over 11 years
      What specific server model is it?
    • Sotis
      Sotis over 11 years
      @ewwhite it's a HP ProLiant DL380 G7
  • Sotis
    Sotis over 11 years
    Memtest seems fine, but I'll run a longer test soon. Currently doing some diagnostics on the disks. Thanks.
  • Sotis
    Sotis over 11 years
    @ewwwhite That would make sense: I'm running diagnostics at the moment, but if they don't come up with anything I'll try one of the other 3 kernels in GRUB. I'm not sure what they are right now, but I'll post the options before trying it out.
  • ewwhite
    ewwhite over 11 years
    There's nothing wrong with your server hardware. This is a kernel/OS interaction. Don't waste the time.
  • Sotis
    Sotis over 11 years
    @ewwwhite The same error occurs with all 4 kernel options available (I've posted these in the main question as I can't put line breaks in the comment)
  • ewwhite
    ewwhite over 11 years
    I fear an issue with an initial ramdisk. The server can't find its root filesystem, it seems... What type of storage is this system using? Local? SAN?
  • Sotis
    Sotis over 11 years
    @ewwwhite A local 6-disk RAID-5 array
  • ewwhite
    ewwhite over 11 years
    If I were you, I'd check the firmware on the server/controllers. Use the Firmware Update DVD or Service Pack for ProLiant DVD. Outside of that, it's possible that some autoupdating of packages could have rendered the system unbootable.
  • Sotis
    Sotis over 11 years
    Willdo, cheers. We've ended up nuking the disk and re-installing - not ideal, but it's got us a working system again. I'll try to keep a closer eye on what gets updated in case anything similar happens. We've gone for CentOS this time, so I know exactly what's being installed - OracleLinux starts off with too many packages for my liking. Thanks for your help
  • Sotis
    Sotis over 9 years
    Apologies, I forgot to come back to this - there was no hardware issue we ever found. After a re-install, the system has been rock solid for 18 months