What is the cause of this kernel panic?

622

Solution 1

I've made it. The problem turned out to be very simple. I gave to PXE client a 3.13.0-30 kernel. But I was running mkinitramfs on a machine with a 3.13.0-24 kernel.

I started to give a PXE client the 3.13.0-24 kernel and it worked.

Solution 2

It is actually possible. You need the debug kernel for that particular distro.

On a seperate host.

  • Download the kebug version of that kernel. It will contain a vmlinux file.

Open the vmlinux file in gdb.

$ gdb /usr/lib/debug/lib/modules/3.14.9-200.fc20.x86_64/vmlinux
GNU gdb (GDB) Fedora 7.7.1-13.fc20
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/lib/debug/lib/modules/3.14.9-200.fc20.x86_64/vmlinux...done.
(gdb) 

Judging from the stack output, we can see prior to the panic the most useful function the kernel was in was mount_block_root.

In order to determine where we failed, we need to feed in the function name plus an offset into GDB. This is done by de-referencing the address to the function, plus the offset. The stack trace supplies the offset as the first value after the function.

I.E mount_block_root+0x225 means (I was at "mount_block_root" plus 549 bytes (the hexadecimal translation).

Finally, we tell GDB to print the source code of that area. In my Linux system, this results in the following

(gdb) list *(mount_block_root+0x225)
0xffffffff81d26513 is in mount_block_root (init/do_mounts.c:422).
417                "explicit textual name for \"root=\" boot option.\n");
418 #endif
419         panic("VFS: Unable to mount root fs on %s", b);
420     }
421 
422     printk("List of all partitions:\n");
423     printk_all_partitions();
424     printk("No filesystem could mount root, tried: ");
425     for (p = fs_names; *p; p += strlen(p)+1)
426         printk(" %s", p);

From here we can tell exactly where we were at the point of the crash. NOTE my kernel is not your kernel, so the offsets are probably off. Based off of the likelihood that both these kernels are nearly the same, I'll hedge a bet that the real panic actually occurs at line 419, not line 422 (as was suggested).

Reading further up the code slightly indicates it was unable to open the block device specified -- but without a crash dump its not possible to tell why from the information. So its probably:-

  • You dont want to mount a block device (likely).
  • You specified a non-existent block device address (or partition).
  • Your initrd, does not contain the proper filesystem module in the initrd to mount it.
  • There is no filesystem on the disk.
  • The superblock for the filesystem is not at the beginning of that location.

Following on from the link in you're reference, it suggests you are trying to mount with NFS as the root, in which case you should never end up landing in this function at all. In which case:

  • Your kernel command line contains multiple root directives.
  • You have mistyped your NFS address such that it does not get parsed correctly to go into the real function you want (mount_nfs_root).

So, overall based off of the information in the question I assume you have omitted something or made a typo.

Solution 3

Some of the output is missing, since it scrolled off the screen already, but it's possible to see that the kernel crashed in mount_root(). This means that it had a problem with mounting whatever you passed as the root filesystem. Check to ensure that you have passed the correct parameters to the kernel to boot from whatever media it is supposed to be booting from.

Solution 4

  1. The quoted link does not have the complete set of parameters (append line) required for PXE booting Lubuntu 14.04.
  2. The kernel panic => the mount cannot be performed correctly because of 1).

You can see How Serva solved the correct lines here (I'm related to Serva development) http://vercot.com/~serva/an/NonWindowsPXE3.html

Serva uses a CIFS share instead of NFS but you could very well use NFS if you want. Of course you do not need to use Serva; you can use its parameters in your own PXE server

[PXESERVA_MENU_ENTRY]
asset    = Lubuntu 14.04 Desktop Live
platform = amd64
kernel   = NWA_PXE/$HEAD_DIR$/casper/vmlinuz
append   = showmounts toram root=/dev/cifs initrd=NWA_PXE/$HEAD_DIR$/casper/initrd.lz,NWA_PXE/$HEAD_DIR$/casper/INITRD_N11.GZ boot=casper netboot=cifs nfsroot=//$IP_BSRV$/NWA_PXE_SHARE/$HEAD_DIR$ NFSOPTS=-ouser=serva,pass=avres,ro ip=dhcp ro

Please consider

  1. Ubunu/Lubuntu have a bug that if you PXE boot them using CIFS you must add the complementary initrd INITRD_N11.GZ (freely available from Serva's page)
  2. If you are installing the 64bit version the former parameters require you to rename the file \casper\vmlinuz.efi to \casper\vmlinuz

Solution 5

I had the same issue with Ubuntu 14.04 today, and it was quite obnoxious so I want to share the solution I found with the world here...

I was using pxelinux.0, NFS for the root filesystem, and TFTP for serving up the kernel image and initramfs. As mentioned above by @MatthewIfe, looking at the stack backtrace and functions being called clearly indicates this issue was occurring in a block device related function, and mount_nfs_root was never being called.

So I turned to the TFTP logs, as indicated by the author of this post, and noted my configuration file was named as:

tftproot/pxelinux.cfg/default

Also it looked like this:

DEFAULT vmlinuz
LABEL Ubuntu 14.04 Blah Blah
KERNEL vmlinuz
APPEND initrd=initrd root=/dev/nfs nfsroot=192.168.1.123:/path/to/exportfs

Also my iPXE loader was also looking for other files just like in the post:

pxelinux.cfg/40709cda-a8e0-d411-8c6c-001e68e210ae
pxelinux.cfg/01-00-1e-68-e2-10-ae
pxelinux.cfg/C0A8010E
pxelinux.cfg/C0A8010
pxelinux.cfg/C0A801
pxelinux.cfg/C0A80
pxelinux.cfg/C0A8
pxelinux.cfg/C0A
pxelinux.cfg/C0
pxelinux.cfg/C
pxelinux.cfg/default

But I saw no record in the log of initrd being pulled down. So I decided to test and see if my APPEND line was working at all. So I added a "panic=10", again as in the post linked. And it seemed to not be working. So none of my kernel config line directives were being used! On a hunch I decided to do two things -- simplify my file to match the post

DEFAULT linux
LABEL linux
KERNEL vmlinuz
APPEND root=/dev/nfs nfsroot=192.168.1.123:/path/to/exportfs initrd=initrd panic=10

and rename it to something like

tftproot/pxelinux.cfg/01-00-1e-68-e2-10-ae

And voilà -- the initrd gets pulled down, no more kernel panic, and NFS is mounted as the root filesystem properly using the default/generic kernel and initramfs. I'm sure I can change the label back, etc. I think the actual issue was with the naming of the configuration file and what pxelinux.0 expects.

Share:
622
syl410
Author by

syl410

Updated on September 18, 2022

Comments

  • syl410
    syl410 almost 2 years

    Can anyone help to explain why I got this runtime error for the code below? (no solution is needed) Thanks!

    import java.util.*;
    
    class GFG {
    
        public static void main (String[] args) {
            ArrayList<Integer>[] arr = (ArrayList<Integer>[]) new Object[2];
        }
    }
    

    Runtime Errors: Exception in thread "main" java.lang.ClassCastException: [Ljava.lang.Object; cannot be cast to [Ljava.util.ArrayList; at GFG.main(File.java:5)

    • Michael Hampton
      Michael Hampton almost 10 years
      Can you get the rest of it?
    • Grigory
      Grigory almost 10 years
      @MichaelHampton the output actually stops at this moment.. So there is no any rest.
    • Pat
      Pat almost 10 years
      what are you trying to boot? a regular Ubuntu install or a Live distribution?
    • Grigory
      Grigory almost 10 years
      @Pat a regular Ubuntu install. It is Lubuntu 14.04
    • Pat
      Pat almost 10 years
      ok see my answer.
    • Grigory
      Grigory almost 10 years
      Gentlemen, thank you for you comments but the problem appeared to be deeper. Please, welcome to my new question here serverfault.com/questions/612024/…
    • Pat
      Pat almost 10 years
      your problem is not deeper; this question and your new question have the same answer that for some reason you refuse to take. If you want to PXE boot/install Lubuntu just do what the answer says.
  • Grigory
    Grigory almost 10 years
    Hmm. The root is acutally suppose to mount from nfs. I'll check this for sure. Thank you.
  • Nathan C
    Nathan C almost 10 years
    There's a kernel panic throw in the code: panic("VFS: Unable to mount root fs on %s", b); so my money's on a filesystem issue too.
  • Pat
    Pat almost 10 years
    Before debugging a kernel it is always a better idea to see around what's going on. You usually do not need a mass spectrometer to know what's in the meatloaf.
  • Matthew Ife
    Matthew Ife almost 10 years
    @Pat how would you determine what was going on in this case?
  • Pat
    Pat almost 10 years
    easy; PXE boot, using NFS, the kernel panic mention a mounting problem. The first thing that comes to my mind is to check if the distro has the required NFS support (most of then do) then checking mounting points, NFS parameters and loop devices; If I have to debug a kernel every time I see a kernel panic my life would be really a nightmare.
  • Grigory
    Grigory almost 10 years
    @MatthewIfe thank you for your comments. Please, welcome to my new question serverfault.com/questions/612024/…
  • Abhishek
    Abhishek about 4 years
    But Object class is a super class to each class. No?
  • Harshal Parekh
    Harshal Parekh about 4 years
    @Abhishek, yes.
  • syl410
    syl410 about 4 years
    Hi @HarshalParekh, thanks for your response. My question was more about the reason of runtim errors. I saw some openJDK code use "E[] tmp = (E[])new Object[input.length]; ". I don't understand why other code works fine but my code doesn't. (hg.openjdk.java.net/jdk9/jdk9/jdk/file/65464a307408/src/…)
  • Harshal Parekh
    Harshal Parekh about 4 years
    What you have posted is generics. It’s not the same as your code, hence the exception.