How to read, understand, analyze, and debug a Linux kernel panic?
Solution 1
It's just an ordinary backtrace, those functions are called in reverse order (first one called was called by the previous one and so on):
unwind_backtrace+0x0/0xf8
warn_slowpath_common+0x50/0x60
warn_slowpath_null+0x1c/0x24
ocal_bh_enable_ip+0xa0/0xac
bdi_register+0xec/0x150
The bdi_register+0xec/0x150
is the symbol + the offset/length there's more information about that in Understanding a Kernel Oops and how you can debug a kernel oops. Also there's this excellent tutorial on Debugging the Kernel
Note: as suggested below by Eugene, you may want to try addr2line first, it still needs an image with debugging symbols though, for example
addr2line -e vmlinux_with_debug_info 0019594c(+offset)
Solution 2
Here are two alternatives for addr2line
. Assuming you have the proper target's toolchain, you can do one of the following:
Use objdump
:
-
locate your
vmlinux
or the.ko
file under the kernel root directory, then disassemble the object file :objdump -dS vmlinux > /tmp/kernel.s
-
Open the generated assembly file,
/tmp/kernel.s
. with a text editor such asvim
. Go tounwind_backtrace+0x0/0xf8
, i.e. search for the address ofunwind_backtrace
+ theoffset
. Finally, you have located the problematic part in your source code.
Use gdb
:
IMO, an even more elegant option is to use the one and only gdb
. Assuming you have the suitable toolchain on your host machine:
- Run
gdb <path-to-vmlinux>
. - Execute in gdb's prompt:
list *(unwind_backtrace+0x10)
.
For additional information, you may checkout the following resources:
Solution 3
In
unwind_backtrace+0x0/0xf8
what the+0x0/0xf8
stands for?
The first number (+0x0
) is the offset from the beginning of the function (unwind_backtrace
in this case). The second number (0xf8
) is the total length of the function. Given these two pieces of information, if you already have a hunch about where the fault occurred this might be enough to confirm your suspicion (you can tell (roughly) how far along in the function you were).
To get the exact source line of the corresponding instruction (generally better than hunches), use addr2line
or the other methods in other answers.
0x90
echo \[q\]sa\[ln0=aln256%Pln256/snlbx\]sb3135071790101768542287578439snlbxq|dc
Updated on January 19, 2022Comments
-
0x90 over 2 years
Consider the following Linux kernel dump stack trace; e.g., you can trigger a panic from the kernel source code by calling
panic("debugging a Linux kernel panic");
:[<001360ac>] (unwind_backtrace+0x0/0xf8) from [<00147b7c>] (warn_slowpath_common+0x50/0x60) [<00147b7c>] (warn_slowpath_common+0x50/0x60) from [<00147c40>] (warn_slowpath_null+0x1c/0x24) [<00147c40>] (warn_slowpath_null+0x1c/0x24) from [<0014de44>] (local_bh_enable_ip+0xa0/0xac) [<0014de44>] (local_bh_enable_ip+0xa0/0xac) from [<0019594c>] (bdi_register+0xec/0x150)
- In
unwind_backtrace+0x0/0xf8
what does+0x0/0xf8
stand for? - How can I see the C code of
unwind_backtrace+0x0/0xf8
? - How to interpret the panic's content?
- In
-
iabdalkader over 11 years@0x90 I don't think you can't get the exact line without debugging the kernel, because that's an instruction offset, the best you could do with oops dump is to know the function that crashed.
-
Eugene over 11 yearsSometimes
addr2line
can resolve the address and determine the appropriate source lines. Of course, it is not always possible to map the instructions to the locations in the source code, but still, better than nothing. Debug symbols for the kernel are needed for that, of course. If one is lucky, they can be found either invmlinux
itself (for custom-built kernels) or in a separate package. Some distros provide such packages, the names may vary.addr2line -e vmlinux_with_debug_info 0019594c
might help to find the source lines corresponding tobdi_register+0xec
. -
Eugene over 11 yearsPerhaps this could be useful: I've recently experimented with
addr2line
andeu-addr2line
(a similar tool from elfutils package) and found that the latter is more reliable. I used both to resolve an address in "e1000" driver with appropriate debug info. I copied the file with debug info (e1000.ko.debug in my case) to a different machine and tried to analyse it there withaddr2line -f -e e1000.ko.debug -j .devinit.text 0x424
and the same witheu-addr2line
. Only the latter gave the correct results. -
Eugene over 11 years(contunued) When I did the analysis on the machine the driver was from, the results were the same as I described.
addr2line
pointed to a wrong location in the source code,eu-addr2line
did things right. I cannot say now why it is this way, perhaps, something else is needed foraddr2line
. In the meantime, I would recommend installing elfutils and usingeu-addr2line
. -
Hi-Angel over 9 years@mux but how do you interpret a dot in the trace? I.e. in my case I saw the
(receive_room.isra.9+0x8/0x4c)
. -
kavadias about 7 years
gdb
did not work for me, although full debug symbols are included and I could debug remotely, in the past (I must have changed something...).objdump
though, produced a very nice listing, including the source code, which was very precise and showed me the point causing the panic. -
kavadias about 7 years@Eugene: On my platform (Zynq board)
addr2line
works well with complete address (e.g.,addr2line -e vmlinux 4004578c
) and not with "symbol+offset" andeu-addr2line
the reverse (e.g.,eu-addr2line -e vmlinux gwrr_balance.isra.20+0x11ac
gives the same asaddr2line
with complete address) -
libbkmz almost 6 yearsI suppose, link has been moved here: opensourceforu.com/2011/01/understanding-a-kernel-oops
-
user3693586 over 4 yearsI was seeing
warn_slowpath_null
andwarn_slowpath_common
warnings fromsg_alloc_table
. can you help to understand under what scenarios theseslowpath
warnings will come.