How to debug a Linux kernel that freezes during boot?

12,250

Solution 1

One way to deal with this is to enable CONFIG_EARLY_PRINTK and add some printk() statements in kernel code that you suspect is freezing (most likely some drivers configuration parameters are wrong).

Also, you might be able to get old kernel config by looking at /boot/config-*, or at /proc/config.gz (it will exist only if old kernel had option CONFIG_IKCONFIG_PROC enabled).

Solution 2

Add initcall_debug to CONFIG_CMDLINE (kernel command line).

CONFIG_CMDLINE="root=/dev/ram0 rw mem=512M@0x0 initrd=0x800000,16M console=ttyS0,38400n8 rootfstype=ext2 init=/bin/busybox init -s initcall_debug"
Share:
12,250
ivarec
Author by

ivarec

Updated on June 28, 2022

Comments

  • ivarec
    ivarec almost 2 years

    I have a legacy device with a binary Linux 2.6.18 kernel that boots normally to its rootfs. However, if I try to compile this kernel from the source, the resulting kernel binary will freeze during the boot. I don't have the .config file used to build the previous kernel binary that is currently booting normally.

    The boot is freezing and no error output is provided. Here is the boot log:

    Linux version 2.6.18-6.2 (myuser@host) (gcc version 4.2.0 20070124 (prerelease) - BRCM 10ts-20080721) #10 SMP Sun Apr 28 18:25:24 BRT 2013
    Fetching vars from bootloader... OK (E,d,B,C)
    Detected 512 MB on MEMC0 (strap 0x23430310)
    Board strapped at 512 MB, default is 256 MB
    Options: sata=1 enet=1 emac_1=1 no_mdio=0 docsis=0 ebi_war=0 pci=1 smp=1
    CPU revision is: 0002a044
    FPU revision is: 00130001
    Primary instruction cache 32kB, physically tagged, 2-way, linesize 64 bytes.
    Primary data cache 64kB, 4-way, linesize 64 bytes.
    <6>Synthesized TLB refill handler (23 instructions).
    <6>Synthesized TLB load handler fastpath (37 instructions).
    <6>Synthesized TLB store handler fastpath (37 instructions).
    <6>Synthesized TLB modify handler fastpath (36 instructions).
    Determined physical RAM map:
     memory: 10000000 @ 00000000 (usable)
     memory: 10000000 @ 20000000 (usable)
    Using 32MB for memory, overwrite by passing mem=xx
    User-defined physical RAM map:
    node [00000000, 02000000: RAM]
    node [02000000, 0e000000: RSVD]
    node [20000000, 10000000: RAM]
    <5>Reserving 224 MB upper memory starting at 02000000
    <7>On node 0 totalpages: 65536
    <7>  DMA zone: 65536 pages, LIFO batch:15
    <7>On node 1 totalpages: 65536
    <7>  Normal zone: 65536 pages, LIFO batch:15
    Built 2 zonelists.  Total pages: 131072
    <5>Kernel command line: root=/dev/mtdblock3 rw rootfstype=jffs2 console=ttyS0,115200
    PID hash table entries: 4096 (order: 12, 16384 bytes)
    mips_counter_frequency = 202000000 from Calibration, = 202500000 from header(CPU_MHz/2)
    Dentry cache hash table entries: 65536 (order: 6, 262144 bytes)
    Inode-cache hash table entries: 32768 (order: 5, 131072 bytes)
    Memory: 286336k/524288k available (2924k kernel code, 237760k reserved, 544k data, 164k init, 0k highmem)
    Mount-cache hash table entries: 512
    Checking for 'wait' instruction...  available.
    plat_prepare_cpus: ENABLING 2nd Thread...
    TP0: prom_boot_secondary: Kick off 2nd CPU...
    CPU revision is: 0002a044
    FPU revision is: 00130001
    Primary instruction cache 32kB, physically tagged, 2-way, linesize 64 bytes.
    Primary data cache 64kB, 4-way, linesize 64 bytes.
    Synthesized TLB refill handler (23 instructions).
    Brought up 2 CPUs
    migration_cost=1000
    NET: Registered protocol family 16
    registering PCI controller with io_map_base unset
    registering PCI controller with io_map_base unset
    SCSI subsystem initialized
    usbcore: registered new driver usbfs
    usbcore: registered new driver hub
    NET: Registered protocol family 2
    IP route cache hash table entries: 16384 (order: 4, 65536 bytes)
    TCP established hash table entries: 65536 (order: 7, 524288 bytes)
    TCP bind hash table entries: 32768 (order: 6, 262144 bytes)
    TCP: Hash tables configured (established 65536 bind 32768)
    TCP reno registered
    brcm-pm: disabling power to USB block
    brcm-pm: disabling power to ENET block
    brcm-pm: disabling power to SATA block
    squashfs: version 3.2-r2 (2007/01/15) Phillip Lougher
    JFFS2 version 2.2. (NAND) (SUMMARY)  (C) 2001-2006 Red Hat, Inc.
    io scheduler noop registered
    io scheduler anticipatory registered (default)
    io scheduler deadline registered
    io scheduler cfq registered
    Serial: 8250/16550 driver $Revision: 1.1.1.1 $ 3 ports, IRQ sharing disabled
    serial8250: ttyS0 at MMIO 0x0 (irq = 22) is a 16550A
    serial8250: ttyS1 at MMIO 0x0 (irq = 66) is a 16550A
    serial8250: ttyS2 at MMIO 0x0 (irq = 67) is a 16550A
    loop: loaded (max 8 devices)
    brcm-pm: enabling power to ENET block
    

    How do I go about debugging this? Any insights on possible solutions to the freeze are welcome as well.