How to kill a process which can't be killed without rebooting?

106,708

Solution 1

You don't have zombies. cat /proc/$PID/cmdline wouldn't have any problem with a zombie. If kill -9 doesn't kill the program, it means the program is doing some uninterruptible I/O operation. That usually indicates one of three things:

  • a network filesystem that isn't responding;
  • a kernel bug;
  • a hardware bug.

Utilities such as ps may hang if they try to read some information such as the process executable path that the kernel isn't providing for one of the reasons above.

Try cat /proc/16181/syscall to see what process 16181 is doing. This may or may not work depending on how far gone your system is.

If the problem is a network filesystem, you may be able to force-unmount it, or to make it come online. If the problem is a kernel or hardware bug, what you can do will depend on the nature of the bug. Rebooting (and upgrading to a fixed kernel, or replacing the broken hardware) is strongly recommended.

Solution 2

The other answers are assuming these are zombie processes. A zombie process is a process that has finished running, but is still in the process table in case the parent wants to know the exit status. These are normal, and init will automatically clean up zombie processes that get assigned to it.

Zombie processes should never cause anything to hang, so it sounds like that may not be your problem. If it's a system call or driver hanging, then the process may be in an uninterruptable state. There's a good explanation here.

Solution 3

To find zombie processes on Linux:

$ ps axo stat,ppid,pid,comm | grep -w defunct

Z 555 10242 Damn-Zombie < defunct >

First, you can try sending SIGCHLD signal to the zombie’s parent process using the kill command. Note that the above command gives you PPID (PID of parent process) of each zombie. In our example, PPID of the zombie is 555.

$ sudo kill -s SIGCHLD 555

If a zombie process still does not go away, you can kill the parent process (e.g., 555) of the zombie.

$ sudo kill -9 555

Once its parent process gets killed, the zombie will be adopted by the init process, which is a parent of all processes in Linux. The init process periodically calls wait() to reap any zombie process.

Solution 4

You can only kill a zombie by killing its parent. A zombie process has released all its resources and is waiting for its exit status to be picked up by its parent. It becomes a zombie when the parent does not execute a wait to pick up the exit status from its child. When you kill the zombie's parent, init picks up the exit status and zombie finally dies.

Share:
106,708

Related videos on Youtube

Sam Stoelinga
Author by

Sam Stoelinga

Updated on September 18, 2022

Comments

  • Sam Stoelinga
    Sam Stoelinga over 1 year

    There are 5 processes which can't be killed by kill -9 $PID and executing cat /proc/$PID/cmdline will hang the current session. Maybe they're zombie processes.

    Executing ps -ef or htop will also hang the current session. But top and ps -e are working fine.

    So it seems that there are two problems the filesystem not responding.

    This is a production machine running virtual machines, so rebooting isn't an option.

    The following processes ids aren't working: 16181 16765 5985 7427 7547

    The parent of these processes is init

            ├─collectd(16765)─┬─{collectd}(16776)
            │                 ├─{collectd}(16777)
            │                 ├─{collectd}(16778)
            │                 ├─{collectd}(16779)
            │                 ├─{collectd}(16780)
            │                 └─{collectd}(16781)
            ├─collectd(28642)───{collectd}(28650)
            ├─collectd(29868)─┬─{collectd}(29873)
            │                 ├─{collectd}(29874)
            │                 ├─{collectd}(29875)
            │                 └─{collectd}(29876)
    

    And one of the qemu processes not working

    |-qemu-system-x86(16181)-+-{qemu-system-x86}(16232)
    |                        |-{qemu-system-x86}(16238)
    |                        |-{qemu-system-x86}(16803)
    |                        |-{qemu-system-x86}(17990)
    |                        |-{qemu-system-x86}(17991)
    |                        |-{qemu-system-x86}(17992)
    |                        |-{qemu-system-x86}(18062)
    |                        |-{qemu-system-x86}(18066)
    |                        |-{qemu-system-x86}(18072)
    |                        |-{qemu-system-x86}(18073)
    |                        |-{qemu-system-x86}(18074)
    |                        |-{qemu-system-x86}(18078)
    |                        |-{qemu-system-x86}(18079)
    |                        |-{qemu-system-x86}(18086)
    |                        |-{qemu-system-x86}(18088)
    |                        |-{qemu-system-x86}(18092)
    |                        |-{qemu-system-x86}(18107)
    |                        |-{qemu-system-x86}(18108)
    |                        |-{qemu-system-x86}(18111)
    |                        |-{qemu-system-x86}(18113)
    |                        |-{qemu-system-x86}(18114)
    |                        |-{qemu-system-x86}(18119)
    |                        |-{qemu-system-x86}(23147)
    |                        `-{qemu-system-x86}(27051)
    
  • Sam Stoelinga
    Sam Stoelinga almost 11 years
    So you want me to kill init? It's not clear from the question sorry hehe but the parent seems to be init :( I've edited the question.
  • tripleee
    tripleee almost 11 years
    No, we want you to not try to kill the zombie. You cannot kill a zombie. This FAQ is as old as Unix itself.
  • tripleee
    tripleee almost 11 years
    "Kill the parent" is the way to reap a regular zombie. You cannot kill init. If a zombie is reparented under init, you cannot kill it.
  • enigment
    enigment about 7 years
    Too many zombie processes can prevent fork from succeeding (when hard nrpoc is reached) because they still occupy space in the process table.
  • RodjamsB
    RodjamsB about 5 years
    Cat never responds. I don't think this is a bug. i think it's a "feature".
  • Andrew
    Andrew about 5 years
    This is the real answer. Killing the parent worked, thanks.