killall gives me `no process found ` but ps

29,453

Solution 1

Is this on Linux?

There are actually a few subtly different versions of the command name that are used by ps, killall, etc.

The two main variants are: 1) the long command name, which is what you get when you run ps u; and 2) the short command name, which is what you get when you run ps without any flags.

Probably the biggest difference happens if your program is a shell script or anything that requires an interpreter, e.g. Python, Java, etc.

Here's a really trivial script that demonstrates the difference. I called it mycat:

#!/bin/sh
cat

After running it, here's the two different types of ps.

Firstly, without u:

$ ps -p 5290
  PID TTY      ... CMD
 5290 pts/6    ... mycat

Secondly, with u:

$ ps u 5290
USER       PID ... COMMAND
mikel     5290 ... /bin/sh /home/mikel/bin/mycat

Note how the second version starts with /bin/sh?

Now, as far as I can tell, killall actually reads /proc/<pid>/stat, and grabs the second word in between the parens as the command name, so that's really what you need to be specifying when you run killall. Logically, that should be the same as what ps without the u flag says, but it would be a good idea to check.

Things to check:

  1. what does cat /proc/<pid>/stat say the command name is?
  2. what does ps -e | grep db2 say the command name is?
  3. do ps -e | grep db2 and ps au | grep db2 show the same command name?

Notes

If you're using other ps flags too, then you might find it simpler to use ps -o comm to see the short name and ps -o cmd to see the long name.

You also might find pkill a better alternative. In particular, pkill -f tries to match using the full command name, i.e. the command name as printed by ps u or ps -o cmd.

Solution 2

killall tries to match on a process name (but is not really that good at the matching part).

And since ps | grep and ps | grep | kill does a much better job, someone simplified this and created pgrep and pkill. Read that commands like ps grep and ps kill, since that command first ps then grep and if wanted kills.

Solution 3

I had a similar problem but /proc/<pid>/stat contained the expected string. By using strace I could see that killall also accessed /proc/<pid>/cmdline.

I continued to investigate using gdb to find that in my case it failed on a check of my command to the full command including all args found in /proc/<pid>/cmdline. It seemed like that path of the code triggered due to the filename being longer than 15 chars (which is a hardcoded value in the source of killall). I didn't fully investigate if I could somehow getting it to work with killall.

But as mentioned in other comments here pkill is a better alternative that does not have the same issues.

The source code of pkill can be found here https://github.com/acg/psmisc for the interested.

Share:
29,453

Related videos on Youtube

Radek
Author by

Radek

Updated on September 18, 2022

Comments

  • Radek
    Radek over 1 year

    Could somebody explain to me the difference between kill and killall? Why doesn't killall see what ps shows?

    # ps aux |grep db2
    root      1123  0.0  0.8 841300 33956 pts/1    Sl   11:48   0:00 db2wdog                                         
    db2inst1  1125  0.0  3.5 2879496 143616 pts/1  Sl   11:48   0:02 db2sysc                                        
    root      1126  0.0  0.6 579156 27840 pts/1    S    11:48   0:00 db2ckpwd                                        
    root      1127  0.0  0.6 579156 27828 pts/1    S    11:48   0:00 db2ckpwd                                        
    root      1128  0.0  0.6 579156 27828 pts/1    S    11:48   0:00 db2ckpwd 
    
    # killall db2ckpwd
    db2ckpwd: no process found
    
    # kill -9 1126
    # kill -9 1127
    # kill -9 1128
    

    System is SuSe 11.3 (64 bit); kernel 2.6.34-12; procps version 3.2.8; killall from PSmisc 22.7; kill from GNU coreutils 7.1

    • Admin
      Admin about 11 years
      Never kill processes with SIGKILL (-9).
    • Admin
      Admin about 11 years
      What to do then when a process needs to terminated?
    • Admin
      Admin about 11 years
      This is the very, very last resort.
  • Radek
    Radek almost 13 years
    very good explanation. And I guess you were right the first time. ps -e |grep db2 gives me 3084 ? 00:00:00 db2syscr` and ps aux |grep db2 gives me root 3084 0.0 0.6 579292 28304 ? S 13:02 0:00 db2ckpwd. Could comment on that. I am bit lost.
  • Mikel
    Mikel almost 13 years
    I'm not sure. It's possible that the program is changing its name. Do you know how it's being run? What does ls -l /proc/3084/exe say? What about which or whence or type to find the file and then ls and type to see if it's a symlink or a script or a binary?
  • Radek
    Radek almost 13 years
    ls -l /proc/3084/exe gives us lrwxrwxrwx 1 root root 0 Jun 6 16:49 /proc/3084/exe -> /var/lib/db2/db2inst1/sqllib/adm/db2syscr
  • Radek
    Radek almost 13 years
    ls -l /var/lib/db2/db2inst1/sqllib/adm/db2syscr gives me -r-sr-s--- 1 root db2iadm1 147K Feb 1 23:32 /var/lib/db2/db2inst1/sqllib/adm/db2syscr*
  • Radek
    Radek almost 13 years
    type gives me /var/lib/db2/db2inst1/sqllib/adm/db2syscr /var/lib/db2/db2inst1/sqllib/adm/db2syscr is /var/lib/db2/db2inst1/sqllib/adm/db2syscr
  • 0x01
    0x01 over 4 years
    Thanks a lot for the hint about pkill. It's MUCH better suited for my task than killall, which I thought I knew how to use - but suddenly stopped working like intended when I upgraded my host system from Ubuntu 16.04 to 18.04. The "Things to check" didn't give any insights about what's wrong, but pkill -f works.