How to monitor single process?

6,229

Solution 1

If you don't want to go for a full Nagios (or whatever) install to monitor a single process, why not just write a script to do it yourself? I've done something similar to keep track of DB connections from one of our boxes, using the output of netstat to do the count and logging the results to a file. Adding an extra few lines to send an email if the count is >3000 should be trivial.

Solution 2

This maybe is not the most sophisticated solution, but - especially if you do not have other processes opening so many sockets - you could check the output of

netstat -nutp

(n: no name resolution, t: TCP, u: UDP, p:show PID and program - you might want to only provide only one of u or t based on whether your process opens UDP or TCP connections).

You can grep for the pid from the output:

netstat -nutp | grep -c ' 12345/progname$'

where '12345' should be replaced with your PID and 'progname' with the name of your process. The option -c for grep counts the matches. You might want to refine the search to more accurately match your needs (e.g. include only ESTABLISHED connections).

Also 'lsof' might be your friend. You can try

lsof -p 12345 -a -i4

and check the output and do some grepping. Have a look into lsof manual page to see if you can modify the output format to better suit scripted parsing.

You can write a simple script to run the command periodically. For huge number of connections, you better experiment how much resources running netstat or lsof takes and adjust the interval. E.g. once per minute (by default):

#!/bin/sh

prog=progname

if [ -z "$1" ]; then
     interval=60
else
     interval="$1"
fi

pid=$(pidof $prog)
while :; do
    n=$(netstat -nutp | grep -c " ${pid}/${prog}$')
    date +"Number of connections [%Y-%m-%d %H:%M:%S]: $n" > connection.log
    if [ "$n" -gt $TRESHOLD ]; then
       # warn the admin
    fi
    sleep "${interval}"
done

(quite useless, just provide to give idea).

Solution 3

You can use a ready made solution, ps-watcher

your config can be like this:

[processname$]
    trigger = $count > 3000
    action  = <<EOF
    mail -s "processname treshold exceeded" <<< "You have $count processes" 
    /root/bin/run_some_cleanup
EOF

[[p]rocessname$]
    action = echo "$count processes are running" 

This will mail you when process count exceeds treshold. The second part has a different regular expression that matches the same process name, it logs the count of processes. As it is not limited by any trigger, the action runs at every ps-watcher check. You can change checking interval with "--sleep 150" option to ps-watcher.

Solution 4

If you want alerts and monitoring then I would look at Nagios if you want pure graphs then I would look at Munin or Cacti. If you just want to know how many connections a process has open at any time then use lsof.

Share:
6,229

Related videos on Youtube

lexsys
Author by

lexsys

in love with mountains

Updated on September 17, 2022

Comments

  • lexsys
    lexsys almost 2 years

    I need to monitor a single process (e.g. be warned when there are more than 3000 connections established) and collect statistics on it (e.g. to determine how many connections were established today 01:20 AM, when the server worked too slow, as client said). What tools should I use?

    • Govindarajulu
      Govindarajulu almost 15 years
      What process is this? Something that can handle 3000 connection would come with some statistics functionality, I recon.
    • lexsys
      lexsys almost 15 years
      It is a daemon that I wrote by myself. I don't want to add statistics functionality to it.