Bash script to kill specific process running longer than 5m

43

Solution 1

I really suggest finding the root cause of this issue (or this issue or this issue).

A killall is a heavy-handed approach to process management, and your real issue is probably an application or resource problem.

Can you outline what you've tried so far? The types of things I would check are:

  • System vitals at the time these runaway Ghostscript processes occur: RAM? CPU?
  • Make sure the system this is running on has enough memory and doesn't have major contention for other resources.
  • Is this a physical or virtual server?
  • Talk to the vendor. There's a community and some level of support around PrinceXML.
  • A possible strace of the affected PIDs and Parent PIDs.
  • Are all of the requisite fonts installed?
  • Try logging the times that this happens to see if there's a correlation between the hang and other system events.
  • If you don't have historical and granular monitoring, you should. You could even try something like NewRelic to try to get a picture of what is happening or happened at a given time.
  • Check apache settings. It looks as though Ghostscript is being spawned by the apache user. Are there any limits or server settings that should be examined here?

Based on your output from an earlier question, it looks like you've only allocated 1 Gigabyte of RAM to this system and possibly only have a single CPU - no swap either...

If all else fails, you can write a script that can clean up old or stalled processes... or just compile a version of killall that supports the --older-than flag.

Solution 2

Will something like this be ok?

#!/bin/bash

PROC_NAME=my_proc_name

# Get all PIDs for process name
procs=(`ps aux | grep  $PROC_NAME | awk '{print $2}'`)

# for each PID in PIDs array
for pid in $procs; do
    # get elapsed time in form mm:ss and remove ":" character
    # to make it easier to parse time 
    time=(`ps -o etime $pid | sed -e 's/[:-]/ /g'`)
    # get minutes from time
    min=${time[1]}
    # if proces runs 5 minutes then kill it
    if [ "$min" -gt "5" ]; then
        kill -9 $pid
    fi
done;

Of course it should be executed by cron or something like that to check processes periodically.

Share:
43

Related videos on Youtube

AbetR
Author by

AbetR

Updated on September 18, 2022

Comments

  • AbetR
    AbetR over 1 year

    enter image description hereSo I am conducting a data analysis on a dataset about a specific product. This dataset has a few columns one of which is PROD_NAME (Strings) and I have tried to run a code to get rid of symbols like "a1~!@#$%^&*(){}_+:"<>?,./;'[]-=", using the following code:

    x <- "a1~!@#$%^&*(){}_+:\"<>?,./;'[]-="
    str_replace_all(x, "[[:punct:]]", " ")
    

    I wanted to get a view of the data by using the command head() but the symbols I wanted to get rid of still show up.

    I have also tried the following

    str_replace_all(x, "[^[:alnum:]]", " ")
    gsub("[^[:alnum:]]", " ", x)
    

    Suggestions will be very much appreciated!

    • Zoredache
      Zoredache over 9 years
      Can you just start the original process with timeout.
    • ewwhite
      ewwhite over 9 years
      @Zoredache I've never even seen that before!
  • Jonathan
    Jonathan over 9 years
    I'm a programmer not a sysadmin so I'll be honest I'm completely clueless on how to fix this the right way. Are you interested in taking a contract job to diagnose this for us? PM me your email if you're interested.
  • ewwhite
    ewwhite over 9 years
    @Jonathan Well, how often does this happen? Can you increase RAM on your Rackspace instance? Is it just one CPU?