Monit "check program" and restart based on exit code

6,327

(sorry for the long answer in advance ^^)

Getting Things Right

But when I use check program Monit will not automatically start it. If the program is running and for some reason it stops with an exit code other than 0 the monit will not restart it (see my configuration below).

There is a significant difference between a process and a program in Monit:

A process is a binary running in the background - a daemon (like an HTTPd, a DB Server, etc.). A process is not run by Monit. Monit does not really control those applications. It can, however, interact with the daemon-izer. Example: systemctl start nginx will not run a nginx in foreground (and block your bash), but will start a daemon in background (that keeps running even after you killed your session).

A program is a binary executed and controlled by Monit. This program should exit after some time and Monit reacts on how all that was going. Example: du -hd1 will run, create an output, and exit. You can then, based on status-code, contents, etc., react to the result/output of the execution.


Notes on the Config

You are trying to use check program to daemonize your application. That will afaik not work. I think that you need to go for check process in this case. You need to daemonize your script for that. Getting back to that later...

    if status > 200 then restart
    if status < 201 then stop

Monit is there to interact if something unusual happens. This config tells Monit:

  • There is something unusual happening if status > 200
  • There is something unusual happening if status < 201

This results in the fact, that there is not success state where everything is okay. You'll keep yourself in an alert-loop ;)

    if 2 restart 5 cycles then exec “/monit/custom_script.sh”
    if 2 restart 5 cycles then stop

Here are two actions that react to the same event. It should stop and run a script. It's possible that this will work, but it's kind of weird to understand...


Result - Kind Of

Best way to solve this would be to have a pidfile from your python-script. Since I do not know s*it about python, that's up to you ;-)

A super-dirty way to create yourself a pidfile whould be a bash script (/monit/MyProgram-daemonize):

#!/usr/bin/env bash
(
    /monit/MyProgram.py &
    echo -n "$!" > "/tmp/MyProgram.pid"
) &

And another one (/monit/MyProgram-kill):

#!/usr/bin/env bash

if [[ -f /tmp/MyProgram.pid ]]; then
    kill -SIGTERM $(cat /tmp/MyProgram.pid)
    wait $(cat /tmp/MyProgram.pid)
    rm -f /tmp/MyProgram.pid
    exit 0
fi

exit 1

Comments:

  • I used /tmp/ instead of /run for permission reasons
  • I used kill -SIGTERM, because kill -SIGKILL or kill -9 are evil ;) You might have to adjust this...

You can use check process then:

check process MyProgram pidfile "/tmp/MyProgram.pid"
  start program = "/monit/MyProgram-daemonize" as uid myNonRootUserHere
  stop program = "/monit/MyProgram-kill" as uid MyNonRootUserHere

  if failed then restart
  if 3 restarts within 5 cycles then unmonitor

The biggest flaw with this approach would be the possible reuse of pids. Monit has no connection between pid-file and binary. So if your python starts with pid 100 and got killed anyhow, but another process takes pid 100, your Monit will not notice it and thinks everything is fine. You therefor should add a check (examples for sure; if your program does not provide an HTTPd, that check might not be the next best thing):

  if failed uid MyNonRootUserHere then restart

  if failed
    host 127.0.0.1
    port 80
    protocol http
    request "/"
  then alert
Share:
6,327

Related videos on Youtube

Onema
Author by

Onema

Software Engineer interested in serverless technologies, distributed systems, automation, AWS, functional programming, best practices, Scala development, OSS, component development.

Updated on September 18, 2022

Comments

  • Onema
    Onema over 1 year

    When I use check process, monit will start the program I define under start program then monit will restart it if it stops.

    But when I use check program monit will not automatically start it. If the program is running and for some reason it stops with an exit code other than 0 the monit will not restart it (see my configuration below).

    I’m really not sure how to properly start and restart the program based on my exit codes.

    My config file looks like this:

    set logfile /tmp/monit.log
    
    set daemon  1
    check program MyProgram with path “/monit/MyProgram.py”
            and with timeout 3600 seconds 
        every 1 cycles
        start program = “/monit/MyProgram.py” with timeout 3600 seconds
        if status > 200 then restart
        if status < 201 then stop
        if 2 restart 5 cycles then exec “/monit/custom_script.sh”
        if 2 restart 5 cycles then stop
    

    and I have tried starting monit like this:

    • monit -c monitrc -vv
    • monit -c monitrc start all -vv
    • monit -c monitrc start MyProgram -vv