simulate nagios notifications

13,611

Solution 1

If you only want to verify that the email alerts are working properly, you could create a simple test service, which generates a warning once a day.

test_alert.sh:

#!/bin/bash

date=`date -u +%H%M`

echo $date
echo "Nagios test script. Intentionally generates a warning daily."

if [[ "$date" -ge "1900" && "$date" -le "1920" ]] ; then
  exit 1
else
  exit 0
fi

commands.cfg:

define command{
  command_name  test_alert
  command_line  /bin/bash /usr/local/scripts/test_alert.sh
}

services.cfg:

define service {
  host                  localhost
  service_description   Test Alert
  check_command         test_alert
  use                   generic-service
}

Solution 2

This is an old post but maybe my solution can help someone.

I use the plugin "check_dummy" which is in the Nagios plugins pack. As it says, it is stupid.

See some exemple of how it works :

Usage:
 check_dummy <integer state> [optional text]
$ ./check_dummy 0
OK
$ ./check_dummy 2
CRITICAL
$ ./check_dummy 3 salut
UNKNOWN: salut
$ ./check_dummy 1 azerty
WARNING: azerty
$ echo $?
1

I create a file which contain the interger state and the optional text : echo 0 OKAY | sudo tee /usr/local/nagios/libexec/dummy.txt sudo chown nagios:nagios /usr/local/nagios/libexec/dummy.txt

With the command :

# Dummy check (notifications tests)
define command {
    command_name    my_check_dummy
    command_line    $USER1$/check_dummy $(cat /usr/local/nagios/libexec/dummy.txt)
}

Associated with the service description :

define service {
    use                             generic-service
    host_name                       localhost
    service_description             Dummy check
    check_period                    24x7
    check_interval                  1
    max_check_attempts              1
    retry_interval                  1
    notifications_enabled           1
    notification_options            w,u,c,r
    notification_interval           0
    notification_period             24x7
    check_command                   my_check_dummy
}

So I just change the contents of the file "dummy.txt" to change the service state :

echo "2 Oups" | sudo tee /usr/local/nagios/libexec/dummy.txt
echo "1 AHHHH" | sudo tee /usr/local/nagios/libexec/dummy.txt
echo "0 Parfait !" | sudo tee /usr/local/nagios/libexec/dummy.txt

This allowed me to debug my notification program.

Hope it helps !

Share:
13,611
Admin
Author by

Admin

Updated on June 04, 2022

Comments

  • Admin
    Admin about 2 years

    My normal method of testing the notification and escalation chain is to simulate a failure by causing one, for example blocking a port.

    But this is thoroughly unsatisfying. I don't want down time recorded in nagios where there was none. I also don't want to wait.

    Does anyone know a way to test a notification chain without causing the outage? For example something like this:

    $ ./check_notifications_chain <service|host> <time down>
    at <x> minutes notification email sent to group <people>
    at <2x> minutes notification email sent to group <people>
    at <3x> minutes escalated to group <management>
    at <200x> rm -rf; shutdown -h now executed.
    

    Extending this paradigm I might make the notification chain a nagios check in itself, but I'll stop here before my brain explodes.

    Anyone?

  • Philip Kearns
    Philip Kearns over 9 years
    How is this called? I thought it would have to be associated with a server otherwise it wouldn't get called. Clearly I'm missing something here.
  • sekrett
    sekrett almost 8 years
    What do you mean by a server? In Nagios you define hosts, services and commands. That's enough for a test email.