Nagios: NRPE: Unable to read output, Can't find the reason, can you?

37,300

Solution 1

Nice detailed write-up Itai! Have you tried reducing the complexity of the config to see if it works?

For starters, I would start by changing the line in nrpe.cfg to

command[check_kvm]=/usr/lib64/nagios/plugins/check_kvm

and temporarily change the /usr/lib64/nagios/plugins/check_kvm script to be something really simple like:

#!/bin/sh
echo Hi
exit 0

If that works, then you can start ratcheting up the complexity. Perhaps instead of giving the nagios user sudo access to the script, it really needs access to the virsh command and you can leave out the sudo part in the nrpe.cfg command line.

Solution 2

I had the same issue and I manage to solve it by killing the nagios process (on the monitored machine):

ps -ef | grep nagios
kill -9 [NagiosProcessNumber]
/etc/init.d/nagios-nrpe-server start

All went fine after that.

Solution 3

I saw a problem on a Gentoo server that resembles to yours at http://forums.gentoo.org/viewtopic-t-806014-start-0.html

there is a nice method there to debug the issue.

the user on that post had a problem with check_disk and got the exact same error message as yours.

he was told to execute the following command:

ssh remote_ip /usr/lib/nagios/plugins/check_disk -w 10 -c 5 -p "/"  2>&1

the 2>&1 will output stderr and might reveal the exact error.

so in your case replace remote_ip with the ip address of the server can't execute check_nrpe on. and replace the check_disk command with the full command that check_kvm is supposed to execute. if you run it without any parameters so you can just go and execute

  ssh <remote_ip> /usr/lib64/nagios/plugins/check_kvm 2>&1

that hopefully will reveal information regarding the problem.

good luck!

Share:
37,300

Related videos on Youtube

Itai Ganot
Author by

Itai Ganot

Architect and Lecturer in the field of DevOps Engineering. LinkedIn: https://www.linkedin.com/in/itaiganot Personal Website: http://geek-kb.com

Updated on September 18, 2022

Comments

  • Itai Ganot
    Itai Ganot almost 2 years

    I have a Nagios server and a monitored server. On the monitored server:

    [root@Monitored ~]# netstat -an |grep :5666
    tcp        0      0 0.0.0.0:5666                0.0.0.0:*                   LISTEN      
    [root@Monitored ~]# locate check_kvm
    /usr/lib64/nagios/plugins/check_kvm
    [root@Monitored ~]# /usr/lib64/nagios/plugins/check_kvm -H localhost
    hosts:3 OK:3 WARN:0 CRIT:0 - ab2c7:running alpweb5:running istaweb5:running
    [root@Monitored ~]# /usr/lib64/nagios/plugins/check_nrpe -H localhost -c check_kvm
    NRPE: Unable to read output
    [root@Monitored ~]# /usr/lib64/nagios/plugins/check_nrpe -H localhost
    NRPE v2.14
    [root@Monitored ~]# ps -ef |grep nrpe
    nagios   21178     1  0 16:11 ?        00:00:00 /usr/sbin/nrpe -c /etc/nagios/nrpe.cfg -d
    [root@Monitored ~]#
    

    On the Nagios server:

    [root@Nagios ~]# /usr/lib64/nagios/plugins/check_nrpe -H 1.1.1.159 -c check_kvm
    NRPE: Unable to read output
    [root@Nagios ~]# /usr/lib64/nagios/plugins/check_nrpe -H 1.1.1.159
    NRPE v2.14
    [root@Nagios ~]#
    

    When I check another server in the network using the same command it works:

    [root@Nagios ~]# /usr/lib64/nagios/plugins/check_nrpe -H 1.1.1.80 -c check_kvm
    hosts:4 OK:4 WARN:0 CRIT:0 - karmisoft:running ab2c4:running kidumim1:running travel2gether1:running
    [root@Nagios ~]#
    

    Running the check locally using Nagios account:

    [root@Monitored ~]# su - nagios
    -bash-4.1$ /usr/lib64/nagios/plugins/check_kvm
    hosts:3 OK:3 WARN:0 CRIT:0 - ab2c7:running alpweb5:running istaweb5:running
    -bash-4.1$
    

    Running the check remotely from the Nagios server using Nagios account:

    -bash-4.1$ /usr/lib64/nagios/plugins/check_nrpe -H 1.1.1.159 -c check_kvm
    NRPE: Unable to read output
    -bash-4.1$ /usr/lib64/nagios/plugins/check_nrpe -H 1.1.1.159
    NRPE v2.14
    -bash-4.1$
    

    Running the same check_kvm against a different server in the network using Nagios account:

    -bash-4.1$ /usr/lib64/nagios/plugins/check_nrpe -H 1.1.1.80 -c check_kvm
    hosts:4 OK:4 WARN:0 CRIT:0 - karmisoft:running ab2c4:running kidumim1:running travel2gether1:running
    -bash-4.1$ 
    

    Permissions:

    -rwxr-xr-x. 1 root root 4684 2013-10-14 17:14 nrpe.cfg (aka /etc/nagios/nrpe.cfg)
    drwxrwxr-x. 3 nagios nagios 4096 2013-10-15 03:38 plugins (aka /usr/lib64/nagios/plugins)
    

    /etc/sudoers:

    [root@Monitored ~]# grep -i requiretty /etc/sudoers
    #Defaults    requiretty
    

    iptables/selinux:

    [root@Monitored xinetd.d]# service iptables status
    iptables: Firewall is not running.
    [root@Monitored xinetd.d]# service ip6tables status
    ip6tables: Firewall is not running.
    [root@Monitored xinetd.d]# grep disable /etc/selinux/config 
    #     disabled - No SELinux policy is loaded.
    SELINUX=disabled
    [root@Monitored xinetd.d]#
    

    The command in /etc/nagios/nrpe.cfg is:

    [root@Monitored ~]# grep kvm /etc/nagios/nrpe.cfg 
    command[check_kvm]=sudo /usr/lib64/nagios/plugins/check_kvm
    

    and the nagios user is added on /etc/sudoers:

    nagios  ALL=(ALL) NOPASSWD:/usr/lib64/nagios/plugins/check_kvm
    nagios  ALL=(ALL) NOPASSWD:/usr/lib64/nagios/plugins/check_nrpe
    

    The check_kvm is a shell script, looks like that:

    #!/bin/sh
    
    LIST=$(virsh list --all | sed '1,2d' | sed '/^$/d'| awk '{print $2":"$3}')
    
    if [ ! "$LIST" ]; then
      EXITVAL=3 #Status 3 = UNKNOWN (orange) 
      echo "Unknown guests"
      exit $EXITVAL
    fi
    
    OK=0
    WARN=0
    CRIT=0
    NUM=0
    
    for host in $(echo $LIST)
    do
      name=$(echo $host | awk -F: '{print $1}')
      state=$(echo $host | awk -F: '{print $2}')
      NUM=$(expr $NUM + 1)
    
      case "$state" in
        running|blocked) OK=$(expr $OK + 1) ;;
        paused) WARN=$(expr $WARN + 1) ;;
        shutdown|shut*|crashed) CRIT=$(expr $CRIT + 1) ;;
        *) CRIT=$(expr $CRIT + 1) ;;
      esac
    done
    
    if [ "$NUM" -eq "$OK" ]; then
      EXITVAL=0 #Status 0 = OK (green)
    fi
    
    if [ "$WARN" -gt 0 ]; then
      EXITVAL=1 #Status 1 = WARNING (yellow)
    fi
    
    if [ "$CRIT" -gt 0 ]; then
      EXITVAL=2 #Status 2 = CRITICAL (red)
    fi
    
    echo hosts:$NUM OK:$OK WARN:$WARN CRIT:$CRIT - $LIST
    
    exit $EXITVAL
    

    Edit (10/22/13): Following all that, I am now able to get some response from the script:

    [root@Monitored ~]# /usr/lib64/nagios/plugins/check_nrpe -H localhost -c check_kvm
    Unknown guests
    [root@Monitored ~]# /usr/lib64/nagios/plugins/check_nrpe -H localhost
    NRPE v2.14
    [root@Monitored ~]# /usr/lib64/nagios/plugins/check_kvm
    hosts:3 OK:3 WARN:0 CRIT:0 - ab2c7:running alpweb5:running istaweb5:running
    [root@Monitored ~]# su - nagios
    -bash-4.1$ /usr/lib64/nagios/plugins/check_kvm
    hosts:3 OK:3 WARN:0 CRIT:0 - ab2c7:running alpweb5:running istaweb5:running
    -bash-4.1$ /usr/lib64/nagios/plugins/check_nrpe -H localhost -c check_kvm
    Unknown guests
    -bash-4.1$ /usr/lib64/nagios/plugins/check_nrpe -H localhost
    NRPE v2.14
    

    It seems like the problem is some how related to the check_nrpe command or something which is related to the nrpe installation on the server.

    Edit 12/2/13: Other checks on the problematic server work: enter image description here

  • Itai Ganot
    Itai Ganot over 10 years
    I Have tried it and still getting NRPE: Unable to read output, any more suggestions?
  • KJH
    KJH over 10 years
    What are the ownership and permissions on /usr/lib64/nagios/plugins/check_kvm ?
  • KJH
    KJH over 10 years
    Did you try changing the script itself to something simple? I don't think the basic "Hi" one I suggested would be 2581 bytes?
  • Itai Ganot
    Itai Ganot over 10 years
    Yes, I've tried changing the script but to no avail. more than that, the script works just fine when checked against another server, or if i run it locally, only when i use the check_nrpe -H localhost -c check_kvm method it returns Unknown guests
  • KJH
    KJH over 10 years
    Hi Itai - join this chat room: chat.stackexchange.com/rooms/11147/…
  • Itai Ganot
    Itai Ganot over 10 years
    It seems like i missed you at the chat, but i've updated the questions, thank you.
  • KJH
    KJH over 10 years
    Try turning on debug in NRPE (might need a restart) and capture the output from wherever it logs to.
  • Itai Ganot
    Itai Ganot over 10 years
    Unfortunately, i get the same outputs: [root@Nagios-SRV ~]# ssh 1.1.1.159 /usr/lib64/nagios/plugins/check_kvm "/" 2>&[email protected]'s password: hosts:3 OK:3 WARN:0 CRIT:0 - ab2c7:running alpweb5:running istaweb5:running [root@Nagios-SRV ~]# ssh 1.1.1.159 /usr/lib64/nagios/plugins/check_nrpe -H localhost "/" 2>&1 [email protected]'s password: NRPE v2.14 [root@Nagios-SRV ~]# ssh 1.1.1.159 /usr/lib64/nagios/plugins/check_nrpe -H localhost -c check_kvm "/" 2>&1 [email protected]'s password: Unknown guests [root@Nagios-SRV ~]#
  • ufk
    ufk over 10 years
    have you tried running other scripts like check_disk ? does this behaviour happens on every script or just this one ?
  • KJH
    KJH over 10 years
    Have you tried running virsh list --all as root and as nagios on that system?
  • Itai Ganot
    Itai Ganot over 10 years
    Yes everything else works and so does the check_kvm script while checking other remote machines.
  • Some Linux Nerd
    Some Linux Nerd almost 10 years
    Mine says "sorry, you must have a tty to run sudo" :)