I have a host that for specific reason I cannot install the NPRE daemon on it. In order to monitor its internals I have to make it report its status to the Nagios Master.
First, install the following batchcheck.sh script in the target host.
#!/bin/sh
# File: /etc/nagios/batchcheck.sh
# Batch run passive checks
# customize your config here:
NAGIOS_SERVER_IP=10.0.0.1
MY_HOSTNAME=example-host
SEND_NSCA=/usr/bin/send_nsca
SEND_NSCA_CFG=/etc/nagios/send_nsca.cfg
PLUGINS_DIR=/usr/lib/nagios/plugins
for check_cmd in $PLUGINS_DIR/*; do
svc_description=`basename $check_cmd`
plugin_output=`$check_cmd`
return_code=$?
printf "%s\t%s\t%s\t%s\n" "$MY_HOSTNAME" "$svc_description" "$return_code" "$plugin_output" | $SEND_NSCA -H $NAGIOS_SERVER_IP -c $SEND_NSCA_CFG
done
Then, put all check scripts in a folder, for example /etc/nagios/checks.
Example for the CPULoad:
# File: /etc/nagios/checks/CPULoad /usr/lib/nagios/plugins/check_load -w 5.0,4.0,3.0 -c 10.0,6.0,4.0
Add a cron job to run the batchcheck.sh script periodically. Example:
# File: /etc/cron.d/nagios_checks 1-59/5 * * * * nagios /etc/nagios/batchcheck.sh
Make sure you enabled NSCA (Nagios Service Check Acceptor) at the Nagios master.
Define the template for passive checking:
define service{
name example-passive-generic-service
...
active_checks_enabled 0
check_freshness 1
freshness_threshold 1800
check_command no_report_warn
...
}
And sample service definition for CPULoad check script:
define service{
use example-passive-generic-service
host_name example-host
service_description CPULoad
...
}
no_report_warn calls a script to set the service status to Warning State. It is used because there is no way to do active check when the freshness_threshold is reached and the system is forced to do a active check. This will happen when for some reason we are not receiving status reports.
#!/bin/bash # file: /usr/lib/nagios/plugins/no_report_warn.sh echo "WARNING: Did not receive service status report for a long time!"; exit 1;