====== Passive Monitoring ======
I have a host that for specific reason I cannot install the NPRE daemon on it. In order to monitor its internals I have to make it report its status to the Nagios Master.
===== Install the batchcheck.sh script =====
First, install the following ''batchcheck.sh'' script in the target host.
#!/bin/sh
# File: /etc/nagios/batchcheck.sh
# Batch run passive checks
# customize your config here:
NAGIOS_SERVER_IP=10.0.0.1
MY_HOSTNAME=example-host
SEND_NSCA=/usr/bin/send_nsca
SEND_NSCA_CFG=/etc/nagios/send_nsca.cfg
PLUGINS_DIR=/usr/lib/nagios/plugins
for check_cmd in $PLUGINS_DIR/*; do
svc_description=`basename $check_cmd`
plugin_output=`$check_cmd`
return_code=$?
printf "%s\t%s\t%s\t%s\n" "$MY_HOSTNAME" "$svc_description" "$return_code" "$plugin_output" | $SEND_NSCA -H $NAGIOS_SERVER_IP -c $SEND_NSCA_CFG
done
===== Prepare individual check scripts =====
Then, put all check scripts in a folder, for example ''/etc/nagios/checks''.
The name of the script must be the same as the service description in the nagios configuration.
Example for the **CPULoad**:
# File: /etc/nagios/checks/CPULoad
/usr/lib/nagios/plugins/check_load -w 5.0,4.0,3.0 -c 10.0,6.0,4.0
===== Schedule check run =====
Add a cron job to run the batchcheck.sh script periodically. Example:
# File: /etc/cron.d/nagios_checks
1-59/5 * * * * nagios /etc/nagios/batchcheck.sh
===== Enable NSCA =====
Make sure you [[building_from_source_on_centos_5#NSCA|enabled NSCA]] (Nagios Service Check Acceptor) at the Nagios master.
===== Configure service for passive check =====
Define the template for passive checking:
define service{
name example-passive-generic-service
...
active_checks_enabled 0
check_freshness 1
freshness_threshold 1800
check_command no_report_warn
...
}
And sample service definition for **CPULoad** [[#prepare_individual_check_scripts|check script]]:
define service{
use example-passive-generic-service
host_name example-host
service_description CPULoad
...
}
==Notes==
''no_report_warn'' calls a script to set the service status to **Warning State**. It is used because there is no way to do active check when the freshness_threshold is reached and the system is forced to do a active check. This will happen when for some reason we are not receiving status reports.
#!/bin/bash
# file: /usr/lib/nagios/plugins/no_report_warn.sh
echo "WARNING: Did not receive service status report for a long time!";
exit 1;
----