HP Insight Control Software for Linux User Manual
Page 239
Reports the number of new records processed in the /hptc_cluster/adm/logs/
consolidated.log
file.
A warning or critical message occurs when there is insufficient time to process a huge volume of
messages before the Nagios service_check_timeout period expires.
Nagios examines the recent incoming consolidated log messages and issues a warning or critical
message if the incoming rate since last interval exceeds a configured number of records. The default
values are 2 for warnings and 20 for critical. See /opt/hptc/nagios/libexec/
check_syslogalerts
for details.
No specific action is required unless the service times out. In that case, an excessive number of syslog
messages is collected across the system; this is more than the plug-in can process in the
service_check_timeout
period. See the /opt/hptc/nagios/etc/nagios.cfg file for the
value of the service_check_timeout parameter.
To solve the problem, run the following command on the system reporting the error:
# /opt/hptc/nagios/libexec/check_syslogalerts –domain node:nagios_monitor –nsca
Otherwise, wait for the nightly log to roll over.
Service: Syslog Alerts
Status Information: Node Syslog alerts information
Reports the number of alerts in a specified period of time and allows you to access the most recent
log.
A warning or critical message indicates that one or more rules defined in the /opt/hptc/nagios/
etc/syslogAlertRules
file matches the specified system's consolidated log file.
Take the appropriate action based on the message.
Service: System Event Log
Status Information: Node Syslog alerts information
A warning or critical message indicates that one or more rules defined in the /opt/hptc/nagios/
etc/selRules
file matches the specified system's firmware System Event Log.
Take the appropriate action based on the System Event Log message.
Service: System Free Space
Status Information: Node / and /var free space
This entry typically displays the status of the /, /var, and /hptc_cluster file systems on the
system.
A warning or critical message indicates that the thresholds for the specific managed system were
exceeded.
Clean up disk space.
23.14.6 A check_nrpe error occurs during management agents installation
When the gather_all_data script is running, a check_nrpe error like the following is
reported:
check_nrpe error: Connection refused by host => server
Corrective Actions:
•
If the check_nrpe error is reported for the CMS, use the following commands to verify
that the nrpe script is running on the CMS:
# ps auxww | grep nrpe
If the nrpe script is not running, use the following commands to start it and to rerun the
gather_all_data
script:
# /etc/init.d/nagios start_nrpe
# /opt/hptc/nagios/libexec/gather_all_data --verbose
23.14 Nagios Troubleshooting
239