4 using the nagios report generator analyze mode, 5 messages reported by nagios – HP Insight Control Software for Linux User Manual

Page 223

$ ./check_sel --help
check_sel <--help> -H hostname <-t timeout>

--help
-H Host to check
--cache file Persistent cache to remember where we last read/processed
default /hptc_cluster/adm/logs/sel/cache/selcache-$nodename.db
--clear Clear the SEL log after reading
--clearifused n Clear the log only if it is past n% full
--ignoreunknownhosts
Ignore unknown hosts, default is to return critical
--last n Only log entries from the last n days, hours, or minutes
--logfile file Log output to file,
default /hptc_cluster/adm/logs/sel/sel-{host}.log
--rules file File containing rule patterns, default /opt/hptc/nagios/etc/selRules
--cp cptype Console type string

Run the Nagios plug-in:

$ ./check_sel -H iclx7
No new entries in event log
$

25.14.4 Using the Nagios report generator analyze mode

The Nagios Report Generator (nrg) command features an analyze mode that can help you
determine the cause of a problem. It also offers information on the solution.

Enter following command to run the nrg analyze mode:

# nrg --mode analyze
Nodelist Description
-------- -------------------------------------------------------------------------
USE6371RA4 Enclosure Status - Warning> The enclosure is reporting one or more
warning conditions for environmental sensors gathered from the device.
Check the sensor status on the enclosure. Verify the status of the
Enclosures Collection Monitor which provides this data.

nh The Enclosure Collection
Monitor collects sensor information from the blade system enclosures.
Enclosure status can be found in the Nagios Enclosure Status service
plug-in status. A critical status indicates that one or more of the
monitored enclosures has either posted a critical status or the
plug-in has been unable to contact the enclosure. Verify logon
credentials are correct if sensor data is not available.

mercury nh neptune Nodes have crossed the load average warning
thresholds set in nagios_vars.ini. These are site specific values and
simply indicate resource usage above an expected range. Verify that the
node does not have a misbehaving job or some other problem causing it to
use more CPU then expected.

mercury neptune NodeInfo - Critical> Critical thresholds have been reached for max users,
processes, or zombies etc. See nagios_vars.ini for threshold values.
Values are site specific and critical values indicate values have been
exceeded.

25.14.5 Messages reported by Nagios

Nagios reports messages in the Service Detail and Service Problems windows. These messages
can help you detect current problems and prevent future problems.

Figure 35

shows a portion of

the Nagios Service Detail window that displays two messages.

Figure 35 Sample Nagios messages

Messages are categorized in the Status column as OK, Unknown, Pending, Warning, and
Critical

and are color-coded.

The messages described in this section are indexed by the Service and Status Information columns.
The messages in this section are arranged alphabetically by the Service column entry. If there is

25.14 Nagios Troubleshooting 223