5 understanding nagios alert messages – HP Insight Control Software for Linux User Manual
Page 172
Table 22 Services monitored on managed systems (continued)
Function/Description
Service name
The System Event Log is collected through the management processor, either an iLO
or an IPMI BMC. System Events are hardware-related alerts such as memory errors,
power supply faults, and so on.
Displays the system free space in /root, /tmp, /var, and /hptc_cluster. This
data is compared to thresholds defined in the nagios_vars.ini file.
System Free Space
2
Reports static system configuration information for a single system such as server type,
memory, and processors.
Configuration
1
Reports total swap space, amount of swap space used, free swap space, and page
cache per server.
Swap Info
1
This information is collected without agents, thus it is available for any host that Insight Control for Linux monitors.
2
This service uses mond to collect its data.
20.5 Understanding Nagios alert messages
Insight Control for Linux provides several value-added plug-ins that can generate alert messages
based on patterns provided by data sources, such as syslog and the Hardware System Event
logs.
These plug-ins use a common syntax to describe patterns and status to report on matches.
The rules that trigger alarms are configured in the following files:
•
/opt/hptc/nagios/etc/selRules
Contains patterns for alerting on System Event Log messages.
You can modify the selRules file as follows:
◦
Add a rule to this file for a new alert.
◦
Modify the corresponding rule to change an alert.
◦
Comment out a rule to remove the corresponding alert.
•
/opt/hptc/nagios/etc/syslogAlertRules
Contains patterns for alerting on consolidated log entries.
•
/opt/hptc/nagios/libexec/sensorData.dat
Contains patterns for alerting based on sensor results.
Nagios uses email to send formatted alerts. The following is the default format of a Nagios alert:
Type: PROBLEM
1
State: return code
2
Service: service
3
Host: system
4
Address: IP Address
5
Info: message output
6
Date/Time: date and time
7
Elapsed: time
8
Number: number
9
1
Valid values are PROBLEM or RECOVERY.
2
The Nagios plug-in return code; the values for this code are as follows:
0
OK
1
Warning
172
Using graphical tools to monitor managed systems