beautypg.com

5 understanding nagios alert messages – HP Insight Control Software for Linux User Manual

Page 172

background image

Table 22 Services monitored on managed systems (continued)

Function/Description

Service name

The System Event Log is collected through the management processor, either an iLO
or an IPMI BMC. System Events are hardware-related alerts such as memory errors,
power supply faults, and so on.

Displays the system free space in /root, /tmp, /var, and /hptc_cluster. This
data is compared to thresholds defined in the nagios_vars.ini file.

System Free Space

2

Reports static system configuration information for a single system such as server type,
memory, and processors.

Configuration

1

Reports total swap space, amount of swap space used, free swap space, and page
cache per server.

Swap Info

1

This information is collected without agents, thus it is available for any host that Insight Control for Linux monitors.

2

This service uses mond to collect its data.

20.5 Understanding Nagios alert messages

Insight Control for Linux provides several value-added plug-ins that can generate alert messages
based on patterns provided by data sources, such as syslog and the Hardware System Event
logs.

These plug-ins use a common syntax to describe patterns and status to report on matches.

The rules that trigger alarms are configured in the following files:

/opt/hptc/nagios/etc/selRules

Contains patterns for alerting on System Event Log messages.

You can modify the selRules file as follows:

Add a rule to this file for a new alert.

Modify the corresponding rule to change an alert.

Comment out a rule to remove the corresponding alert.

/opt/hptc/nagios/etc/syslogAlertRules

Contains patterns for alerting on consolidated log entries.

/opt/hptc/nagios/libexec/sensorData.dat

Contains patterns for alerting based on sensor results.

Nagios uses email to send formatted alerts. The following is the default format of a Nagios alert:

Type: PROBLEM

1

State: return code

2

Service: service

3

Host: system

4

Address: IP Address

5

Info: message output

6

Date/Time: date and time

7

Elapsed: time

8

Number: number

9

1

Valid values are PROBLEM or RECOVERY.

2

The Nagios plug-in return code; the values for this code are as follows:
0

OK

1

Warning

172

Using graphical tools to monitor managed systems