beautypg.com

HP Insight Control Software for Linux User Manual

Page 226

background image

Corrective Actions

Cause/Symptom

The Nagios default threshold values for total
processes

, user processes, and zombie

nagios_vars.ini

file. HP recommends saving a copy

of the original file before making any updates.

processes

might be too small for certain system

1.

Save a copy of the original file:

configurations, particularly those with virtualization

# cp /opt/hptc/nagios/etc/nagios_vars.ini
/opt/hptc/nagios/etc/nagios_vars.ini.orig

operating systems. If so, you will encounter CRITICAL or
WARNING

alerts for the NodeInfo service.

2.

Edit the nagios_vars.ini file to update the values
for the *_procs_warning and *_procs_critical
parameters to values that apply to your environment.
For example:

[default]

total_procs_warning = 400
total_procs_critical = 500
user_procs_warning = 200
user_procs_critical = 300
zombie_procs_warning = 1
zombie_procs_critical = 5

[service_nodes]

total_procs_warning = 500
total_procs_critical = 600
user_procs_warning = 300
user_procs_critical = 400
zombie_procs_warning = 1
zombie_procs_critical = 5

3.

Rebuild the /opt/hptc/etc/sysconfig/vars.ini
file and push out the new vars.ini file to all the
managed systems:

#
/opt/hptc/nagios/libexec/check_nagios_vars
--rebuild

#
/opt/hptc/nagios/libexec/check_nagios_vars
--update
Vars Warning - iclx[2-6] vars.ini have
been resynchronized

NOTE:

The vars.ini file is dynamically created from

the nagios_vars.ini file, so any changes to the default
Nagios thresholds must be made to nagios_vars.ini,
not to vars.ini.

See

“Resolving host names on the CMS” (page 78)

Nagios host check shows host status as "DOWN"

Nagios incorrectly reports a managed system as down
when the host name for an existing server in HP SIM
database has changed since Options

→IC-Linux→Configure

Management Services was last run.

Edit the /opt/hptc/nagios/etc/syslogAlertRules
file on the CMS and update the mcelog_error rule as

mcelog

Nagios alerts for the Syslog Alerts service on

SLES11 SP1 managed systems

Nagios periodically calls the /usr/sbin/mcelog utility
on managed systems. There is s a known issue in which

shown below so that Nagios alerts are not generated for
the mcelog: Cannot mmap SMBIOS tables syslog
event.

programs using mmap to read SMBIOS info (like mcelog)
fail when hp-health is installed.

rule mcelog_error {
name (!/Cannot mmap SMBIOS tables at/)

For example, the following mcelog messages appear in
the /var/log/messages and will cause Nagios to
generate an alert.

relevance ($subsystem =~ /mcelog/)

226 Troubleshooting