Failure indicators – HP Insight Management Agents User Manual

Page 47

•

SCSI Bus Faults—Displays the number of times that SCSI bus parity, overrun, or underrun errors
have been detected on the SCSI bus. Since the controller retries the operation, SCSI bus faults
can cause a drop in performance, or, in some cases, data corruption.

If the count is not zero and the drive has failed, the failure might be correctable without
replacing the drive. Follow the steps below:
1.

Ensure that all system and storage system cables are intact and seated properly. You may
need to replace the cables.

Check the physical proximity of the system to other electrical devices. Since electrical
noise may cause a Bus Fault error, check the AC circuit for other electrical devices.

Ensure that the system temperature is within specified limits. Ensure that fans are operating
and are not blocked.

SCSI Bus Faults can be caused when two or more drives are set to the same SCSI ID.
Ensure that storage system and system SCSI IDs do not conflict.

In some instances, drive failure can cause SCSI Bus Faults. If you continue to receive many
of these errors, replace the drive.

•

IRQ Deglitch—Displays the number of times that a glitch has been detected on the drive
interface cable. Since the controller retries the operation, problems can cause a drop in
performance or, in some cases, data corruption. Glitches indicate electrical noise on the drive
cable or an intermittent failure of the drive electronics.

This item is considered a Problem Indicator that may be correctable without replacing the
drive. If this counter is not zero and the drive has failed, follow the steps below:
1.

Ensure that all system and storage system cables are intact and seated properly. You may
need to replace cables.

Check the physical proximity of the system to other electrical devices. Since electrical
noise may cause a glitch error, check the AC circuit for other electrical devices.

If you continue to receive many of these errors, replace the drive.

NOTE:

If the drive has not failed, the above counts simply provide a cumulative record of past

errors that have been corrected.

Failure Indicators

Use the Failure Indicators to determine the cause of a drive failure. Typically, the number of failures
is zero when the drive is operating normally. If a counter is not zero and the drive has not failed,
there could be an intermittent problem that may require the drive to be replaced.

The Failure Indicators are:

•

Spinup Errors—When the physical drive fails due to the failure of a spin-up command, a
Spinup Error occurs. If the failure count is not zero and the drive has failed, replace the drive.

If the counter is not zero and the drive is OK (has not failed), there may be an intermittent
problem that requires drive replacement. If you observe that the count is increasing over time,
replace the drive.

•

Aborted Commands—The Aborted Commands counter records the number of times that a
physical SCSI drive returned an Aborted Command status when a SCSI command was
attempted. This error count indicates unsuccessful termination of the SCSI command. When
the physical drive is failed due to aborted commands that could not be retried successfully,
Aborted Commands errors occur. If the number of errors is not zero and the drive has failed,
replace the drive.

Storage Agent