beautypg.com

4 troubleshooting, Troubleshooting memory, Troubleshooting disk drives and storage systems – HP Insight Diagnostics Software User Manual

Page 35

background image

4 Troubleshooting

Troubleshooting memory

The memory test component can perform the following tests:

Address test–This test verifies the integrity of the address buses connecting the processors to
the memory modules. Verification is done by writing data to all possible addresses that have
only 1 bit either set (1) or reset (0), having alternate bits set, having all bits high, and having
all bits low. The purpose of this test is to check for address lines that are either shorted to
ground, shorted to a high-voltage signal, shorted to other address lines, or floating
(disconnected). This test alone might not indicate a hard failure.

Walk test–This test verifies the integrity of the data buses connecting the processors to the
memory modules. Verification is done by writing data to all possible addresses that have only
1 bit either set (1) or reset (0), having alternate bits set, having all bits high, and having all
bits low. The purpose of this test is to check for data lines that are either shorted to ground,
shorted to a high-voltage signal, shorted to other address lines, or floating (disconnected).
This test alone might not indicate a hard failure.

Noise test–This test verifies memory integrity by writing the inverse of the current test address
to the current test address. The current test address alternates between the start and the end
of the current test block, incrementing or decrementing the address until the entire block has
been accessed. The purpose of this test is to check for address and data bus transition problems
when these lines are forced high and low as rapidly as possible. A failure of this test indicates
a failure of the DIMM.

March test–This test is similar to a true walk bit test and is able to detect the following: address
faults, stuck-at faults, transition faults, coupling faults, and linked coupling faults. These types
of faults occur when memory cells within a bit cell array affect the operation of nearby memory
cells. In many cases, static type tests do not detect these failures. A failure of this test indicates
a failure of the DIMM.

Random address test–This test verifies memory integrity by running a random pattern across
a given test range. The addresses used to store the patterns are selected randomly and
normalized to fit within the current test block. The purpose of this test is to detect intermittent
memory problems that can be caused by temperature, variable clock speeds, variable voltages,
signal timing, manufacturing faults, variable refresh rates, and decay. This test is also useful
in detecting memory faults that might not be detected by other static tests. A failure of this test
indicates a failure of the DIMM.

Not all the memory in a system can be tested because of the operating system and applications
that are installed. As a best practice, use the default setting for each test. The default settings help
ensure the maximum amount of memory that is available is tested.

To test memory thoroughly, run as many loops as possible in the time allowed. If time is critical,
and all memory tests cannot be run, then HP recommends running the Random Address test and
the Noise test. These two tests can catch the most errors.

Troubleshooting disk drives and storage systems

To further troubleshoot a disk drive, or if you continue experiencing storage-related issues after
running Diagnose, perform the following tasks:

Search for known storage-related issues on the HP website at

http://www.hp.com

. To search

for customer advisories related to ProLiant servers configured with Smart Array controllers,
use the following search string: +ProLiant +Advisory +"Smart Array".

Update the controller driver and firmware revision and any drive-related software components
such as firmware updates, management agents, and storage utilities.

Troubleshooting memory

35