Use the cli, Monitor event notification, View the enclosure leds – HP MSA 2040 SAN Storage User Manual

Page 52: Performing basic steps, Gather fault information, Determine where the fault is occurring, Review the event logs

Troubleshooting

Use the CLI

As an alternative to using the SMU, you can run the show system command in the CLI to view the health

of the system and its components. If any component has a problem, the system health will be Degraded,

Fault, or Unknown, and those components will be listed as Unhealthy Components. Follow the

recommended actions in the component Health Recommendations field to resolve the problem.

Monitor event notification

With event notification configured and enabled, you can view event logs to monitor the health of the

system and its components. If a message tells you to check whether an event has been logged, or to view

information about an event in the log, you can do so using either the SMU or the CLI. Using the SMU, you

would view the event log and then click on the event message to see detail about that event. Using the CLI,

you would run the show events detail command (with additional parameters to filter the output) to

see the detail for an event.

View the enclosure LEDs

You can view the LEDs on the hardware (while referring to

LED descriptions

for your enclosure model) to

identify component status. If a problem prevents access to either the SMU or the CLI, this is the only option

available. However, monitoring/management is often done at a management console using storage

management interfaces, rather than relying on line-of-sight to LEDs of racked hardware components.

Performing basic steps

You can use any of the available options in performing the basic steps comprising the fault isolation

methodology.

Gather fault information

When a fault occurs, it is important to gather as much information as possible. Doing so will help you

determine the correct action needed to remedy the fault.
Begin by reviewing the reported fault:

•

Is the fault related to an internal data path or an external data path?

•

Is the fault related to a hardware component such as a disk drive module, controller module, or power

supply?

By isolating the fault to one of the components within the storage system, you will be able to determine the

necessary action more quickly.

Determine where the fault is occurring

Once you have an understanding of the reported fault, review the enclosure LEDs. The enclosure LEDs are

designed to alert users of any system faults, and might be what alerted the user to a fault in the first place.
When a fault occurs, the Fault ID status LED on the enclosure right ear (see

"Front panel components"

(page 13)

) illuminates. Check the LEDs on the back of the enclosure to narrow the fault to a FRU,

connection, or both. The LEDs also help you identify the location of a FRU reporting a fault.
Use the SMU to verify any faults found while viewing the LEDs. The SMU is also a good tool to use in

determining where the fault is occurring if the LEDs cannot be viewed due to the location of the system. The

SMU provides you with a visual representation of the system and where the fault is occurring. It can also

provide more detailed information about FRUs, data, and faults.

Review the event logs

The event logs record all system events. Each event has a numeric code that identifies the type of event that

occurred, and has one of the following severities:

•

Critical. A failure occurred that may cause a controller to shut down. Correct the problem immediately.

•

Error. A failure occurred that may affect data integrity or system stability. Correct the problem as soon

as possible.

•

Warning. A problem occurred that may affect system stability, but not data integrity. Evaluate the

problem and correct it if necessary.