Compromised fault tolerance, Recovering from compromised fault tolerance, Replacing drives – HP Smart Array P431 Controller User Manual
Page 25
Drive procedures 25
Compromised fault tolerance
CAUTION:
When fault tolerance is compromised, data loss can occur. However, it may be
possible to recover the data. For more information, see "Recovering from compromised fault
)."
If more drives fail than the fault-tolerance method can manage, fault tolerance is compromised, and the
logical drive fails. If this failure occurs, the operating system rejects all requests and indicates unrecoverable
errors.
For example, fault tolerance might occur when a drive in an array fails while another drive in the array is
being rebuilt.
Compromised fault tolerance can also be caused by problems unrelated to drives. In such cases, replacing
the physical drives is not required.
Recovering from compromised fault tolerance
If fault tolerance is compromised, inserting replacement drives does not improve the condition of the logical
volume. Instead, if the screen displays unrecoverable error messages, perform the following procedure to
recover data:
1.
Power down the entire system, and then power it back up. In some cases, a marginal drive will work
again for long enough to enable you to make copies of important files.
If a 1779 POST message is displayed, press the F2 key to re-enable the logical volumes. Remember that
data loss has probably occurred and any data on the logical volume is suspect.
2.
Make copies of important data, if possible.
3.
Replace any failed drives.
4.
After you have replaced the failed drives, fault tolerance may again be compromised. If so, cycle the
power again. If the 1779 POST message is displayed:
a.
Press the F2 key to re-enable the logical drives.
b.
Recreate the partitions.
c.
Restore all data from backup.
To minimize the risk of data loss that is caused by compromised fault tolerance, make frequent backups of all
logical volumes.
Replacing drives
The most common reason for replacing a drive is that it has failed. However, another reason is to gradually
increase the storage capacity of the entire system.
For systems that support hot-pluggable drives, if you replace a failed drive that belongs to a fault-tolerant
configuration while the system power is on, all drive activity in the array pauses for 1 or 2 seconds while the
new drive is initializing. When the drive is ready, data recovery to the replacement drive begins
automatically.
If you replace a drive belonging to a fault-tolerant configuration while the system power is off, a POST
message appears when the system is next powered up. This message prompts you to press the F1 key to start
automatic data recovery. If you do not enable automatic data recovery, the logical volume remains in a
ready-to-recover condition and the same POST message appears whenever the system is restarted.