10 replacing a disk in an sfs20 array, 10 replacing a disk in an sfs20 array -7 – HP StorageWorks Scalable File Share User Manual
Page 205

Replacing hardware components
8–7
8.1.10 Replacing a disk in an SFS20 array
TIP:
Section 9.34 provides useful information on how to determine whether you need to replace a disk in
an SFS20 array.
Configuration impact
When a disk fails, the SFS20 array automatically rebuilds the logical drives to use the spare disk (if one is
available).
It is unusual for multiple disks in an SFS20 array to fail at the same time. However, if more than two disks
fail where ADG redundancy is used, or more than one disk where RAID5 redundancy is used, the logical
drive that uses the disks will be catastrophically affected and it is possible that you will not be able to access
the original data on the LUN associated with the logical drive. In this event, you can attempt to reconstruct
the logical drive as described in Section 9.40; if this does not allow you to recover the data from the logical
drive, contact your HP Customer Support representative.
For information on the length of time that a RAID rebuild operation takes to complete, see Appendix D.
Replacing a disk that is logging URE errors
If you are replacing a disk that is logging
Unrecovered read error
(URE) messages, note the following
points:
•
If the disk is designated as a RAID spare, you can replace the disk as described in the Process section
below.
•
If the disk is not designated as a RAID spare, you can minimize the probability of losing data (in the
event of the RAID rebuild failing due to unrecoverable errors) when you replace the disk by
performing a read operation on all blocks on all LUNs that use the relevant disk before you replace it.
(Typically, a disk in an array is used by all LUNs on the array.)
When the blocks are read, any blocks that are unreadable due to unrecoverable disk errors are
detected and recreated on a different physical sector using available parity.
NOTE:
A quiescent array would perform such a read operation automatically using the
surface scan
process, which is a thread that runs on the SFS20 array searching for bad
blocks and swapping them out using the available parity. However, because there is no reliable
way to monitor the progress of the
surface scan
process, you must intervene manually to
read the blocks on the LUNs in order to minimize the risk of the RAID rebuild failing.
Note that reading all blocks on all LUNs that use a disk does not guarantee that no unrecoverable
disk errors will be encountered during a RAID5 rebuild, because new errors may occur after the RAID
rebuild has started. However, reading all blocks minimizes the probability of encountering an URE
error.
To read all blocks on all LUNs on an array, perform the following steps:
1.
Enter the
show lun
command and identify the LUNs on the array where the disk is to be replaced.
In the following example, LUNs 24 and 25 are on array 8:
sfs> show lun
LUN Array Role Used by Size(GB) Preferred Server Visible to
--- ----- ----- ---------- -------- ---------------- ----------
.
.
.
24 8 service south3 1 - south[3-4]
25 8 ost ost7 2048 south3 south[3-4]
.
.
.