beautypg.com

10 replacing a disk in an sfs20 array, 10 replacing a disk in an sfs20 array -7 – HP StorageWorks Scalable File Share User Manual

Page 205

background image

Replacing hardware components

8–7

8.1.10 Replacing a disk in an SFS20 array

TIP:

Section 9.34 provides useful information on how to determine whether you need to replace a disk in

an SFS20 array.

Configuration impact

When a disk fails, the SFS20 array automatically rebuilds the logical drives to use the spare disk (if one is

available).

It is unusual for multiple disks in an SFS20 array to fail at the same time. However, if more than two disks

fail where ADG redundancy is used, or more than one disk where RAID5 redundancy is used, the logical

drive that uses the disks will be catastrophically affected and it is possible that you will not be able to access

the original data on the LUN associated with the logical drive. In this event, you can attempt to reconstruct

the logical drive as described in Section 9.40; if this does not allow you to recover the data from the logical

drive, contact your HP Customer Support representative.

For information on the length of time that a RAID rebuild operation takes to complete, see Appendix D.

Replacing a disk that is logging URE errors

If you are replacing a disk that is logging

Unrecovered read error

(URE) messages, note the following

points:

If the disk is designated as a RAID spare, you can replace the disk as described in the Process section

below.

If the disk is not designated as a RAID spare, you can minimize the probability of losing data (in the

event of the RAID rebuild failing due to unrecoverable errors) when you replace the disk by

performing a read operation on all blocks on all LUNs that use the relevant disk before you replace it.

(Typically, a disk in an array is used by all LUNs on the array.)

When the blocks are read, any blocks that are unreadable due to unrecoverable disk errors are

detected and recreated on a different physical sector using available parity.

NOTE:

A quiescent array would perform such a read operation automatically using the

surface scan

process, which is a thread that runs on the SFS20 array searching for bad

blocks and swapping them out using the available parity. However, because there is no reliable

way to monitor the progress of the

surface scan

process, you must intervene manually to

read the blocks on the LUNs in order to minimize the risk of the RAID rebuild failing.

Note that reading all blocks on all LUNs that use a disk does not guarantee that no unrecoverable

disk errors will be encountered during a RAID5 rebuild, because new errors may occur after the RAID

rebuild has started. However, reading all blocks minimizes the probability of encountering an URE

error.

To read all blocks on all LUNs on an array, perform the following steps:

1.

Enter the

show lun

command and identify the LUNs on the array where the disk is to be replaced.

In the following example, LUNs 24 and 25 are on array 8:

sfs> show lun

LUN Array Role Used by Size(GB) Preferred Server Visible to
--- ----- ----- ---------- -------- ---------------- ----------
.

.

.

24 8 service south3 1 - south[3-4]
25 8 ost ost7 2048 south3 south[3-4]
.

.

.