Transferring sense information between sites, File and database recovery procedures – HP StorageWorks XP Remote Web Console Software User Manual
Page 146
![background image](/manuals/397687/146/background.png)
146 Hitachi TrueCopy z/OS for HP XP12000/XP10000 and SVS 200 storage systems
R-VOL Status (CRIT=Y(PATHS)). When this fence level is selected, the M-VOL is fenced only if the MCU is
not able to change the R-VOL pair status to suspended. If the MCU successfully changes the R-VOL pair
status to suspended, subsequent write I/O operations to the M-VOL will be accepted and the MCU will
keep track of updates to the M-VOL. This allows the volume pair to be resumed quickly using the resync
(out-of-sync-cylinders) copy operation (MODE=RESYNC). This setting will also reduce the amount of time
required to analyze the R-VOL currency during disaster recovery.
Never (CRIT=NO). When this fence level is selected, the M-VOL is never fenced when the pair is
suspended. This M-VOL fence level setting ensures that the M-VOL remains available to applications for
updates, even if all TC390 copy operations have failed. The R-VOL may no longer be in sync with the
M-VOL, but the MCU will keep track of updates to the M-VOL while the pair is suspended. ERC is essential
if this fence level setting is used. For disaster recovery, the currency of the R-VOL is determined by using the
sense information transferred through ERC or by comparing the R-VOL contents with other files confirmed to
be current.
NOTE:
To exchange CRIT=Y(ALL) and CRIT=Y(PATHS), XP12000/XP10000 system option mode 36 can
be used. For more information on the XP12000/XP10000 modes, see
Transferring sense information between sites
When the MCU (or RCU for TC390A) suspends a TC390 pair due to an error condition, the MCU/RCU
sends sense information with unit check status to the appropriate host(s). This sense information is used
during disaster recovery to determine the currency of the R-VOL. If the host system does not support IBM
PPRC, you must transfer the sense information to the remote site through the error reporting communications
(ERC). If the host system supports IBM PPRC and receives PPRC-compatible sense information related to a
TC390 pair, the host operating system will:
1.
Temporarily suspend all application I/O operations to the M-VOL.
2.
Enter an IEA491E message in the system log (SYSLOG) that indicates the time that the M-VOL was
suspended. Verify that the system log is common to both the main and remote operating systems.
3.
Place specific information about the failure (SIM) in the SYS1.LOGREC dataset for use by service
personnel. For more information on the TC390 SIMs, see ”
4.
Wait for the IEA491E message to reach the remote system.
5.
Resume all host application I/O operations to the M-VOL. If the M-VOL fence level setting does not
allow subsequent updates, the MCU will return a unit check for all subsequent write I/O operations and
the application will terminate.
NOTE:
Verify that the MCUs and RCUs are configured to report the service-level SIMs to the host. Select
the Service SIM of Remote Copy = Report setting on the RCU Option window.
File and database recovery procedures
When a TC390 Synchronous pair is suspended or when the MCU fails due to a disaster, the R-VOL may
contain in-process data. A data set could be open or transactions may not have completed. Even if you use
the R-VOL Data fence level for all TC390 Synchronous pairs, you need to establish file recovery
procedures. These procedures should be the same as those used for recovering any volume that is
inaccessible due to control unit failure. These procedures are more important if the R-VOL Status or Never
fence level settings are used.
TC390A does not provide any procedure for detecting and retrieving lost updates. To detect and recreate
lost updates, you must check other current information, such as a database journal log file that was active
at the primary system when the disaster occurred. Note that the journal log file entries of most DBMS have
the same system TOD clock information that is used for the I/O time-stamps (when timer type = system).
The TC390A group consistency time can be extremely useful when performing this detection and retrieval.
Because this detection/retrieval process can take a while, your disaster recovery scenario should be
designed so that detection and retrieval of lost updates is performed after the application has been started
at the secondary system.