Failback, Modifying your failover policy – Dell PowerVault 775N (Rackmount NAS Appliance) User Manual
Page 85
For example, if an application depends on a Physical Disk resource, the Cluster Service takes the application offline
first, allowing the application to write changes to the disk before the disk is taken offline.
The resource is taken offline.
Cluster Service takes a resource offline by invoking, through the Resource Monitor, the resource DLL that manages the
resource. If the resource does not shut down within a specified time limit, the Cluster Service forces the resource to
shut down.
The group is transferred to the next preferred host node.
When all of the resources are offline, the Cluster Service attempts to transfer the group to the node that is listed next
on the group's list of preferred host nodes.
For example, if cluster node 1 fails, the Cluster Service moves the resources to the next cluster node number, which is
cluster node 2.
The group's resources are brought back online.
If the Cluster Service successfully moves the group to another node, it tries to bring all of the group's resources online.
Failover is complete when all of the group's resources are online on the new node.
The Cluster Service continues to try and fail over a group until it succeeds or until the number of attempts occurs within a
predetermined time span. A group's failover policy specifies the maximum number of failover attempts that can occur in an
interval of time. The Cluster Service will discontinue the failover process when it exceeds the number of attempts in the
group's failover policy.
Modifying Your Failover Policy
Because a group's failover policy provides a framework for the failover process, make sure that your failover policy is
appropriate for your particular needs. When you modify your failover policy, consider the following guidelines:
Define the method in which the Cluster Service detects and responds to individual resource failures in a group.
Establish dependency relationships between the cluster resources to control the order in which the Cluster Service
takes resources offline.
Specify Time-out, failover Threshold, and failover Period for your cluster resources
Time-out controls how long the Cluster Service waits for the resource to shut down.
Threshold and Period control how many times the Cluster Service attempts to fail over a resource in a
particular period of time.
Specify a Possible owner list for your cluster resources. The Possible owner list for a resource controls which
cluster nodes are allowed to host the resource.
Failback
When the System Administrator repairs and restarts the failed cluster node, the opposite process may occur. After the original
cluster node has been restarted and rejoins the cluster, the Cluster Service will bring the running application and its resources
offline, move them from the failover cluster node to the original cluster node, and then restart the application. This process of
returning the resources back to their original cluster node is called failback.